Here is a summary of the system limits for users of our computing resources. The memory is the maximum you can allocate per node in practice. Nodes is the total number of nodes, and cores is the total maximum per user, not the cores per node.
These are the default values; individual users and units may have limits that are higher or lower than these (hardware permitting).
Deigo has a number of partitions, each with its own restrictions on maximum cores and time.
| Partition | Memory per node | Nodes | Cores per user | Memory per user | Runtime per job | Notes |
|---|---|---|---|---|---|---|
| Compute | 500G | 446 | 2000 | 7500G | 4 days | (1) |
| Short | 500G | 622 | 4000 | 6500G | 2 hours | (1) |
| Largemem | 500G | 44 | 5 nodes | — | ∞ | unrestricted(2) |
| Largemem | 750G | 14 | 5 nodes | — | ∞ | unrestricted(2) |
| Bigmem | 2978G | 1 | 8 | — | ∞ | unrestricted(2) |
| Bigmem | 1500G | 1 | 8 | — | ∞ | unrestricted(2) |
| Largejob | 500G | 150 | 50 nodes | — | 2 days | Special(3) |
| datacp | 185G | 4 | 4 | 19G | ∞ | for moving data(4) |
| Storage | Amount | Notes |
|---|---|---|
| /home | 50G | user limit |
| /flash | 10T | Per-unit limit |
| /bucket | 50T | Per-unit limit, expandable |
| naruto | — | Tape for archiving; expands as neeed |
If you need more compute time than the current limit, you can ask us to increase that limit in exchange for fewer cores, as detailed here.
Largemem and Bigmem have no fixed maximum time. However, it’s a bad idea to run for more than a couple of weeks. The risk increases greatly that a hardware or software error; an electical outage; or emergency maintenance will kill your job prematurely.
Largejob is for many-core computations, and requires an application from the users’ unit leader. You can apply using this form.
Datacp partition has four nodes, and is only for transferring large data volumes between Bucket and Flash. Do not use these for any kind of computation.
Saion is a general accellerated computing system with three subsystems. You can read more about this on this page.
| Partition | Memory per node | Nodes | Cores per user | GPUs per user | Runtime per job | Notes |
|---|---|---|---|---|---|---|
| test-gpu | 497G | 6 | 18 | 2 | — | preemptible(1) |
| gpu | 497G | 16 | 36 | 4 | 7 days | P100, V100(2) |
| largegpu | 2TB | 4 | varies | varies | varies | A100(3) |
| Storage | Amount | Notes |
|---|---|---|
| /home | 50GB | user limit, same system as Deigo |
| /bucket | 50TB | unit-limit; same system as Deigo |
| /work | 10TB | Per-unit limit |
The test-gpu partition is accessible to anybody. However, it is a low-priority partition; if somebody using a restricted partition wants to use the hardware, your job may be suspended or killed. This is best used for development and testing, not long computations.
You can optionally ask for 8 GPUs and 72 cores for up to two days instead if you need it.
The LargeGPU partition has 8 A100 GPUs per node, with 80GB memory per GPU, and is intended for jobs that can’t fit on the regular GPU partition. Allocation is determined by your need and by how busy the partition already is. Please contact us for details.
This summarizes the public storage systems at OIST, and which system uses each one. Also see our pages on the research storage system for more up to date information.
| System | Storage | Amount | Notes |
|---|---|---|---|
| Deigo, Saion | /home | 50G | Per-user |
| Deigo | /flash | 10T | Per-unit |
| Deigo, Saion | /bucket | 50T | Per-unit |
| Saion | /work | 10T | Per-unit |
| Comspace | 5T | Per-unit | |
| naruto | — | Tape for archiving |