At OIST we install scientific software on the clusters as modules. Modules let us have several versions of the same software available. You can use the latest release or you can stay with a trusted older version, and you can switch between versions with a single command.
You can also create your own modules for yourself or for your unit. The details of doing that are available here.
You use the module system to list the available modules and to load the
software that you need. Use the 'module' command to interact with the
module system. All commands are of the form ‘module <command>’. The
longer commands can be abbreviated, usually to 2-3 characters.
In addition there is a very convenient short-form command 'ml' that you can
use instead of 'module'. We will show both interchangeably below.
First, let us see what modules are available. We do that with
‘module avail’ (short for “available”) or ‘ml av’:
$ module avail
---------------------- /apps/.metamodules81 ----------------------
amd-modules sango-legacy-modules
bioinfo-ugrp-modules user-modules
intel-modules
-------------------- /apps/.modulefiles81 --------------------
AIMAll/19.10 hdf5.icc/1.10.6
BUSCO/3.0.2 hmmer/3.1b2
BUSCO/4.0.6 igv/2.3.82
BUSCO/4.1.2 (D) isoseq3/3.4.0
Gaussian/09RE01R2 java-jdk/1.8.0_20
Gaussian/09RE01 java-jdk/11
Gaussian/16RC01 (D) java-jdk/14
HTSeq/0.9.1 java-jdk/17
MaterialsStudio/2016 java-jdk/21 (D)
MrBayes.mpi/3.2.3 jellyfish/2.2.7
...
‘avail’ or ‘av’ gives us a list of software modules on the
system. As you can see, each software is listed by name and version,
with many packages having multiple versions. Each package version is a
separate module.
We also have “metamodules”, listed at the top. Metamodules are collections of modules that belong together in some way. We will talk more about these further down.
You load a module with the ‘load’ command:
$ module load julia
The short form is simply:
$ ml julia
Let’s see what version of Julia we loaded:
$ julia --version
julia version 1.9.4
The load command will normally load the latest version of the software by
default (the version marked with “(D)” in the list). If you want a specific
version, you give the module name and the version separated with a slash:
$ module load julia/1.3.1
$ julia --version
julia version 1.3.1
You can see what modules you have loaded with ‘list’ or ‘li’:
$ module li
# or short form:
$ ml
Currently Loaded Modules:
1) julia/1.3.1
You can remove a module again with the ‘unload’ command:
$ module unload bamtools
$ module list
No modules loaded
With the short form we can list modules with just 'ml', and we can
unload modules by adding a minus sign “-“ in front:
# load julia, ruse and bamtools at once
$ ml julia ruse bamtools
$ ml
Currently Loaded Modules:
1) julia/1.4.1 2) ruse/1.0 3) bamtools/2.5.1
# unload ruse and bamtools, and switch to Julia 1.3.1
$ ml -ruse -bamtools julia/1.3.1
Some modules depend on other modules, and will load them automatically. Take, for instance, the BUSCO application:
$ ml BUSCO
$ ml
Currently Loaded Modules:
1) openmpi.gcc/4.0.3 4) hmmer/3.1b2 7) Prodigal/2.6.2
2) python/3.7.3 5) bamtools/2.4.1 8) BUSCO/4.0.6
3) ncbi-blast/2.7.1+ 6) augustus/3.3
BUSCO depends on a number of other modules, some of which depend on others in turn. The module system loads all the modules we need for us. The module system keeps track of how a module was loaded, so when you unload BUSCO these dependencies will also be unloaded.
Often you want to clear out all modules. Instead of unloading modules
one by one you can use the ‘purge’ command to unload all loaded
modules at once:
$ module purge
$ module list
No modules loaded
This is also useful in scripts when you want to make sure that you’re
starting from a clean slate. Begin the script with a ‘module purge’
and you won’t have any other modules interfering by accident.
We have organised the modules on Deigo into separate areas. We now have a common area where most software is installed, and four specialised areas.
| Module area | Purpose |
|---|---|
| common | The default area. Most software is installed here. |
| intel-modules | Software that runs best or only on the Intel nodes. |
| amd-modules | Software that runs best or only on the AMD nodes. |
| bioinfo-ugrp-modules | Modules maintained by the Bioinformatics User Group |
To use, say, software from the Bioinformatics user group, you load the “bioinfo-ugrp-modules” metamodule:
$ module load bioinfo-ugrp-modules
If you then look at available modules:
$ module av
------------------------------- /apps/.bioinfo-ugrp-modulefiles81 --------------------------------
DB/Dfam/3.6 Other/canu/2.1.1
DB/Dfam/3.8 (D) Other/compleasm/0.2.2
DB/Dfam_RepeatMasker/3.6__4.1.3 Other/deepvariant/1.1.0
DB/Pfam/34.0 Other/deepvariant/1.6.0 (D)
DB/Pfam/35.0 (D) Other/dovetail_tools/20210914
DB/blastDB/ncbi/238 Other/edirect/18.2
DB/blastDB/ncbi/2021-11-28 Other/fasttree/2.1.11
DB/blastDB/ncbi/2022-07-nr Other/genescope/2021.03.26
...
-------------------------------------- /apps/.metamodules81 --------------------------------------
amd-modules intel-modules user-modules
bioinfo-ugrp-modules (L) sango-legacy-modules
-------------------------------------- /apps/.modulefiles81 --------------------------------------
AIMAll/19.10 comsol/43a matlab/MCR
BUSCO/3.0.2 comsol/43b matlab/R2009b
BUSCO/4.0.6 comsol/44 matlab/R2011b
...
The Bioinformatics user-group modules are now listed first, followed by the metamodules for the different areas, and then the common modules. You could now load, say, “canu” or “deepvariant”.
The modules in “amd-modules” will work on all systems. Modules in “intel-modules” will be faster on the Intel nodes, but will crash on AMD nodes. The intel compiler in intel-modules is an exception and does work everywhere (and should perhaps have been installed in the general module area).
For more information, please read more about this on the Deigo page.
How do you find the module you want? You may know what you need, but not the name of the module. Or maybe you want to know more about how a module is installed.
The ‘spider’ subcommand will let you search for modules by name:
# 'spider' searches for any module matching the text:
$ ml spider trimmo
--------------------------------------------------------------------------
Trimmomatic: Trimmomatic/0.33
--------------------------------------------------------------------------
Description:
A flexible trimmer for Illumina sequence data.
...
# 'spider' by itself lists all modules with a short description:
$ module spider
--------------------------------------------------------------------------
The following is a list of the modules and extensions currently available:
--------------------------------------------------------------------------
BUSCO: BUSCO/3.0.2, BUSCO/4.0.6
Assess genome assembly and annotation completeness with benchmarking
universal single-copy orthologs.
Gaussian: Gaussian/09RE01
HTSeq: HTSeq/0.9.1
High-throughput sequencing data analysis with Python.
...
The ‘whatis’ command will show you a brief description of a
module:
$ ml whatis augustus
augustus/3.3.3 : Name: augustus
augustus/3.3.3 : Version: 3.3.3
augustus/3.3.3 : URL: http://bioinf.uni-greifswald.de/augustus/
augustus/3.3.3 : Category: bioinformatics
augustus/3.3.3 : Keywords: sequencing, analysis
augustus/3.3.3 : Description: Predict genes in eukaryotic genomic sequences.
‘module help’ will give you in-depth information on a single module:
$ ml help qiime2
---------------------- Module Specific Help for "qiime2/2019.1" ----------------------
Powerful, extensible, and decentralized microbiome analysis package with a
focus on data and analysis transparency. A complete redesign and rewrite of
QIIME 1.
QIIME 2 comes distributed as a container. We add a small script that lets you
run it as 'qiime' without having to deal with the container directly.
Some modules have only a brief description. Some have more information, including helpful tips for running them on the cluster. If you feel a module description could be improved, please let us know!
The ‘key’ subcommand will search the tags and words in the description in
every module. This is good when you don’t really know what you are looking for:
$ ml key numeric
----------------------------------------------------------------------------------
The following modules match your search criteria: "numeric"
----------------------------------------------------------------------------------
OpenBLAS.gcc: OpenBLAS.gcc/0.3.9
An optimized BLAS and lapack library.
R: R/3.4.2
A popular software environment for statistical computing and graphics.
aocl.aocc: aocl.aocc/2.1
A set of numerical libraries tuned specifically for the AMD EPYC processor
family. This is the AOCC version.
...
Your unit may have software installed in your own unit-specific area. If
you want to use that software, you need to tell the module system where
to find those module files. If they have been installed according to
our instructions,
you can add your software modules with the ‘module use’ command:
$ module use /apps/unit/[unit name]U/.modulefiles/
The module commands will now look in that directory as well for module
files, and you will be able to use any software you have installed
there. If you use this often, it might be a good idea to add this
command to your .bashrc file so it gets run each time you log in.
If you want to remove the unit-specific modules again, the
‘module unuse’ command will do that for you:
$ module unuse /apps/unit/[unit name]U/.modulefiles/
Here’s a summary of our commands, with the ‘ml’ command, the equivalent
‘module’ command, and the effect:
| command | full module command | effect |
|---|---|---|
| ml | module list | lists modules you have loaded |
| ml <module> | module load <module> | loads <module> |
| ml -<module> | module unload <module> | unloads module (note the minus sign for ml) |
| ml av | module av | lists available modules (also ‘avail’ and ‘available’) |
| ml purge | module purge | removes all loaded modules |
| ml spider <text> | module spider <text> | Search for “text” in names of modules |
| ml whatis <module> | module whatis <module> | brief information about the module |
| ml help <module> | module help <module> | more in-depth information about the module |
| ml key <text> | module key <text> | search tags and descriptions for “text” |
| ml use <path> | module use <path> | use modules stored under “path” |
| ml unuse <path> | module unuse <path> | no longer use modules stored under “path” |
The ‘ml’ command can take all the same subcommands as ’module’.
The one thing it can’t do is load a module that happens to have the
same name as a module subcommand. If you have a module named “purge”
for instance, ’ml purge' would purge all loaded modules, not load the
“purge” module. In such a case, use ‘module load purge’ instead.