This page explains how to deploy MPI (and/or GNU parallel) cluster with NFS filesystem in the KASI cloud. Here, OpenMPI ( is used as an MPI implementation. If you like to use Intel oneAPI toolkit and its MPI implementation, see the tip section of the current page. Slurm ( workload manager can be installed in the cluster too. This how-to assumes that users know how to use a single VM by following the guide given in KASI Science Cloud : VM instances. In particular, following KASI Science Cloud : VM instances#Step5.RemoteaccesstoVMinstancesviaSSHtunneling will help you access created clusters by using SSH. The basic usage scenario is: 1) user prepares codes and data in the created NFS volume, 2) compile or run the prepared code with OpenMPI (or Intel MPI) w/ or w/o Slurm, 3) output files are stored in the NFS volume, and 4) if needed, an external NAS volume is accessed in the VMs to receive/send the data between the created MPI cluster and the NAS (see KASI Science Cloud : VM instances#Step4.ConfiguretheVMinstanceforremotedesktop&externaldatastore). The same scenario also works in the case using GNU parallel with other codes. The related codes and shell scripts mentioned in this how-to are available in Cloning the github repository to the NFS volume is the easies way to use the provided materials.
If you have questions and suggestions about this tutorial page and related problems, please, contact Min-Su Shin.
Step 1. Choose a cluster template: KASI-OpenMPI-Cluster or KASI-OpenMPI-Cluster-Slurm
As presented in the following figure, two options are available for the cluster. If you need Slurm in your cluster, choose KASI-OpenMPI-Cluster-Slurm template in Project → Cluster Infra → KASI Cluster Templates. If you simply need an MPI(or GNU parallel)-enabled cluster, choose KASI-OpenMPI-Cluster.
Click Next button in the following page after checking whether you are about to run a right cluster template.
Step 2. Configure a cluster by typing configuration parameters
- Stack Name: the name of the cluster which determines the hostnames of the master and slave VM nodes in the cluster.
- Password for user: password required to control the created cluster in certain situations.
- Image: VM system image.
- Flavor: VM flavor.
- Network: choose kasi-user-network.
- Minion VMs Number: the number of slave nodes. If you plan to use Slurm, it might be the number of Slurm work nodes which do not include the master node.
- NFS Mount Path: NFS directory path which will be prepared in all nodes including both master and slave nodes.
- NFS Size: the size of NFS volume.
- SSH Keys: ssh key used to access created VMs.
- Root Password: root password for root account in all nodes of the cluster. You may want to change the password after the cluster is created.
- User Script: shell commands that will be executed in all VM nodes of the cluster. Type custom commands as a single line (see about how to use , ;, &&, ||). If you like to use GNU parallel in the cluster, type apt install parallel -y as shown above. If you are not familiar with apt command in Ubuntu OS, see
Step 3. Checking the creation process
You can check the progress of creating the cluster in Cluster Infra → KASI Clusters, Compute → Instances, and Share → Shares as shown in the following figures.
Step 4. (Optional) tasks after creating the cluster
Because it takes time to build all VM nodes in the cluster, you may need to confirm that all nodes are ready with the required tools. The following is the shell script
#!/bin/bash CLUSTERNAME="mycluster" MINIONLASTIND="14" echo "... checking ${CLUSTERNAME}-master" res=$(which mpirun | wc -l) if [ ${res} -ne "1" ] then echo "[WARNING] ${CLUSTERNAME}-master is not ready yet." fi for ind in $(seq 0 ${MINIONLASTIND}) do echo "... checking ${CLUSTERNAME}-minion-${ind}" res=$(ssh ${CLUSTERNAME}-minion-${ind} "which mpirun" | wc -l) if [ ${res} -ne "1" ] then echo "[WARNING] ${CLUSTERNAME}-minion-${ind} is not ready yet." fi done
The above script tests whether mpirun is available or not in all cluster VM nodes. conducts the similar test for Slurm as well as OpenMPI as shown below.
#!/bin/bash CLUSTERNAME="mycluster" MINIONLASTIND="14" echo "... checking ${CLUSTERNAME}-master" res=$(which munged mpirun | wc -l) if [ ${res} -ne "2" ] then echo "[WARNING] ${CLUSTERNAME}-master is not ready yet." fi for ind in $(seq 0 ${MINIONLASTIND}) do echo "... checking ${CLUSTERNAME}-minion-${ind}" res=$(ssh ${CLUSTERNAME}-minion-${ind} "which munged mpirun" | wc -l) if [ ${res} -ne "2" ] then echo "[WARNING] ${CLUSTERNAME}-minion-${ind} is not ready yet." fi done
Step 5. Erasing the cluster
Choose the cluster in Cluster Infra → KASI Clusters by clicking Delete Stacks. If some VM nodes are not erases cleanly, delete the VMs following the instruction given in KASI Science Cloud : VM instances.
Useful Tips
Running MPI codes
Without Slurm, you can simply run MPI codes by mpirun. The following example compile the example C++ MPI codes in and run them.
mpic++ -o a.out ex_mpi_hostname.cpp mpic++ -o a.out ex_mpi_montecarlo_pi.cpp mpirun --allow-run-as-root -np 32 --hostfile ./ex_mpirun_hostfile.txt ./a.out
See or for mpirun. You need to prepare a hostfile for mpirun, which is ex_mpirun_hostfile.txt in the above example. For example, the hostfile is like the following.
mycluster-master mycluster-minion-0 mycluster-minion-1
When your cluster is equipped with Slurm, you may need to use Slurm commands and follow the Slurm's way to submit jobs. See or In the following example, ex_slurm_openmpi.job file is submitted via the sbatch command.
sbatch -N 3 -n 24 ex_slurm_openmpi.job
where ex_slurm_openmpi.job is the following
#!/bin/bash mpirun --allow-run-as-root ./a.out
Running GNU Parallel
You can run GNU parallel to execute jobs in remote hosts, i.e., cluster slave nodes. See (or The following example run some simple shell commands on nodes lsited in ex_parallel_hostfile.txt.
parallel --nonall --sshloginfile ex_parallel_hostfile.txt hostname parallel --workdir /mnt/mpi --sshloginfile ex_parallel_hostfile.txt 'hostname; touch $RANDOM-$(hostname)-{}.txt' ::: 3 4 5 6 7 8 9 10 11 12
where ex_parallel_hostfile.txt is like the following
: mycluster-minion-0 mycluster-minion-1
Changing root password in multiple VM nodes
SSH remote execution can be used to change root passowrds in all VM nodes as described in
#!/bin/bash CLUSTERNAME="mycluster" MINIONLASTIND="14" PWUSER="root" NEWPASSWORD="xxxxxxxxxx" echo "... changing ${CLUSTERNAME}-master : ${PWUSER}" echo -e "${NEWPASSWORD}\n${NEWPASSWORD}" | passwd ${PWUSER} for ind in $(seq 0 ${MINIONLASTIND}) do echo "... changing ${CLUSTERNAME}-minion-${ind} : ${PWUSER}" ssh ${CLUSTERNAME}-minion-${ind} "echo -e \"${NEWPASSWORD}\n${NEWPASSWORD}\" | passwd ${PWUSER}" done
where PWUSER is a user account and NEWPASSWORD is a new password.
Install Intel oneAPI and use its MPI
It is possible to install Intel oneAPI and use its MPI implementation instead of OpenMPI. The following script ( can be used to install Intel oneAPI Base and HPC Toolkits in all VM nodes of the cluster.
#!/bin/bash # See # using-package-managers/apt.html install_intel_oneapi='cd /tmp; wget; apt-key add GPG-PUB-KEY-IN TEL-SW-PRODUCTS.PUB; rm GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB; echo "deb all main" | tee /etc/apt/sources.lis t.d/oneAPI.list; add-apt-repository "deb all main"; apt install -y intel-basekit intel-hpckit' CLUSTERNAME="mycluster" MINIONLASTIND="14" echo "... install on ${CLUSTERNAME}-master" echo $install_intel_oneapi | bash for ind in $(seq 0 ${MINIONLASTIND}) do echo "... install on ${CLUSTERNAME}-minion-${ind}" ssh ${CLUSTERNAME}-minion-${ind} "${install_intel_oneapi}" done
After installing the Intel toolkits, you need to setup the shell environment for the Intel tools. See for the guide to setup the environment. Here, simply source /opt/intel/oneapi/ As shown below, you can compile and test MPI programs by using the installed Intel toolkits.
> which mpiicpc /opt/intel/oneapi/mpi/2021.5.1/bin/mpiicpc > mpiicpc ./ex_mpi_montecarlo_pi.cpp > ldd ./a.out > which mpirun /opt/intel/oneapi/mpi/2021.5.1/bin/mpirun
See to figure out how to compile and run MPI programs with Intel toolkits.
(Future update) Installing softwares with apt
Installing conda and preparing conda environtmens
The following script is available at which install miniconda and setup a specific conda environment by using the network-shared volume.
#!/bin/bash CLUSTERNAME="mycluster" NFSDIR="/mnt/mpi" CONDAENV="xclass" CONDAURL="" # additional apt packages apt install zip # installation of miniconda cd ${NFSDIR} wget "${CONDAURL}" -O ./ bash ./ -b -p ${NFSDIR}/miniconda eval "$(${NFSDIR}/miniconda/bin/conda shell.bash hook)" conda init conda update -y -n base -c defaults conda # creating the environment conda create -y -n ${CONDAENV} python=2.7 # adding new conda packages conda install -y -n ${CONDAENV} numpy conda install -y -n ${CONDAENV} scipy conda install -y -n ${CONDAENV} matplotlib conda install -y -n ${CONDAENV} astropy conda install -y -n ${CONDAENV} sqlite # adding pip packages conda activate ${CONDAENV} pip install pyfits echo "Do the following things to use the environment ${CONDAENV}" echo "1) source ~/.bashrc" echo "2) conda activate ${CONDAENV}"
You can imagine installing conda and preparing environments in your local home directory. The script can be used.
#!/bin/bash CLUSTERNAME="mycluster" TMPDIR="/tmp" CONDAENV="xclass" CONDAURL="" # additional apt packages apt install zip # installation of miniconda cd ${TMPDIR} wget "${CONDAURL}" -O ./ bash ./ -b -p ${HOME}/miniconda eval "$(${HOME}/miniconda/bin/conda shell.bash hook)" conda init conda update -y -n base -c defaults conda # creating the environment conda create -y -n ${CONDAENV} python=2.7 # adding new conda packages conda install -y -n ${CONDAENV} numpy conda install -y -n ${CONDAENV} scipy conda install -y -n ${CONDAENV} matplotlib conda install -y -n ${CONDAENV} astropy conda install -y -n ${CONDAENV} sqlite # adding pip packages conda activate ${CONDAENV} pip install pyfits echo "Do the following things to use the environment ${CONDAENV}" echo "1) source ~/.bashrc" echo "2) conda activate ${CONDAENV}"