This page explains how to deploy MPI (and/or GNU parallel) cluster with NFS filesystem in the KASI cloud. Here, OpenMPI (https://www.open-mpi.org/) is used as an MPI implementation. If you like to use Intel oneAPI toolkit and its MPI implementation, see the tip section of the current page. Slurm (https://slurm.schedmd.com/) workload manager can be installed in the cluster too. This how-to assumes that users know how to use a single VM by following the guide given in KASI Science Cloud : VM instances. The basic usage scenario is: 1) user prepares codes and data in the created NFS volume, 2) compile or run the prepared code with OpenMPI (or Intel MPI) w/ or w/o Slurm, 3) output files are stored in the NFS volume, and 4) if needed, an external NAS volume is accessed in the VMs to receive/send the data between the created MPI cluster and the NAS. The same scenario also works in the case using GNU parallel with other codes.
Step 1. Choose a cluster template: KASI-OpenMPI-Cluster or KASI-OpenMPI-Cluster-Slurm
As presented in the following figure, two options are available for the cluster. If you need Slurm in your cluster, choose KASI-OpenMPI-Cluster-Slurm template in Project → Cluster Infra → KASI Cluster Templates. If you simply need an MPI(or GNU parallel)-enabled cluster, choose KASI-OpenMPI-Cluster.
Click Next button in the following page after checking whether you are about to run a right cluster template.
Step 2. Configure a cluster by typing configuration parameters
- Stack Name: the name of the cluster which determines the hostnames of the master and slave VM nodes in the cluster.
- Password for user: password required to control the created cluster in certain situations.
- Image: VM system image.
- Flavor: VM flavor.
- Network: choose kasi-user-network.
- Minion VMs Number: the number of slave nodes. If you plan to use Slurm, it might be the number of Slurm work nodes which do not include the master node.
- NFS Mount Path: NFS directory path which will be prepared in all nodes including both master and slave nodes.
- NFS Size: the size of NFS volume.
- SSH Keys: ssh key used to access created VMs.
- Root Password: root password for root account in all nodes of the cluster. You may want to change the password after the cluster is created.
- User Script: shell commands that will be executed in all VM nodes of the cluster. Type custom commands as a single line (see https://dev.to/0xbf/run-multiple-commands-in-one-line-with-and-linux-tips-5hgm about how to use , ;, &&, ||). If you like to use GNU parallel in the cluster, type apt install parallel -y as shown in the above. If you are not familiar with apt command in Ubuntu OS, see https://ubuntu.com/server/docs/package-management.
Step 3. Checking the creation process
You can check the progress of creating the cluster in Cluster Infra → KASI Clusters, Compute → Instances, and Share → Shares as shown in the following figures.