C5n

Community Images

To run this benchmark we fetch community images from gallery.ecr.aws/hpc.

We’ll use Thread-MPI with baked in settings of how many OpenMP threads should be spawned.

Two images with different tags:

  1. c5n_18xl_on: build for a c5n.18xllarge with hypterthreading on. This image will use 72 MPI Ranks.
  2. c5n_18xl_off: build for a c5n.18xllarge with hypterthreading off. This image will use 36 MPI Ranks.
sarus pull public.ecr.aws/hpc/spack/gromacs/2021.1/threadmpi:c5n_18xl_on
sarus pull public.ecr.aws/hpc/spack/gromacs/2021.1/threadmpi:c5n_18xl_off

36 Ranks

Our first job is aquivalent to the 36 ranks and 2 OpenMP threads, even though Thread-MPI is using process threads to mimic MPI ranks.

cat > gromacs-single-node-sarus-c5n-threadmpi-36x2.sbatch << \EOF
#!/bin/bash
#SBATCH --job-name=gromacs-single-node-sarus-c5n-threadmpi-36x2
#SBATCH --exclusive
#SBATCH --output=/fsx/logs/%x_%j.out
#SBATCH --partition=c5n

mkdir -p /fsx/jobs/${SLURM_JOBID}
export INPUT=/fsx/input/gromacs/benchRIB.tpr
sarus run --workdir=/fsx/jobs/${SLURM_JOBID} public.ecr.aws/hpc/spack/gromacs/2021.1/threadmpi:c5n_18xl_off
EOF

Let’s submit two jobs of those. We are using $(genSlurmDep c5n sleep-inf) to generate a dependency string for sbatch.

sbatch $(genSlurmDep c5n sleep-inf) -N1 -p c5n gromacs-single-node-sarus-c5n-threadmpi-36x2.sbatch
sbatch $(genSlurmDep c5n sleep-inf) -N1 -p c5n gromacs-single-node-sarus-c5n-threadmpi-36x2.sbatch
squeue

The sleep-inf job is still blocking the jobs.

Let us create a new sleep-inf job to use the instances once the GROMACS runs are done.

sbatch $(genSlurmDep c5n "gromacs.*") -N2 -p c5n sleep.sbatch
squeue

Now, let us cancel the running sleep-inf job and let the GROMACS jobs loose.

scancel $(squeue --states=RUNNING --name=sleep-inf -p c5n -o "%i" --noheader) && squeue && sleep 2 && squeue

72 Ranks

Our first job is aquivalent to the 72 ranks and 1 OpenMP threads, even though Thread-MPI is using process threads to mimic MPI ranks.

cat > gromacs-single-node-sarus-c5n-threadmpi-72x1.sbatch << \EOF
#!/bin/bash
#SBATCH --job-name=gromacs-single-node-sarus-c5n-threadmpi-72x1
#SBATCH --exclusive
#SBATCH --output=/fsx/logs/%x_%j.out
#SBATCH --partition=c5n

mkdir -p /fsx/jobs/${SLURM_JOBID}
export INPUT=/fsx/input/gromacs/benchRIB.tpr
sarus run --workdir=/fsx/jobs/${SLURM_JOBID} public.ecr.aws/hpc/spack/gromacs/2021.1/threadmpi:c5n_18xl_off
EOF

Let’s submit two jobs of those.

We are not using genSlurmDep in this submission because we only have two nodes we want to keep. Thus, the jobs are going to be submitted on --exclusive nodes which SLRUM will bring up.

sbatch -N1 -p c5n gromacs-single-node-sarus-c5n-threadmpi-72x1.sbatch
sbatch -N1 -p c5n gromacs-single-node-sarus-c5n-threadmpi-72x1.sbatch
squeue

The nodes are in CONFIGURE state, as SLURM is going to bring those nodes up. This will take a couple of minutes (depending on how much post-install is happening - most likely a lot in this workshop).

Results

After those runs are done, we grep the performance results.

grep -B2 Performance /fsx/logs/gromacs-single-node-sarus-*

This extends the table from the gromacs-on-pcluster workshop started with decomposition.

# execution spec instance Ranks x Threads ns/day
1 native gromacs@2021.1 c5n.18xl 18 x 4 4.7
2 native gromacs@2021.1 c5n.18xl 36 x 2 5.3
3 native gromacs@2021.1 c5n.18xl 72 x 1 5.5
4 native gromacs@2021.1 ^intel-mkl c5n.18xl 36 x 2 5.4
5 native gromacs@2021.1 ^intel-mkl c5n.18xl 72 x 1 5.5
6 native gromacs@2021.1 ~mpi c5n.18xl 36 x 2 5.5
7 native gromacs@2021.1 ~mpi c5n.18xl 72 x 1 5.7
8 native gromacs@2021.1 +cuda ~mpi g4dn.8xl 1 x 32 6.3
9 sarus gromacs@2021.1 ~mpi c5n.18xl 36 x 2 5.7
10 sarus gromacs@2021.1 ~mpi c5n.18xl 72 x 1 5.7

Please note, the containerized run yield the same result as the native run!