Refine Config

Refine Base Config

We will reuse the guided configuration we just created different clusters. No matter which cluster we create, we can refine the config to already have some common settings.

cd ${HOME}
cp .parallelcluster/config pcluster-base.ini

Adjust Basic Configuration

Changes to the configuration file are going to be presented in two ways. First, as the snippet that needs to change:

Key: value

And afterwards how the snippet can be achieved with a magic command using wildq and sponge.

cat file.ini \
  |wildq -i ini -M '.Key = "value"' \
  |sponge file.ini

This does the following.

  1. cat file.ini outputs the file content to stdout
  2. wildq -i ini -M '.Key = "value"' takes the content (which it expects to be in the ini format), and updates the value for Key to be value. In case the key does not exists it will create one.
  3. sponge file.ini reads the output of the previous command (the altered content) and overwrites the content of the file file.ini just before it exits. So that the original file is replaced. That’s because cat file.ini > file.ini will result in an empty file no matter the initial content (without going into details why; just try each step in echo test > test.txt; cat test.txt ;cat test.txt > test.txt;cat test.txt).

Postinstall

Let us add the post-install configuration. Here’s some magic command that sets the post-install.sh script for ParallelCluster to execute.

The adjust needed looks like this.

[cluster default]
s3_read_resource = arn:aws:s3:::*
post_install = s3://pcluster-2021-09-17-7d41cb8c/post.install.sh
post_install_args = pcluster-2021-09-17-7d41cb8c

This can be accomplished by running this wildq command.

cat pcluster-base.ini \
      |wildq -i ini -M '."cluster default".s3_read_resource = "arn:aws:s3:::*"' \
      |wildq -i ini -M '."cluster default".post_install = "s3://BUCKET_NAME/post.install.sh"' \
      |wildq -i ini -M '."cluster default".post_install_args = "BUCKET_NAME"' \
      |sed -e "s/BUCKET_NAME/${BUCKET_NAME}/" \
      |sponge pcluster-base.ini

SSH Key

The SSH key is defined within the cluster config like this.

[cluster default]
key_name = pc-key-XYZ

The following command overwrites (or creates) the desired key.

cat pcluster-base.ini \
      |wildq -i ini -M '."cluster default".key_name = "SSH_KEY"' \
      |sed -e "s/SSH_KEY/${SSH_KEY}/" \
      |sponge pcluster-base.ini

Remove Compute

We’ll remove the compute queue and compute environment create by pcluster configure. That will be replaced with specific settings later.

The default configuration of AWS ParallelCluster creates a compute queue with compute resources. We’ll add queues and compute resources later, so please remove these section from the ini file.

[queue compute]
enable_efa = false
enable_efa_gdr = false
compute_resource_settings = default

[compute_resource default]
instance_type = t2.micro

Again, wildq makes this a quick command.

cat pcluster-base.ini \
      |wildq -i ini -M 'del(."queue compute")' \
      |wildq -i ini -M 'del(."compute_resource default")' \
      |sponge pcluster-base.ini

Software Share

ParallelCluster exports a shared file-system from the headnode. /shared usually is 20GB in size. Pretty small for all the HPC apps.

We can overwrite the default by specifying out own.

[cluster default]
ebs_settings = shared

[ebs shared]
volume_size = 250
shared_dir = /shared
volume_type = gp2

The following command adds ebs_settings to the cluster section and creates an [ebs shared] section as well.

cat pcluster-base.ini \
    |wildq -i ini -M '."cluster default".ebs_settings = "shared"' \
    |wildq -i ini -M '."ebs shared".volume_size = "250"' \
    |wildq -i ini -M '."ebs shared".shared_dir = "/shared"' \
    |wildq -i ini -M '."ebs shared".volume_type = "gp2"' \
    |sponge pcluster-base.ini

FSx

In case you need a fast, shared filesystem you can enable FSx for Lustre using this snippet:

[cluster default]
fsx_settings = myfsx

[fsx myfsx]
shared_dir = /fsx
storage_capacity = 1200

wildq does that for you:

cat pcluster-base.ini \
    |wildq -i ini -M '."cluster default".fsx_settings = "myfsx"' \
    |wildq -i ini -M '."fsx myfsx".shared_dir = "/fsx"' \
    |wildq -i ini -M '."fsx myfsx".storage_capacity = "1200"' \
    |sponge pcluster-base.ini

Operating System

Depending on your use-case you might want to change the OS. The default is Amazonlinux 2 (alinux2).

[cluster default]
base_os = alinux2

This command overwrites whatever the current value is:

cat pcluster-base.ini \
    |wildq -i ini -M '."cluster default".base_os = "alinux2"' \
    |sponge pcluster-base.ini

Instance Root Volume Size

ParallelCluster has a default of 35GB root volumes on the head-node and compute instances. As we might use container images which are downloaded to the local storage, we’ll increase the root volume for compute nodes and head nodes to 100GB.

[cluster default]
compute_root_volume_size = 100
master_root_volume_size = 100

wildq to the rescue:

cat pcluster-base.ini \
    |wildq -i ini -M '."cluster default".compute_root_volume_size = "100"' \
    |wildq -i ini -M '."cluster default".master_root_volume_size = "100"' \
    |sponge pcluster-base.ini