Support for Placement Groups

As part of  [Terraform/#63](https://github.com/clusterinthecloud/terraform/issues/63) (AWS EFA support), support for AWS Placement groups are required.  I've been contemplating this a bit recently, as placement groups (AWS, Azure) and GCP Group Placement Policies are somewhat important good performance with certain HPC jobs.

Placement groups are a great match to a single HPC job, or a static set of nodes.  They're not really conducive to very elastic environments, or environments where you may mix & match instance types.  While they can work there, you're just more likely to get capacity issues and instances failing to launch.

There are also some restrictions that are challenging to support:
* On GCP, Group Placement Policies are limited to only C2 node types (which aren't really supported by CitC yet), and only up to 22 VM instances.  The number of instances that will be in the Group Placement policy must be set when creating the policy.
* On AWS, Cluster placement groups don't support all VM types (ie, Burstable vCPU (T-series) and Mac)

Thus, placement groups need to be a somewhat optional feature, and it would be nice to treat both AWS and GCP similarly, even though they have different restrictions.

I don't believe that we can create the placement groups as part of the Terraform process, as at that point, `limits.yaml` doesn't exist, and we don't know how big the cluster could be (affects GCP).

I don't believe that we can create the placement groups as part of the SLURM `ResumeProgram` call to `startnode.py`, as this isn't directly linked to a single job.  Creating a group for every `startnode` call will get messy as the nodes not all terminate at a set time, so cleanup becomes a challenge.  That said, I do believe that `startnode` ought to change to enable all the nodes which SLURM wishes to start at once be done in a single API call - it's more likely that the cloud scheduler will be able to find space for the set of nodes, placed compactly (in the placement group) if they are all started in a single call.

### Suggesting course

I'm currently thinking that making changes to `update_config.py` is our best spot for creating for placement groups.  Each call to `update_config` could clean up/terminate  existing placement groups that are part of our `${cluster_id}`, and create new placement group(s).

I feel like creating a placement group per shape defined in `limits.yaml` would make the most sense.  This way, we would, for example, group C5n instances together, and group C6gn instances together, without trying to get AWS to find a way to compactly mix ARM and x86 instances.

We would also want to update `startnode` to add the placement policy to the instance starts, in the case where we have placement group created.  (ie, we wouldn't create them for AWS t3a instances, as they're burstable, or n1 instances on GCP).


Is there already work in progress to support Placement Groups?  If not, does my suggested course of action seem reasonable?  I can work on this, and offer patches, but I wanted to make sure that the plan seems reasonable to the core team first.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for Placement Groups #32

Suggesting course

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for Placement Groups #32

Description

Suggesting course

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions