Skip to content

Commit

Permalink
Add res-demo template that allows CIDR and LDIF configuration (#256)
Browse files Browse the repository at this point in the history
The LDIF file creates admins, users, and groups in the regular USERS OU.
It also creates a RES OU with admins, users, and groups in it.
The RES OU users and groups are prefixed with "res".

Resolves #251

Add install_vscode role and playbook

Update custom-amis.md
  • Loading branch information
cartalla authored Sep 12, 2024
1 parent 1263a10 commit 55517db
Show file tree
Hide file tree
Showing 18 changed files with 6,888 additions and 20 deletions.
8 changes: 6 additions & 2 deletions docs/custom-amis.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@ By default, ParallelCluster will use pre-built AMIs for the OS that you select.
The exception is Rocky 8 and 9, for which ParallelCluster does not provide pre-built AMIs.
To use Rocky Linux, you must first build a custom AMI and specify it in your config file at **slurm/ParallelClusterConfig/Os/CustomAmi**.

The easiest way is to start an EC2 instance, update it with your changes, and create a new AMI from that instance.
You can then add the new AMI to your configuration file.
The easiest way to create a new AMI is to start an EC2 instance with an existing ParallelCluster AMI, update it with your changes, and create a new AMI from that instance.
You can find the official ParallelCluster AMIs using the ParallelCluster UI.
Click on **Images** and the list of **Official Images** will be listed.
After you create the a new AMI, you can then add it to your configuration file.

ParallelCluster can also automate this process for you using EC2 ImageBuilder.
When you build your cluster, example ParallelCluster build configuration files
Expand Down Expand Up @@ -50,6 +52,8 @@ charges when run on AWS EC2 instances to develop FPGA images that can be run on
First subscribe to the FPGA developer AMI in the [AWS Marketplace](https://us-east-1.console.aws.amazon.com/marketplace/home?region=us-east-1#/landing).
There are 2 versions, one for [CentOS 7](https://aws.amazon.com/marketplace/pp/prodview-gimv3gqbpe57k?ref=cns_1clkPro) and the other for [Amazon Linux 2](https://aws.amazon.com/marketplace/pp/prodview-iehshpgi7hcjg?ref=cns_1clkPro).

**Note**: The FPGA Developer AMI hasn't been ported to the latest OS versions, so it will not show up in the build file templates.

## Deploy or update the Cluster

After the AMI is built, add it to the config and create or update your cluster to use the AMI.
Expand Down
100 changes: 82 additions & 18 deletions docs/res_integration.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,59 @@
# RES Integration

First you will need to deploy RES.
The easiest way is to [deploy the demo environment](https://docs.aws.amazon.com/res/latest/ug/create-demo-env.html) which provides all of the prerequisites and completely automates the deployment.
If you want to use an existing VPC or Active Directory, then you will need to follow the instructions to [deploy the product](https://docs.aws.amazon.com/res/latest/ug/deploy-the-product.html).

## RES Setup

After you've deployed RES, you need to configure it so that the remote desktops can be used as external login nodes and so that they have access to any file systems that you created.

### Onboard your file systems

RES natively supports EFS, FSx for NetApp Ontap, and FSx for Lustre file systems.
It can create them for you or you can onboard existing file systems.

* Expand **Environment Management**
* Click **File Systems**
* Click **Onboard File System** or **Create File System**

### Create RES Project

* Expand **Environment Management**
* Click **Projects**
* Click **Create Project**
* Fill in the required fields.
* Add any file systems that you created so they will be automatically mounted on the desktops that belong to the project.
* Expand **Advanced Options** under **Resource Configurations** and add the **SlurmLoginNodeSG** so that it will be attached automatically to the remote desktop so they can access the external file systems and slurm clusters.
* Add the groups and users that can use the project.

### Give the project access to software stacks

Next, you'll need to give the project access to a Software Stack.
You can either create a new Software Stack or update an existing one.

* Select **Software Stacks** under **Session Management**.
* Select an existing stack like the RHEL 8 stack
* Select **Actions**, **Edit Stack**.
* Select your project under Projects and enable it to use the stack.

### Create virtual desktop

Now you can create a virtual desktop using the project that you just created.

* Select **My Virtual Desktops** under **Desktops**.
* Click **Launch New Virtual Desktop**
* Give it a descriptive name, select the project, operating system, and software stack.
* I suggest using a t3 instance for virtual desktops, such as a t3.large. If you need more cores or memory you will use your ParallelCluster compute nodes.
* I usually increase the storage size to 20GB so I can install additional packages.
* Click **Submit** and then wait for the desktop to be provisioned. You may need to refresh the page to update the desktop status.

You can switch to the EC2 console to verify that the instance has been launched and that it has the required security group attached.

## ParallelCluster Configurattion

Integration with [Research and Engineering Studion (RES)](https://docs.aws.amazon.com/res/latest/ug/overview.html) is straightforward.
You simply specify the **--RESEnvironmentName** option for the `install.sh` script or add the **RESEnvironmentName** configuration parameter
You simply specify the **--RESStackName** option for the `install.sh` script or add the **RESStackName** configuration parameter
to your configuration file.
The install script will set the following configuration parameters based on your RES environment or check them if you have them set to make sure they are consistent
with your RES environment.
Expand All @@ -11,15 +63,18 @@ The intention is to completely automate the deployment of ParallelCluster and se
|-----------|-------------|------
| VpcId | VPC id for the RES cluster | vpc-xxxxxx
| SubnetId | Subnet in the RES VPC. | subnet-xxxxx
| SubmitterInstanceTags | The tag of VDI instances. | 'res:EnvironmentName': *EnvironmentName*'
| ExtraMounts | The mount parameters for the /home directory. This is required for access to the home directory. |
| ExtraMountSecurityGroups | Security groups that give access to the ExtraMounts. These will be added to compute nodes so they can access the file systems.
| slurm/ExternalLoginNodes | Information of instances to be configured as external login nodes |
| slurm/DomainJoinedInstance | Tags of cluster-manager which will be used to create users_groups.json |
| slurm/storage/ExtraMounts | The mount parameters for the /home directory. This is required for access to the home directory. |
| slurm/SlurmCtl/AdditionalSecurityGroups | Security group that allows access to EFS /home |
| slurm/InstanceConfig/AdditionalSecurityGroups | Security group that allows access to EFS /home |

You must also create security groups as described in [Security Groups for Login Nodes](deployment-prerequisites.md#security-groups-for-login-nodes) and specify the SlurmHeadNodeSG in the `slurm/SlurmCtl/AdditionalSecurityGroups` parameter and the SlurmComputeNodeSG in the `slurm/InstanceConfig/AdditionalSecurityGroups` parameter.
You must also create security groups as described in [Security Groups for Login Nodes](deployment-prerequisites.md#security-groups-for-login-nodes).
You must either specify **AdditionalSecurityGroupsStackName** or specify the SlurmHeadNodeSG in the `slurm/SlurmCtl/AdditionalSecurityGroups` parameter and the SlurmComputeNodeSG in the `slurm/InstanceConfig/AdditionalSecurityGroups` parameter.

When you specify **RESEnvironmentName**, a lambda function will run SSM commands to create a cron job on a RES domain joined instance to update the users_groups.json file every hour. Another lambda function will also automatically configure all running VDI hosts to use the cluster.
When you specify **RESStackName**, a lambda function will run SSM commands to create a cron job on a RES domain joined instance to update the users_groups.json file every hour. Another lambda function will also automatically configure all running VDI hosts to use the cluster.

The following example shows the configuration parameters for a RES with the EnvironmentName=res-eda.
The following example shows the configuration parameters for a RES cluster with a stack named res-eda.

```
---
Expand All @@ -33,37 +88,46 @@ The following example shows the configuration parameters for a RES with the Envi
StackName: res-eda-pc-3-9-1-rhel8-x86-config
Region: <region>
SshKeyPair: <key-name>
RESEnvironmentName: res-eda
AdditionalSecurityGroupsStackName: res-eda-SlurmSecurityGroups
RESStackName: res-eda
ErrorSnsTopicArn: <topic-arn>
TimeZone: 'US/Central'
slurm:
ClusterName: res-eda-pc-3-9-1-rhel8-x86
ParallelClusterConfig:
Version: '3.9.1'
Version: '3.10.1'
Image:
Os: 'rhel8'
Architecture: 'x86_64'
Database:
DatabaseStackName: pcluster-slurm-db-res
Slurmdbd:
SlurmdbdStackName: pcluster-slurm-dbd-res-eda-3-10-1
SlurmCtl:
AdditionalSecurityGroups:
- sg-12345678 # SlurmHeadNodeSG
SlurmCtl: {}
# Configure typical EDA instance types
# A partition will be created for each combination of Base OS, Architecture, and Spot
InstanceConfig:
AdditionalSecurityGroups:
- sg-23456789 # SlurmComputeNodeSG
UseSpot: true
NodeCounts:
DefaultMaxCount: 10
```

## Connect to the virtual desktop

When the cluster deployment finishes you are ready to run jobs from your RES DCV desktop.

## Create custom AMI for virtual desktops

Connect to your virtual desktop and install packages, software, configure ParallelCluster clusters, mount file systems, and whatever else you need for your project.
You'll normally require root access to do this.
When you are done, remove the following files or else new virtual desktops created from the image will fail to provision.

```
rm /root/bootstrap/semaphore/*.lock
```
3 changes: 3 additions & 0 deletions res/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@

.venv
rendered_templates/
196 changes: 196 additions & 0 deletions res/create-ldif.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
#!/usr/bin/env python3
"""
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: MIT-0
Permission is hereby granted, free of charge, to any person obtaining a copy of this
software and associated documentation files (the "Software"), to deal in the Software
without restriction, including without limitation the rights to use, copy, modify,
merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""

"""
Create an LDIF file that creates 10 admins, 2 groups, and 100 users per group.
"""

import logging
import os
import os.path
from os.path import dirname, realpath
from textwrap import dedent

config = {
'OU': 'CORP',
'DC': 'corp,dc=res,dc=com',
'DirectoryDomain': 'corp.res.com',
'groups': [
{
'prefix': 'group',
'OU': None,
'first_gid': 2000,
'index0': 1,
'number': 2,
'description': 'Non-RES group'
},
{
'name': 'RESAdministrators',
'OU': 'RES',
'gid': 2020,
'description': 'Represents the group of sudoers'
}
],
'users': [
{
'prefix': 'admin',
'OU': None,
'number': 10,
'first_uid': 3000,
'first_gid': 4000,
'groups': [
'RESAdministrators',
'group01',
'group02'
]
},
{
'prefix': 'user',
'OU': None,
'number': 100,
'first_uid': 5000,
'first_gid': 6000,
'groups': [
'group01'
]
},
{
'prefix': 'user',
'OU': None,
'number': 100,
'first_index': 101,
'first_uid': 5100,
'first_gid': 6100,
'groups': [
'group02'
]
}
]
}

logger = logging.getLogger(__file__)
logger_formatter = logging.Formatter('%(levelname)s: %(message)s')
logger_streamHandler = logging.StreamHandler()
logger_streamHandler.setFormatter(logger_formatter)
logger.addHandler(logger_streamHandler)
logger.propagate = False
logger.setLevel(logging.INFO)

# Use script location as current working directory
script_directory = os.path.dirname(os.path.realpath(f"{__file__}"))
logger.info(f"Working directory: {script_directory}")
os.chdir(script_directory)

fh = open('res-demo-with-cidr/res.ldif', 'w')

fh.write(dedent("""
# Create a OU to be used by RES
dn: OU=RES,OU=${OU},DC=${DC}
changetype: add
objectClass: top
objectClass: organizationalUnit
ou: RES
description: The RES application will limit syncing groups and group-members in the RES OU
# Create a OU to be used by RES to create computers
dn: OU=Computers,OU=RES,OU=${OU},DC=${DC}
changetype: add
objectClass: top
objectClass: organizationalUnit
ou: Computers
description: The RES application will limit creating computers to this OU
# Create a OU to be used by RES to create groups and add users to
dn: OU=Users,OU=RES,OU=${OU},DC=${DC}
changetype: add
objectClass: top
objectClass: organizationalUnit
ou: Users
description: The RES application will limit syncing groups and group-members in the RES OU
"""))

group_members = {}
for user_dict in config['users']:
user_prefix = user_dict['prefix']
first_index = user_dict.get('first_index', 1)
for user_index in range(first_index, first_index + user_dict['number']):
userid = user_prefix + f"{user_index:04}"
ou = user_dict['OU']
comment = f"Create a user: {userid}"
if ou:
comment += f" in {ou} OU"
dn = f"CN={userid},OU=Users,"
if ou:
dn += f"OU={ou},"
dn += "OU=${OU},DC=${DC}"
fh.write(dedent(f"""
{comment}
dn: {dn}
changetype: add
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: user
cn: {userid}
sAMAccountName: {userid}
name: {userid}
userPrincipalName: {userid}@${{DirectoryDomain}}
mail: {userid}@${{DirectoryDomain}}
uidNumber: {user_dict['first_uid']+user_index}
gidNumber: {user_dict['first_gid']+user_index}
unixHomeDirectory: /home/{userid}
loginShell: /bin/bash
"""))
for group in user_dict['groups']:
if group not in group_members:
group_members[group] = []
group_members[group].append(dn)

for group_dict in config['groups']:
ou = group_dict['OU']
for group_index in range(group_dict.get('index0', 1), group_dict.get('number', 1) + 1):
if 'index0' in group_dict:
group_name = f"{group_dict['prefix']}{group_index:02}"
gid = group_dict['first_gid'] + group_index
else:
group_name = group_dict['name']
gid = group_dict['gid']
comment = f"# Create a group: {group_name}"
if ou:
comment += f" in {ou} OU"
dn = f"CN={group_name},OU=Users,"
if ou:
dn += f"OU={ou},"
dn += "OU=${OU},DC=${DC}"
fh.write(dedent(f"""
{comment}
dn: {dn}
changetype: add
objectClass: top
objectClass: group
cn: {group_name}
description: {group_dict['description']}
distinguishedName: {dn}
name: {group_name}
sAMAccountName: {group_name}
objectCategory: CN=Group,CN=Schema,CN=Configuration,DC=${{DC}}
gidNumber: {gid}
"""))
for member_dn in group_members.get(group_name, []):
fh.write(f"member: {member_dn}\n")
14 changes: 14 additions & 0 deletions res/download-res-templates.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash -xe
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: MIT-0
#
# Download the original templates from s3
# This is so that they can be used to create modified versions

script_dir=$(dirname $(realpath $0))
cd $script_dir
aws s3 cp s3://aws-hpc-recipes/main/recipes/res/res_demo_env/assets/res-demo-stack.yaml res-demo-original/.
aws s3 cp s3://aws-hpc-recipes/main/recipes/res/res_demo_env/assets/bi.yaml res-demo-original/.
aws s3 cp s3://aws-hpc-recipes/main/recipes/net/hpc_large_scale/assets/main.yaml res-demo-original/networking.yaml

aws s3 cp s3://aws-hpc-recipes/main/recipes/res/res_demo_env/assets/res.ldif res-demo-original/.
2 changes: 2 additions & 0 deletions res/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
boto3
jinja2
Loading

0 comments on commit 55517db

Please sign in to comment.