****# NNF Deployment
To clone this project, use the additional --recurse-submodules option to retrieve its submodules:
git clone --recurse-submodules [email protected]:NearNodeFlash/nnf-deploy
To update the submodules in this work area, run the update.sh script. Use this to pick up recent changes in any of the submodules.
Warning If the current submodules have already been deployed to a K8s system, then teardown and delete any workflows and run nnf-deploy undeploy
to remove the old CRDs and pods prior to updating the submodules. An update may pull in new CRD changes that are incompatible with resources that are already on the K8s system.
The update.sh command will update each submodule directory to the head of its master branch.
tools/update.sh
Any submodule can be set to a specific revision and it will be used by the nnf-deploy command. Note the warning above prior to setting a submodule to a specific revision.
To set a submodule to a specific revision, change into that submodule's directory and switch to that revision or branch:
cd nnf-sos
git switch branch-with-my-fixes
cd ..
The update.sh command will switch that submodule back to the head of its master branch.
nnf-deploy is a golang executable capable of building components of the Rabbit software stack locally as well as deploying and un-deploying those components to a k8s cluster specified by the current kubeconfig.
Build using: make
Prior to running, ensure correct NNF systems are loaded in ./config/systems.yaml and correct ghcr repositories are defined in ./config/repositories.yaml
./nnf-deploy --help
Usage: nnf-deploy <command>
Flags:
-h, --help Show context-sensitive help.
--debug Enable debug mode.
--dry-run Show what would be run.
--systems="config/systems.yaml"
path to the systems config file
--repos="config/repositories.yaml"
path to the repositories config file
--daemons="config/daemons.yaml"
path to the daemons config file
Commands:
deploy [<only> ...]
Deploy to current context.
undeploy [<only> ...]
Undeploy from current context.
make <command> [<only> ...]
Run make [COMMAND] in every repository.
install [<node> ...]
Install daemons (EXPERIMENTAL).
init
Initialize cluster.
Run "nnf-deploy <command> --help" for more information on a command.
The init
subcommand will install ArgoCD via helm. The user must have the helm
CLI installed. This init command should be done only once on a new cluster.
./nnf-deploy init
To restore legacy init behavior--to have init
install cert manager,
mpi-operator, lustre-csi-driver, and lustre-fs-operator--copy the
config/overlay-legacy.yaml-template
file to ./overlay-legacy.yaml
. This init
command only needs to be done once on a new cluster or when one of them
changes.
cp config/overlay-legacy.yaml-template overlay-legacy.yaml
./nnf-deploy init
Deploy all the submodules using the deploy
command
./nnf-deploy deploy
To deploy only specific repositories, include the desired modules after deploy
command. For example, to deploy only dws
and nnf-sos
repositories, use
./nnf-deploy deploy dws nnf-sos
WARNING! Before you undeploy, delete any user or administrator created resources such as lustrefilesystems
and workflows
using kubectl commands
kubectl delete workflows.dws.cray.hpe.com --all
kubectl delete lustrefilesystems.cray.hpe.com --all
Undeploy all the submodules using the undeploy
command.
./nnf-deploy undeploy
Similar to deploy, you may undeploy specific repositories by including the desired modules after the undeploy
command. For example, to undeploy only dws
and nnf-sos
, use
./nnf-deploy undeploy dws nnf-sos
The make
subcommand provides direct access to makefile targets within each submodule in nnf-deploy executing make <command>
within each submodule. For example, the following command performs a docker-build
within each submodule:
./nnf-deploy make docker-build
Kind clusters are built and deployed using locally compiled images. The following commands:
- Create a kind cluster
- Build all docker images for Rabbit modules
- Push those images into the Kind cluster
- Deploy those images onto the Kind cluster nodes
./tools/kind.sh reset
./nnf-deploy make docker-build
./nnf-deploy make kind-push
./nnf-deploy deploy
The install
subcommand will compile and install the daemons on the compute nodes, along with the
proper certs and tokens. Systemd files are used to manage and start the daemons. This is necessary for data movement.
./nnf-deploy install