Emu is a simulation software designed to show the architecture and operational mechanics of supercomputers used in Oak Ridge National Laboratory. This application helps beginners to familiarize with job submission, queue systems, and schedulers on HPC environment before actually using it in real HPC systems.
-
Customizable Machine Creation and Configuration
- Users can create virtual HPC machines by specifying parameters like nodes, CPUs, cores, and GCDs per node.
- Helps beginners understand HPC and supercomputer architecture concepts.
-
Job Creation and Execution
- Web Interface in each machine helps users to design and submit small HPC jobs, and understand
slurm
scripts. - Dynamically updates
slurm
scripts as users input each values in the GUI.
- Web Interface in each machine helps users to design and submit small HPC jobs, and understand
-
Command Line Support
- The CLI environment supports limited commands for users who haven’t logged in.
- After logging in via SSH, users gain full CLI support created using websockets, replicating a real HPC environment.
-
Feedback
- Errors during job submission will provide detailed feedback, helping users understand and correct mistakes.
- Download local copy of source code from
https://github.com/olcf/ssd_emu
usinggit clone [email protected]:olcf/ssd_emu.git
- Install and setup docker if you haven't already
- Install docker following the setup from Docker Docs
- Create a user by
sudo usermod -aG docker $USER
newgrp docker
- Restart docker to apply changes
systemctl restart docker
- Build docker components inside our application
cd ssd_emu/slurm-docker-cluster
- Use
docker compose build
or follow the tutorials in the readme there. - Run
docker compose up -d
to deploy the containers.
- Install ruby with rbenv (Fedora)
- Install rbenv dependency from Fedora rbenv
- When installing rbenv from fedora official repository, it won't have
3.2.2
which is why we need to re installrbenv
manually. - Clone ruby-build by
git clone https://github.com/rbenv/ruby-build.git
- Run following command to install
ruby-build
-
cd ruby-build chmod u+x ./install.sh ./install.sh rbenv install 3.2.2
- Check the version by running
rbenv --version
- Setup postgresql
- Install postgres following instruction at fedora-postgresql docs.
- Install
postgresql-devel
package bysudo dnf install postgresql-devel
(bundle installer won't work without this package) - Make sure you provide all the privileges to current user.
- Install ruby dependencies using Bundler
cd ssd_emu/rails-server
rbenv local 3.2.2
bundle install
- Setup databases
- Create database using
bin/rails db:create
- Load database seed using
bin/rails db:seed
- Migrate database
bin/rails db:migrate
- Create database using
- Launch server
bin/rails server
- Install Node.js and node modules required for this project.
- Install node js using
sudo dnf install nodejs
for Fedora (Our project uses node with version ofv22.11.0
) - Confirm Node.js installation by
node --version
which outputs something likev22.11.0
.
- Install node js using
- Install node modules
cd ssd_emu/vue-client
- Install node modules by running
npm i
.
- Run the project
npm run dev
- Check localhost:5173 to make sure it is running.
- Create a new machine by clicking on
Create new machine
available in Left Navigation Pane underList all machines
@ssd_team
- Integrate docker component with slurm built inside.
- Use websocket to simulate ssh environment.
- Preserve directory on each command in CLI on frontend and backend
- Update missions page with documentation and challenge.
- Implement job submission via
slurm cluster sbatch
instead of regularbash
scripts - Add architecture image on each machine page using mermaidJS