This is a small collection of Ansible playbooks for my Raspberry Pi cluster.
These really aren't set up to be general at the moment -- i.e., there are still hard-coded IPs and names in config files, and other crap -- but if you want to try to use this to build your own cluster, feel free to give it a shot. :) It should at least be pretty straightforward to read and understand the playbooks.
Ansible is a really cool tool for automating system configuration and deployment tasks. (However, this particular set of playbooks is a bit of a mess, and I'm meaning to split it up to conform to the new roles system introduced in Ansible 1.2... watch this space.) This set of playbooks will set up an HPC cluster with some of the following libraries and services:
- Cluster Scheduler: SLURM
- Message-passing library: OpenMPI
- Benchmark: HPCC
- Shared filesystem: NFS
- Time server: NTP
- DNS: dnsmasq
- Performance montioring: Ganglia
Important: the notes below describe roughly what I did, but there was a lot of trial-and-error and I didn't note all the blind alleys. I haven't reproduced with this set of instructions yet, so you might have to troubleshoot a bit.
- Get 2 or more Raspberry Pi Model B's. One will be the head node, the others will be compute nodes.
- Download the most recent Raspbian Linux and copy the raw image onto an SD card. (I used the "dd" command on a Linux laptop to do this, but this page has additional information on how to do that.
- I booted up the SD card on one of the Raspberry Pi's. I changed the password, made sure SSH was turned on and X Windows was turned off, then shut it back down.
- Copy the contents of the Raspbian SD card to N idential SD cards for the other Raspberry Pi's.
I set up all my Raspberry Pi's with static addresses in the 192.168.42.0/24 subnet, and gave the head node a second interface to DHCP.
Mostly I did this so that I could run the cluster "stand-alone", i.e. without an external DHCP server, but so that I could also plug into a larger network and have the head node reachable from the rest of the network.
Also because HPC clusters usually keep all their compute nodes on a "private" network, and I was trying to make it look like a "real" cluster. :)
I probably could have figured out how to do this with Ansible, but I did this by hand at the time.
- Get an ethernet switch and connect all the Raspberry Pi's. Also temporarily connect this switch to your home router, or other DHCP server.
- Turn on all the Raspberry Pi's and let them boot.
- Use nmap or look at your router config to figure out what their temporary IPs are.
- SSH into the head node and set up two interfaces on eth0. I configured eth0 to have the static address 192.168.42.1 and eth0:1 to DHCP, so I could have the head node NAT to the outside world when connected to my home router. I also changed the head node hostname to "pihead" at this point.
- SSH into each "compute" Raspberry Pi and configure a static IP on eth0. Change the hostname to something like "pi01", "pi02", etc. Then reboot.
- Create an /etc/hosts file on the head node to list the compute node names and IPs.
At this point it's also a good idea to set up passwordless SSH with keys from your laptop into the head node, and from the head node to all the compute nodes. I used the "pi" user for this and gave "pi" sudo rights on all nodes.
If you haven't installed Ansible yet, you should do so! Run "pip install ansible" or "easy_install ansible" to do this.
Edit the "hosts.pi" file and enter the IP address of the head node, and the names or IPs of the compute nodes.
You should (hopefully) be able to set up all the cluster services using the following command:
ansible-playbook -i hosts.pi headnode-main.yml
The compute nodes are on a private network, so you have to configure them from the head node.
SSH into the head node as "pi" and confirm that this repo is present as "ansible-pi-cluster". Then you should be able to just run
ansible-playbook -i hosts.pi computes-main.yml
And with luck, you should then have your SLURM cluster!
For information about how to use SLURM, see the quickstart guide.