EManager stays for Execution Manager and it is a module developed within the AIMES project. The EManager takes a workflow description as an input and executes its tasks by means of a pilot framework on a dynamically chosen set of resources. The number of pilots required to execute the given tasks is determined dynamically on the base of the duration and number of cores required by each task and on the information acquired about target resources.
Currently, EManager supports:
- RADICAL-pilot as PilotJob framework;
- AIMES skeletons as a synthetic workflow descriptor; and
- AIMES bundles as information system about resource properties.
EManager is still under development. At the moment, a proof of concept is available from this repository in the form of a demo script. The script has been used at Super Computing 2014 to illustrate the progress and current state of the art of the AIMES project.
The demo script requires:
- A Linux operating system. Apple OS X should work too but it has not been tested.
- OS-level applications
- A selected number of python modules.
- Accounts and allocations on XSEDE and NeRSC.
- radical.pilot: installed from pipi
- radical.utils: installed by radical.pilot from pipi
- saga-python: installed by radical.pilot from pipi
- aimes.skeleton: installed from github
- aimes.bundle: installed from github
- pandas: installed from pipi
- Pyro4: installed from pipi
The demo can be run on a reduced set of resources but all the above resources are required to get a run consistent with the one demoed at SC2014.
To run the demo script, prepare a dedicated python environment:
virtualenv ~/Virtualenvs/AIMES-DEMO-SC2014
. ~/Virtualenvs/AIMES-DEMO-SC2014/bin/activate
Install the required python modules:
pip install radical.pilot
pip install --upgrade git+https://github.com/applicationskeleton/Skeleton.git@$master#egg=Skeleton
pip install --upgrade git+https://github.com/Francis-Liu/aimes.bundle.git@$master#egg=aimes.bundle
pip install pandas
pip install Pyro4
Install the execution manager:
pip install --upgrade git+https://github.com/mturilli/aimes.emanager.git@master#egg=aimes.emanager
This demo has not been designed to be portable or to be shared among multiple users. As such, the demo requires an extensive and fairly rigid configuration of its running environment.
The following programs need to be installed and made available within the OS:
- gnuplot >= 4.6
- mutt
Gnuplot is used to generate a diagrammatic representation of the demo run. This diagram is mailed alongside run statistics and logs to a configurable list of recipients via mutt. A mail agent/server is assumed to be configured and available on the host on which the demo is executed. Without one, mutt will not be able to send the e-mail.
Edit the following file in your preferred editor:
~/Virtualenvs/AIMES-DEMO-SC2014/bin/demo_SC2014_env_setup.sh
uncomment the following block of text:
# if test "$username" = "<INSERT_YOUR_USERNAME>"
# then
# #export EMANAGER_DEBUG
# export AIMES_USER_ID=<INSERT_YOUR_USERNAME>
# export AIMES_USER_KEY=<INSERT_PATH_TO_YOUR_SSH_PUBLIC_KEY>
# export DEMO_FOLDER=/home/<INSERT_YOUR_USERNAME>/AIMES_demo_SC2014
# export BUNDLE_CONF=~/Virtualenvs/AIMES-DEMO-SC2014/etc/bundle_demo_SC2014.conf
# export SKELETON_CONF=~/Virtualenvs/AIMES-DEMO-SC2014/etc/skeleton_demo_SC2014.conf
# export XSEDE_PROJECT_ID_STAMPEDE=<INSERT_YOUR_STAMPEDE_ALLOCATION>
# export XSEDE_PROJECT_ID_TRESTLES=<INSERT_YOUR_TRESTLES_ALLOCATION>
# export XSEDE_PROJECT_ID_GORDON=<INSERT_YOUR_GORDON_ALLOCATION>
# export XSEDE_PROJECT_ID_BLACKLIGHT=<INSERT_YOUR_BLACKLIGHT_ALLOCATION>
# export RECIPIENTS=<INSERT_RECIPIENT_EMAIL_ADDRESS>,<INSERT_RECIPIENT_EMAIL_ADDRESS>
# fi
and replace:
<INSERT_YOUR_USERNAME>
with the name of the account from which you will run the demo. The commandid -un
can be used to find out the name of the account to be used.<INSERT_PATH_TO_YOUR_SSH_PUBLIC_KEY>
with the path of the ssh public key that will be used to authenticate on all the target resources. This parameter needs to be specified only when more than one private key is present in~/.ssh
. Example of a valid parameter:/home/test/.ssh/test_rsa.pub
.<INSERT_YOUR_STAMPEDE_ALLOCATION>
with the allocation you want to use on stampede.<INSERT_YOUR_TRESTLES_ALLOCATION>
with the allocation you want to use on trestles.<INSERT_YOUR_GORDON_ALLOCATION>
with the allocation you want to use on gordon.<INSERT_YOUR_BLACKLIGHT_ALLOCATION>
with the allocation you want to use on blacklight.<INSERT_RECIPIENT_EMAIL_ADDRESS>
with one or more comma-delimited e-mail address(es) to which you want to send the report email once a demo run has terminated.
Edit the following file in your preferred editor:
~/Virtualenvs/AIMES-DEMO-SC2014/etc/bundle_demo_SC2014.conf
and replace <INSERT_STAMPEDE_USERNAME>
, <INSERT_TRESTLES_USERNAME>
, <INSERT_GORDON_USERNAME>
, <INSERT_BLACKLIGHT_USERNAME>
, and <INSERT_HOPPER_USERNAME>
with your username on the named resources.
If the run needs to be run on a reduced set of resources, all the unneeded resources should be commented out in this file.
NOTE
When different IDs are used on target resources further configuration is needed:
-
Unset/comment out the environment variable
AIMES_USER_ID
indemo_SC2014_env_setup.sh
. -
Set an ID for each target resource in your
~/.ssh/config
. For example:
host stampede stampede.tacc.xsede.org stampede.tacc.utexas.edu User = tg803521 Hostname = login1.stampede.tacc.utexas.edu
host trestles trestles.sdsc.xsede.org User = amerzky Hostname = trestles-login.sdsc.edu
host login.archer.ac.uk User = merzky
### Execution environment
Create the directory from which to run the demo:
mkdir ~/AIMES_demo_SC2014
### Authentication
Bundles and radical.pilot require key-based ssh authentication and **do not handle** password requests for password-protected private keys. You have the choice to create a password-less private key or, more securely, use a ssh-agent to manage password requests for your keys. In order to run this demo you will need to setup key-based ssh authentication on: stampede, trestles, gordon, blacklight, and hopper.
Please note that
## Initialization
The bundle module needs to be initialized before running the demo. Execute the following command:
aimes-bundle-manager -c ~/Virtualenvs/AIMES-DEMO-SC2014/etc/bundle_demo_SC2014.conf -m mongodb -u mongodb://54.221.194.147:24242/AIMES_bundle_/ -v
The string `<USERNAME>` will have to be replaced by the same username that has ben set in the file `~/Virtualenvs/AIMES-DEMO-SC2014/bin/demo_SC2014_env_setup.sh` as per instructions in the Section `Configuration files` above.
`aimed-bundle-manager` command will take few minutes to populate the bundle database with all the resource information. This demo should not be run before the database has been fully populated.
The bundle database will bave to be purged and reinitialized in case of repeated runs of the demo with different target resources. The following command can be used to purge the bundle database:
radical-utils-mongodb.py -m remove -d mongodb://54.221.194.147:24242/AIMES_bundle_/
The content of the bundle database can be inspected with one of the following commands, depending on the output formatting required:
radical-utils-mongodb.py -m tree -d mongodb://54.221.194.147:24242/AIMES_bundle_/ radical-utils-mongodb.py -m dump -d mongodb://54.221.194.147:24242/AIMES_bundle_/
## Execution
Execute the AIMES SC2014 demo as follows:
cd ~/AIMES_demo_SC2014 demo_SC2014.sh
The script will output all the steps of the demo on the console and, once completed, will send an e-mail with the summary of the run and its diagrammatic representation to the e-mail address(es) indicated in the demo configuration file. Here an example of a diagram produced for a successful run of the demo:
![Diagrammatic representation of a demo run](https://raw.githubusercontent.com/mturilli/aimes.emanager/master/doc/54c64b2323769c240b19d396.png)
**Note that the pilot on blacklight is supposed to fail. This illustrates the fault tolerant properties of the scheduler used to late-bind the tasks of the given skeleton on a dynamic number of pilots.**
The following directories will be written into the demo directory:
* `run-21-<SID>`: directory containing all the files relative to the demo run, including diagrams, logs, and statistics. If the e-mail fails to be delivered, all the files are still available within this directory. Multiple runs create individual directories.
* `Stage_1_Input`: directory with the input files for the tasks of the first stage of the skeleton.
* `Stage_1_Output`: directory with the output files of the tasks of the first stage of the skeleton. These files are transferred from the remote resource back to the host from which the demo has been run.
* `Stage_2_Output`: directory with the output files of the tasks of the second stage of the skeleton. These files are transferred from the remote resource back to the host from which the demo has been run.
The skeleton executed by the demo is limited to 21 tasks due to the time constraints imposed by a live execution. The skeleton can be modified by editing the file:
~/Virtualenvs/AIMES-DEMO-SC2014/etc/skeleton_demo_SC2014.conf
The current code should support runs up but not above 4096 tasks per stage.