GitHub - In-For-Disaster-Analytics/cookbook-conda-template: A template cookbook using conda

This template is the first in a series of templates that will guide you through the process of creating a cookbook and running it on TACC systems. From simple ones that run a command to more complex ones that run a Python script using conda or a Jupyter Notebook.

Requirements

A GitHub account
TACC account. If you don't have one, you can request one here.
To access TACC systems, you should have an allocation.
- You can see your allocations here.
- If you don't have an allocation, you can request one here.

Template Overview

This template creates a simple Python script that will be used to demonstrate how to run a cookbook on a TACC cluster and obtain the output using a UI. The cookbook will use a CSV file stored on TACC storage and run a Python script that reads it, calculates the average of the values in the first column, and writes the result to a file.

In this case, the file is small for demonstration purposes. However, you can use the same process to analyze large files.

How does it work?

app.json file: contains the definition of the Tapis application, including the application's name, description, Docker image, input files, and advanced options.
Dockerfile: a Docker image is built from the Dockerfile. The Docker image defines the runtime environment for the application and the files that will be used by the application.
run.sh: contains all the commands that will be executed on the TACC cluster.

Upload files to TACC storage

One of the goals of the template is to demonstrate how to use the TACC storage system to store the input and output files. So, you should upload the CSV file to the TACC storage system.

Go to the TACC Portal.
Click on the "Data Files" tab.
Click on the "Add +" button.
Click on the "Upload" button.
Select the file you want to upload and click Upload Selected.

Modify the Dockerfile

The Dockerfile is used to create a Docker image that will be used to run the Python script. In this case, the Docker image is created using the microconda base image, which is a minimal image that contains conda.

For example, the Dockerfile below installs curl using apt-get. This is useful if you need to install packages that are not available in conda.

RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

Define conda dependencies using `environment.yaml`

The environment.yaml file is used to define the conda environment that will be used to run the Python script. In this case, the environment.yaml file contains the dependencies needed to run the Python script.

name: base
channels:
  - conda-forge
dependencies:
  - python=3.9.1
  - pandas=1.2.1

Job run script

The run.sh file is used to run the Python script. It activates the conda environment and runs the Python script.

#!/bin/bash
set -xe

cd ${_tapisExecSystemInputDir}
python /code/main.py billing.csv ${_tapisExecSystemOutputDir}/output.txt

The run.sh has two variables that are used to define the input and output directories. These variables are _tapisExecSystemInputDir and _tapisExecSystemOutputDir which are automatically set by the Tapis system.

_tapisExecSystemInputDir: The directory where the input files are staged
_tapisExecSystemOutputDir: The directory where the application writes the output files

Create your cookbook

You can use this repository as a template to create your cookbook. Follow the steps below to create your cookbook.

Create a new repository

Click on the "Use this template" button to create a new repository
Fill in the form with the information for your new repository

Build the Docker image

Clone the repository
Build the Docker image using the command below

docker build -t cookbook-python .

Push the Docker image to a container registry

docker tag cookbook-python <your-registry>/cookbook-python
docker push <your-registry>/cookbook-python

Modify the `app.json` file

Each app has a unique id and description. So, you should change these fields to match your app's name and description.

Download the app.json file
Change the values id and description fields with the name and description as you wish.

Create a New Application on the Cookbook UI

Go to Cookbook UI
Click on the "Create Application" button
Fill in the form with the information from your app.json file
Click "Create Application"
A new application will be created, and you will be redirected to the application's page

Run your Cookbook

Go to the application's page on the Cookbook UI, if you are not already there
Click on the "Run" button on the right side of the page. This will open the Portal UI
Click on the "Select" button to choose the input file
Click "Run"

Check the output

After the job finishes, you can check the output by clicking on the "Output location" link on the job's page
You will be redirected to the output location, where you can see the output files generated by the job
Click on a file to see its content. In this case, the file is named output.txt

Next templates

Authors

William Mobley - [email protected]
Maximiliano Osorio - [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.github		.github
images		images
tests/data		tests/data
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
README.md		README.md
app.json		app.json
environment.yaml		environment.yaml
main.py		main.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Template Overview

How does it work?

Upload files to TACC storage

Modify the Dockerfile

Define conda dependencies using `environment.yaml`

Job run script

Create your cookbook

Create a new repository

Build the Docker image

Modify the `app.json` file

Create a New Application on the Cookbook UI

Run your Cookbook

Check the output

Next templates

Authors

About

Releases 1

Packages

Contributors 2

Languages

In-For-Disaster-Analytics/cookbook-conda-template

Folders and files

Latest commit

History

Repository files navigation

Requirements

Template Overview

How does it work?

Upload files to TACC storage

Modify the Dockerfile

Define conda dependencies using environment.yaml

Job run script

Create your cookbook

Create a new repository

Build the Docker image

Modify the app.json file

Create a New Application on the Cookbook UI

Run your Cookbook

Check the output

Next templates

Authors

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Define conda dependencies using `environment.yaml`

Modify the `app.json` file

Packages