Skip to content

1. Getting started

Jim Schwoebel edited this page Aug 11, 2020 · 20 revisions

Mac or Linux

First, clone the repository:

git clone [email protected]:jim-schwoebel/allie.git
cd allie 

Set up virtual environment (to ensure consistent operating mode across operating systems).

python3 -m pip install --user virtualenv
python3 -m venv env
source env/bin/activate

Now install required dependencies and perform unit tests to make sure everything works:

python3 setup.py

Note the installation process and unit tests above takes roughly ~10-15 minutes to complete and makes sure that you can featurize, model, and load model files (to make predictions) via your default featurizers and modeling techniques. It may be best to go grab lunch or coffee while waiting. :-)

After everything is done, you can use the Allie CLI by typing in:

python3 allie.py -h

Which should output some ways you can use Allie:

Usage: allie.py [options]

Options:
  -h, --help            show this help message and exit
  --c=command, --command=command
                        the target command (annotate API = 'annotate',
                        augmentation API = 'augment',  cleaning API = 'clean',
                        datasets API = 'data',  features API = 'features',
                        model prediction API = 'predict',  preprocessing API =
                        'transform',  model training API = 'train',  testing
                        API = 'test',  visualize API = 'visualize',
                        list/change default settings = 'settings')
  --p=problemtype, --problemtype=problemtype
                        specify the problem type ('c' = classification or 'r'
                        = regression)
  --s=sampletype, --sampletype=sampletype
                        specify the type files that you'd like to operate on
                        (e.g. 'audio', 'text', 'image', 'video', 'csv')
  --n=common_name, --name=common_name
                        specify the common name for the model (e.g. 'gender'
                        for a male/female problem)
  --i=class_, --class=class_
                        specify the class that you wish to annotate (e.g.
                        'male')
  --d=dir, --dir=dir    an array of the target directory (or directories) that
                        contains sample files for the annotation API,
                        prediction API, features API, augmentation API,
                        cleaning API, and preprocessing API (e.g.
                        '/Users/jim/desktop/allie/train_dir/teens/')

For more information on how to use the Allie CLI, check out the Allie CLI tutorial or any of the links below:

Windows

recommended installation (Docker)

You can run Allie in a Docker container fairly easily (10-11GB container run on top of Linux/Ubuntu):

git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
cd allie 
docker build -t allie_image .
docker run -it --entrypoint=/bin/bash allie_image
cd ..

You will then have access to the docker container to use Allie's folder structure. You can then run tests @

cd tests
python3 test.py

Note you can quickly download datasets from AWS buckets and train machine learning models from there.

alternative

Note that there are many incomptible Python libraries with Windows, so I encourage you to instead run Allie in a Docker container with Ubuntu or on Windows Subsystem for Linux.

If you still want to try to use Allie with Windows, you can do so below.

First, install various dependencies:

Now clone Allie and run the setup.py script:

git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
git checkout windows
cd allie 
python3 -m pip install --user virtualenv
python3 -m venv env
python3 setup.py

Note that there are some functions that are limited (e.g. featurization / modeling scripts) due to lack of Windows compatibility.