-
Notifications
You must be signed in to change notification settings - Fork 35
1. Getting started
First, clone the repository:
git clone [email protected]:jim-schwoebel/allie.git
cd allie
Set up virtual environment (to ensure consistent operating mode across operating systems).
python3 -m pip install --user virtualenv
python3 -m venv env
source env/bin/activate
Now install required dependencies and perform unit tests to make sure everything works:
python3 setup.py
Note the installatin process and unit tests above takes roughly ~10-15 minutes to complete and makes sure that you can featurize, model, and load model files (to make predictions) via your default featurizers and modeling techniques. It may be best to go grab lunch or coffee while waiting. :-)
After everything is done, you can use the Allie CLI by typing in:
python3 allie.py -h
Which should output some ways you can use Allie with commands in the API:
Usage: allie.py [options]
Options:
-h, --help show this help message and exit
--c=command, --command=command
the target command (annotate API = 'annotate',
augmentation API = 'augment', cleaning API = 'clean',
datasets API = 'data' features API = 'features' model
prediction API = 'predict' preprocessing API =
'transform' model training API = 'train' testing API =
'test' visualize API = visualize)
--p=problemtype, --problemtype=problemtype
specify the problem type ('c' = classification or 'r'
= regression)
--s=sampletype, --sampletype=sampletype
specify the type files that you'd like to operate on
(e.g. 'audio', 'text', 'image', 'video', 'csv')
--n=common_name, --name=common_name
specify the common name for the model (e.g. 'gender'
for a male/female problem)
--i=class_, --class=class_
specify the class that you wish to annotate for (e.g.
'male')
--a=ldir, --adir=ldir
the directory full of files to annotate (e.g.
'/Users/jim/desktop/allie/train_dir/males/')
--l=ldir, --ldir=ldir
the directory full of files to make model predictions;
if not here will default to ./load_dir (e.g.
'/Users/jim/desktop/allie/load_dir/newfiles/')
--t1=tdir1, --tdir1=tdir1
the directory in the ./train_dir that represent the
folders of files that the transform API will operate
upon (e.g. 'males')
--t2=tdir2, --tdir2=tdir2
the directory in the ./train_dir that represent the
folders of files that the transform API will operate
upon (e.g. 'females')
--d1=dir1, --dir1=dir1
the target directory that contains sample files for
the features API, augmentation API, and cleaning API
(e.g. '/Users/jim/desktop/allie/train_dir/teens/').
--d2=dir2, --dir2=dir2
the target directory that contains sample files for
the features API, augmentation API, and cleaning API
(e.g. '/Users/jim/desktop/allie/train_dir/twenties/').
--d3=dir3, --dir3=dir3
the target directory that contains sample files for
the features API, augmentation API, and cleaning API
(e.g. '/Users/jim/desktop/allie/train_dir/thirties/').
--d4=dir4, --dir4=dir4
the target directory that contains sample files for
the features API, augmentation API, and cleaning API
(e.g. '/Users/jim/desktop/allie/train_dir/fourties/')
You can run Allie in a Docker container fairly easily (10-11GB container run on top of Linux/Ubuntu):
git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
cd allie
docker build -t allie_image .
docker run -it --entrypoint=/bin/bash allie_image
cd ..
You will then have access to the docker container to use Allie's folder structure. You can then run tests @
cd tests
python3 test.py
Note you can quickly download datasets from AWS buckets and train machine learning models from there.
Note that there are many incomptible Python libraries with Windows, so I encourage you to instead run Allie in a Docker container with Ubuntu or on Windows Subsystem for Linux.
If you still want to try to use Allie with Windows, you can do so below.
First, install various dependencies:
- Download Microsoft Visual C++ (https://www.visualstudio.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=15).
- Download SWIG and compile locally as an environment variable (http://www.swig.org/download.html).
- Follow instructions to setup Tensorflow on Windows.
Now clone Allie and run the setup.py script:
git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
git checkout windows
cd allie
python3 -m pip install --user virtualenv
python3 -m venv env
python3 setup.py
Note that there are some functions that are limited (e.g. featurization / modeling scripts) due to lack of Windows compatibility.