1. Getting started

Installing dependencies

First, clone the repository:

git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
cd allie

Set up virtual environment (to ensure consistent operating mode across operating systems).

python3 -m pip install --user virtualenv
python3 -m venv env
source env/bin/activate

Now install required dependencies:

python3 setup.py

Now do some unit tests to make sure everything works:

cd tests
python3 test.py

Note the test above takes roughly 5-10 minutes to complete and makes sure that you can featurize, model, and load model files (to make predictions) via your default featurizers and modeling techniques.

Navigating folders

Here is a table that describes the folder structure for this repository. These descriptions could help guide how you can quickly get started with featurizing and modeling data samples.

folder name	description of folder
datasets	an elaborate list of open source datasets that can be used for curating datasets and augmenting datasets.
features	a list of audio, text, image, video, and csv featurization scripts (these can be specified in the settings.json files).
load_dir	a directory where you can put in audio, text, image, video, or .CSV files and make moel predictions from ./models directory.
models	for loading/storing machine learning models and making model predictions for files put in the load_dir.
production	a folder for outputting production-ready repositories via the YAML.py script.
tests	for running local tests and making sure everything works as expected.
train_dir	a directory where you can put in audio, text, image, video, or .CSV files in folders and train machine learning models from the model.py script in the ./training/ directory.
training	for training machine learning models via specified model training scripts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. Getting started

Installing dependencies

Navigating folders

Setting defaults

Clone this wiki locally