-
Notifications
You must be signed in to change notification settings - Fork 35
1. Getting started
Jim Schwoebel edited this page Jul 9, 2019
·
20 revisions
First, clone the repository:
git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
cd allie
Set up virtual environment (to ensure consistent operating mode across operating systems).
python3 -m pip install --user virtualenv
python3 -m venv env
source env/bin/activate
Now install required dependencies:
python3 setup.py
Now do some unit tests to make sure everything works:
cd tests
python3 test.py
Note the test above takes roughly 5-10 minutes to complete and makes sure that you can featurize, model, and load model files (to make predictions) via your default featurizers and modeling techniques.
Here is a table that describes the folder structure for this repository. These descriptions could help guide how you can quickly get started with featurizing and modeling data samples.
folder name | description of folder |
---|---|
datasets | an elaborate list of open source datasets that can be used for curating datasets and augmenting datasets. |
features | a list of audio, text, image, video, and csv featurization scripts (these can be specified in the settings.json files). |
load_dir | a directory where you can put in audio, text, image, video, or .CSV files and make moel predictions from ./models directory. |
models | for loading/storing machine learning models and making model predictions for files put in the load_dir. |
production | a folder for outputting production-ready repositories via the YAML.py script. |
tests | for running local tests and making sure everything works as expected. |
train_dir | a directory where you can put in audio, text, image, video, or .CSV files in folders and train machine learning models from the model.py script in the ./training/ directory. |
training | for training machine learning models via specified model training scripts. |