Skip to content

5. Command Line Interface

Jim Schwoebel edited this page Aug 7, 2020 · 42 revisions

Allie has a rich command-line interface to perform many of the API functions from it. In this section of the wiki you can learn more about how to use the Allie CLI.

To follow along with these examples, quickly seed some data (51 male files / 51 female files):

cd allie
cd datasets
python3 seed_test.py

Help

To get started, you can explore commands Allie CLI by typing in:

cd ~ 
cd allie
python3 allie.py -h

Which should output some ways you can use Allie with commands in the API:

Usage: allie.py [options]

Options:
  -h, --help            show this help message and exit
  --c=command, --command=command
                        the target command (annotate API = 'annotate',
                        augmentation API = 'augment',  cleaning API = 'clean',
                        datasets API = 'data',  features API = 'features',
                        model prediction API = 'predict',  preprocessing API =
                        'transform',  model training API = 'train',  testing
                        API = 'test',  visualize API = 'visualize',
                        list/change default settings = 'settings')
  --p=problemtype, --problemtype=problemtype
                        specify the problem type ('c' = classification or 'r'
                        = regression)
  --s=sampletype, --sampletype=sampletype
                        specify the type files that you'd like to operate on
                        (e.g. 'audio', 'text', 'image', 'video', 'csv')
  --n=common_name, --name=common_name
                        specify the common name for the model (e.g. 'gender'
                        for a male/female problem)
  --i=class_, --class=class_
                        specify the class that you wish to annotate (e.g.
                        'male')
  --t1=tdir1, --tdir1=tdir1
                        the directory in the ./train_dir that represent a
                        folder of files that the transform API will operate
                        upon (e.g. 'males')
  --t2=tdir2, --tdir2=tdir2
                        the directory in the ./train_dir that represent a
                        folder of files that the transform API will operate
                        upon (e.g. 'females')
  --d=dir, --dir=dir    an array of the target directory (or directories) that
                        contains sample files for the annotation API,
                        prediction API, features API, augmentation API, and
                        cleaning API (e.g.
                        '/Users/jim/desktop/allie/train_dir/teens/')

Classification problem

You can annotate a folder of audio files here as a classification problem with the label male in a directory with a command like this:

python3 allie.py --command annotate --sampletype audio --problemtype c --i male --dir /Users/jim/desktop/allie/train_dir/males

Regression problem

To change to a regression problem, you just need to change the problemtype to -r and the class (--i) to a regression class problem (e.g. age):

python3 allie.py --command annotate --sampletype audio --problemtype r --i age --dir /Users/jim/desktop/allie/train_dir/males

You can augment data like this via the default_augmentation settings:

python3 allie.py --command augment --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males

You can clean data like this via the default_augmentation settings:

python3 allie.py --command clean --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males

You can call the Allie datasets API with this command:

python3 allie.py --command data

You can featurize data just like augmentation and cleaning:

python3 allie.py --command features --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males

You can train machine learning models quickly with:

python3 allie.py --command train

You need to have a model that you have trained with Allie in the ./models/[sampletype]_models directory. For example, an audio model that is detecting gender may be in this tree structure:


Then you can run the allie.py predict command.

python3 allie.py --command predict

You can make a transformer to reduce or select features with:

python3 allie.py --command transform --tdir1 males --tidr2 females

Where 'males' and 'females' are the two directories in the train_dir that are being used to complete the transformation.

You can run unit tests with:

python3 allie.py --command test

You can visualized multi-class problems that have featurized folders with:

python3 allie.py --command visualize

This will then take you through a visualization prompt to set the classes and structure a visualization session, as output in the 'visualization_session" folder.

You can set some new settings within Allie quite easily by doing:

python3 allie.py --command settings

This will then open up a list of questions to allow you to specify new settings within Allie or visualize the existing settings, as set by the settings.json database.

For example, you may want to turn off video_transcribe setting by setting it to False:

{'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': True, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}


Would you like to change any of these settings? Yes (-y) or No (-n)
y
What setting would you like to change?
transcribe_video
What setting would you like to set here?
False
<class 'bool'>