-
Notifications
You must be signed in to change notification settings - Fork 35
5. Command Line Interface
AAA lllllll lllllll iiii
A:::A l:::::l l:::::l i::::i
A:::::A l:::::l l:::::l iiii
A:::::::A l:::::l l:::::l
A:::::::::A l::::l l::::l iiiiiii eeeeeeeeeeee
A:::::A:::::A l::::l l::::l i:::::i ee::::::::::::ee
A:::::A A:::::A l::::l l::::l i::::i e::::::eeeee:::::ee
A:::::A A:::::A l::::l l::::l i::::i e::::::e e:::::e
A:::::A A:::::A l::::l l::::l i::::i e:::::::eeeee::::::e
A:::::AAAAAAAAA:::::A l::::l l::::l i::::i e:::::::::::::::::e
A:::::::::::::::::::::A l::::l l::::l i::::i e::::::eeeeeeeeeee
A:::::AAAAAAAAAAAAA:::::A l::::l l::::l i::::i e:::::::e
A:::::A A:::::A l::::::ll::::::li::::::ie::::::::e
A:::::A A:::::A l::::::ll::::::li::::::i e::::::::eeeeeeee
A:::::A A:::::A l::::::ll::::::li::::::i ee:::::::::::::e
AAAAAAA AAAAAAAlllllllllllllllliiiiiiii eeeeeeeeeeeeee
_____ _ _ _
/ __ \ | | | | (_)
| / \/ ___ _ __ ___ _ __ ___ __ _ _ __ __| | | | _ _ __ ___
| | / _ \| '_ ` _ \| '_ ` _ \ / _` | '_ \ / _` | | | | | '_ \ / _ \
| \__/\ (_) | | | | | | | | | | | (_| | | | | (_| | | |___| | | | | __/
\____/\___/|_| |_| |_|_| |_| |_|\__,_|_| |_|\__,_| \_____/_|_| |_|\___|
_____ _ __
|_ _| | | / _|
| | _ __ | |_ ___ _ __| |_ __ _ ___ ___
| || '_ \| __/ _ \ '__| _/ _` |/ __/ _ \
_| || | | | || __/ | | || (_| | (_| __/
\___/_| |_|\__\___|_| |_| \__,_|\___\___|
Allie has a rich command-line interface to perform many of the API functions from it. In this section of the wiki you can learn more about how to use the Allie CLI.
To follow along with these examples, quickly seed some data (51 male files / 51 female files):
cd allie
cd datasets
python3 seed_test.py
You can also use any of the links below to go to your section of interest:
- help
- annotating files
- augmenting files
- cleaning files
- collecting data
- featurizing files
- training models
- model predictions
- preprocessing / making transformers
- unit tests
- visualizing data
- new settings
To get started, you can explore commands Allie CLI by typing in:
cd ~
cd allie
python3 allie.py -h
Which should output some ways you can use Allie with commands in the API:
Usage: allie.py [options]
Options:
-h, --help show this help message and exit
--c=command, --command=command
the target command (annotate API = 'annotate',
augmentation API = 'augment', cleaning API = 'clean',
datasets API = 'data', features API = 'features',
model prediction API = 'predict', preprocessing API =
'transform', model training API = 'train', testing
API = 'test', visualize API = 'visualize',
list/change default settings = 'settings')
--p=problemtype, --problemtype=problemtype
specify the problem type ('c' = classification or 'r'
= regression)
--s=sampletype, --sampletype=sampletype
specify the type files that you'd like to operate on
(e.g. 'audio', 'text', 'image', 'video', 'csv')
--n=common_name, --name=common_name
specify the common name for the model (e.g. 'gender'
for a male/female problem)
--i=class_, --class=class_
specify the class that you wish to annotate (e.g.
'male')
--d=dir, --dir=dir an array of the target directory (or directories) that
contains sample files for the annotation API,
prediction API, features API, augmentation API,
cleaning API, and preprocessing API (e.g.
'/Users/jim/desktop/allie/train_dir/teens/')
You can annotate a folder of audio files here as a classification problem with the label male in a directory with a command like this:
python3 allie.py --command annotate --sampletype audio --problemtype c --class male --dir /Users/jim/desktop/allie/train_dir/males
It will then play back audio files for you to annotate around the specified class:
0%| | 0/51 [00:00<?, ?it/s]playing file... 16.WAV
16.wav:
File Size: 137k Bit Rate: 256k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 16000Hz
Replaygain: off
Duration: 00:00:04.29
In:100% 00:00:04.29 [00:00:00.00] Out:189k [ | ] Hd:5.9 Clip:0
Done.
MALE label 1 (yes) or 0 (no)?
yes
error annotating, annotating again...
error - file 16.wav not recognized
2%|β | 1/51 [00:07<06:09, 7.39s/it]playing file... 17.WAV
17.wav:
File Size: 229k Bit Rate: 256k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 16000Hz
Replaygain: off
Duration: 00:00:07.17
In:100% 00:00:07.17 [00:00:00.00] Out:316k [ | ] Clip:0
To change to a regression problem, you just need to change the problemtype to -r and the class (--i) to a regression class problem (e.g. age):
python3 allie.py --command annotate --sampletype audio --problemtype r --i age --dir /Users/jim/desktop/allie/train_dir/males
This similarly allows you to annotate for regression problems:
0%| | 0/51 [00:00<?, ?it/s]playing file... 16.WAV
16.wav:
File Size: 137k Bit Rate: 256k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 16000Hz
Replaygain: off
Duration: 00:00:04.29
In:100% 00:00:04.29 [00:00:00.00] Out:189k [ | ] Hd:5.9 Clip:0
Done.
AGE value?
50
[{'age': {'value': 50.0, 'datetime': '2020-08-07 12:22:06.180569', 'filetype': 'audio', 'file': '16.wav', 'problemtype': 'r', 'annotate_dir': '/Users/jim/desktop/allie/train_dir/males'}}]
2%|β | 1/51 [00:11<09:37, 11.55s/it]playing file... 17.WAV
17.wav:
File Size: 229k Bit Rate: 256k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 16000Hz
Replaygain: off
Duration: 00:00:07.17
In:100% 00:00:07.17 [00:00:00.00] Out:316k [ | ] Clip:0
Done.
AGE value?
You can augment data like this via the default_augmentation settings:
python3 allie.py --command augment --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males --dir /Users/jim/desktop/allie/train_dir/females
You now have an augmented set of files in both directories:
males: 0%| | 0/52 [00:00<?, ?it/s](87495,)
males: 2%|β | 1/52 [00:00<00:46, 1.09it/s](88906,)
males: 4%|ββ | 2/52 [00:01<00:34, 1.44it/s](94551,)
males: 6%|βββ | 3/52 [00:01<00:26, 1.87it/s](90317,)
males: 8%|βββ | 4/52 [00:01<00:20, 2.38it/s](90317,)
males: 10%|ββββ | 5/52 [00:01<00:16, 2.79it/s](158055,)
males: 12%|βββββ | 6/52 [00:02<00:16, 2.73it/s](114308,)
males: 13%|βββββ | 7/52 [00:02<00:15, 2.82it/s](104429,)
males: 15%|ββββββ | 8/52 [00:02<00:14, 2.98it/s](104429,)
males: 17%|βββββββ | 9/52 [00:02<00:12, 3.37it/s](129831,)
males: 19%|βββββββ | 10/52 [00:03<00:12, 3.38it/s](228615,)
males: 21%|ββββββββ | 11/52 [00:03<00:13, 3.08it/s](103018,)
males: 23%|βββββββββ | 12/52 [00:03<00:13, 2.96it/s](101607,)
males: 25%|βββββββββ | 13/52 [00:04<00:12, 3.09it/s](87495,)
males: 27%|ββββββββββ | 14/52 [00:04<00:10, 3.54it/s](94551,)
males: 29%|βββββββββββ | 15/52 [00:04<00:09, 3.75it/s](129831,)
males: 31%|βββββββββββ | 16/52 [00:04<00:09, 3.81it/s](91728,)
males: 33%|ββββββββββββ | 17/52 [00:05<00:08, 4.30it/s](198980,)
males: 35%|βββββββββββββ | 18/52 [00:05<00:08, 3.85it/s](143943,)
males: 37%|ββββββββββββββ | 19/52 [00:05<00:08, 3.76it/s](124186,)
males: 38%|ββββββββββββββ | 20/52 [00:05<00:08, 3.82it/s](114308,)
males: 40%|βββββββββββββββ | 21/52 [00:06<00:07, 3.93it/s](107252,)
males: 42%|ββββββββββββββββ | 22/52 [00:06<00:07, 4.25it/s](97373,)
males: 44%|ββββββββββββββββ | 23/52 [00:06<00:06, 4.62it/s](541901,)
males: 46%|βββββββββββββββββ | 24/52 [00:07<00:12, 2.28it/s](203213,)
males: 48%|ββββββββββββββββββ | 25/52 [00:07<00:11, 2.39it/s](214503,)
males: 50%|ββββββββββββββββββ | 26/52 [00:08<00:09, 2.61it/s](94551,)
males: 52%|βββββββββββββββββββ | 27/52 [00:08<00:08, 3.08it/s](111485,)
males: 54%|ββββββββββββββββββββ | 28/52 [00:08<00:07, 3.29it/s](303408,)
males: 56%|ββββββββββββββββββββ | 29/52 [00:09<00:08, 2.71it/s](155232,)
males: 58%|βββββββββββββββββββββ | 30/52 [00:09<00:07, 3.02it/s](94551,)
males: 60%|ββββββββββββββββββββββ | 31/52 [00:09<00:06, 3.45it/s](90317,)
males: 62%|βββββββββββββββββββββββ | 32/52 [00:09<00:05, 3.84it/s](117130,)
males: 63%|βββββββββββββββββββββββ | 33/52 [00:09<00:04, 4.05it/s](128420,)
males: 65%|ββββββββββββββββββββββββ | 34/52 [00:10<00:04, 3.94it/s](115719,)
males: 67%|βββββββββββββββββββββββββ | 35/52 [00:10<00:04, 3.97it/s](134064,)
males: 69%|βββββββββββββββββββββββββ | 36/52 [00:10<00:04, 3.70it/s](152410,)
males: 71%|ββββββββββββββββββββββββββ | 37/52 [00:10<00:03, 3.79it/s](145354,)
males: 73%|βββββββββββββββββββββββββββ | 38/52 [00:11<00:03, 3.64it/s](90317,)
males: 75%|βββββββββββββββββββββββββββ | 39/52 [00:11<00:03, 3.82it/s](108663,)
males: 77%|ββββββββββββββββββββββββββββ | 40/52 [00:11<00:02, 4.13it/s](119952,)
males: 79%|βββββββββββββββββββββββββββββ | 41/52 [00:11<00:02, 4.02it/s](108663,)
males: 81%|βββββββββββββββββββββββββββββ | 42/52 [00:12<00:02, 4.35it/s](115719,)
males: 83%|ββββββββββββββββββββββββββββββ | 43/52 [00:12<00:02, 4.26it/s](124186,)
males: 85%|βββββββββββββββββββββββββββββββ | 44/52 [00:12<00:01, 4.51it/s](94551,)
males: 87%|ββββββββββββββββββββββββββββββββ | 45/52 [00:12<00:01, 4.72it/s](136887,)
males: 88%|ββββββββββββββββββββββββββββββββ | 46/52 [00:13<00:01, 4.54it/s](136887,)
males: 90%|βββββββββββββββββββββββββββββββββ | 47/52 [00:13<00:01, 4.41it/s](121364,)
males: 92%|ββββββββββββββββββββββββββββββββββ | 48/52 [00:13<00:00, 4.64it/s](403604,)
males: 94%|ββββββββββββββββββββββββββββββββββ | 49/52 [00:14<00:01, 2.06it/s](94551,)
males: 96%|βββββββββββββββββββββββββββββββββββ | 50/52 [00:15<00:00, 2.09it/s](396548,)
males: 100%|ββββββββββββββββββββββββββββββββββββ| 52/52 [00:16<00:00, 3.15it/s]
females: 0%| | 0/51 [00:00<?, ?it/s](208858,)
females: 2%|β | 1/51 [00:01<00:50, 1.01s/it](224381,)
females: 4%|ββ | 2/51 [00:01<00:39, 1.23it/s](90317,)
females: 6%|ββ | 3/51 [00:01<00:30, 1.59it/s](156644,)
females: 8%|βββ | 4/51 [00:01<00:24, 1.89it/s](598349,)
females: 10%|ββββ | 5/51 [00:02<00:30, 1.50it/s](93140,)
females: 12%|ββββ | 6/51 [00:03<00:23, 1.89it/s](248372,)
females: 14%|βββββ | 7/51 [00:03<00:21, 2.02it/s](129831,)
females: 16%|ββββββ | 8/51 [00:03<00:18, 2.37it/s](196157,)
females: 18%|βββββββ | 9/51 [00:04<00:17, 2.47it/s](213092,)
females: 20%|βββββββ | 10/51 [00:04<00:15, 2.60it/s](107252,)
females: 22%|ββββββββ | 11/51 [00:04<00:12, 3.12it/s](104429,)
females: 24%|ββββββββ | 12/51 [00:04<00:11, 3.45it/s](129831,)
females: 25%|βββββββββ | 13/51 [00:05<00:09, 3.82it/s](118541,)
females: 27%|ββββββββββ | 14/51 [00:05<00:09, 4.07it/s](98784,)
females: 29%|ββββββββββ | 15/51 [00:05<00:08, 4.19it/s](103018,)
females: 31%|βββββββββββ | 16/51 [00:05<00:08, 4.22it/s](90317,)
females: 33%|ββββββββββββ | 17/51 [00:05<00:07, 4.31it/s](249783,)
females: 35%|ββββββββββββ | 18/51 [00:06<00:08, 3.85it/s](124186,)
females: 37%|βββββββββββββ | 19/51 [00:06<00:08, 3.88it/s](324576,)
females: 39%|ββββββββββββββ | 20/51 [00:06<00:09, 3.13it/s](143943,)
females: 41%|ββββββββββββββ | 21/51 [00:07<00:09, 3.22it/s](93140,)
females: 43%|βββββββββββββββ | 22/51 [00:07<00:07, 3.73it/s](153821,)
females: 45%|ββββββββββββββββ | 23/51 [00:07<00:07, 3.62it/s](156644,)
females: 47%|ββββββββββββββββ | 24/51 [00:07<00:07, 3.60it/s](321754,)
females: 49%|βββββββββββββββββ | 25/51 [00:08<00:08, 3.06it/s](589882,)
females: 51%|ββββββββββββββββββ | 26/51 [00:09<00:12, 2.01it/s](242727,)
females: 53%|ββββββββββββββββββ | 27/51 [00:09<00:11, 2.09it/s](93140,)
females: 55%|βββββββββββββββββββ | 28/51 [00:09<00:09, 2.55it/s](104429,)
females: 57%|ββββββββββββββββββββ | 29/51 [00:10<00:07, 2.90it/s](235671,)
females: 59%|ββββββββββββββββββββ | 30/51 [00:10<00:07, 2.86it/s](101607,)
females: 61%|βββββββββββββββββββββ | 31/51 [00:10<00:06, 3.30it/s](87495,)
females: 63%|ββββββββββββββββββββββ | 32/51 [00:10<00:05, 3.80it/s](101607,)
females: 65%|ββββββββββββββββββββββ | 33/51 [00:11<00:04, 4.22it/s](122775,)
females: 67%|βββββββββββββββββββββββ | 34/51 [00:11<00:04, 4.23it/s](101607,)
females: 69%|ββββββββββββββββββββββββ | 35/51 [00:11<00:03, 4.50it/s](91728,)
females: 71%|ββββββββββββββββββββββββ | 36/51 [00:11<00:03, 4.59it/s](98784,)
females: 73%|βββββββββββββββββββββββββ | 37/51 [00:11<00:03, 4.62it/s](87495,)
females: 75%|ββββββββββββββββββββββββββ | 38/51 [00:12<00:02, 5.01it/s](166522,)
females: 76%|ββββββββββββββββββββββββββ | 39/51 [00:12<00:02, 4.08it/s](134064,)
females: 78%|βββββββββββββββββββββββββββ | 40/51 [00:12<00:03, 3.43it/s](118541,)
females: 80%|ββββββββββββββββββββββββββββ | 41/51 [00:13<00:03, 3.26it/s](149588,)
females: 82%|ββββββββββββββββββββββββββββ | 42/51 [00:13<00:02, 3.37it/s](206036,)
females: 84%|βββββββββββββββββββββββββββββ | 43/51 [00:13<00:02, 3.28it/s](87495,)
females: 86%|ββββββββββββββββββββββββββββββ | 44/51 [00:13<00:01, 3.76it/s](211680,)
females: 88%|ββββββββββββββββββββββββββββββ | 45/51 [00:14<00:01, 3.45it/s](325988,)
females: 90%|βββββββββββββββββββββββββββββββ | 46/51 [00:14<00:01, 2.55it/s](241316,)
females: 92%|ββββββββββββββββββββββββββββββββ | 47/51 [00:15<00:01, 2.71it/s](87495,)
females: 94%|ββββββββββββββββββββββββββββββββ | 48/51 [00:15<00:00, 3.20it/s](90317,)
females: 96%|βββββββββββββββββββββββββββββββββ | 49/51 [00:15<00:00, 3.63it/s](90317,)
females: 98%|ββββββββββββββββββββββββββββββββββ| 50/51 [00:15<00:00, 4.10it/s](101607,)
females: 100%|ββββββββββββββββββββββββββββββββββ| 51/51 [00:15<00:00, 3.20it/s]
You can clean data like this via the default_augmentation settings:
python3 allie.py --command clean --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males --dir /Users/jim/desktop/allie/train_dir/females
You now have a set of cleaned files in both directories.
males: 0%| | 0/102 [00:00<?, ?it/s]ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '6ab789e1-5994-4796-b4ab-63ff9f20cf09.wav':
Metadata:
encoder : Lavf58.29.100
Duration: 00:00:04.10, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '6ab789e1-5994-4796-b4ab-63ff9f20cf09_cleaned.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 128kB time=00:00:04.09 bitrate= 256.2kbits/s speed=2.32e+03x
video:0kB audio:128kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.059509%
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : 5.0
Input #0, wav, from '8caa6f96-04e8-48ee-8817-8b9da97734b2.wav':
Duration: 00:00:08.00, bitrate: 1764 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, 5.0, s16, 1764 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '8caa6f96-04e8-48ee-8817-8b9da97734b2_cleaned.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 250kB time=00:00:08.00 bitrate= 256.1kbits/s speed= 276x
video:0kB audio:250kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.030469%
males: 2%|β | 2/102 [00:00<00:09, 10.18it/s]ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '49b96c77-5971-409a-9cb0-a2abfd9b1f37.wav':
Metadata:
encoder : Lavf58.29.100
Duration: 00:00:04.74, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '49b96c77-5971-409a-9cb0-a2abfd9b1f37_cleaned.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 148kB time=00:00:04.73 bitrate= 256.1kbits/s speed=3.48e+03x
video:0kB audio:148kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.051467%
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '06499054-2859-4861-88d8-841fbaec0365.wav':
Metadata:
encoder : Lavf58.29.100
Duration: 00:00:06.21, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '06499054-2859-4861-88d8-841fbaec0365_cleaned.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 194kB time=00:00:06.20 bitrate= 256.1kbits/s speed=4.28e+03x
video:0kB audio:194kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.039264%
males: 4%|ββ | 4/102 [00:00<00:08, 11.29it/s]ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
...
You can call the Allie datasets API with this command:
python3 allie.py --command data
You can then download a dataset quickly following through the website instructions. Note that many datasets have different ways for downloading, so we have only taken you to the websites of interest for these datasets for you to figure this out. In future versions of Allie, these datasets can be downloaded directly through an API.
/usr/local/lib/python3.7/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
what dataset would you like to download? (1-audio, 2-text, 3-image, 4-video, 5-csv)
1
found 34 datasets...
----------------------------
here are the available AUDIO datasets
----------------------------
TIMIT dataset
Parkinson's speech dataset
ISOLET Data Set
AudioSet
Multimodal EmotionLines Dataset (MELD)
Free Spoken Digit Dataset
Speech Accent Archive
2000 HUB5 English
Emotional Voice dataset - Nature
LJ Speech
VoxForge
Million Song Dataset
Free Music Archive
Common Voice
Spoken Commands dataset
Bird audio detection challenge
Environmental audio dataset
Urban Sound Dataset
Ted-LIUM
Noisy Dataset
Librispeech
Emotional Voices Database
CMU Wilderness
Arabic Speech Corpus
Flickr Audio Caption
CHIME
Tatoeba
Freesound dataset
Spoken Wikipeida Corpora
Karoldvl-ESC
Zero Resource Speech Challenge
Speech Commands Dataset
Persian Consonant Vowel Combination (PCVC) Speech Dataset
VoxCeleb
what audio dataset would you like to download?
Speech Commmands
found dataset: Speech Commands Dataset
-speech-commands-dataset.html) - The dataset (1.4 GB) has 65,000 one-second long utterances of 30 short words, by thousands of different people, contributed by members of the public through the AIY website.
just confirming, do you want to download the Speech Commands Dataset dataset? (Y - yes, N - no)
yes
You can featurize data just like augmentation and cleaning:
python3 allie.py --command features --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males --dir /Users/jim/desktop/allie/train_dir/females
This will then featurize both folders with the default_audio_features in the settings.json.
males: 0%| | 0/102 [00:00<?, ?it/s]deepspeech_dict transcribing: 17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.wav
--2020-08-07 12:29:42-- https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/db3b3f80-84bd-11ea-93d7-1ddb76a21efe?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T162942Z&X-Amz-Expires=300&X-Amz-Signature=75a04415e8839d00e611a7414d420fc4a1a465de88a93e80e9417ee7e55c4325&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.pbmm&response-content-type=application%2Foctet-stream [following]
--2020-08-07 12:29:42-- https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/db3b3f80-84bd-11ea-93d7-1ddb76a21efe?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T162942Z&X-Amz-Expires=300&X-Amz-Signature=75a04415e8839d00e611a7414d420fc4a1a465de88a93e80e9417ee7e55c4325&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.pbmm&response-content-type=application%2Foctet-stream
Resolving github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)... 52.216.204.83
Connecting to github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|52.216.204.83|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 188916323 (180M) [application/octet-stream]
Saving to: βdeepspeech-0.7.0-models.pbmmβ
deepspeech-0.7.0-mo 100%[===================>] 180.16M 6.88MB/s in 18s
2020-08-07 12:30:00 (9.94 MB/s) - βdeepspeech-0.7.0-models.pbmmβ saved [188916323/188916323]
--2020-08-07 12:30:00-- https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/49dcc500-84df-11ea-9cb6-ec1d98c50dd4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T163001Z&X-Amz-Expires=300&X-Amz-Signature=ed079c1be3b63caf76b2daf1ad6d62537d0e4a6aa856c2428995993484bd2872&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.scorer&response-content-type=application%2Foctet-stream [following]
--2020-08-07 12:30:01-- https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/49dcc500-84df-11ea-9cb6-ec1d98c50dd4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T163001Z&X-Amz-Expires=300&X-Amz-Signature=ed079c1be3b63caf76b2daf1ad6d62537d0e4a6aa856c2428995993484bd2872&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.scorer&response-content-type=application%2Foctet-stream
Resolving github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)... 52.216.236.67
Connecting to github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|52.216.236.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 953363776 (909M) [application/octet-stream]
Saving to: βdeepspeech-0.7.0-models.scorerβ
deepspeech-0.7.0-mo 100%[===================>] 909.20M 10.4MB/s in 88s
2020-08-07 12:31:29 (10.3 MB/s) - βdeepspeech-0.7.0-models.scorerβ saved [953363776/953363776]
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.wav':
Metadata:
encoder : Lavf58.45.100
Duration: 00:00:02.00, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned_newaudio.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 63kB time=00:00:02.00 bitrate= 256.3kbits/s speed=1.24e+03x
video:0kB audio:62kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.121875%
deepspeech --model /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm --scorer /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer --audio "17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned_newaudio.wav" >> "17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.txt"
Loading model from file /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm
TensorFlow: v1.15.0-24-gceb46aae58
DeepSpeech: v0.7.4-0-gfcd9563f
2020-08-07 12:31:30.122614: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.0155s.
Loading scorer from files /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer
Loaded scorer in 0.00188s.
Running inference.
Inference took 2.470s for 2.000s audio file.
DEEPSPEECH_DICT
-->
librosa featurizing: 17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.wav
/usr/local/lib/python3.7/site-packages/librosa/beat.py:306: DeprecationWarning: np.asscalar(a) is deprecated since NumPy v1.16, use a.item() instead
hop_length=hop_length))
[15.0, 44.8, 27.914631766632116, 82.0, 3.0, 52.0, 143.5546875, 1.236462812624379, 0.7251315935053164, 3.334862198133343, 0.0, 1.0859751751040547, 1.0, 0.0, 1.0, 1.0, 1.0, 0.9021154304582399, 0.011871022692166161, 0.9248579351103645, 0.8845252162754503, 0.9007761885347338, 0.8005566025338086, 0.02351835274803656, 0.846290326519876, 0.7666702190624551, 0.7974848758054224, 0.765299804936602, 0.028837831871710528, 0.8210615152089579, 0.723447224850167, 0.7616938878984344, 0.7718090402633874, 0.030151369356669896, 0.8289804789632119, 0.7266534111141562, 0.7686879868973036, 0.7936400196140749, 0.031036953487073464, 0.8507660198788851, 0.7451751937203362, 0.7913818436015906, 0.774629021009383, 0.03215688854479966, 0.8344764767038448, 0.7251595846469461, 0.7719273208794816, 0.7428815548766532, 0.035011430707789774, 0.8084967674473139, 0.6894574657533005, 0.7397077230156677, 0.7471335383622011, 0.03463006342656048, 0.811789409800113, 0.6939570618543236, 0.7441413516783196, 0.7610523695125583, 0.033862478442252354, 0.8235134515207413, 0.7081080052043068, 0.7585638476404928, 0.789820226591492, 0.03384767249624557, 0.8500833679586846, 0.7343945280617632, 0.7885333494859879, 0.8059179015451146, 0.03157630618281619, 0.8615948245740482, 0.7535417225113215, 0.8050269828120753, 0.7935417840439638, 0.031145337061902156, 0.8492406377587695, 0.7427011806455776, 0.7922497709713059, -381.8200653053766, 23.009903107621383, -321.89837910119815, -441.6259904054828, -379.3488122081372, 149.04439804929382, 15.356164419049225, 172.85739766597234, 110.28800952925451, 153.08285883743733, -32.84528000164682, 13.009141709326732, -2.154076829327365, -64.91296682470796, -30.99198128144861, 40.70623621978138, 17.548974836043755, 73.18507958780387, 8.337746078102892, 40.63827945609428, -52.86069958238985, 13.478379908189092, -27.553997955729045, -87.5715612441206, -51.58811003068236, 31.42738418944771, 6.795009930398713, 49.46758858300626, 18.299603573231376, 31.6997571738992, -35.82303959204243, 8.486268198834747, -14.57639089253998, -55.40606622608898, -34.90037102114016, 15.955209103884254, 8.934103499373093, 44.50048758909077, -5.494667426263748, 15.650980212978268, -17.338170873356056, 5.727612678025376, -3.1374263092378176, -33.480526476176806, -17.446097772263684, -0.4383376230378039, 6.2672128421452875, 15.720566519612913, -14.330033302127145, -0.244906066419725, -1.467000393875816, 6.138683208427911, 15.463481385114175, -15.812384056333133, -2.0209526024605786, -6.972125329645311, 4.816668688995419, 6.338229281172403, -17.349015718809397, -6.496008401327131, 6.688298302126343, 6.351559382372022, 20.66368904480788, -9.92049214526477, 7.446377744032864, -3.146738423029468e-05, 1.0080761356243433e-05, -1.2325730935412251e-05, -6.256804392224112e-05, -3.133272391000924e-05, 0.25608241330935894, 0.07833864054721744, 0.5017299271817217, 0.1071248558508767, 0.2536783052990988, 1698.1068839811428, 247.34317762284775, 2332.782952940378, 1298.7956436686768, 1652.8623945849333, 1916.493769636128, 217.42031398082, 2404.6618216796505, 1537.6071015613695, 1882.6348073434424, 15.195013434264473, 4.030761691034897, 26.309048213128285, 5.981068616687288, 15.426375628186392, 0.0004132288449909538, 0.000528702512383461, 0.004117688629776239, 7.970369915710762e-05, 0.0002881725667975843, 3901.964911099138, 915.5047430674098, 5792.431640625, 2196.38671875, 3757.5439453125, 0.0726977819683908, 0.013766841258384812, 0.11962890625, 0.03857421875, 0.0712890625, 0.009517103433609009, 0.0026407463010400534, 0.01786264032125473, 0.004661164246499538, 0.009199550375342369]
males: 1%|β | 1/102 [01:51<3:08:13, 111.82s/it]deepspeech_dict transcribing: d2a57cd6-f757-435d-9768-cac1667f79e1_cleaned.wav
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'd2a57cd6-f757-435d-9768-cac1667f79e1_cleaned.wav':
Metadata:
encoder : Lavf58.45.100
Duration: 00:00:08.00, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'd2a57cd6-f757-435d-9768-cac1667f79e1_cleaned_newaudio.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 250kB time=00:00:08.00 bitrate= 256.1kbits/s speed=1.37e+03x
video:0kB audio:250kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.030469%
deepspeech --model /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm --scorer /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer --audio "d2a57cd6-f757-435d-9768-cac1667f79e1_cleaned_newaudio.wav" >> "d2a57cd6-f757-435d-9768-cac1667f79e1_cleaned.txt"
Loading model from file /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm
TensorFlow: v1.15.0-24-gceb46aae58
DeepSpeech: v0.7.4-0-gfcd9563f
2020-08-07 12:31:34.542205: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.0235s.
Loading scorer from files /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer
Loaded scorer in 0.000517s.
Running inference.
...
You can train machine learning models quickly with:
python3 allie.py --command train
This will then train models based on the CLI. Note since we have already featurized the folders this will speed up the modeling process.
is this a classification (c) or regression (r) problem?
c
what problem are you solving? (1-audio, 2-text, 3-image, 4-video, 5-csv)
1
OK cool, we got you modeling audio files
how many classes would you like to model? (2 available)
2
these are the available classes:
['females', 'males']
what is class #1
males
what is class #2
females
what is the 1-word common name for the problem you are working on? (e.g. gender for male/female classification)
gender
-----------------------------------
LOADING MODULES
-----------------------------------
Requirement already satisfied: scikit-learn==0.22.2.post1 in /usr/local/lib/python3.7/site-packages (0.22.2.post1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (0.15.1)
Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.4.1)
Requirement already satisfied: numpy>=1.11.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.18.4)
WARNING: You are using pip version 20.2; however, version 20.2.1 is available.
You should consider upgrading via the '/usr/local/opt/python/bin/python3.7 -m pip install --upgrade pip' command.
-----------------------------------
______ _____ ___ _____ _ _______ _____ ___________ _ _ _____
| ___| ___|/ _ \_ _| | | | ___ \_ _|___ /_ _| \ | | __ \
| |_ | |__ / /_\ \| | | | | | |_/ / | | / / | | | \| | | \/
| _| | __|| _ || | | | | | / | | / / | | | . ` | | __
| | | |___| | | || | | |_| | |\ \ _| |_./ /____| |_| |\ | |_\ \
\_| \____/\_| |_/\_/ \___/\_| \_|\___/\_____/\___/\_| \_/\____/
______ ___ _____ ___
| _ \/ _ \_ _/ _ \
| | | / /_\ \| |/ /_\ \
| | | | _ || || _ |
| |/ /| | | || || | | |
|___/ \_| |_/\_/\_| |_/
-----------------------------------
-----------------------------------
FEATURIZING MALES
-----------------------------------
males: 0%| | 0/51 [00:00<?, ?it/s]librosa featurizing: 38.wav
...
... [skipping a lot of output in terminal]
...
WARNING: You are using pip version 20.2; however, version 20.2.1 is available.
You should consider upgrading via the '/usr/local/opt/python/bin/python3.7 -m pip install --upgrade pip' command.
Warning: xgboost.XGBClassifier is not available and will not be used by TPOT.
Generation 1 - Current best internal CV score: 0.8882352941176471
Generation 2 - Current best internal CV score: 0.8882352941176471
You need to have a model that you have trained with Allie in the ./models/[sampletype]_models directory. For example, an audio model that is detecting gender may be in this tree structure. Since we have already trained the gender model above, we just need to put a sample file in the ./load_dir to make a prediction (which can be found here).
Now call the CLI:
python3 allie.py --command predict
A gender prediction is now made.
'gender_tpot_classifier']
['gender_tpot_classifier']
[]
[]
[]
[]
[]
[]
error
------------------------------
IDENTIFIED MODELS
------------------------------
{'audio_models': ['gender_tpot_classifier'], 'text_models': [], 'image_models': [], 'video_models': [], 'csv_models': []}
-----------------------
FEATURIZING AUDIO_MODELS
-----------------------
/Users/jim/Desktop/allie/allie/features/audio_features
load_dir: 100%|βββββββββββββββββββββββββββββββββ| 3/3 [00:00<00:00, 1154.71it/s]
-----------------------
MODELING AUDIO_MODELS
-----------------------
audio_models
--> predicting gender_tpot_classifier
gender_tpot_classifier_transform.pickle
tpot
audio_models
['jim.wav', 'jim.json', 'README.md']
['jim.json']
audio
{'sampletype': 'audio', 'transcripts': {'audio': {}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'features': {'audio': {'librosa_features': {'features': [48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955], 'labels': ['onset_length', 'onset_detect_mean', 'onset_detect_std', 'onset_detect_maxv', 'onset_detect_minv', 'onset_detect_median', 'tempo', 'onset_strength_mean', 'onset_strength_std', 'onset_strength_maxv', 'onset_strength_minv', 'onset_strength_median', 'rhythm_0_mean', 'rhythm_0_std', 'rhythm_0_maxv', 'rhythm_0_minv', 'rhythm_0_median', 'rhythm_1_mean', 'rhythm_1_std', 'rhythm_1_maxv', 'rhythm_1_minv', 'rhythm_1_median', 'rhythm_2_mean', 'rhythm_2_std', 'rhythm_2_maxv', 'rhythm_2_minv', 'rhythm_2_median', 'rhythm_3_mean', 'rhythm_3_std', 'rhythm_3_maxv', 'rhythm_3_minv', 'rhythm_3_median', 'rhythm_4_mean', 'rhythm_4_std', 'rhythm_4_maxv', 'rhythm_4_minv', 'rhythm_4_median', 'rhythm_5_mean', 'rhythm_5_std', 'rhythm_5_maxv', 'rhythm_5_minv', 'rhythm_5_median', 'rhythm_6_mean', 'rhythm_6_std', 'rhythm_6_maxv', 'rhythm_6_minv', 'rhythm_6_median', 'rhythm_7_mean', 'rhythm_7_std', 'rhythm_7_maxv', 'rhythm_7_minv', 'rhythm_7_median', 'rhythm_8_mean', 'rhythm_8_std', 'rhythm_8_maxv', 'rhythm_8_minv', 'rhythm_8_median', 'rhythm_9_mean', 'rhythm_9_std', 'rhythm_9_maxv', 'rhythm_9_minv', 'rhythm_9_median', 'rhythm_10_mean', 'rhythm_10_std', 'rhythm_10_maxv', 'rhythm_10_minv', 'rhythm_10_median', 'rhythm_11_mean', 'rhythm_11_std', 'rhythm_11_maxv', 'rhythm_11_minv', 'rhythm_11_median', 'rhythm_12_mean', 'rhythm_12_std', 'rhythm_12_maxv', 'rhythm_12_minv', 'rhythm_12_median', 'mfcc_0_mean', 'mfcc_0_std', 'mfcc_0_maxv', 'mfcc_0_minv', 'mfcc_0_median', 'mfcc_1_mean', 'mfcc_1_std', 'mfcc_1_maxv', 'mfcc_1_minv', 'mfcc_1_median', 'mfcc_2_mean', 'mfcc_2_std', 'mfcc_2_maxv', 'mfcc_2_minv', 'mfcc_2_median', 'mfcc_3_mean', 'mfcc_3_std', 'mfcc_3_maxv', 'mfcc_3_minv', 'mfcc_3_median', 'mfcc_4_mean', 'mfcc_4_std', 'mfcc_4_maxv', 'mfcc_4_minv', 'mfcc_4_median', 'mfcc_5_mean', 'mfcc_5_std', 'mfcc_5_maxv', 'mfcc_5_minv', 'mfcc_5_median', 'mfcc_6_mean', 'mfcc_6_std', 'mfcc_6_maxv', 'mfcc_6_minv', 'mfcc_6_median', 'mfcc_7_mean', 'mfcc_7_std', 'mfcc_7_maxv', 'mfcc_7_minv', 'mfcc_7_median', 'mfcc_8_mean', 'mfcc_8_std', 'mfcc_8_maxv', 'mfcc_8_minv', 'mfcc_8_median', 'mfcc_9_mean', 'mfcc_9_std', 'mfcc_9_maxv', 'mfcc_9_minv', 'mfcc_9_median', 'mfcc_10_mean', 'mfcc_10_std', 'mfcc_10_maxv', 'mfcc_10_minv', 'mfcc_10_median', 'mfcc_11_mean', 'mfcc_11_std', 'mfcc_11_maxv', 'mfcc_11_minv', 'mfcc_11_median', 'mfcc_12_mean', 'mfcc_12_std', 'mfcc_12_maxv', 'mfcc_12_minv', 'mfcc_12_median', 'poly_0_mean', 'poly_0_std', 'poly_0_maxv', 'poly_0_minv', 'poly_0_median', 'poly_1_mean', 'poly_1_std', 'poly_1_maxv', 'poly_1_minv', 'poly_1_median', 'spectral_centroid_mean', 'spectral_centroid_std', 'spectral_centroid_maxv', 'spectral_centroid_minv', 'spectral_centroid_median', 'spectral_bandwidth_mean', 'spectral_bandwidth_std', 'spectral_bandwidth_maxv', 'spectral_bandwidth_minv', 'spectral_bandwidth_median', 'spectral_contrast_mean', 'spectral_contrast_std', 'spectral_contrast_maxv', 'spectral_contrast_minv', 'spectral_contrast_median', 'spectral_flatness_mean', 'spectral_flatness_std', 'spectral_flatness_maxv', 'spectral_flatness_minv', 'spectral_flatness_median', 'spectral_rolloff_mean', 'spectral_rolloff_std', 'spectral_rolloff_maxv', 'spectral_rolloff_minv', 'spectral_rolloff_median', 'zero_crossings_mean', 'zero_crossings_std', 'zero_crossings_maxv', 'zero_crossings_minv', 'zero_crossings_median', 'RMSE_mean', 'RMSE_std', 'RMSE_maxv', 'RMSE_minv', 'RMSE_median']}}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'models': {'audio': {'females': [{'sample type': 'audio', 'created date': '2020-08-07 12:43:08.041610', 'device info': {'time': '2020-08-07 12:43', 'timezone': ['EST', 'EDT'], 'operating system': 'Darwin', 'os release': '19.5.0', 'os version': 'Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64', 'cpu data': {'memory': [8589934592, 2205638656, 74.3, 4275126272, 94564352, 2113089536, 1905627136, 2162036736], 'cpu percent': 77.4, 'cpu times': [157014.42, 0.0, 65743.85, 543406.97], 'cpu count': 4, 'cpu stats': [73245, 527841, 971986315, 465241], 'cpu swap': [2147483648, 1335623680, 811859968, 62.2, 542256906240, 1029312512], 'partitions': [['/dev/disk1s6', '/', 'apfs', 'ro,local,rootfs,dovolfs,journaled,multilabel'], ['/dev/disk1s5', '/System/Volumes/Data', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s4', '/private/var/vm', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s1', '/Volumes/Macintosh HD - Data', 'apfs', 'rw,local,dovolfs,journaled,multilabel'], ['/dev/disk1s3', '/Volumes/Recovery', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel']], 'disk usage': [499963174912, 10985529344, 339299921920, 3.1], 'disk io counters': [11783216, 8846330, 592488718336, 504655544320, 11689626, 6788646], 'battery': [56, -2, True], 'boot time': 1596543104.0}, 'space left': 339.29992192}, 'session id': 'b4a4b8e8-d8cc-11ea-8e12-acde48001122', 'classes': ['males', 'females'], 'problem type': 'classification', 'model name': 'gender_tpot_classifier.pickle', 'model type': 'tpot', 'metrics': {'accuracy': 0.7, 'balanced_accuracy': 0.6666666666666667, 'precision': 0.6666666666666666, 'recall': 0.5, 'f1_score': 0.5714285714285715, 'f1_micro': 0.7, 'f1_macro': 0.6703296703296704, 'roc_auc': 0.6666666666666666, 'roc_auc_micro': 0.6666666666666666, 'roc_auc_macro': 0.6666666666666666, 'confusion_matrix': [[5, 1], [2, 2]], 'classification_report': ' precision recall f1-score support\n\n males 0.71 0.83 0.77 6\n females 0.67 0.50 0.57 4\n\n accuracy 0.70 10\n macro avg 0.69 0.67 0.67 10\nweighted avg 0.70 0.70 0.69 10\n'}, 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}, 'transformer name': 'gender_tpot_classifier_transform.pickle', 'training data': ['gender_all.csv', 'gender_train.csv', 'gender_test.csv', 'gender_all_transformed.csv', 'gender_train_transformed.csv', 'gender_test_transformed.csv'], 'sample X_test': [-0.5905520916740901, -0.5769542196749199, -0.4206917011248405, 0.7529668802365918, 0.09174953222020645, -0.6629166129037093, 0.7375134654386479, -0.15802980098738767, -0.15805735479905567, -0.6081655509519702, 0.9011267037791326, 1.0261687941190032, 0.20454394530543832, -0.8448466788837747, 0.35198391256345163, 0.8316422800614328, -0.8654169740282134, 0.8593094113006882, 0.8285863806706361, -0.7025236163101211], 'sample y_test': 0}]}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'labels': ['load_dir'], 'errors': [], 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': False, 'select_features': False, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False, 'default_dimensionionality_reducer': ['pca']}}
['librosa_features']
audio
Pipeline(memory=None,
steps=[('standard_scaler',
StandardScaler(copy=True, with_mean=True, with_std=True)),
('rfe',
RFE(estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3,
epsilon=0.1, gamma='scale', kernel='linear',
max_iter=-1, shrinking=True, tol=0.001,
verbose=False),
n_features_to_select=20, step=1, verbose=0))],
verbose=False)
[48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955]
[[ 0.06457865 -0.71415559 -1.50986974 -0.70250164 -1.28613767 0.25201476
1.62472075 1.76165016 -1.54345567 -2.54907196 1.38612324 2.71927715
0.06395576 -0.53985087 1.10550038 0.54977639 -0.55146853 3.84376773
1.70959641 -0.73457093]]
tpot
[1]
{'males': [0], 'females': [1]}
1
females
females
Note that predictions are made within the standard data dictionary (as a .JSON file):
{'sampletype': 'audio', 'transcripts': {'audio': {}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'features': {'audio': {'librosa_features': {'features': [48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955], 'labels': ['onset_length', 'onset_detect_mean', 'onset_detect_std', 'onset_detect_maxv', 'onset_detect_minv', 'onset_detect_median', 'tempo', 'onset_strength_mean', 'onset_strength_std', 'onset_strength_maxv', 'onset_strength_minv', 'onset_strength_median', 'rhythm_0_mean', 'rhythm_0_std', 'rhythm_0_maxv', 'rhythm_0_minv', 'rhythm_0_median', 'rhythm_1_mean', 'rhythm_1_std', 'rhythm_1_maxv', 'rhythm_1_minv', 'rhythm_1_median', 'rhythm_2_mean', 'rhythm_2_std', 'rhythm_2_maxv', 'rhythm_2_minv', 'rhythm_2_median', 'rhythm_3_mean', 'rhythm_3_std', 'rhythm_3_maxv', 'rhythm_3_minv', 'rhythm_3_median', 'rhythm_4_mean', 'rhythm_4_std', 'rhythm_4_maxv', 'rhythm_4_minv', 'rhythm_4_median', 'rhythm_5_mean', 'rhythm_5_std', 'rhythm_5_maxv', 'rhythm_5_minv', 'rhythm_5_median', 'rhythm_6_mean', 'rhythm_6_std', 'rhythm_6_maxv', 'rhythm_6_minv', 'rhythm_6_median', 'rhythm_7_mean', 'rhythm_7_std', 'rhythm_7_maxv', 'rhythm_7_minv', 'rhythm_7_median', 'rhythm_8_mean', 'rhythm_8_std', 'rhythm_8_maxv', 'rhythm_8_minv', 'rhythm_8_median', 'rhythm_9_mean', 'rhythm_9_std', 'rhythm_9_maxv', 'rhythm_9_minv', 'rhythm_9_median', 'rhythm_10_mean', 'rhythm_10_std', 'rhythm_10_maxv', 'rhythm_10_minv', 'rhythm_10_median', 'rhythm_11_mean', 'rhythm_11_std', 'rhythm_11_maxv', 'rhythm_11_minv', 'rhythm_11_median', 'rhythm_12_mean', 'rhythm_12_std', 'rhythm_12_maxv', 'rhythm_12_minv', 'rhythm_12_median', 'mfcc_0_mean', 'mfcc_0_std', 'mfcc_0_maxv', 'mfcc_0_minv', 'mfcc_0_median', 'mfcc_1_mean', 'mfcc_1_std', 'mfcc_1_maxv', 'mfcc_1_minv', 'mfcc_1_median', 'mfcc_2_mean', 'mfcc_2_std', 'mfcc_2_maxv', 'mfcc_2_minv', 'mfcc_2_median', 'mfcc_3_mean', 'mfcc_3_std', 'mfcc_3_maxv', 'mfcc_3_minv', 'mfcc_3_median', 'mfcc_4_mean', 'mfcc_4_std', 'mfcc_4_maxv', 'mfcc_4_minv', 'mfcc_4_median', 'mfcc_5_mean', 'mfcc_5_std', 'mfcc_5_maxv', 'mfcc_5_minv', 'mfcc_5_median', 'mfcc_6_mean', 'mfcc_6_std', 'mfcc_6_maxv', 'mfcc_6_minv', 'mfcc_6_median', 'mfcc_7_mean', 'mfcc_7_std', 'mfcc_7_maxv', 'mfcc_7_minv', 'mfcc_7_median', 'mfcc_8_mean', 'mfcc_8_std', 'mfcc_8_maxv', 'mfcc_8_minv', 'mfcc_8_median', 'mfcc_9_mean', 'mfcc_9_std', 'mfcc_9_maxv', 'mfcc_9_minv', 'mfcc_9_median', 'mfcc_10_mean', 'mfcc_10_std', 'mfcc_10_maxv', 'mfcc_10_minv', 'mfcc_10_median', 'mfcc_11_mean', 'mfcc_11_std', 'mfcc_11_maxv', 'mfcc_11_minv', 'mfcc_11_median', 'mfcc_12_mean', 'mfcc_12_std', 'mfcc_12_maxv', 'mfcc_12_minv', 'mfcc_12_median', 'poly_0_mean', 'poly_0_std', 'poly_0_maxv', 'poly_0_minv', 'poly_0_median', 'poly_1_mean', 'poly_1_std', 'poly_1_maxv', 'poly_1_minv', 'poly_1_median', 'spectral_centroid_mean', 'spectral_centroid_std', 'spectral_centroid_maxv', 'spectral_centroid_minv', 'spectral_centroid_median', 'spectral_bandwidth_mean', 'spectral_bandwidth_std', 'spectral_bandwidth_maxv', 'spectral_bandwidth_minv', 'spectral_bandwidth_median', 'spectral_contrast_mean', 'spectral_contrast_std', 'spectral_contrast_maxv', 'spectral_contrast_minv', 'spectral_contrast_median', 'spectral_flatness_mean', 'spectral_flatness_std', 'spectral_flatness_maxv', 'spectral_flatness_minv', 'spectral_flatness_median', 'spectral_rolloff_mean', 'spectral_rolloff_std', 'spectral_rolloff_maxv', 'spectral_rolloff_minv', 'spectral_rolloff_median', 'zero_crossings_mean', 'zero_crossings_std', 'zero_crossings_maxv', 'zero_crossings_minv', 'zero_crossings_median', 'RMSE_mean', 'RMSE_std', 'RMSE_maxv', 'RMSE_minv', 'RMSE_median']}}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'models': {'audio': {'females': [{'sample type': 'audio', 'created date': '2020-08-07 12:43:08.041610', 'device info': {'time': '2020-08-07 12:43', 'timezone': ['EST', 'EDT'], 'operating system': 'Darwin', 'os release': '19.5.0', 'os version': 'Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64', 'cpu data': {'memory': [8589934592, 2205638656, 74.3, 4275126272, 94564352, 2113089536, 1905627136, 2162036736], 'cpu percent': 77.4, 'cpu times': [157014.42, 0.0, 65743.85, 543406.97], 'cpu count': 4, 'cpu stats': [73245, 527841, 971986315, 465241], 'cpu swap': [2147483648, 1335623680, 811859968, 62.2, 542256906240, 1029312512], 'partitions': [['/dev/disk1s6', '/', 'apfs', 'ro,local,rootfs,dovolfs,journaled,multilabel'], ['/dev/disk1s5', '/System/Volumes/Data', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s4', '/private/var/vm', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s1', '/Volumes/Macintosh HD - Data', 'apfs', 'rw,local,dovolfs,journaled,multilabel'], ['/dev/disk1s3', '/Volumes/Recovery', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel']], 'disk usage': [499963174912, 10985529344, 339299921920, 3.1], 'disk io counters': [11783216, 8846330, 592488718336, 504655544320, 11689626, 6788646], 'battery': [56, -2, True], 'boot time': 1596543104.0}, 'space left': 339.29992192}, 'session id': 'b4a4b8e8-d8cc-11ea-8e12-acde48001122', 'classes': ['males', 'females'], 'problem type': 'classification', 'model name': 'gender_tpot_classifier.pickle', 'model type': 'tpot', 'metrics': {'accuracy': 0.7, 'balanced_accuracy': 0.6666666666666667, 'precision': 0.6666666666666666, 'recall': 0.5, 'f1_score': 0.5714285714285715, 'f1_micro': 0.7, 'f1_macro': 0.6703296703296704, 'roc_auc': 0.6666666666666666, 'roc_auc_micro': 0.6666666666666666, 'roc_auc_macro': 0.6666666666666666, 'confusion_matrix': [[5, 1], [2, 2]], 'classification_report': ' precision recall f1-score support\n\n males 0.71 0.83 0.77 6\n females 0.67 0.50 0.57 4\n\n accuracy 0.70 10\n macro avg 0.69 0.67 0.67 10\nweighted avg 0.70 0.70 0.69 10\n'}, 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}, 'transformer name': 'gender_tpot_classifier_transform.pickle', 'training data': ['gender_all.csv', 'gender_train.csv', 'gender_test.csv', 'gender_all_transformed.csv', 'gender_train_transformed.csv', 'gender_test_transformed.csv'], 'sample X_test': [-0.5905520916740901, -0.5769542196749199, -0.4206917011248405, 0.7529668802365918, 0.09174953222020645, -0.6629166129037093, 0.7375134654386479, -0.15802980098738767, -0.15805735479905567, -0.6081655509519702, 0.9011267037791326, 1.0261687941190032, 0.20454394530543832, -0.8448466788837747, 0.35198391256345163, 0.8316422800614328, -0.8654169740282134, 0.8593094113006882, 0.8285863806706361, -0.7025236163101211], 'sample y_test': 0}, {'sample type': 'audio', 'created date': '2020-08-07 12:43:08.041610', 'device info': {'time': '2020-08-07 12:43', 'timezone': ['EST', 'EDT'], 'operating system': 'Darwin', 'os release': '19.5.0', 'os version': 'Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64', 'cpu data': {'memory': [8589934592, 2205638656, 74.3, 4275126272, 94564352, 2113089536, 1905627136, 2162036736], 'cpu percent': 77.4, 'cpu times': [157014.42, 0.0, 65743.85, 543406.97], 'cpu count': 4, 'cpu stats': [73245, 527841, 971986315, 465241], 'cpu swap': [2147483648, 1335623680, 811859968, 62.2, 542256906240, 1029312512], 'partitions': [['/dev/disk1s6', '/', 'apfs', 'ro,local,rootfs,dovolfs,journaled,multilabel'], ['/dev/disk1s5', '/System/Volumes/Data', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s4', '/private/var/vm', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s1', '/Volumes/Macintosh HD - Data', 'apfs', 'rw,local,dovolfs,journaled,multilabel'], ['/dev/disk1s3', '/Volumes/Recovery', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel']], 'disk usage': [499963174912, 10985529344, 339299921920, 3.1], 'disk io counters': [11783216, 8846330, 592488718336, 504655544320, 11689626, 6788646], 'battery': [56, -2, True], 'boot time': 1596543104.0}, 'space left': 339.29992192}, 'session id': 'b4a4b8e8-d8cc-11ea-8e12-acde48001122', 'classes': ['males', 'females'], 'problem type': 'classification', 'model name': 'gender_tpot_classifier.pickle', 'model type': 'tpot', 'metrics': {'accuracy': 0.7, 'balanced_accuracy': 0.6666666666666667, 'precision': 0.6666666666666666, 'recall': 0.5, 'f1_score': 0.5714285714285715, 'f1_micro': 0.7, 'f1_macro': 0.6703296703296704, 'roc_auc': 0.6666666666666666, 'roc_auc_micro': 0.6666666666666666, 'roc_auc_macro': 0.6666666666666666, 'confusion_matrix': [[5, 1], [2, 2]], 'classification_report': ' precision recall f1-score support\n\n males 0.71 0.83 0.77 6\n females 0.67 0.50 0.57 4\n\n accuracy 0.70 10\n macro avg 0.69 0.67 0.67 10\nweighted avg 0.70 0.70 0.69 10\n'}, 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}, 'transformer name': 'gender_tpot_classifier_transform.pickle', 'training data': ['gender_all.csv', 'gender_train.csv', 'gender_test.csv', 'gender_all_transformed.csv', 'gender_train_transformed.csv', 'gender_test_transformed.csv'], 'sample X_test': [-0.5905520916740901, -0.5769542196749199, -0.4206917011248405, 0.7529668802365918, 0.09174953222020645, -0.6629166129037093, 0.7375134654386479, -0.15802980098738767, -0.15805735479905567, -0.6081655509519702, 0.9011267037791326, 1.0261687941190032, 0.20454394530543832, -0.8448466788837747, 0.35198391256345163, 0.8316422800614328, -0.8654169740282134, 0.8593094113006882, 0.8285863806706361, -0.7025236163101211], 'sample y_test': 0}]}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'labels': ['load_dir'], 'errors': [], 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': False, 'select_features': False, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False, 'default_dimensionionality_reducer': ['pca']}}
You can make a transformer to reduce or select features after folders of files have been featurized. Note 'males' and 'females' are the two directories that are being used to complete the transformation:
python3 allie.py --command transform --dir /Users/jim/desktop/allie/allie/train_dir/males --dir /Users/jim/desktop/allie/allie/train_dir/females --sampletype audio --problemtype c --name gender
This makes the transformer based defaults set in the settings.json file:
Requirement already satisfied: scikit-learn==0.22.2.post1 in /usr/local/lib/python3.7/site-packages (0.22.2.post1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (0.15.1)
Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.4.1)
Requirement already satisfied: numpy>=1.11.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.18.4)
WARNING: You are using pip version 20.2; however, version 20.2.1 is available.
You should consider upgrading via the '/usr/local/opt/python/bin/python3.7 -m pip install --upgrade pip' command.
/Users/jim/Desktop/allie/allie
True
False
True
['standard_scaler']
['pca']
['rfe']
['males']
['males', 'females']
----------LOADING MALES----------
100%|βββββββββββββββββββββββββββββββββββββββββ| 51/51 [00:00<00:00, 1904.04it/s]
----------LOADING FEMALES----------
100%|βββββββββββββββββββββββββββββββββββββββββ| 51/51 [00:00<00:00, 1596.84it/s]
[30.0, 92.66666666666667, 49.01451032319126, 169.0, 3.0, 94.5, 129.19921875, 1.690250229326713, 1.2789544717499288, 7.456481484968146, 0.0, 1.367037342965302, 1.0, 0.0, 1.0, 1.0, 1.0, 0.8144353054485143, 0.07315829369998768, 0.9417573794204208, 0.719033364501182, 0.8028530517913689, 0.6516314010377938, 0.12866423184791223, 0.880534700043115, 0.48682582807073504, 0.628550582813504, 0.6557855743777458, 0.12156072712457477, 0.876943271385895, 0.5062201640787684, 0.6303779638265605, 0.7009278935840026, 0.11771103035069283, 0.9076673271146752, 0.5511437216636424, 0.6804123245018233, 0.7258621781459784, 0.11636256125038048, 0.9226262209665014, 0.5658213459263224, 0.7115627650092918, 0.7319688473295565, 0.10607996699023634, 0.9132149573567513, 0.5828179013969554, 0.7190717663201714, 0.6810593533546547, 0.1271240541410719, 0.8928223130579849, 0.49494090316773426, 0.6701480413560182, 0.6492148665611919, 0.1313768776855918, 0.874588980402631, 0.4626074503575165, 0.6337619863577791, 0.6913642725773188, 0.11647170482925652, 0.8893955618442694, 0.5215713488179923, 0.6793508362465139, 0.740905465414844, 0.11388770333587857, 0.9189510886031139, 0.5580632361822792, 0.7396580862023646, 0.706541518233447, 0.12965432917680048, 0.909703948901588, 0.5021640861839305, 0.7040640631286661, 0.660178894654137, 0.13467299472507263, 0.881431610290283, 0.45922877094491754, 0.6508665445976783, -125.02876889048967, 39.94989669951198, -41.153294052622755, -241.5324712414671, -123.15837201863796, 127.28465582663468, 35.80741987232664, 192.13303272873583, 26.363771628602464, 131.39178180883965, -31.40824555387448, 14.845912346019078, 4.214112235102621, -63.02792432547794, -31.674782272416806, 63.59218904191833, 19.29518727295757, 113.68750652424006, 2.028838106725491, 66.10692313577907, -26.962807839040785, 21.096820187821937, 21.987230456807126, -72.99876213857725, -26.07311047818838, 28.076659576584003, 19.396170848691963, 83.59413022430225, -8.32134239204613, 28.358152011527395, -26.198379123200496, 13.790690985287558, -0.4330725985216038, -67.13887308266585, -24.024634643719104, 8.944904763952549, 11.567269550826811, 45.93168128672215, -13.803048109141683, 8.970779559926964, -14.501981105396572, 9.903767724994214, 5.117014585578213, -37.86970568591489, -13.963186489353376, -9.611464362952907, 9.47798222378478, 10.076670810295305, -33.566944481425, -10.062288102168846, -3.327091218464487, 7.47132694440408, 18.9357242293349, -21.052624308721697, -2.9141128249673565, -10.727285681938708, 7.619622699284192, 10.784910450523757, -33.111268975426995, -10.533412108789724, 6.26325664564468, 7.931147571087225, 25.507855843506437, -14.300663616055672, 6.416538653827497, -0.000515034409211756, 0.0002809336537487639, -7.005439972082589e-05, -0.0012700827705259812, -0.0005047523754684405, 4.271716503539869, 2.1772810824505493, 9.883547207471688, 0.6583119006445217, 4.221365244890755, 1862.442990522799, 877.294093498969, 4822.9006596619165, 860.6399518698654, 1546.5592748248223, 1794.3816987793941, 382.3749135091646, 2659.687547401115, 1050.0506924306242, 1772.359533399369, 22.382069858248713, 5.600814877396231, 46.28314734561523, 10.567706808766474, 22.745121605457474, 0.00013835863501299173, 0.0002091049827868119, 0.0008215561974793673, 4.916078069072682e-06, 2.9649212592630647e-05, 3742.1810752467104, 1489.038670641279, 6696.826171875, 1130.4931640625, 3402.24609375, 0.10214501096491228, 0.06874772038732836, 0.37109375, 0.0107421875, 0.08349609375, 0.20060665905475616, 0.09223830699920654, 0.3933629095554352, 0.021954059600830078, 0.22151805460453033]
['onset_length', 'onset_detect_mean', 'onset_detect_std', 'onset_detect_maxv', 'onset_detect_minv', 'onset_detect_median', 'tempo', 'onset_strength_mean', 'onset_strength_std', 'onset_strength_maxv', 'onset_strength_minv', 'onset_strength_median', 'rhythm_0_mean', 'rhythm_0_std', 'rhythm_0_maxv', 'rhythm_0_minv', 'rhythm_0_median', 'rhythm_1_mean', 'rhythm_1_std', 'rhythm_1_maxv', 'rhythm_1_minv', 'rhythm_1_median', 'rhythm_2_mean', 'rhythm_2_std', 'rhythm_2_maxv', 'rhythm_2_minv', 'rhythm_2_median', 'rhythm_3_mean', 'rhythm_3_std', 'rhythm_3_maxv', 'rhythm_3_minv', 'rhythm_3_median', 'rhythm_4_mean', 'rhythm_4_std', 'rhythm_4_maxv', 'rhythm_4_minv', 'rhythm_4_median', 'rhythm_5_mean', 'rhythm_5_std', 'rhythm_5_maxv', 'rhythm_5_minv', 'rhythm_5_median', 'rhythm_6_mean', 'rhythm_6_std', 'rhythm_6_maxv', 'rhythm_6_minv', 'rhythm_6_median', 'rhythm_7_mean', 'rhythm_7_std', 'rhythm_7_maxv', 'rhythm_7_minv', 'rhythm_7_median', 'rhythm_8_mean', 'rhythm_8_std', 'rhythm_8_maxv', 'rhythm_8_minv', 'rhythm_8_median', 'rhythm_9_mean', 'rhythm_9_std', 'rhythm_9_maxv', 'rhythm_9_minv', 'rhythm_9_median', 'rhythm_10_mean', 'rhythm_10_std', 'rhythm_10_maxv', 'rhythm_10_minv', 'rhythm_10_median', 'rhythm_11_mean', 'rhythm_11_std', 'rhythm_11_maxv', 'rhythm_11_minv', 'rhythm_11_median', 'rhythm_12_mean', 'rhythm_12_std', 'rhythm_12_maxv', 'rhythm_12_minv', 'rhythm_12_median', 'mfcc_0_mean', 'mfcc_0_std', 'mfcc_0_maxv', 'mfcc_0_minv', 'mfcc_0_median', 'mfcc_1_mean', 'mfcc_1_std', 'mfcc_1_maxv', 'mfcc_1_minv', 'mfcc_1_median', 'mfcc_2_mean', 'mfcc_2_std', 'mfcc_2_maxv', 'mfcc_2_minv', 'mfcc_2_median', 'mfcc_3_mean', 'mfcc_3_std', 'mfcc_3_maxv', 'mfcc_3_minv', 'mfcc_3_median', 'mfcc_4_mean', 'mfcc_4_std', 'mfcc_4_maxv', 'mfcc_4_minv', 'mfcc_4_median', 'mfcc_5_mean', 'mfcc_5_std', 'mfcc_5_maxv', 'mfcc_5_minv', 'mfcc_5_median', 'mfcc_6_mean', 'mfcc_6_std', 'mfcc_6_maxv', 'mfcc_6_minv', 'mfcc_6_median', 'mfcc_7_mean', 'mfcc_7_std', 'mfcc_7_maxv', 'mfcc_7_minv', 'mfcc_7_median', 'mfcc_8_mean', 'mfcc_8_std', 'mfcc_8_maxv', 'mfcc_8_minv', 'mfcc_8_median', 'mfcc_9_mean', 'mfcc_9_std', 'mfcc_9_maxv', 'mfcc_9_minv', 'mfcc_9_median', 'mfcc_10_mean', 'mfcc_10_std', 'mfcc_10_maxv', 'mfcc_10_minv', 'mfcc_10_median', 'mfcc_11_mean', 'mfcc_11_std', 'mfcc_11_maxv', 'mfcc_11_minv', 'mfcc_11_median', 'mfcc_12_mean', 'mfcc_12_std', 'mfcc_12_maxv', 'mfcc_12_minv', 'mfcc_12_median', 'poly_0_mean', 'poly_0_std', 'poly_0_maxv', 'poly_0_minv', 'poly_0_median', 'poly_1_mean', 'poly_1_std', 'poly_1_maxv', 'poly_1_minv', 'poly_1_median', 'spectral_centroid_mean', 'spectral_centroid_std', 'spectral_centroid_maxv', 'spectral_centroid_minv', 'spectral_centroid_median', 'spectral_bandwidth_mean', 'spectral_bandwidth_std', 'spectral_bandwidth_maxv', 'spectral_bandwidth_minv', 'spectral_bandwidth_median', 'spectral_contrast_mean', 'spectral_contrast_std', 'spectral_contrast_maxv', 'spectral_contrast_minv', 'spectral_contrast_median', 'spectral_flatness_mean', 'spectral_flatness_std', 'spectral_flatness_maxv', 'spectral_flatness_minv', 'spectral_flatness_median', 'spectral_rolloff_mean', 'spectral_rolloff_std', 'spectral_rolloff_maxv', 'spectral_rolloff_minv', 'spectral_rolloff_median', 'zero_crossings_mean', 'zero_crossings_std', 'zero_crossings_maxv', 'zero_crossings_minv', 'zero_crossings_median', 'RMSE_mean', 'RMSE_std', 'RMSE_maxv', 'RMSE_minv', 'RMSE_median']
STANDARD_SCALER
RFE - 20 features
[('standard_scaler', StandardScaler(copy=True, with_mean=True, with_std=True)), ('rfe', RFE(estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
gamma='scale', kernel='linear', max_iter=-1, shrinking=True,
tol=0.001, verbose=False),
n_features_to_select=20, step=1, verbose=0))]
11
11
transformed training size
[ 0.87867072 -0.2927148 -1.3942374 -2.21466181 -1.24338953 -0.5532292
0.02975783 0.42827433 -0.28430065 -1.2838709 0.90746239 1.67629585
-1.48610134 1.03165105 1.35402715 0.73145188 -0.61561207 0.41546984
0.63114357 1.48055371]
/Users/jim/Desktop/allie/allie/preprocessing
your transform can now be found in the ./preprocessing/audio_transforms directory
For more information about Allie's preprocessing capabilities, see this link.
You can run unit tests with:
python3 allie.py --command test
This will then show if any tests succeeded or failed:
----------------------------------------------------------------------
Ran 28 tests in 551.517s
OK
-----------------^^^-----------------------
-------------^^^^---^^^^-------------------
-----------CLEANUP TEMP FILES--------------
---------^^^^^^^^^^^^^^^^^^^^^^------------
deleting temp files from FFmpeg and SoX tests
-------------------------------------------
deleting temp files load_dir tests
-------------------------------------------
deleting temp model files (audio, text, image, and video)
-------------------------------------------
Note that these unit tests are contextualized around the settings that you create in the settings.json database.
You can visualize multi-class problems that have featurized folders with:
python3 allie.py --command visualize
Now specify the folders to run the visualization on:
what is the problem that you are going after (e.g. "audio", "text", "image","video","csv")
audio
audio
how many classes do you want to model? (e.g. 2)
2
what is class #1
males
what is class #2
females
minimum length is...
51
----------LOADING MALES----------
100%|βββββββββββββββββββββββββββββββββββββββββ| 51/51 [00:00<00:00, 1593.25it/s]
----------LOADING FEMALES----------
100%|βββββββββββββββββββββββββββββββββββββββββ| 51/51 [00:00<00:00, 1227.09it/s]
...
This will then take you through a visualization prompt to set the classes and structure a visualization session, as output in the 'visualization_session" folder.
You can set some new settings within Allie quite easily by doing:
python3 allie.py --command settings
This will then open up a list of questions to allow you to specify new settings within Allie or visualize the existing settings, as set by the settings.json database.
For example, you may want to turn off video_transcribe setting by setting it to False:
{'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': True, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}
Would you like to change any of these settings? Yes (-y) or No (-n)
y
What setting would you like to change?
transcribe_video
What setting would you like to set here?
False
<class 'bool'>
Note that a list of all possible settings to change can be found here.