Skip to content

0.4. Data modeling

Jim Schwoebel edited this page Aug 17, 2018 · 31 revisions

This section overviews all the scripts in the Chapter 4: Data modeling folder.

Definitions

Term Definition
features descriptive numerical representations to describe an object.
machine learning the process of teaching a machine something that is useful.
classification model If the goal is to separate out into classes (e.g. male or female), then this is known as a classification problem.
regression model if the end goal is to measure some correlation with a variable and the output is more a numerical range (e.g. often between 0 and 1), then this is more of a regression problem.
deep learning models models that are trained using a neural network.
unsupervised learning if machines do not require labels (e.g. just need features), this is known as a unsupervised learning problem.
supervised learning if machines require labels (e.g. male or female as separate feature arrays), this is known as a supervised learning problem.
training set Machines are fed training data in the form of feature arrays and compress patterns in these feature arrays into models through algorithms.
testing set data that is left out during training so that the accuracy can be calculated using cross-validation techniques.
validation set data that is left out during training to tune hyperparameters (often used in deep learning modeling.
label a tag of an featurized audio sample (e.g. male or female) to aid in supervised learning.
cross-validation how the performance of ML models are assessed (in terms of accuracy).

4.2 - Obtaining training data

make_playlist.py (from CLI)

cd ~
cd voicebook/chapter_4_modeling/youtube_scrape
python3 make_playlist.py 
what is the name of this playlist?
what is the playlist id or URL?
… [‘n’ to stop making playlist]

download_playlist.py (from CLI)

python3 download_playlist.py 
what is the name of the playlist to download?
… downloads playlist to /playlist folder 

4.3 - Labeling training data

Sample text

4.4 - Classification models

Sample text

4.5 - Regression models

Sample text

4.6 - Deep learning models

Sample text

4.7 - AutoML approaches

Sample text