The implementation here focuses on binary detection of orca calls (that are in the audible range, hence fun to listen to and annotate :) ) We change the audio-preprocessing front-end to better match this task & fine-tune the fully-connected layers and classification head of the AudioSet model, specifically a PyTorch port of the model/weights. The model is generating local predictions on a fixed window size of ~2.45s. Sampling and aggregation strategies for more global detection at minute/hourly/day-wise time scale would be a welcome contribution (helpful for a real-time detection pipeline, or processing 2-3 months of historical data from different hydrophone nodes).
- The model was bootstrapped with scraped open data from WHOI Marine Mammal Database (see
src.scraper
andnotebooks/DataPreparation
for details)- Labelled data in live conditions from Orcasound hydrophones has subsequently been added using the Pod.Cast tool by prioritizing labelling in an active-learning-like fashion after the initial bootstrap. (DataArchives details on all datasets)
- The mel spectrogram generation is changed to better suit this task (for details on choice of filterbank see
notebooks/DataPreparation
. Implementation is indata_ml/src.params
anddata_ml/src.dataloader
)- Given limited domain data, and need for robustness to different acoustic conditions (hydrophone nodes, SNR, noise/disturbances) in live conditions, the baseline uses transfer learning.
- Data augmentation in the style of SpecAug is also implemented, that acts as a helpful form of regularization
- data_ml (current directory)
train.py
test.py
- src (module library)
- notebooks (for evaluation, data preparation)
- tools
- models
- runs
- live_inference (deploy trained model)
See documentation at DataArchives for details on how to access and read datasets in a standard form.
This is a convenience script to download & uncompress latest combined training (Round1,2,3 etc.) & test datasets.
python data_ml/tools/download_datasets.py <LOCATION> (--only_train/--only_test)
Pardon the brevity here, this is just a rough starting point, that will evolve significantly! Some of the code is still pretty rough, however src.model
and src.dataloader
are useful places to start.
Training converges quite fast (~5 minutes on a GPU). Train/validation tensorboard logs & model checkpoints are saved to a directory in runRootPath
.
python train.py -dataPath ../train_data -runRootPath ../runs/test --preTrainedModelPath ../models/pytorch_vggish.pth -model AudioSet_fc_all -lr 0.0005
See notebook Evaluation.ipynb
(might be pretty rough, but should give a general idea)
- Download the test data by using the script
- Download a trained model from model path to a location on your machine
<model-download-location>
- Regenerate test results
python tools/prepare_test_and_model_data.py --test_path <test-data-download-dir> --model_path <model-download-dir>
python data_ml/test.py --test_path <test-data-download-location> --model_path <model-download-location>
-
[Windows] Get pyenv-win to manage python versions:
git clone https://github.com/pyenv-win/pyenv-win.git %USERPROFILE%/.pyenv
- Add the following to your shell PATH
%USERPROFILE%\.pyenv\pyenv-win\bin
,%USERPROFILE%\.pyenv\pyenv-win\shims
-
[Mac] Get pyenv to manage python versions:
- Use homebrew and run
brew update && brew install pyenv
- Follow from step 3 onwards here. This essentially adds the
pyenv init
command to your shell on startup - FYI this is a commands reference
- Use homebrew and run
-
[Common] Install and maintain the right Python version (3.6.8)
- Run
pyenv --version
to check installation - Run
pyenv rehash
from your home directory, install python 3.6.8 withpyenv install -l 3.6.8
(use 3.6.8-amd64 on Windows if relevant) and runpyenv rehash
again - Cd to
/PodCast
and set a local python versionpyenv local 3.6.8
(or 3.6.8-amd64). This saves a.python-version
file that tells pyenv what to use in this dir - Type
python --version
and check you're using the right one
- Run
(feel free to skip 1, 2, 3 if you prefer to use your own Python setup and are familiar with many of this)
- Create a virtual environment to isolate and install package dependencies
- In your working directory, run
python -m venv podcast-venv
. This creates a directorypodcast-venv
with relevant files/scripts. - On Mac, activate this environment with
source podcast-venv/bin/activate
and when you're done,deactivate
On Windows, activate with.\podcast-venv\Scripts\activate.bat
and.\podcast-venv\Scripts\deactivate.bat
when done - In an active environment, cd to
/data_ml
and runpython -m pip install --upgrade pip && pip install -r requirements.txt
- In your working directory, run