Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 1.44 KB

README.md

File metadata and controls

25 lines (20 loc) · 1.44 KB

Salamon & Bello 2017 Replication

Replication of the Paper Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification by Salamon & Bello. Implementation based on tensorflow.keras.

Outline

  • setup : instructions for setting up the enviroment
  • Functions:
    • evaluation.py : evaluation function
    • model.py : model building functions
    • preprocessing.py : data extraction, saving and loading from wav files with fixed 3 seconds length
    • preprocessing_augmented.py : data augmentation, extraction, saving and loading for wav+jams files
    • preprocessing_multi.py : data extraction, saving and loading from wav files
  • Google colab notebook:
    • augmented_preprocessing.ipynb : preprocess dataset from jams
    • usk8:cnn_baseline.ipynb : preprocess dataset and train models on dataset with full lengths++
    • usk8:cnn_baseline_crop.ipynb : preprocess dataset and train models on dataset
  • Desktop Notebooks:
    • results_visualization.ipynb : Visualizations of results of every sceneario
    • usk8_cnn_salomon.ipynb : train model on augmented dataset (desktop)**
    • usk8_cnn_wavelet.ipynb : train improved model on augmented dataset (desktop)**

++ This trains the data as described on the paper. Random 3 seconds excerpts from each larger than 3 second sample for training, and average output over all possible excerpts on testing.

** Had to train this on a desktop as google colab couldn't handle the dataset on memory.