Urban Sound Classification Using Deep Learning

バージョン 1.0.0 (12.5 MB) 作成者: Cong Dong Ngoc Minh

In the example, I will demonstrate how we do sound classification using deep learning from extracting audio feature (spectrogram).

フォロー

5.0

(1)

ダウンロード: 336

更新 2021/7/21

ライセンスの表示

Classifying Urban sounds using Deep Learning

This package includes 3 main files: SC1_preprocessing.mlx, SC2_extract_feature.mlx, SC3_train_network.mlx. Other files such as: SoundClassify.m and SoundClassifySample.m will be used for library compiler.

Dataset:

UrbanSound dataset

For this project we will use a dataset called Urbansound8K. The dataset contains 8732 sound excerpts (<=4s) of urban sounds from 10 classes, which are:

Air Conditioner
Car Horn
Children Playing
Dog bark
Drilling
Engine Idling
Gun Shot
Jackhammer
Siren
Street Music

The accompanying metadata contains a unique ID for each sound excerpt along with it's given class name.

A sample of this dataset is included with the accompanying git repo and the full dataset can be downloaded from https://urbansounddataset.weebly.com/urbansound8k.html.

Audio sample file data overview

These sound excerpts are digital audio files in .wav format.

Sound waves are digitised by sampling them at discrete intervals known as the sampling rate (typically 44.1kHz for CD quality audio meaning samples are taken 44,100 times per second).

Each sample is the amplitude of the wave at a particular time interval, where the bit depth determines how detailed the sample will be also known as the dynamic range of the signal (typically 16bit which means a sample can range from 65,536 amplitude values).

Deep Learning Workflow

Access Data -> Pre-processing -> Extract signal feature (Spectrogram) -> Train neural netwrok -> Deployment (optional).

Step 1: Data preparation with SC1_preprocessing.mlx:

Create new folder based on Class ID name and move the files into their class folder.

Step 2: Feature extraction with SC2_extract_feature.mlx:

Pre-processing audio data and extract spectrogram feature.

Convert audio signal to spectrogram with sampling time as fs and save the spectrogram as original audio file directory.

Step 3: Create neural network and train it wit SC3_train_network.mlx

From spectrogram data which has been extracted, we will create the simple neural network for training and classifying. The images are stored in the folder Spectrograms. The data for each class is seperated in subfolders, labelled by the folder name.

Split the data so that 80% of the images are used for training, 10% are used for validation, and the rest are used for testing. With my limited time, I just used 25% of whole dataset for training.

The accuracy of training is: 92% as picture below:

The accuracy of testing is: 91% with the confusion matrix as below:

Step 4: Deployment (Optional)

In this steo, I used MATLAB Comlier SDK to create python library.

SoundClassify.m file is the main function for creating the libray

SoundClassifySample.m is the sample for creating python sample drivier file. You can change the script in this file for another sample image.

You can see the result after running sample.

*Note: For running the libray, it must contains the "trainednet.mat" as the neural network and test image.

Hope it would be useful for everyone.

Thank you!

引用

Cong Dong Ngoc Minh (2024). Urban Sound Classification Using Deep Learning (https://www.mathworks.com/matlabcentral/fileexchange/96148-urban-sound-classification-using-deep-learning), MATLAB Central File Exchange. 取得済み April 25、2024.

MATLAB リリースの互換性

作成: R2021a

すべてのリリースと互換性あり

プラットフォームの互換性

Windows macOS Linux

タグタグを追加

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Sound_Classification_DeepLearning

Sound_Classification_DeepLearning/HelperFunctions

Sound_Classification_DeepLearning

バージョン	公開済み	リリースノート
1.0.0	2021/7/21		ダウンロード