Apply LSTM network to .ogg files

3 views (last 30 days)
Pooyan Mobtahej
Pooyan Mobtahej on 26 Oct 2020
Edited: Pooyan Mobtahej on 27 Oct 2020
I need to apply LSTM and get results for large datasets of .ogg audio files (datasets) in Matlab, Data can be separated into three parts. For example, 80% of all normal and anomaly signals for training (2 classes), 10% for validation, and 10% for testing.
I have used the following code but you can suggest me proper modification:
How to define Normal and Anomaly arrays with different sizes?
How to define Test?
%ADS = audioDatastore('/Users/pooyan/OneDrive - lamar.edu','FileExtensions','.ogg')
folder='/Users/pooyan/Documents/computer Vision';
audio_files=dir(fullfile(folder,'*.ogg'));
j=length(audio_files);
normal = zeros(132300,1); %return matrix size(normal_name)
anomaly = zeros(132300,1);
Fs=44100; %sample rate according to .ogg file
for i = 1:length(audio_files)
normal_name = strcat('normal_',num2str(i),'.ogg');
anomoly_name = strcat('anomaly_',num2str(i),'.ogg');
%[y,Fs] = audioread(filename)
[normal(i)] = audioread(normal_name);
[anomaly(i)] = audioread(anomaly_name); %can add Fs sample rate?
%normal(i) = zeros(size(normal_name),1); %return matrix size(normal_name)
%anomaly(i) = zeros(size(anomaly_name),1);
end
audioTrain = [normal(:,0.8*(1:length(audio_files))),anomaly(:,0.8*(1:length(audio_files)))]; %precentage
audioValidation = [normal(:,0.1*(1:length(audio_files))),anomaly(:,0.1*(1:length(audio_files)))];
% Create an audioFeatureExtractor object
%to extract the centroid and slope of the mel spectrum over time.
aFE = audioFeatureExtractor("SampleRate",Fs, ... %Fs
"SpectralDescriptorInput","melSpectrum", ...
"spectralCentroid",true, ...
"spectralSlope",true);
featuresTrain = extract(aFE,audioTrain);
[numHopsPerSequence,numFeatures,numSignals] = size(featuresTrain);
numHopsPerSequence;
numFeatures;
numSignals;
%treat the extracted features as sequences and use a
%sequenceInputLayer as the first layer of your deep learning model.
featuresTrain = permute(featuresTrain,[2,1,3]);
featuresTrain = squeeze(num2cell(featuresTrain,[1,2]));
numSignals = numel(featuresTrain);
[numFeatures,numHopsPerSequence] = size(featuresTrain{1});
%Extract the validation features.
featuresValidation = extract(aFE,audioValidation);
featuresValidation = permute(featuresValidation,[2,1,3]);
featuresValidation = squeeze(num2cell(featuresValidation,[1,2]));
%Define the network architecture.
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(50,"OutputMode","last")
fullyConnectedLayer(numel(unique(audioTrain))) %%labelTrain=audio
softmaxLayer
classificationLayer];
%To define the training options
options = trainingOptions("adam", ...
"Shuffle","every-epoch", ...
"ValidationData",{featuresValidation,audioValidation}, ... %%labelValidatin=audioValidation
"Plots","training-progress", ...
"Verbose",false);
%To train the network
net = trainNetwork(featuresTrain,audioTrain,layers,options);
%Test the network %10 preccent
normalTest = normal(:,0.1*(1:length(audio_files)));
classify(net,extract(aFE,normalTest)')
anomalyTest = anomaly(:,0.1*(1:length(audio_files)));
classify(net,extract(aFE,anomalyTest)')

Answers (0)

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!