Can anyone explain me this code of mel spaced filter banks? i have to use this in my speaker recognition project.
Show older comments
function m = melfb(p, n, fs)
% MELFB Determine matrix for a mel-spaced filterbank %
% Inputs: p number of filters in filterbank
% n length of fft
% fs sample rate in Hz %
% Outputs: x a (sparse) matrix containing the filterbank amplitudes
% size(x) = [p, 1+floor(n/2)] %
f0 = 700 / fs;
fn2 = floor(n/2);
lr = log(1 + 0.5/f0) / (p+1);
% convert to fft bin numbers with 0 for DC term
bl = n * (f0 * (exp([0 1 p p+1] * lr) - 1));
b1 = floor(bl(1)) + 1;
b2 = ceil(bl(2));
b3 = floor(bl(3));
b4 = min(fn2, ceil(bl(4))) - 1;
pf = log(1 + (b1:b4)/n/f0) / lr;
fp = floor(pf);
pm = pf - fp;
r = [fp(b2:b4) 1+fp(1:b3)];
c = [b2:b4 1:b3] + 1;
v = 2 * [1-pm(b2:b4) pm(1:b3)];
m = sparse(r, c, v, p, 1+fn2);
Answers (1)
Hari
on 11 Jun 2025
0 votes
Hi,
I understand that you’re trying to use the melfb function in your speaker recognition project, and you're seeking an explanation of how the code generates mel filter banks from FFT bins.
I assume that you're familiar with basic signal processing and the concept of mel-scale filtering, but need clarity on how the code translates mel-scale logic into filter bank computation using MATLAB.
In order to understand how this code computes mel-spaced triangular filter banks, you can follow the below explanation:
Step 1: Set up mel frequency conversion parameters
The mel scale approximates human hearing and is calculated as:
mel(f) = log(1 + f/700)
The variable f0 = 700/fs and lr = log(1 + 0.5/f0)/(p+1) convert frequency to mel-space and divide the mel scale evenly.
Step 2: Determine filter bank edges in FFT bins
The code computes boundaries for p filters across the spectrum:
- bl = n * (f0 * (exp([...]) - 1)) maps evenly spaced mel points back to linear frequency (in FFT bin numbers).
- b1, b2, b3, and b4 define the valid range of bins contributing to the filter bank.
Step 3: Compute triangular weights for filters
- pf converts bin indices to positions in mel space.
- fp gives the lower bin of the triangle for each frequency.
- pm gives the fractional part for linear interpolation.
- The vectors r, c, and v form the row indices, column indices, and values of the sparse matrix.
Step 4: Construct sparse filter bank matrix
- m = sparse(r, c, v, p, 1+fn2) builds a p × (n/2 + 1) matrix.
- Each row in m is a triangular filter centered at a mel-scaled frequency, used to convert FFT magnitudes to mel-filtered energy.
This matrix m is used to transform the FFT magnitude spectrum into mel filterbank energies, essential in computing MFCCs for speaker recognition.
Refer to the documentation of:
- "sparse": https://www.mathworks.com/help/matlab/ref/sparse.html
- "floor": https://www.mathworks.com/help/matlab/ref/floor.html
- "exp": https://www.mathworks.com/help/matlab/ref/exp.html
Hope this helps!
Categories
Find more on Feature Extraction in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!