MATLAB Answers

How pwelch computes confidence intervals

16 views (last 30 days)
Stephen
Stephen on 30 Apr 2020
Answered: William Rose on 26 Feb 2021
Can Mathworks please point to exactly which formulae are being used to compute confidence intervals in pwelch? Perhaps state the formulae being used in the code? This would include how exacty the number of degrees of freedom are being computed.
  3 Comments
Stephen
Stephen on 15 Feb 2021
There are several good books on this - the original Jenkins and Watt book is excellent (although out of print and not evailble in pdf form) as is a book aimed at oceanographers by Thomson and Emery (Data Analysis Methods in Physical Oceanography). The reason I originally asked this question was because I suspect that the confidence intervals given by pwelch are not quite calculated correctly. One can check this by doing spectra for a random process that has a known spectrum - Jenkins and Watt give an excellent example of such a process in their book. I wrote a script - below - that generates a time series using the auto-regressive process J&W use for their examples. With apologies for the poor coding technique (I first learned programming in Fortran 66), you will see that if one inverts the CIs that Matlab computes and swap the results between the upper and lower CIs, one gets something that looks pretty good whereas if one sticks with the CIs pwelch gives, the lower and upper bounds are both high, meaning that the lower bound is not low enough and the upper bound is too high. So, this is what makes me think there is something wrong in pwelch.
%
% Examine spectral confidence intervals using a random process with a known spectrum
% The time series used is generated via an AR1 process detailed in Jenkins and Watts on p 269
% S. Monismith
% Stanford University
% 2/15/21
%
clear
M=16;
seriesize=2^M; % length of time series
t=0:seriesize-1; % times
X=ones(seriesize,1); %
Z=randn(seriesize,1);
factor=0.5;
for j=3:seriesize
X(j)=X(j-1)-factor*X(j-2)+Z(j);
end
X(1)=X(seriesize)-factor*X(seriesize-1)+Z(1);
X(2)=X(1)-factor*X(seriesize)+Z(2);
idothese=seriesize;
%
% Key adjustment - how many segments
%
nsegs=2;
%
% Now set up and do pwelch estimation
%
windowlength=idothese/nsegs;
nfft=windowlength;
[PXX,f,PXXc] = pwelch(X,windowlength,0.5*windowlength,nfft,1,'ConfidenceLevel',0.95);
%
% Now compute bounds - note that they are multiplicative factors on known spectrum
% First, we will use what pwelch gives
%
factor1=median(PXXc(:,1)./PXX);
factor2=median(PXXc(:,2)./PXX);
subplot(1,2,1)
semilogy(f,PXX,'ko','markeredgecolor',0.7*[1,1,1],...
'markerfacecolor',0.7*[1,1,1],'markersize',4)
hold on
Pxxtheory=var(X)*(0.834)./(2.25-3*cos(2*pi*f)+cos(4*pi*f));
hold on
semilogy(f,Pxxtheory,'r','linewidth',2)
semilogy(f,Pxxtheory*factor1,'b--','linewidth',2)
semilogy(f,Pxxtheory*factor2,'r--','linewidth',2)
hold off
ylim([1e-1 1e2])
grid
set(gca,'fontsize',20)
ylabel('\Phi_{XX}','fontsize',20)
xlabel('f','fontsize',20)
title('CIs as given by pwelch','fontsize',20)
%
% Now re-do the confidence intervals with a suggested correction
%
factor1=median(PXX./PXXc(:,1));
factor2=median(PXX./PXXc(:,2));
%
subplot(1,2,2)
semilogy(f,PXX,'ko','markeredgecolor',0.7*[1,1,1],...
'markerfacecolor',0.7*[1,1,1],'markersize',4)
hold on
Pxxtheory=var(X)*(0.834)./(2.25-3*cos(2*pi*f)+cos(4*pi*f));
hold on
semilogy(f,Pxxtheory,'r','linewidth',2)
semilogy(f,Pxxtheory*factor1,'b--','linewidth',2)
semilogy(f,Pxxtheory*factor2,'r--','linewidth',2)
hold off
ylim([1e-1 1e2])
grid
set(gca,'fontsize',20)
ylabel('\Phi_{XX}','fontsize',20)
xlabel('f','fontsize',20)
title('CIs inverted and swapped','fontsize',20)

Sign in to comment.

Answers (2)

Kiran Felix Robert
Kiran Felix Robert on 5 Feb 2021
Hi Stephen,
Refer the pwelchfunction documentation for the details.
  2 Comments
William Rose
William Rose on 15 Feb 2021
Kiran,
I read the source code for pwelch.m (C:\Program Files\MATLAB\R2018b\toolbox\signal\signal\pwelch.m). It does not reveal how the confidence interval is computed. pwelch.m calls welch.m (C:\Program Files\MATLAB\R2018b\toolbox\signal\signal\+spectrum\welch.m), which I inspected. It calls another version of pwelch() which I oculd not find, probably because it is in a compiled library. Therefore I would appreciate an actual answer to Stephen's question. Thank you.

Sign in to comment.


William Rose
William Rose on 26 Feb 2021
The 2-sided confidence interval (C.I.) with probability p on the power spectral denisty (p.s.d.), which is returned by pwelch(), is given by
Pxxhat(f)*k/chi2((1+p)/2,k) < Pxx(f) < Pxxhat(f)*k/chi2((1-p)/2,k)
where Pxxhat(f) is the experimental estimate of the p.s.d. at frequency f, and Pxx(f) is the true, but unknown, value of the p.s.d., and k is the degrees of freedom. This is analagous to a confidence interval for a variance. The degrees of freedom is given by
k = 2*K
where upper case K is the number of segments or windows used when esitmating the p.s.d.
If the window length and offset are chosen so that the windows fit the signal without any leftover bits, then
K=(N-L)/D+1
where N=sinal length, L=window length, D=offset. (Don't confuse overlap and offset. Overlap=L-D. pwelch() takes overlap as an argument, but Welch (1967) uses offset=D in his formulas.
Example: Signal length=1024, window=Hamming, with length 256, half-overlapped. Then N=1024, L=256, and D=offset=128. Therefore K=7, by the formula above, and we can verify that 7 half overlapped windows cover the signal exactly. Then k=2*K = 14.
Prob{Pxxhat*14/chi2(.975,14) < Pxx < Pxxhat*14/chi2(.025,14)} = 0.95
=> Prob{Pxxhat*0.536 < Pxx < Pxxhat*2.487} = 0.95
I have checked the above formulas for various combinations of N, L, D, and probability p. The formuals correctly reproduce the confidence intervals reported by pwelch().
The formulas above are not correct from a statistical point of view, when the overlap is more than zero, but they are the formulas that pwelch() uses. The error is that k, the degrees of freedom, should be somewhat less than 2*K when the windows overlap. The exact formula is given by Welch, IEEE Trans Audio Electroacoustics AU-15:70-73, 1967, https://ieeexplore.ieee.org/document/1161901. It is complicated so I will not reproduce it here. It has a varible number of terms, depending on window shape and amount of overlap. The C.I. from pwelch() does not take window shape or overlap into account. pwelch()'s confidence intervals are correct when the windows do not overlap. The error in the confidence interval reported by pwelch() is relatively small when using Hamming or Hann or similar windows at half-overlap. In general, the C.I. reported by pwelch() is narrower than it should be. See table below:
The errors in the confidence intervals reported by pwelch() become more severe if the overlap is greater. By more severe I mean that the C.I. from pwelch() is significantly more narrow that than the true C.I.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!