Apply average smoothing to data from text file?

I have tried looking up various way to take my data from a text file and apply a smoothing/moving average filter (without a loop), but NOTHING I do seems to work. I may be completely approaching this wrong, but any help in the right direction would be appreciated!
The file I am working with is attached: SN_d_tot_V2.txt
This is my latest attempt, and while it finally gives me a plot display, it doesn't give me the original data or the smoothed data:
if true
ssnmatrix = dlmread('SN_d_tot_V2.txt'); %data has been truncated to only show 1849-2015
Q = size(ssnmatrix);
nrows = Q(1);
ncolumns = Q(2);
n = 365/2;
ssnave = ssnraw;
ssnave = ssnave/(2*n+1);
% pick out the two relevant columns
decyear = ssnmatrix(:,4);
ssnraw = ssnmatrix(:,5);
% attempt to create filter coefficients
a = 1;
b = [2 1];
%smooth data and plot
y = filter(b,a,ssnave);
t = 1:length(ssnave);
plot(t,decyear,ssnraw,ssnave)
end
Could anyone possible explain what I'm doing wrong? I simply want to smooth the data and get my plot!

3 Comments

I tried some other things, and now I updated the code to be:
if true
load SN_d_tot_V2.txt
x = SN_d_tot_V2(:,4);
y = SN_d_tot_V2(:,5);
a = 1;
b = [2 1];
d = filter(a,b,x);
e = filter(a,b,y);
t = 1:length(x);
plot(t,x,'--',t,y,'-')
end
But the plot still isn't right, now I'm getting this plot:
What do you want to do with your signal?
Are there frequency components you want to filter out?
Moving average filters are rarely a good choice for signal processing.
I guess the simplest way to put it is that I want to take my data and average n adjacent values, and to plot the average. The other way the data was smoothed is as follows:
if true
clear;
clc;
ssnmatrix = dlmread('SN_d_tot_V2.txt'); %data has been truncated to only show 1849-2015
Q = size(ssnmatrix);
nrows = Q(1);
ncolumns = Q(2);
% pick out the two relevant columns
decyear = ssnmatrix(:,4);
ssnraw = ssnmatrix(:,5);
% Smooth the data set
% create a centered average of half width n
% monthly average n=15
% yearly average n=365/2
n = 365/2;
ssnave = ssnraw;
for shift = 1:n
ssnave = ssnave + circshift(ssnraw,[shift,0]) + circshift(ssnraw,[-shift,0]);
end
ssnave = ssnave/(2*n+1);
% Plot raw data and smoothed data
figure(1)
plot(decyear,ssnraw,decyear,ssnave);
ylabel('SSN')
xlabel('Time (Years)')
title('Sunspot Numbers from 1849-2015 with Smoothing')
end
Which results in a plot of:

Sign in to comment.

 Accepted Answer

I'm not saying this is the best, but you can try this:
data = importdata('SN_d_tot_V2.txt');
col4 = data(:,4);
col5 = data(:,5);
t = 1:length(col4);
plot(t, col4, 'r-', 'LineWidth', 2);
hold on;
plot(t, col5, 'b-', 'LineWidth', 2);
grid on;
order = 2;
frameLength = 901; % <=== Adjust this to get different levels of smoothing.
smoothCol4 = sgolayfilt(col4, order, frameLength);
smoothCol5 = sgolayfilt(col5, order, frameLength);
plot(t, smoothCol4, 'm-', 'LineWidth', 4);
plot(t, smoothCol5, 'c-', 'LineWidth', 4);

8 Comments

This looks like progress, at least! However, now the major issue is that that pink line shouldn't be there, and the data isn't spread over correct axis values, one method of smoothing this data resulted in this plot:
But thank you for at least trying to help, I greatly appreciate it! I'm going to try some modifications and see if I can remove that line and fix my axes!
The red and magenta lines are your 4th column. I'm not sure what it represents. Is it the time? If you don't want it, then take out the col 4 plotting calls.
The fourth column are time in years, with daily sampling frequency, so years+fraction. I plotted the fft, and the noise turns out to be broadband.
The Savitzky-Golay filter is the only option for smoothing it, or any other signal with broadband noise. A discrete filter — including a moving-average filter or any FIR or IIR filter — will retain the noise within its passband.
Thank you both so much! I will try to work from here and see what I can fix ! I think my biggest confusion was in which filter to use in terms of averaging, I didn't think that the Savitzky-Golay filter would be better than the moving average.
So, I've gotten it to finally look somewhat better by going off of the code you both so generously helped me with:
But, how can I fix my x-axis to show my col4 or my Time in years from 1849-2015? Do I need to invert variables in one of my plot() lines?
Just plot it as a function of Column 4, so Column 4 is the first variable in your plot call.
figure(9)
plot(col4, smoothCol5)
grid
Sorry, to ask for help again, but if anyone knows whether the data set I've provided results in revealing that there is, in fact, an 11-year cycle that is evident once applying an FFT to the data set, then please just verify for me. I have tried multiple ways to use FFT but nothing is giving me an 10-11 year max in cycle frequency for this data set..
How about just using findpeaks() on the smoothed signal and then using diff() on the peak locations (x axis = years) and seeing what the average difference between peak years is?

Sign in to comment.

More Answers (3)

My filter approach was correct. It works to filter out the noise and smooth the signal. Also, the 11-year period is obvious if you plot it as a period and not as frequency.
The Code:
d = load('Zoe Zontos SN_d_tot_V2.txt');
t = d(:,4); % Time
sn = d(:,5); % Sunspot Numbers
L = size(sn,1);
tsts = [mean(diff(t)) std(diff(t)) min(t) max(t)];
figure(1)
plot(t, sn)
grid
Ts = tsts(1); % Sampling Interval (Approximate)
Fs = 1/Ts; % Sampling Frequency (Appriximate)
Fn = Fs/2; % Nyquist Frequency (Approximate)
FTsn = fft(sn)/L; % Fourier Transform
Fv = linspace(0, 1, fix(L/2)+1)*Fn; % Frequency Vector
Iv = 1:length(Fv); % Index Vector
% % Fvi = [NaN 1./Fv(2:end)];
Wp = 1/Fn; % Normalised Passband
Ws = 1.5/Fn; % Normalised Stopband
Rp = 20; % Passband Ripple
Rs = 40; % Stopband Ripple
[n,Wn] = buttord(Wp,Ws,Rp,Rs); % Filter Order
[b,a] = butter(n,Wn); % Filter Transfer Function Coefficients
[sos,g] = tf2sos(b,a); % Second-Order-Section For Stability
snf = filtfilt(sos,g,sn); % Phase-Neutral Filtering
figure(3)
freqz(sos) % Bode Plot
figure(2)
semilogy(Fv, abs(FTsn(Iv))*2)
grid
axis([0 25 ylim])
xlabel('Frequency (Y)')
figure(4)
plot(t, sn)
hold on
plot(t, snf, '-r', 'LineWidth',2)
hold off
grid
FTsnf = fft(snf)/L; % Fourier Transform Of Filtered Signal
[sn_max,max_idx] = max(abs(FTsnf(Iv(2:end)))*2);
figure(5)
plot(1./Fv, abs(FTsnf(Iv))*2)
grid
axis([0 15 0 60])
xlabel('Period (Years/Cycle)')
text(1./Fv(max_idx+1), sn_max, sprintf('Mean Number %.2f\nMean Period %.2f years', sn_max, 1./Fv(max_idx+1)), 'HorizontalALignment','center', 'VerticalAlignment','bottom')
The Plots:
EDIT Corrected units of x-axis label to ‘Years/Cycle’ in second plot figure and code.
You forgot to attach 'SN_d_tot_V2.txt', in case people want to test their code. How do you want to smooth? Personally I'd use conv:
smoothed = conv(ssnraw, ones(1,3)/3, 'same');
But there are other options like different window sizes, sgolayfilt(), etc.

1 Comment

Sorry about that, I attached the file now and updated what I tried doing! And I just want to do a general smoothing of the data or a moving average, just any other way of smoothing that doesn't require a loop of any kind because I am looking for alternative ways.

Sign in to comment.

As per the documentation of filter, the coefficient b for a moving average filter is:
b = (1/windowSize)*ones(1,windowSize);
All values of b must be equal and their product must be 1 for it to be a moving average filter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!