Script to remove polynomial/quadratic error off CSV data
1 view (last 30 days)
Show older comments
[tl;dr: read a csv, fit a curve, substract it from the data and write back to the csv]
Hello everyone,
for a research project I have large amounts of data coming off a profilometer. If you don't know, this is a device that measures the surface profile, in my case of a thin film on a piece of glass, and stores it as X/Y-data in .csv form. Inherent to this data is an error caused by the curvature of the glass plate, that needs to get removed. One such measurement will produce about 40000 lines of data.
I have determined that a quadratic compensation is good enough for what I'm looking to measure, so I have an area in front of and behind the film, as well as in the middle, where there is no film, which can be used to fit a quadratic polynome. The data is quite noisy, so you need to take an average over a couple 100 points. What I would like to do is write a script that reads a CSV file, fits a quadratic polynome to these areas that are known to be the glass plate and subtracts this polynome from the data, so I will hopefully end up with data that is compensated for the curvature of the glass plate, which is then added to the CSV file, ideally in a third column, if that is even possible.
Unfortunately, I am quite new to Matlab, although I managed to cobble together a script that could read a CSV file and plot it in the past, I don't know where to even start with this one. Has anyone ever done this or knows how to do it?
Best, IJ
6 Comments
dpb
on 10 May 2021
Ah...that's a lot less restrictive of a problem statement than I had inferred from prior... :)
Are the spikes "real" in that they're going to be influencing this estimate across the sample or would/should rejecting them be part of the algorithm?
I've not looked at the rest, there are a relatively few meally large spikes of from 2-3X to 5-6X the surrounding area that are extremely large excursion at the beginning/ending although they have some noise/structure at the peak (that may/may not be real?). Would it be desirable/acceptable to remove those and replace with, say, spline interpolant between?
That likely could be done reasonably robustly and then, having done that in your three selected areas, just fit that parabola on the means of those locations. You could investigate the effect of fitting the raw data as well, but I suspect it wouldn't help much and would, in fact, reintroduce more noise than would help.
I've got other tasks right now, but I'll try to look again later this evening...but those would be my thinking of what I'd probably try. findpeaks if you have Signal Processing TB could be very helpful in peak-locating.
Answers (2)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!