How to generate a regression fit of a 2D surface from 4D data

Question

0 votes

Hi All, First time posting so please forgive etiquette errors.

I have been using polyfitn to generate a regression fit of 3D noisy data onto a 2D surface. I have also been using polyfitn to generate a regression fit of 4D noisy data onto a 3D surface.

However I now want to fit 4D noisy data onto a 2D surface. Unfortunately I cannot find reference to any matlab functions which will support this. As far as I can see I will need a parametric surface (<http://en.wikipedia.org/wiki/Parametric_surface>) but have been unable to form a method that will make the matlab functions fit to it.

A third or fourth order polynomial is expected to give a reasonable fit to the function as it is not very far from a plane, but it is actually an exponential function a*e^( m*x+c). We have around a million data points in the range roughly 0.5 - 1.0 on each axis.

Can anyone make some suggestions about how I can go about generating this fit?

(Edit: Further Clarification of data available)

4 Comments
Show 2 older comments Hide 2 older comments

John D'Errico on 18 Dec 2014

Polyfitn will allow you to estimate a model of the form z=f(x,y), so we can think of that as fitting a 2-d surface from 3-d data. It works as long as you have a functional relationship to estimate. Likewise, polyfitn will estimate a function w=f(x,y,z), so you can call that a 3-d surface from 4-d data.

I suppose you might call the first case a 2-manifold, embedded in a 3 dimensional space. The latter case would be a 3-manifold, embedded in R^4, a 4-d space.

Now what you seem to be asking for is a method to fit a 2-manifold that lives in a 4-d space, and to do it from noisy data. I find myself with at least one big question that needs to be resolved.

What are your goals? What will you do with this surface? After all, you cannot really use it to plot something in 4-d. In the other cases, one typically sees a person using these fits for predictive purposes, allowing you to predict a smoothed value z=f(x,y) for example. But here you are talking about a parametric 2-manifold, embedded in an R^4 space.

Yes, it is possible to estimate a planar 2-manifold embedded in a 4-d space. Typically this would involve a singular value decomposition, which would allow us to reduce the problem to a 2-dimensional subspace. But you are talking about a nonlinear relationship here, in a general parametric form. As such, it would have no predictive value, since the parameters in that form would be unknown.

I suppose you might use a locally linear method, wherein a SVD might be used only on the data near a point on the surface. It would not seem to be of much value though, resulting in a set of discontinuous planar segments. Nothing you can plot, nothing to look at, nothing to predict.

It might help to explain something about what the data represents. For example, one place I have seen and worked with problems where there was a 2-manifold in a 4-d space is the color gamut of a printing device that uses 4 colors to print, thus a CMYK printer. There though, I would generally not want to represent it asa surface in the form of a polynomial model, since such gamuts often have sharp edges in them. There are also other ways to reduce that problem to a 2-manifold anyway.

William on 18 Dec 2014

Edited: William on 18 Dec 2014

John, thank you for the response. Your points are concise and well made.

We are planning on using the surface in a predictive manner unfortunately the details of exactly what we are trying to achieve is bound by NDAs. However I realise I may not have explained myself sufficiently,

Each axis represents a measurement of a sample being repeated under different conditions. Although we have four measurements the samples vary continuously in only two dimensions. We have successfully used the method using three measurements down to a 2D manifold and Four measurements down to a 3D manifold. The 3D manifold is less constrained than the actual problem and gives worse results. As such we would like to move to a solution where the surface only has two varying dimensions (say U and V) which represent the values of the sample and from it determine the expected four measurements.

We are looking to use the fitted manifold to identify how similar a new set of samples are, by determining how close the new sample points are to the surface.

The more I think about it the more I feel the problem is that we don't know the two values which determine the sample. If we did we could display the problem as a parametric equation using the two values for U and V (Sample variability) and then generate expected measurement values x,y,z and w from the U and V.

Maybe I should be asking how I can generate a sensible U,V mesh for this data set. Any thoughts would be appreciated, but I may have to think on the problem further,

Thank you

John D'Errico on 18 Dec 2014

You will notice that I have not chosen to answer this question. :)

I did once attempt a general code that would take a set of (noisy) data in n dimensions and attempt to find a k-manifold (for specified k) that represented the data. You won't find anything posted on the FEX though, as I had no real success in the endeavor.

Your problem is one of a general nonlinear errors in variables class. I also worked on one of these problems long ago for a client. I was never really happy with the result there either. Planar fits are fine for this problem using the SVD, but beyond that point, it gets nasty when things are nonlinear.

So were I to try to solve what I understand so far of your problem, I would think of starting with a very coarse triangulated mesh in the 4-d domain space of your data. In my thoughts this would be no more than two triangles, floating in that R^4 domain.

For each point in your point cloud, compute the distance to the closest point on this triangulated 2-manifold. Think of that as a spring, connecting these fixed points to your manifold. You can think of those springs as exerting a force on the manifold which is proportional to the square of the extension of those springs. (Basic mechanics tells us that springs store potential energy as a function of the square of their extension.) Now, use a SOR-like scheme to relax the triangulated mesh to have minimum potential energy. You could also add in an energy term that would penalize a mesh that was highly curved. Relax the mesh until the system is at a minimum potential energy state. Then look for areas (triangles in the mesh) where the mesh appears to have points that lie at a long distance from the mesh, and refine the triangulation in those regions. Repeat the relaxation step until you are happy with the results.

Overall, not trivial to write, and long to run on a zillion points, but potentially doable.

William on 19 Dec 2014

Open in MATLAB Online

Hi John,

    I think you are right. A linear fit would be fairly straight forward. I like your solution of a spring balancing system, I'll have a think about the practicalities.

How to generate a regression fit of a 2D surface from 4D data

4 Comments
Show 2 older comments Hide 2 older comments

Answers (0)

Categories

Tags

Community Treasure Hunt

How to generate a regression fit of a 2D surface from 4D data

4 Comments Show 2 older comments Hide 2 older comments

Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

4 Comments
Show 2 older comments Hide 2 older comments