How to work around the precision problem?

23 views (last 30 days)
I am aware that certain numbers cannot be perfectly represented in binary, and as a result some of the numbers are not exact.
I am trying to calculate the numerical derivative of x^4 + 3x^3 - 2x^2 + x - 2
n = 10001;
lb = 0;
ub = 2;
dx = (ub-lb)/(n-1);
t = 1:n;
x = t*dx - dx;
u0 = x.^4+3*x.^3-2*x.^2+x-2;
figure;
hold on;
plot(x,u0);
drawnow;
u = u0;
ntemp = n;
ntempstart = 1;
for i = 1:4
du_dx = dudx(u,dx,length(ntempstart:ntemp));
ntempstart = ntempstart+1;
ntemp = ntemp - 1;
plot(x(ntempstart:ntemp),du_dx);
drawnow;
u = du_dx;
end
function du_dx = dudx(u,dx,n)
du_dx = zeros(1,n-2);
for i = 2:n-1
du_dx(i-1) = (u(i+1) - u(i-1))/(2*dx);
end
end
This is a debugging process for a numerical solution of the KdV equation and I am asked to solve the derivatives in KdV using derivative definition such as the one defined here. I am trying to get to the bottom of this:
When I run the above code with n = 1001 or smaller number of increments, it produces the expected derivative results. But when I increase it to 10001 as shown, I get the following. I believe the problem is caused by the precision limitations. Is the only solution to not use such a high number of increments? I tried to round the values for the third and fourth derivative, but they do not help. Rounding for all cases makes it worse.
  2 Comments
Walter Roberson
Walter Roberson on 29 Mar 2020
x = t*dx - dx;
would be more precise as
x = (t-1)*dx;
Zhangxi Feng
Zhangxi Feng on 29 Mar 2020
Thanks for the tips. It didn't make a difference though...

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 29 Mar 2020
I worked through symbolically.
Suppose that you create your x vector, starting at 0, and that each entry is theoretically perfect accuracy, except that in the last entry there is a noise with magnitude N. And then create u0 perfect according to those x values, with no extra noise introduced in the calculation (for simplification.) Then run through your dudx 4 times, each time with perfect accuracy, with no extra noise introduced into that calculation (for simplification.)
Now, under this simplified model of precision, if every x was perfect except for noise in the last one, then all entries will be exactly 24 (the 4th derivative of the function), except that the last entry would be
39062500000000*N^4 + 117781250000000*N^3 - 76785678125000*N^2 + 38473835136250*N + 24
solve for that equalling 26 (2 higher than expected) and the two real solutions are N = -3.631231264177921 (a huge value compared to your range, so irrelevant), and also N = 5.19833802093675e-14 .
That is, without even one other error, an error on the order of 234*eps in representing one of the x entries would suffice to explain an error of magnitude 2.
You do not actually get a difference that large in representing x entries; the maximum representation error I find is 2.26307861339592e-16 but since there noise in adjacent entries, assuming the rest of the calculations are perfect, that leads to 4th derivatives that are off by about +/- 1.58, even without taking into account round-off noise in calculating u0 or the differences. When all of the errors combine, the actual range is about +/- 4.2
What can you do about this? Well, you can work with the Symbolic Toolbox. Or you can use techniques such as compensated summation https://en.wikipedia.org/wiki/Kahan_summation_algorithm -- but you have to do it consistently, including when calculating u0. Even then you face the problem that floating point numbers cannot exactly represent multiples of 1/5000...
  4 Comments
Zhangxi Feng
Zhangxi Feng on 29 Mar 2020
The Fortran code and Matlab code produced the same issues so I decided to debug in Matlab. What you said makes sense, however, as after I fixed a few bugs, the Fortran code seems to cause more value loss at the boundaries, leading to derivatives above the fourth one to have issues. Thanks!
Les Beckham
Les Beckham on 29 Mar 2020
Edited: Les Beckham on 29 Mar 2020
Walter's last note in his answer ("Even then you face the problem that floating point numbers cannot exactly represent multiples of 1/5000...") got me thinking, so I started playing around with this.
I found that you can get some fascinating behavior by using a dx that is a fraction of a power of 2.
For example, changing to n = 8193 (dx = 1/4096), I get this:
Using n = 16385 (dx = 1/8192):
It appears that a dx of 1/4096 is about as good as it gets. Note that there is a bit of a 'blip' in the first plot at the same point that the second one goes in to oscillation (apparently a limit cycle due to the non-linearity of the round-off errors). And the second plot has a similar small blip at about half the number of samples.
Decreasing the dx further just gets worse. Here is n = 32769 (dx = 1/16384). It is interesting that using fractions of powers of two for dx seems to cause this kind of rapid transition into limit cycles rather than the slow build-up in errors that other values exhibit (such as your original plot).
I don't know if this helps but I thought it was interesting. It does indicate that judicious choice of the step size may allow you to get cleaner results than you would get with arbitrary step sizes.
Regards,
Les
BTW - you may have noticed from the appearance of the figure screenshots that my experimenting was done in Octave, but it should be using the same IEEE floating point math. So, YMMV.

Sign in to comment.

More Answers (0)

Categories

Find more on Fortran with MATLAB in Help Center and File Exchange

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!