My old college buddy Mark VandeWettering tweeted a challenge right before Pi Day: he was going to write code to calculate a value for pi in a novel way. What were the rest of us slobs going to do?
I had a pretty good idea how he was going to do it, as he’d recently posted something really cool about a way to use the Mandelbrot set to find pi. (Sure enough that’s what he did.)
I decided to play around with using MATLAB (my current favorite programming environment) to see how many interesting ways I could calculate pi. I stayed up all night, had fun, learned a lot, and came up with a couple of things worthy of posting. (At least by the standards of this blog. Mark doesn’t have anything to worry about.)
To warm up, I wrote a Monte Carlo simulation, with a small wrinkle: I got a load of random numbers from random.org and used those, then compared it to MATLAB’s
rand function. No noticeable difference.
%% Happy Pi Day 2011!
% This program uses a Monte Carlo technique to estimate the value of pi.
% Using a list of random numbers between 0 and 1, it uses every possible
% pair of numbers to create [X,Y] points. These points are then tested to
% see whether they lie within the unit circle. The ratio of points in the
% circle to total points should be approximately pi/4.
%% ***** Get 10000 random numbers
% Downloaded from random.org - numbers evenly distributed between 0 and 1
% randnums = textread('randoms.txt');
% rcount = length(randnums);
randnums = rand([10000,1]);
rcount = length(randnums);
%% ***** Calculate pi
incircle = 0;
trials = 0;
for ii = 1:rcount-1
for jj = ii:rcount
pt = [randnums(ii), randnums(jj)];
len = norm(pt); % Calculate the distance from the origin
if len < 1 % It's inside the circle
incircle = incircle + 1;
if len ~= 1 % Don't count it if it's ON the circle
trials = trials + 1;
fprintf('Happy Pi Day 2011! For %d trials, Pi is about %f\n', ...
trials, incircle/trials * 4)
Happy Pi Day 2011! For 50004999 trials, Pi is about 3.152321
Elapsed time is 73.081508 seconds.
I noticed that this technique converges excruciatingly slowly, so I figured anything I could come up with would have to be faster.
I looked through the Wikipedia articles about estimating pi and was impressed with Ramanujan’s summation formulae, but I didn’t understand where they come from so it seemed like cheating.
Now, I was able to follow the derivation of Machin’s arctan formula, but doing that with a computer seemed… trivial.
Should I try to do something similar with the arcsin function? Naaah, it was getting late and my talents seem to lie mostly with computer algorithms rather than number theory. Let’s try some naïve brute-force code.
And hey, instead of adding up the area inside the circle like in the Monte Carlo simulation, let’s see if we can figure out how long the arc is. That should take a lot fewer points.
Here’s a brute-force arc-drawing routine:
function result = arcwalk(r)
% ARCWALK(R) - generate pixels for quarter-circle arc of radius R
% preallocate array of points
pts = zeros(2*r+1, 2);
x = r;
y = 0;
idx = 1;
%% Walk from [r,0] to [0,r], moving up or left
while x >= 0
pts(idx,[1 2]) = [x y];
up = norm([x, y+1]);
left = norm([x-1, y]);
if abs(r - up) < abs(r - left)
y = y + 1;
x = x - 1;
idx = idx + 1;
result = pts;
arcwalk(10) produces these points:
The arcwalk function's output for radius=10
Kewl, now all I have to do is count up the number of pixels it took, assume that’s approximately how long the arc is, and Bob’s your uncle:
maxpow = 5;
for r = 10 .^ [1:maxpow]
pts = arcwalk(r);
s = length(pts);
fprintf('For radius %d, pi is about %f\n', r, 2*s/r)
For radius 10, pi is about 4.200000
For radius 100, pi is about 4.020000
For radius 1000, pi is about 4.002000
For radius 10000, pi is about 4.000200
For radius 100000, pi is about 4.000020
Elapsed time is 1.026044 seconds.
Wha… oh. Maybe not quite that naïve. Bonus points if you figure out what’s going on. Hint: I had already figured this out in the
arcwalk function, right there on line 5.
What if I simplify that arc? Hey, I can use one of my favorite algorithms, the recursive Ramer-Douglas-Peukert algorithm! I used it last summer while (ahem!) interning at NASA. Here’s my implementation:
%% DPSIMP - quick implementation of the Douglas-Peucker algorithm
% PS = DPSIMP(P, E) - simplify polyline P with epsilon E
% DPSIMP uses the Ramer-Douglas-Peucker algorithm to simplify a polyline
% made up of an ordered list of X,Y points. For more information, see
% P is an N-by-2 matrix containing an X,Y point in each of the N rows.
% E is the value of epsilon (the maximum distance from the simplified
% polyline to the given polyline).
% PS is an N-by-2 matrix containing the X,Y points of the simplified
% by Doug Weathers, email@example.com
function result = dpsimp(p, epsilon)
% pldist - find distance from point to line
% The point is p0, and the line runs through p1 and p2. The points are row
% vectors [x y].
function result = pldist(p0, p1, p2)
% First, put the line in the form Ax + By + C = 0
A = p1(2) - p2(2);
B = p2(1) - p1(1);
C = -A * p1(1) - B * p1(2);
% Then, calculate the orthogonal distance with this simple formula:
result = abs(A * p0(1) + B * p0(2) + C) / sqrt(A^2 + B^2);
% Find point with maximum distance from the line that runs between the first
% and last points.
dmax = 0;
index = 0;
[rows, cols] = size(p);
% Draw a line between the endpoints. Delete it later if it's not
% simplified enough. This lets you watch the recursion in action.
% h = line([p(1,1) p(end,1)], [p(1,2) p(end,2)], 'LineStyle','-', 'Marker','x');
for i = 2:rows-1
d = pldist(p(i,:), p(1,:), p(end,:));
if d > dmax
index = i;
dmax = d;
% Is there a point farther away than epsilon? If so, we need to simplify
if dmax >= epsilon
% Yes, there's a point far enough away from the line to be included in
% the simplified version.
% Split the line into two segments and call this routine recursively on
% each segment.
recresult1 = dpsimp(p(1:index,:), epsilon);
recresult2 = dpsimp(p(index:end,:), epsilon);
% Connect the simplified polylines back together! Only keep one copy
% of the point at the index.
result = [ recresult1(1:end-1,:); recresult2 ];
% The line we drew earlier needs to be erased
% None of the interior points of the polyline are far enough away from
% the line joining the endpoints to worry about.
% We're done! We can skip all the intervening points!
result = [ p(1,:); p(end,:) ];
Here’s how it looks when applied to the
arcwalk(10) pixels, with an epsilon of 1:
Quarter-arc, arcwalk pixels, and the DP simplified polyline
Now, how to pick a good value for
epsilon? I played with a few values and settled on the square root of 2. It might be worth trying to optimize this some day.
Let’s try to calculate pi with this technique. (Observe how my comments are getting briefer as time passes.)
maxpow = 5;
epsilon = sqrt(2);
for r = 10 .^ (1:maxpow)
pts = arcwalk(r); % make the arc
spts = dpsimp(pts, epsilon); % simplify it
path = spts(2:end,:) - spts(1:end-1,:); % vector subtraction
s = sum(hypot(path(:,1), path(:,2))); % add up the lengths
fprintf('For radius %d, pi is about %f\n', r, 2*s/r)
For radius 10, pi is about 3.164823
For radius 100, pi is about 3.152301
For radius 1000, pi is about 3.142237
For radius 10000, pi is about 3.141706
For radius 100000, pi is about 3.141607
Elapsed time is 42.576056 seconds.
Reasonably accurate, but pretty slow. I suspect the problem is with recursion in MATLAB – it’s really inefficient.
Let’s see if we can speed it up and get a better answer by using some calculus. I looked up the formula for the length of the curve of a function, plugged the equation for our quarter-circle into it, simplified it, and ended up with the following code:
pts = 1000000;
x = linspace(0, sqrt(2)/2, pts);
dx = x(2)-x(1);
s = sum(sqrt(1 ./ (1 - x.^2))) .* dx;
fprintf('\nFor %d points, pi is about %f\n', pts, 4*s)
For 1000000 points, pi is about 3.141596
Elapsed time is 0.078690 seconds.
Stupidly fast! A million points, forsooth. I calculated only one-eighth of the circle instead of one quarter, but the big time-saver was the opportunity to vectorize the code. In MATLAB you can perform element-by-element operations on arrays by preceding the operator with a dot. It’s worth doing. No loops – I made an array of all the x-values, then calculated a y-value for each x in one operation.
After this I went to bed and tried to ignore the birds greeting the sun. It wasn’t too hard.
Note: during the writing of this post, I got to wondering about my naïve arc-drawing algorithm. Is there a better one available? Of course there is, and it’s got some serious potential of speeding up my brute-force code. More later – right now I need to see if I can get to sleep before the birds wake up.