Today almost every computer, every tablet and every smartphone has a multi-core processor installed. Unfortunately, software does not necessarily automatically use all cores of the CPU (multithreading). So the programmer has to do some preliminary work so that his software can take full advantage of the system and run in multiple threads.

### The Leibniz formula

In this tutorial we want to calculate an approximation of PI. For this we use the so-called Leibniz formula, which calculates PI by the following sum:

The formula is basically easy to implement with a simple for-loop. For the sum of k = 0 to k = 1000, you could calculate an approximation of PI with the following loop:

double sum = 0; for (int k = 0; k <= 1000; k++) { sum += Math.Pow(-1, k) / (2 * k + 1); } double pi = sum * 4;

The result is: pi = 3,14259165433954.

We see that the result is still pretty inaccurate – even though we’ve already run k up to 1000. For this reason, we need a significantly larger k and thus significantly more loop passes, which is very time consuming. At the moment everything is running on a single core of the CPU. So to shorten our computation time, we would like to compute PI under full CPU utilization in multiple threads on all cores. At the same time we want to be able to calculate PI in a precision that we can define. For this we need an unlimited number of decimal places as well as an unlimited maximum k (= number of loop passes).

Unfortunately, C # does not give us an option to work with an infinite number of decimal places, so we have to do some tricks and use the class BigInteger (System.Numerics), which contains at least infinitely large numbers (limited by RAM) from the whole infinite series of integer numbers (…; -3, -2, -1, 0, 1, 2, 3, …).

### Implementation in C#

Let’s start with a class called PICalculator and some necessary variables:

class PICalculator { BigInteger f; BigInteger[] i; BigInteger[] sum; BigInteger depth; bool[] finish; }

The **variable f** stands for our factor. Since we do not count on floating-point numbers, but on integers, we simply multiply the numerator of our sum by a factor f corresponding to one power of ten. At the end we shift the comma by the amount of power to the front.

Example with f = 10^4 = 10000:

After shifting the coma, the result would be: pi = 3,2849

The **variable i** is our k. Since we work in separate threads, the array will be the size of the number of threads. So for each thread i can be increased separately.

The same applies to the **variable sum**, which stores the sum for each thread.

The **variable depth** corresponds to our maximum k – the total number of loop passes.

The **variable finish** stores whether the respective thread has finished its loop.

#### Example: Calculating in 4 threads with f = 10000 and depth = 1000

Result: pi = (10000 + sum[0] + sum[1] + sum[2] + sum[3]) * 4 = 31428

After the division by f or shifting the comma, we get pi = 3,1428.

So we divide the calculation of the sum into 4 equal loops. Each thread has (depth / number of threads) loop iterations.

### The constructor

public PICalculator(int precision, BigInteger searchdepth, int threads) { f = BigInteger.Pow(10, precision); sum = new BigInteger[threads]; i = new BigInteger[threads]; finish = new bool[threads]; depth = searchdepth; for (int p = 0; p < threads; p++) { i[p] = depth / threads * p + 1; sum[p] = 0; finish[p] = false; } Task oTask = new Task(outputTask); oTask.Start(); }

The constructor takes three parameters: the precision, which is the power of ten of our f, the maximum total number of loop passes, and the number of threads.

Furthermore, the variables are initialized and the arrays filled. Each i is set to the start value depending on the number of threads.

**Example for depth = 1000 and threads = 4:**

i[0] = 1 i[1] = 251 i[2] = 501 i[3] = 701

The task oTask (outputTask) is only used to run a task in the background, which makes it possible to display the progress of the individual threads in percent when a button is pressed. The implementation can be seen in the full code example at the end of this tutorial and does not matter at first.

### The calculation of a loop step in a thread

public bool performSteps(int thread) { if (i[thread] > (depth / i.Length * (thread+1))) { finish[thread] = true; return false; } if (BigInteger.Remainder(i[thread], 2) == 0) { sum[thread] += f / (i[thread] * 2 + 1); } else { sum[thread] -= f / (i[thread] * 2 + 1); } i[thread]++; return true; }

The performSteps (int thread) method calculates a single loop pass and increments counter i by 1 in the corresponding thread.

At first we check if the maximum height of i for this thread has already been reached by comparing the current i with the start value of the next thread (thread + 1). When the end is reached, finish for this thread is set to true and no further computation is performed.

if (BigInteger.Remainder(i[thread], 2) == 0)

Here it is checked whether the current i is an even or an odd number. According to the Leibniz formula we subtract at an odd i and add at an even i. The function BigInteger.Remainder corresponds in principle to a modulo operation.

### Addition of all individual results of the threads

public BigInteger getResult() { BigInteger pi = f; foreach (BigInteger s in sum) { pi += s; } pi *= 4; return pi; }

The method getResult () takes all sum of the threads, multiplies the result by 4 and returns the result as an integer (without comma shift).

#### The main class and void Main ()

class Program { static void Main(string[] args) { int numThreads = 25; int precision = 20; BigInteger depth = 1000000000; // One Billion = 10^9 PICalculator pi = new PICalculator(precision, depth, numThreads); Console.WriteLine("Calculation started with " + numThreads.ToString() + " threads, press space to see progress."); Console.WriteLine(); Parallel.For(0, numThreads, i => { bool unfinished = true; while (unfinished == true) { unfinished = pi.performSteps(i); } }); StringBuilder sb = new StringBuilder(); sb.Append(pi.getResult().ToString()); sb.Insert(1,","); string piString = sb.ToString(); Console.WriteLine("Final Result: " + piString); ConsoleKeyInfo key = Console.ReadKey(); while (key.Key != ConsoleKey.Escape) key = Console.ReadKey(); } }

In our Main method, we first define in variables how many threads we want to work with, how big our f is and how many maximum loop passes we want to have.

In this example, we want to have 25 threads, of which the finished program always uses up to the maximum possible number. The precision of 20 corresponds to a f of 10 ^ 20 = 1000000000000000000. In total, a billion loop passes should be made. The calculation will take some time, even with multithreading. It can be assumed that in a quad-core CPU with activated hyperthreading in this case, the calculation takes only about one-eighth of the time, as if the calculation without multithreading (as in the first example at the top) would perform.

### Parallel.For (…)

Parallel.For(0, numThreads, i => { bool unfinished = true; while (unfinished == true) { unfinished = pi.performSteps(i); } });

Let us now come to the most important part. C # provides us with the Parallel.For function (from System.Threading.Tasks), which is similar to a for-loop, but it is automatically distributed to a maximum number of threads. That means, this Parallel.For loop traverses i from 0 to numThreads and starts a new thread for each i. In each of these threads, we call the above implemented method performSteps in a while-loop. As parameters, we pass the i, which corresponds to the current thread. The while loop runs until the performSteps method has reached the end of its computations and returns false.

Finally, in our example, we have 25 while-loops, which in principle should run in parallel. But of the fact that there are less CPUs that support 25 threads running in parallel, some threads only run in the moments where other threads are currently performing no calculation.

In the last part of the Main Method, we just put the comma after the first position of the result and output it. Then we wait for the user to press the escape key to exit the program.

### The whole Code

In the following you can study, copy and test the complete code. It also contains two methods that have not been explicitly discussed here. This involves determining the progress of the calculation in the individual thread and the output in percent.

using System; using System.Text; using System.Threading.Tasks; using System.Numerics; namespace CalculatePI { class PICalculator { BigInteger f; BigInteger[] i; BigInteger[] sum; BigInteger depth; bool[] finish; public PICalculator(int precision, BigInteger searchdepth, int threads) { f = BigInteger.Pow(10, precision); sum = new BigInteger[threads]; i = new BigInteger[threads]; finish = new bool[threads]; depth = searchdepth; for (int p = 0; p < threads; p++) { i[p] = depth / threads * p + 1; sum[p] = 0; finish[p] = false; } Task oTask = new Task(outputTask); oTask.Start(); } void outputTask() { bool continueThread = false; while (continueThread == false) { continueThread = true; for (int p = 0; p < finish.Length; p++) { continueThread = continueThread && finish[p]; } if (Console.KeyAvailable) { ConsoleKeyInfo key = Console.ReadKey(); if (key.Key == ConsoleKey.Spacebar) { Console.Clear(); Console.CursorLeft = 0; float[] pos = getPositions(); for (int p = 0; p < pos.Length; p++) { Console.WriteLine("Thread " + p + ": " + " {0:0.0000000000} %", pos[p]); } StringBuilder sb = new StringBuilder(); sb.Append(getResult().ToString()); sb.Insert(1, ","); string piString = sb.ToString(); Console.WriteLine("Interim result: " + piString); Console.WriteLine(); } } } } public bool performSteps(int thread) { if (i[thread] > (depth / i.Length * (thread+1))) { finish[thread] = true; return false; } if (BigInteger.Remainder(i[thread], 2) == 0) { sum[thread] += f / (i[thread] * 2 + 1); } else { sum[thread] -= f / (i[thread] * 2 + 1); } i[thread]++; return true; } public float[] getPositions() { float[] positions = new float[i.Length]; for (int p = 0; p < positions.Length; p++) { BigInteger size = depth / i.Length; BigInteger cleanpos = i[p] - size * p; positions[p] = (float)(cleanpos * 1000000000000 / size)/10000000000f; } return positions; } public BigInteger getResult() { BigInteger pi = f; foreach (BigInteger s in sum) { pi += s; } pi *= 4; return pi; } } class Program { static void Main(string[] args) { int numThreads = 25; int precision = 20; BigInteger depth = 1000000000; // One Billion = 10^9 PICalculator pi = new PICalculator(precision, depth, numThreads); Console.WriteLine("Calculation started with " + numThreads.ToString() + " threads, press space to see progress."); Console.WriteLine(); Parallel.For(0, numThreads, i => { bool unfinished = true; while (unfinished == true) { unfinished = pi.performSteps(i); } }); StringBuilder sb = new StringBuilder(); sb.Append(pi.getResult().ToString()); sb.Insert(1,","); string piString = sb.ToString(); Console.WriteLine("Final Result: " + piString); ConsoleKeyInfo key = Console.ReadKey(); while (key.Key != ConsoleKey.Escape) key = Console.ReadKey(); } } }

The output after pressing the space bar after a few seconds. You can see that some threads have not or have barely begun their work:

Thread 0: 21,1294500000 % Thread 1: 0,0000025000 % Thread 2: 20,8750300000 % Thread 3: 0,0000025000 % Thread 4: 20,5217100000 % Thread 5: 0,0000025000 % Thread 6: 20,3607900000 % Thread 7: 0,0000025000 % Thread 8: 20,4376300000 % Thread 9: 0,0000025000 % Thread 10: 20,0706800000 % Thread 11: 0,0000025000 % Thread 12: 20,5839600000 % Thread 13: 0,0000025000 % Thread 14: 20,7066900000 % Thread 15: 0,0000025000 % Thread 16: 20,2433900000 % Thread 17: 0,0000025000 % Thread 18: 20,3013200000 % Thread 19: 0,0000025000 % Thread 20: 20,1487700000 % Thread 21: 0,0000025000 % Thread 22: 20,5862100000 % Thread 23: 0,0000025000 % Thread 24: 0,0000025000 % Interim result: 3,14159247739307800504

Thank you for this nice guide about C# multithreading with Parallel. This really helped me, can i share it?

Where do you want to share it?

how you would like to compute PI under full CPU utilization?

tnx i learned a lot