Click here to Skip to main content
15,881,882 members
Please Sign up or sign in to vote.
3.00/5 (2 votes)
See more:
I've written a small benchmark program as a small test of some languages I've been looking at. It doesn't say a lot about the language, but it gives me a bit of experience using the language and also gives a minor benchmark of performance. The same algorithm is used for all languages. I've used this with C++, Rust, Go, Crystal, "V", Java, Dart, JS. I'm really only interested in a compiled back-end language, but I included some others out of interest. The benchmark is to simply find the prime numbers up to a certain number. The algorithm is probably not the fastest, but it is the same algorithm for all programs.

To calculate the prime numbers up to 10 million on my i7 PC with almost all of these languages takes less than 13 seconds. Java takes about 14 seconds.

However, I also wrote this program using "C", and it took about 675 seconds and I don't know why. I'm not a "C" programmer, but I am experienced with a few languages. What I want to know is what I have done wrong in the program. It is extremely strange to me that it works as expected in "C" on Linux, but not on Win10.

I've posted the code here:

https://controlc.com/3179b32a
C++
/* CALCULATE PRIME NUMBERS */
/* THIS IS NOT DESIGNED TO BE SUPER-OPTIMAL, BUT AS A BENCHMARK. */

#include <stdio.h>
#include <stdlib.h>
#include "strings.h"
#include <sysinfoapi.h>
//
long fnCalcSqrt(long);
int main(void) {
    printf("\nCalculate number of prime integers from 2 to n million");
    long iMillions = 0;
    char sMillions[10];
    while (iMillions < 1 || iMillions > 100) {
        printf("\nEnter number of millions (1 to 100): ");
        fgets(sMillions, sizeof(sMillions), stdin);
        if (strlen(sMillions) < 2) {
           return 0;
        }
        iMillions = atoi(sMillions);
        if (iMillions < 1 || iMillions > 100) {
            printf("Must be from 1 to 100");
        }
    }
    long iEndVal = iMillions * 1000000;
    long iCurrent, iDiv, iSqrt, tfPrime, iPrimeTot = 0;
    printf("Started: calculating primes from 2 to %ld,000,000 .......\n", iMillions);
    long iElapsedMs = GetTickCount();
    for (iCurrent = 2; iCurrent <= iEndVal; iCurrent++) {
        if (iCurrent % 2 != 0 || iCurrent == 2) {
            iSqrt = fnCalcSqrt(iCurrent);
            tfPrime = 1;
            for (iDiv = 2; iDiv <= iSqrt; iDiv++) {
                if ((iCurrent % iDiv) == 0) {
                    if (iDiv != iCurrent) {
                        tfPrime = 0;
                    }
                    break;
                }
            }
            iPrimeTot += tfPrime;
        }
    }
  
    iElapsedMs = GetTickCount() -iElapsedMs;
    printf("Finished. prime total = %ld\n", iPrimeTot);
    printf("Elapsed = %ld.%ld seconds\n", iElapsedMs/1000, iElapsedMs % 1000);
    return 0;
}

long fnCalcSqrt(long iCurrent) /* CALC APPROXIMATE SQRT (NOT EXACT) */
{
    long iProd = 0, iPrevDiv = 0, iDiff = 0;
    long iDiv = iCurrent / 10;
    if (iDiv < 2) {
        iDiv = 2;
    }
    while (1) {
        iProd = iDiv * iDiv;
        if (iPrevDiv < iDiv) {
            iDiff = (iDiv - iPrevDiv) / 2;
        } else {
            iDiff = (iPrevDiv - iDiv) / 2;
        }

        iPrevDiv = iDiv;
        if (iProd < iCurrent) /* iDiv IS TOO LOW */ {
            if (iDiff < 1) {
                iDiff = 1;
            }
            iDiv += iDiff;
        } else {
            if (iDiff < 2) {
                return iDiv;
            }
            iDiv -= iDiff;
        }
    }
}


What I have tried:

I presumed that I had made a mistake, and I presume that I have, but I couldn't find where. So, I set up WSL2 on my PC and compiled and ran the program on Ubuntu and it took 12.5 seconds to run. This is the exact same program that I ran on Win10 that took 675 seconds - about 50 times longer. So, I presumed it must have been the compiler on Windows, so I tried different compilers on Windows - MS, GCC, Tiny C, Pelles C. The result was the same with all - 675 seconds or more. It was then that I coded it in JS in order to transpile to "C", however that did not work (did not compile), and if I modified it to work I would have ended up with essentially my own "C" program. Thinking that it may be my PC, I tried on another Win10 PC, but essentially the same problem.
Posted
Updated 24-Jul-20 4:30am
v3
Comments
Shao Voon Wong 24-Jul-20 23:55pm    
Remember to turn on the optimizer and build in Release mode in VC++. Debug mode has all sorts of checks which slows down the execution.

I have no doubt the problem is in the square root stuff, which is a relatively expensive operation, with a performance that very much depends on how it gets handled.

Now the crux here is you don't need any square root at all, rather than
for (iDiv = 2; iDiv <= iSqrt; iDiv++) {

do
for (iDiv = 2; iDiv * iDiv <= iCurrent; iDiv++) {

which basically replaces a square root by a multiplication!

This will speed up your slowest runs a lot, and probably the fastest ones a bit too.

When using some algorithm to get an idea on system performance, first of all make sure your algorithm is OK. When your measuring stick isn't any good, so won't be your measurements...

:)
 
Share this answer
 
v2
Comments
Richard MacCutchan 24-Jul-20 10:14am    
I would be interested to see which is faster: doing the multiplication for each iteration, or getting the square root once. As near as makes no difference the times are the same.
Luc Pattyn 24-Jul-20 10:21am    
A square root inside a simple comparison typically can be replaced by a single multiplication. Yes, this results in many more multiplications but these are very cheap, I've never seen the float/double approach be faster than the pure integer approach, but feel free to prove me wrong...

PS: most numbers aren't prime; the inner loop will most often end with a small divisor (2, 3, 5 or 7), long before the square root has been amortized...

PS2: and IF one needs sequential square roots, their values are known to be rising monotonically, more specifically each square root equals the previous one plus either zero or one, no need to calculate them from scratch each time...

Richard MacCutchan 24-Jul-20 10:34am    
All very true.
This code can be simplified:
C++
if (iPrevDiv < iDiv) {
    iDiff = (iDiv - iPrevDiv) / 2;
} else {
    iDiff = (iPrevDiv - iDiv) / 2;
}

to
C++
iDiff = abs(iDiv - iPrevDiv) / 2;

and it makes it faster.

When it comes to finding a factor as a way to tell if a number is a prime.
A simple thinking tells you that once you know that 2 is not a factor, no other even numbers (4, 6, 8, 10, 12 ...) will be. You can skip those.
It only double the speed of your algorithm.

Getting even faster:
- Take advantage of your knowledge: 2 and 3 are primes, only marginal optimization here.
- You are checking all numbers in sequence, imply that the next square root will be the same until next number reach next square.
Once n reach 16, sqrt=4 and will not change until n=25, and your can deduce that next sqrt will be 5, and same until n=36 ....
In fact, with a little code, one can completely avoid square roots.

Here is an article I wrote: Integer Factorization: Trial Division Algorithm[^]
 
Share this answer
 
v2
I corrected the link in your question and downloaded your code. The problem is caused by your square root calculation. I have changed the code as follows:
C++
#include <math.h>    // added this include file

// iSqrt = fnCalcSqrt(iCurrent); remove this
iSqrt = (int)sqrt((double)iCurrent); // use the math library sqrt function

// also changed the following block
if ((iCurrent % iDiv) == 0) {
    if (iDiv != iCurrent) {
        tfPrime = 0;
        break; // move the break statement here
    }
}

The results are:
Enter number of millions (1 to 100): 10
Started: calculating primes from 2 to 10,000,000 .......
Finished. prime total = 664579
Elapsed = 6.765 seconds


BTW, in future please include your code in the question rather than posting elsewhere.
 
Share this answer
 
Comments
Member 1045828 24-Jul-20 21:43pm    
I appreciate people taking the time to look at this. I know that I could use the sqrt function. The whole point of the exercise is not to make it faster by changing the algorithm, because the same code is used for all languages. This exact same code runs fast on Linux and is 50 times slower on Win10. If I move the break statement, it gives the wrong result. The whole point of the exercise is to use the same algorithm, not to make it a faster algorithm. I'm not using the sqrt function for the other languages, because part of the exercise is to calculate it. In addition the code on Linux is the exact same code and it doesn't use the sqrt function.

The whole point of the question is why does this "same code" run 50 times slower on Win10 than on Linux? In addition, it runs 50 times slower than Rust, Go, Dart, Java, JS on Win10. I am not a "C" programmer, so I am probably doing something wrong, but I don't know what. Using the sqrt function obviously makes it faster, but it doesn't solve the problem by answering the question as to the huge difference, and in addition the benchmark is to calculate it using that algorithm. The same compiler was used on Linux (GCC) which is one of the ones that I used on Win10. Maybe the compiler adds some smarts, but using a profiler, it appeared to me that it doesn't spend much time calculating the sqrt. However, I've never used a profiler before and I should probably check that, but it wouldn't solve the problem. I don't want to use the sqrt function because part of the exercise is to calculate it using that algorithm for all languages. Perhaps I introduced a bug, but if so why does it run fine on Linux - using the same (GCC) compiler?
Richard MacCutchan 25-Jul-20 4:16am    
We have not seen the Linux version of your code so cannot compare it. As I said above you need to show all the code that you are referring to.

OK, I just removed the 'Windows' parts from your code and ran it on my Ubuntu system (hosted in Windows) and it is as fast as you say. That is very strange, but impossible to see what is different since there are no system calls in the calculation parts of the code.
Richard MacCutchan 25-Jul-20 4:40am    
It appears that your square root calculation runs extremely fast in Linux, but slow on Windows. I have no idea why that should be, but I think it may be worth taking up with the compiler team at Microsoft.
Richard MacCutchan 25-Jul-20 10:19am    
OK, I give up. I have tried various modifications of the code but nothing improves the time more than half a second. I really think the compiler team at Microsoft need to to see this and figure out what is wrong.
Member 1045828 26-Jul-20 1:23am    
Thanks a lot Richard, I appreciate you taking the time to look at it. Yes, the exact code was used on Linux. It is extremely strange that all compilers on WIn10 give the same slow run-time. I'll attempt to progress the problem with MS or a compiler provider.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900