Fun to Program – Prime numbers

Date: 2013/08/05 (initial publish), 2021/08/02 (last update)

Source: en/fun2-00005.md

Previous Post Top Next Post

TOC

This was originally written and created around 2013 and may require to be updated. (2021)

Prime numbers

A prime number (or a prime) is a natural number greater than 1 that has no positive divisors other than 1 and itself.

Let’s check simple code snippets to obtain prime numbers via the same trial division algorithm implemented in different languages to study the following:

Please note this algorithm to obtain prime numbers is not the best one.

Shell

Here is a shell program to search the prime numbers.

Source code for the Shell script: prime

#!/bin/sh
N_MAX=$1
P="" # list containing all primes found
for N in $(seq 2 $N_MAX); do
    FLAG_PRIME=1 # True
    # search for all pimes found
    for I in $P; do
        N_DIV_I=$(($N / $I))
        N_MOD_I=$(($N % $I))
        if [ $N_MOD_I = 0 ]; then
            FLAG_PRIME=0 # False
            break # found not to be prime
        fi
        if [ $N_DIV_I -lt $I ]; then
            break # no use doing more i-loop if N < I*I
        fi
    done
    if [ $FLAG_PRIME = 1 ] ; then
        P="$P $N"
    fi
done
# space separated to line break separated
echo "$P" |xargs -n1 echo

Let’s check time to compute prime numbers up to some numbers.

Behavior of the shell script

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 23.91
user 19.38
sys 0.48

Python

Here is a Python program for the prime numbers.

Source code for the Python script: prime

#!/usr/bin/python3
import sys
n_max = int(sys.argv[1])
p = [] # list containing all primes found
for n in range(2, n_max):
    flag_prime = True
    # search for all pimes found
    for i in p:
        (n_div_i, n_mod_i) = divmod(n,i)
        if n_mod_i == 0:
            flag_prime = False
            break # found not to be prime
        if n_div_i < i:
            break # no loop over i if n < i*i
    if flag_prime:
        p.append(n)
for i in p:
    print(i)

Behavior of the Perl script

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.16
user 0.16
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 4.86
user 4.86
sys 0.00

Lua

Here is a Lua program for the prime numbers with BASIC style “for”.

Source code for the Lua script: prime

#!/usr/bin/lua
n_max = tonumber(arg[1])
True = 1
False = 0
p = {}; -- list containing all primes found
for n = 2, n_max do
    flag_prime = True
    -- search for all pimes found
    for ii = 1, (#p) do
        i = p[ii]
        n_div_i = math.floor(n / i)
        n_mod_i = n % i
        if n_mod_i == 0 then
            flag_prime = False
            break -- found not to be prime
        end
        if n_div_i < i then
            break -- no use doing more i-loop if n < i*i
        end
    end
    if flag_prime == True then
        table.insert(p,n)
    end
end
for ii = 1, (#p) do
    print(p[ii])
end

Here is an alternative Lua program with “for ... in ” and a iterator function.

Source code for the Lua script with “for ... in ”: prime0

#!/usr/bin/lua
n_max = tonumber(arg[1])
True = 1
False = 0
function iter(a) -- ordered iterator
  local i = 0
  return function()
    i = i + 1
    return a[i]
  end
end
p = {}; -- list containing all primes found
for n = 2, n_max do
    flag_prime = True
    -- search for all pimes found
    for i in iter(p) do
        n_div_i = math.floor(n / i)
        n_mod_i = n % i
        if n_mod_i == 0 then
            flag_prime = False
            break -- found not to be prime
        end
        if n_div_i < i then
            break -- no use doing more i-loop if n < i*i
        end
    end
    if flag_prime == True then
        table.insert(p,n)
    end
end
for i in iter(p) do
    print(i)
end

Behavior of the Lua script

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.16
user 0.16
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 5.23
user 5.22
sys 0.00
$ /usr/bin/time -p ./prime0 "$(echo 2^20 | bc)">/dev/null
real 5.11
user 5.10
sys 0.01

Perl

Here is a Perl program for the prime numbers.

Source code for the Perl script: prime

#!/usr/bin/perl -w
use constant false => 0;
use constant true  => 1;
$n_max = "$ARGV[0]" + 0;
@p = (); # list containing all primes found
for ($n = 2; $n < $n_max; $n++) {
    $flag_prime = true;
    # search for all pimes found
    foreach $i (@p) {
        $n_div_i = int($n / $i);
        $n_mod_i = $n % $i;
        if ($n_mod_i == 0) {
            # found to be non-prime
            $flag_prime = false;
            last; # found not to be prime
        }
        if ($n_div_i < $i) {
            last; # no use doing more i-loop if n < i*i
        }
    }
    if ($flag_prime == true) {
        push (@p, $n);
    }
}
foreach $i (@p) {
    print "$i\n";
}

Behavior of the Perl script

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.16
user 0.16
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 4.86
user 4.86
sys 0.00

C

C with the array

Here is a C program with the array for the prime numbers.

Source code for the C program with the array: prime.c

#include <stdlib.h>
#include <stdio.h>

#define ARRAY_SIZE 100000000
#define TRUE 1
#define FALSE 0

int main (int argc, char ** argv) {
    long n_max = atol (argv[1]);
    long j = 1;
    long p[ARRAY_SIZE];
    long n, k, i, n_div_i, n_mod_i;
    int flag_prime;
    for (n = 2; n < n_max; n++) {
        flag_prime = TRUE;
        /* search for all pimes found */
        for (k = 1; k < j; k++) {
            i = p[k];
            n_div_i = n / i;
            n_mod_i = n % i;
            if (n_mod_i == 0) {
                flag_prime = FALSE;
                break; /* found not to be prime */
            }
            if (n_div_i < i) {
                break; /* no use doing more i-loop if n < i*i */
            }
        }
        if (flag_prime == TRUE) {
            p[j]=n;
            j++;
            if (j>ARRAY_SIZE) {
                printf ("E: Overflow array after %ld\n", p[j-1]);
                return EXIT_FAILURE;
            }
        }
    }
    for (k=1; k < j; k++) {
        printf ("%ld\n", p[k]);
    }
    return EXIT_SUCCESS;
}

Let’s compile this into a binary executable.

Compilation of the C program with the array

$ gcc -O9 -Wall -o prime prime.c

Behavior of the C program with the array

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.00
user 0.00
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 0.16
user 0.16
sys 0.00

C with the list

Here is a C program with the list for the prime numbers.

Source code for the C program with the list: prime.c

#include <stdlib.h>
#include <stdio.h>
#define TRUE 1
#define FALSE 0

struct _primelist {
    long prime; 
    struct _primelist *next;
    };
typedef struct _primelist primelist;

primelist *head=NULL, *tail=NULL;

int checkprime(long n) {
    primelist *p;
    long i, n_div_i, n_mod_i;
    int flag;
    flag = TRUE;
    for (p = head; p != NULL ; p = p->next) {
        i = p->prime;
        n_div_i = n / i;
        n_mod_i = n % i;
        if (n_mod_i == 0) {
            flag = FALSE;
            break; /* found not to be prime */
        }
        if (n_div_i < i) {
            break; /* no use doing more i-loop if n < i*i */
        }
    }
    return flag;
}

int main(int argc, char **argv) {
    primelist *p=NULL, *q=NULL;
    long n, n_max;
    n_max = atol(argv[1]);
    for (n = 2; n <= n_max; n++) {
        if (checkprime(n)) {
            q= calloc(1, sizeof(primelist));
            q->prime = n;
            if (head == NULL) {
                head = q;
            } else {
                tail->next=q;
            }
            tail = q;
        }
    }
    for (p = head; p != NULL; p = p->next) {
        printf ("%ld\n", p->prime);
    }
    for (p = head; p != NULL; q = p->next, free(p), p = q) {}
    return EXIT_SUCCESS;
}

TIP: It is a good practice to free memory properly to avoid common errors in C dynamic memory allocation even when OS may take care.

Let’s compile this into a binary executable.

Compilation of the C program with the list

$ gcc -O9 -Wall -o prime prime.c

Behavior of the C program with the list

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.00
user 0.00
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 0.17
user 0.17
sys 0.00

C with the list (variants)

Since this is very good example to learn C with the dynamic memory allocation, I made a combination code of several styles.

Source code for the C program with the list: prime-all.c

#include <stdlib.h>
#include <stdio.h>
#define TRUE 1
#define FALSE 0

#if STYLE==1
typedef struct _primelist {
    long prime; 
    struct _primelist *next;
    } primelist;
#elif STYLE==2
typedef struct _primelist primelist;
struct _primelist {
    long prime; 
    primelist *next;
    };
#else /* default */
struct _primelist {
    long prime; 
    struct _primelist *next;
    };
typedef struct _primelist primelist;
#endif

primelist *head=NULL, *tail=NULL;

int checkprime(long n) {
    primelist *p;
    long i, n_div_i, n_mod_i;
    int flag;
    flag = TRUE;
#if STYLE == 4
    p = head;
    while(p) {
#else /* default */
    for (p = head; p != NULL ; p = p->next) {
#endif
        i = p->prime;
        n_div_i = n / i;
        n_mod_i = n % i;
        if (n_mod_i == 0) {
            flag = FALSE;
            break; /* found not to be prime */
        }
        if (n_div_i < i) {
            break; /* no use doing more i-loop if n < i*i */
        }
#if STYLE == 4
        p = p->next;
#endif
    }
    return flag;
}

int main(int argc, char **argv) {
    primelist *p=NULL, *q=NULL;
    long n, n_max;
    n_max = atol(argv[1]);
#if STYLE == 4
    n = 1;
    while(n <= n_max) {
        n++;
#else /* default */
    for (n = 2; n <= n_max; n++) {
#endif
        if (checkprime(n)) {
#if STYLE == 3
            q = malloc(sizeof(primelist));
            q->next = NULL;
#else /* default */
            q= calloc(1, sizeof(primelist));
#endif
            q->prime = n;
            if (head == NULL) {
                head = q;
            } else {
                tail->next=q;
            }
            tail = q;
        }
    }
#if STYLE == 4
    p=head;
    while(p) {
        printf ("%ld\n", p->prime);
        p = p->next;
    }
    p=head;
    while(p) {
        q = p->next;
        free(p);
        p = q;
    }
#else /* default */
    for (p = head; p != NULL; p = p->next) {
        printf ("%ld\n", p->prime);
    }
    for (p = head; p != NULL; q = p->next, free(p), p = q) {}
#endif
    return EXIT_SUCCESS;
}

This code uses CPP macro to change its style.

Here, I used the _primelist tag for struct to distinguish from the primelist identifier for typedef. The _primelist tags can be replaced with primelist without problem in C since struct tags and typedefs are in separate namespaces in C (But C++ is a different story …).

The first 3 memory allocation styles generate exactly the same code. The difference between calloc and malloc codes is because only calloc sets the allocated memory to zero.

The “-D NAME=DEFINITION” compiler option is processed effectively as “#define NAME DEFINITION” at the top of the source. (See more in the info page for CPP.)

Let’s compile this into 5 binaries using this “-D NAME=DEFINITION” compiler option.

Compilation of the C program with the list in several styles

$ gcc -O9 -Wall -D'STYLE=0' -o prime0 prime-all.c
$ gcc -O9 -Wall -D'STYLE=1' -o prime1 prime-all.c
$ gcc -O9 -Wall -D'STYLE=2' -o prime2 prime-all.c
$ gcc -O9 -Wall -D'STYLE=3' -o prime3 prime-all.c
$ gcc -O9 -Wall -D'STYLE=4' -o prime4 prime-all.c

All these generated binary executable programs work the same (i.e., the same execution speed).

The prime-all.c code can be simplified to prime.c by removing preprocessor conditionals with unifdef(1).

Removal of preprocessor conditionals from prime-all.c

$ unifdef -D'STYLE=0' -o prime.c prime-all.c
$ unifdef -D'STYLE=1' -o prime1.c prime-all.c
$ unifdef -D'STYLE=2' -o prime2.c prime-all.c
$ unifdef -D'STYLE=3' -o prime3.c prime-all.c
$ unifdef -D'STYLE=4' -o prime4.c prime-all.c

Vala (Glib)

Here is a Vala program for the prime numbers using only Glib.

(This is a bad program example. I know this is problematic only after making this program and analyzing the problem I faced. See Benchmark.)

Source code for the Vala program: prime.vala

int main (string[] args) {
    long n_max = long.parse(args[1]);
    var p = new List<long> ();
    for (long n = 2; n < n_max; n++) {
        bool flag_prime = true;
        // search for all pimes found
        foreach (long i in p) {
            long n_div_i = n / i;
            long n_mod_i = n % i;
            if (n_mod_i == 0) {
                flag_prime = false;
                break; // found not to be prime
            }
            if (n_div_i < i) {
                break; // no use doing more i-loop if n < i*i
            }
        }
        if (flag_prime == true) {
            p.append(n);
        }
    }
    foreach (long i in p) {
        stdout.printf ("%lld\n", i);
    }
    return 0;
}

Behavior of the Vala program

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.07
user 0.06
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 9.45
user 9.44
sys 0.00

Havoc Pennington said “g_list_last() and g_list_append() are defined to be O(n) operations; they are not fast for long lists. GList is modeled on the lists found in Scheme, Haskell, etc. which behave the same way.” I should have known …

Vala (Gee)

Here is a Vala program for the prime numbers using Gee.

(This is an improved program example over previous Vala (GLib) one. See Benchmark.)

ArrayList<G> provided by Gee is used instead of List<G> provided by Glib. This type of list structure is very fast for adding data at the end of the list.

Source code for the Vala program: prime.vala

using Gee;
int main (string[] args) {
    long n_max = long.parse(args[1]);
    var p = new ArrayList<long> ();
    for (long n = 2; n < n_max; n++) {
        bool flag_prime = true;
        // search for all pimes found
        foreach (long i in p) {
            long n_div_i = n / i;
            long n_mod_i = n % i;
            if (n_mod_i == 0) {
                flag_prime = false;
                break; // found not to be prime
            }
            if (n_div_i < i) {
                break; // no use doing more i-loop if n < i*i
            }
        }
        if (flag_prime == true) {
            p.add(n);
        }
    }
    foreach (long i in p) {
        stdout.printf ("%lld\n", i);
    }
    return 0;
}

Behavior of the Vala program

$ /usr/bin/time -p ./prime "$(echo 2^16 | bc)">/dev/null
real 0.02
user 0.02
sys 0.00
$ /usr/bin/time -p ./prime "$(echo 2^20 | bc)">/dev/null
real 0.57
user 0.57
sys 0.00

Benchmark

Here are benchmark results of various program languages.

Speed benchmark of various program languages

Program real(2^16) user(2^16) sys(2^16) real(2^20) user(2^20) sys(2^20)
Shell 23.91 19.38 0.48
Python 0.19 0.19 0.00 5.58 5.58 0.00
Lua 0.16 0.16 0.00 5.23 5.22 0.00
Lua (for … in) 5.11 5.10 0.01
Perl 0.16 0.16 0.00 4.86 4.86 0.00
C (array) 0.00 0.00 0.00 0.16 0.16 0.00
C (list) 0.00 0.00 0.00 0.17 0.17 0.00
Vala (glib) 0.07 0.06 0.00 9.45 9.44 0.00
Vala (gee) 0.02 0.02 0.00 0.57 0.57 0.00

Here, the time reported by the /usr/bin/time -p command is in POSIX standard 1003.2 style:

Please disregard minor differences such as 20% since this benchmark is not meant to be very accurate.

The speed differences are several orders of magnitudes.

A code written in any compiler language is faster than one written in any interpreter in normal situation.

But looking at real (2^20) case, the Vala program with GLib result is very odd. For now, let me just say “The root cause is the way GLib is used. There are many operations to add an item at the end of the long list. This is slow o(n) operation.”

This will be analyzed in Profiling prime.

TIP: The use of the compiler does not guarantee the faster code. Algorithm is important.

Previous Post Top Next Post