Fun to Program -- Library

This was originally written and created around 2013 and may require to be updated. (2021)

Library

Static and dynamic libraries

Compiling source while stopping at object file can be done with the -c option. You can bunch such object files into a single archive/object. This is called library.

static library: libfoo.a
- simple archive of object files (*.o) as “ar rcs libfoo.a *.o”
- *.a may be used just like bunch of *.o files while linking.
dynamic library: libfoo.so
- all object files (*.o) compiled with gcc option -fPIC.
- shared object file created by “gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o”.
- associated symbolic links created by “ldconfig”.

TIP: In order to load a library file with the GCC -l option, its name must start with lib.

I do not go in details here but the gccintro package provides a good tutorial “Introduction to GCC by Brian J. Gough” with examples.

GNU C Library

The Debian libc6:amd64 package offeres embedded GNU C library which contains the standard libraries that are used by nearly all programs on the system.

Shared libraries offered by the libc6:amd64 package.

$ dpkg -L libc6:amd64|grep ^/lib/x86_64-linux-gnu/.*\.so$
/lib/x86_64-linux-gnu/libpthread-2.17.so
/lib/x86_64-linux-gnu/ld-2.17.so
/lib/x86_64-linux-gnu/libanl-2.17.so
/lib/x86_64-linux-gnu/libBrokenLocale-2.17.so
/lib/x86_64-linux-gnu/libc-2.17.so
/lib/x86_64-linux-gnu/libcidn-2.17.so
/lib/x86_64-linux-gnu/libcrypt-2.17.so
/lib/x86_64-linux-gnu/libdl-2.17.so
/lib/x86_64-linux-gnu/libm-2.17.so
/lib/x86_64-linux-gnu/libmemusage.so
/lib/x86_64-linux-gnu/libnsl-2.17.so
/lib/x86_64-linux-gnu/libnss_compat-2.17.so
/lib/x86_64-linux-gnu/libnss_dns-2.17.so
/lib/x86_64-linux-gnu/libnss_files-2.17.so
/lib/x86_64-linux-gnu/libnss_hesiod-2.17.so
/lib/x86_64-linux-gnu/libnss_nis-2.17.so
/lib/x86_64-linux-gnu/libnss_nisplus-2.17.so
/lib/x86_64-linux-gnu/libpcprofile.so
/lib/x86_64-linux-gnu/libresolv-2.17.so
/lib/x86_64-linux-gnu/librt-2.17.so
/lib/x86_64-linux-gnu/libSegFault.so
/lib/x86_64-linux-gnu/libthread_db-1.0.so
/lib/x86_64-linux-gnu/libutil-2.17.so

libc

Most of the standard C library functions are included in the libc library.

You do not mention linking to the libc library explicitly using the -l option to GCC. It is always linked.

Most of the basic functions of the libc library are explained in many C programing tutorials such as “The C programming Language, by B. W. Kerninghan and Dennis M. Ritchie”. I will skip most of those mentioned in such tutorials.

The GNU C Library manual is also good source of information.

libc: macro

There are some macros defined in the libc library. They tend to make programs easier to read.

Macro for exit status.

$ grep "#define.*EXIT_" /usr/include/stdlib.h
#define    EXIT_FAILURE    1    /* Failing exit status.  */
#define    EXIT_SUCCESS    0    /* Successful exit status.  */

TIP: The exit status value matches with the shell convention. But some programs return -1 as the non-zero value instead of 1 when errors are encountered.

TIP: Defining TRUE and FALSE macros as the Boolean context values for 1 and 0 are popular in the C program. They are not defined in the libc library. So normally, user has to define them.

libc: error.h

Here are some notable items for the error handling of the libc library.

The errno integer variable is set to non-zero value when library functions encounter the error.
The strerror(errno) function returns a pointer to a string that describes the meaning of the error for errno.
The perror("foo") produces a message on the standard error output for errno with "foo: <error message>"

The macros for the error are explained in the manpage errno(3).

The values of the macros for the error are defined in <errno.h> header file which is a bit convoluted. Arch dependent symlinks are marked as (*).:

header file	action	target
`<errno.h>`	defines	“`extern int errno;`”.
`<errno.h>`	includes	`<bits/errno.h>`.
`<bits/errno.h>` (*)	includes	`<linux/errno.h>`.
`<linux/errno.h>`	includes	`<asm/errno.h>`.
`<asm/errno.h>` (*)	includes	`<asm-generic/errno.h>`.
`<asm-generic/errno.h>`	defines	many error values.
`<asm-generic/errno.h>`	includes	`<asm-generic/errno-base.h>`.
`<asm-generic/errno-base.h>`	defines	important error values from 1 to 34.

TIP: Make sure to include <error.h> in the header if a program needs to deal with the libc library generated errors.

libc: string operations

Unfortunately, some C string functions are known to be troublemakers.

Safe coding recommendations by busybox coders

troublemaker functions	overrun concerns	recommendation
`strcpy`(3)	`dest` string	`safe_strncpy`
`strncpy`(3)	may fail to 0-terminate `dst`	`safe_strncpy`
`strcat`(3)	`dest` string	`strncat`(3)
`gets`(3)	string it gets	`fgets`(3)
`getwd`(3)	`buf` string	`getcwd`(3)
`[v]sprintf`(3)	`str` buffer	`[v]snprintf`(3)
`realpath`(3)	`path` buffer	use with `pathconf`(3)
`[vf]scanf`(3)	its arguments	just avoid it

Although [vf]scanf(3) are marked as “just avoid it”, it is not the end of the world for the coding of the scanf-like logic.

The combination of getline(3) and sscanf(3) is the most portable solution for the safe scanf alternative. If the incoming data is not delimited by the newline “\n” code, getdelim(3) may alternatively be used in place of getline(3).

The use the “m” flag in the format string, as described in scanf(3) is the other solution for the safe scanf alternative. (You need newer POSIX.1-2008 complient libc.) It uses “[vf]scanf("%ms", &char)” with free instead of “[vf]scanf("%s", char)” alone.

libc: safe_strncpy

The safe_strncpy recommended by busybox coders is essentially the 0-terminate guranteed strncpy with the safe data length. Since it is missing in the libc library, it should be emulated by adding a custom function definition as:

Alternative safe string copy function safe_strncpy: safe_strncpy.c.

#include <string.h>
char* safe_strncpy(char *dst, const char *src, size_t size)
{
    if (size == 0) {
        return dst;
    } else {
        size--;
        dst[size] = '\0';
        return strncpy(dst, src, size);
    }
}

Alternative safe string copy function safe_strncpy: safe_strncpy.h.

char* safe_strncpy(char *dst, const char *src, size_t size);

Let’s test string copy functions: strcpy, strncpy, and safe_strncpy.

Test code for string copy functions: test_strcpy.c.

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "safe_strncpy.h"

char s1[10], s2[10];
int main(int argc, char ** argv)
{
    char *sc1 = "0123456789"; /* 11 bytes */
    char *sc2 = "I gotcha!";  /* 10 bytes */

    printf("%s\n", "Constant strings");
    printf("\tsc1 = %s\n", sc1);
    printf("\tsc2 = %s\n", sc2);
    safe_strncpy(s1, sc1, 10);
    safe_strncpy(s2, sc2, 10);
    printf("%s\n", "Copied strings: safe_strncpy\n"
        "\t\tIt should drop 9 at the end to be safe.");
    printf("\ts1  = %s\n", s1);
    printf("\ts2  = %s\n", s2);
    strncpy(s1, sc1, 10);
    printf("%s\n", "Copied strings: strncpy(s1, sc1, 10)\n"
        "\t\tprintf(..., s1) can not stop at the end.");
    printf("\ts1  = %s\n", s1);
    printf("\ts2  = %s\n", s2);
    strcpy(s1, sc1);
    printf("%s\n", "Copied strings: strcpy(s1, sc1)\n"
        "\t\tstrcpy overwrites onto s2.");
    printf("\ts1  = %s\n", s1);
    printf("\ts2  = %s\n", s2);
    return EXIT_SUCCESS;
}

Test result of string copy functions by test_strcpy.c.

$ gcc -Wall -o test_strcpy safe_strncpy.c test_strcpy.c
$ ./test_strcpy
Constant strings
    sc1 = 0123456789
    sc2 = I gotcha!
Copied strings: safe_strncpy
        It should drop 9 at the end to be safe.
    s1  = 012345678
    s2  = I gotcha!
Copied strings: strncpy(s1, sc1, 10)
        printf(..., s1) can not stop at the end.
    s1  = 0123456789
    s2  = I gotcha!
Copied strings: strcpy(s1, sc1)
        strcpy overwrites onto s2.
    s1  = 0123456789
    s2  = I gotcha!

Only safe_strncpy works safely as seen above.

libc: file operations

File operation in C can be done with different levels.

low level file descriptor based operations:
- open(2): open and possibly create the file and return a new file descriptor
- close(2): close the file descriptor
- lseak(2): reposition read/write file offset associated with the file descriptor
- read(2): read from the file descriptor
- write(2): write to the file descriptor
- mmap(2): map file associated with the file descriptor into memory
- fcntl(2): manipulate file descriptor

Predefined file descriptor macros

$ grep FILENO /usr/include/unistd.h
#define    STDIN_FILENO    0    /* Standard input.  */
#define    STDOUT_FILENO    1    /* Standard output.  */
#define    STDERR_FILENO    2    /* Standard error output.  */

high level stream IO based operations:
- fopen(3): open and possibly create the file and associate the stream with the file
- fclose(3): close the stream
- feof(3): test the end-of-file indicator for the stream
- ferror(3): test the error indicator for the stream
- getc(3): read a character from the binary stream
- putc(3): write a character to the binary stream
- fread(3): read blocks of data from the binary stream
- fwrite(3): write blocks of data to the binary stream
- fprintf(3): formatted text stream output conversion
- fscanf(3): formatted text stream input conversion

Predefined file stream macros

$ grep "# *define *std" /usr/include/stdio.h
#define stdin stdin
#define stdout stdout
#define stderr stderr

Let’s learn fundamentals of file operation by creating simple codes such as size of a file or copy a file. These example codes are not meant to be the fastest nor the shortest code.

libc: size of a file

We can think of 4 different methods to obtain the size of a file.

char : read and count characters
block : read blocks and count characters
lseek : move file offset and count characters
stat : obtain file size from the directory it belongs

Since stat method works only for real files but not for symlinks, lseek method seems to be the most popular one used.

Here are benchmark results of these methods using perf (See Debug: level 4: perf).

Speed benchmark of various methods to obtain the file size.

Performance counter stats	char	block	lseek	stat
seconds time elapsed	0.060168937	0.004092420	0.002638296	0.002588425
task-clock	59.126037	3.051782	1.616662	1.634113
context-switches	5	0	0	0
cpu-migrations	1	1	0	1
page-faults	193	193	191	191
cycles	154,305,859	2,428,799		1,297,107
stalled-cycles-frontend	32,413,607	1,237,525	787,158	805,663
stalled-cycles-backend	3,089,296	828,200	612,162	627,711
branches	87,138,178	382,322	209,395	208,836
branch-misses	14,795		9,023

If you wish to do more than just counting characters, other methods may give good starting point for such programs. I will list all the source of size.c as below.

Read size of a file (char)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdio.h>      /* printf, perror */
#include <errno.h>      /* perror */
#include <stdlib.h>     /* exit */
#include <fcntl.h>      /* open */
#include <sys/stat.h>   /* open */
#include <sys/types.h>  /* open lseek */
#include <unistd.h>     /* lseek */
#include <locale.h>     /* setlocale */

int
main(int argc, char* argv[])
{
    FILE *f;
    off_t size = 0;
    if (argc != 2) {
        printf("E: Need a filename as an argument.\n");
        return EXIT_FAILURE;
    }
    f = fopen(argv[1], "r");
    if (f == NULL) {
        perror("E: Can not open input file");
        exit(EXIT_FAILURE);
    }

    for (;;) {
        fgetc(f);
        if (ferror(f)) {
            perror("E: Error reading input file");
            exit(EXIT_FAILURE);
        }
        if (feof(f)) {
            break;
        } else {
            size += 1;
        }
    }
    if (fclose(f)) {
        perror("E: Can not close input file");
        exit(EXIT_FAILURE);
    }

    setlocale(LC_ALL,"");
    return (printf("\nFile size: %'zi\n", size));
}

Read size of a file (block)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h>  /* printf, perror, freed */
#include <errno.h>  /* perror */
#include <locale.h> /* setlocale */
#define BUFFSIZE (1024*4)
int
main(int argc, char* argv[])
{
    FILE *f;
    char *buf;
    size_t n = BUFFSIZE, i, size = 0;
    if (argc != 2) {
        printf("E: Need a filename as an argument.\n");
        return EXIT_FAILURE;
    }
    if ((buf = (char *) malloc(n)) == NULL) {
        perror("E: Can not make a buffer");
        exit(EXIT_FAILURE);
    }
    if ((f = fopen(argv[1], "r")) == NULL) {
        perror("E: Can not open input file");
        exit(EXIT_FAILURE);
    }
    for (;;) {
        i = fread(buf, 1, n, f);
        if (ferror(f)) {
            perror("E: Error reading input file");
            exit(EXIT_FAILURE);
        } else if (i == 0) {
            break;
        } else {
            size += i;
        }
    }
    if (fclose(f)) {
        perror("E: Can not close input file");
        exit(EXIT_FAILURE);
    }
    setlocale(LC_ALL,"");
    return (printf("\nFile size: %'zi\n", size));
}

Read size of a file (lseek)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdio.h>      /* printf, perror */
#include <errno.h>      /* perror */
#include <stdlib.h>     /* exit */
#include <fcntl.h>      /* open */
#include <sys/stat.h>   /* open */
#include <sys/types.h>  /* open, lseek */
#include <unistd.h>     /* lseek */
#include <locale.h>     /* setlocale */

int
main(int argc, char* argv[])
{
    int fd;
    off_t size;
    if (argc != 2) {
        printf("E: Need a filename as an argument.\n");
        return EXIT_FAILURE;
    }
    if ((fd = open(argv[1], O_RDONLY)) == -1) {
        perror("E: Can not open input file");
        exit(EXIT_FAILURE);
    }
    size = lseek(fd, 0, SEEK_END);
    setlocale(LC_ALL,"");
    printf("\nFile size: %'zi\n", size);
    return EXIT_SUCCESS;
}

Read size of a file (stat)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdio.h>      /* printf, perror */
#include <errno.h>      /* perror */
#include <stdlib.h>     /* exit */
#include <sys/types.h>  /* stat */
#include <sys/stat.h>   /* stat */
#include <unistd.h>     /* stat */
#include <locale.h>     /* setlocale */

int
main(int argc, char* argv[])
{
    struct stat st;
    off_t size;
    if (argc != 2) {
        printf("E: Need a filename as an argument.\n");
        return EXIT_FAILURE;
    }
    if (stat(argv[1], &st) == -1) {
        perror("E: Can not stat input file");
        exit(EXIT_FAILURE);
    }
    size = st.st_size;
    setlocale(LC_ALL,"");
    printf("\nFile size: %'zi\n", size);
    return EXIT_SUCCESS;
}

All the above example can be compiled as follows.

Example of compiling size.c.

$ gcc -Wall -o size size.c

libc: copy a file

We can think of 5 different methods to copy a file.

char : copy a character at a time
block : copy a block (4 KiB) at a time
block big : copy a block (4 MiB) at a time
mmap memcpy : use mmap(2) to map input and output files while copying data with memcpy(3).
mmap write : use mmap(2) to map input file while writing data with write(2) from the memory.

Here are benchmark results of these methods using perf (See Debug: level 4: perf).

Speed benchmark of various methods to copy a large file about 2.4 MiB.

Performance counter stats	char	block	block big	mmap memcpy	mmap write
seconds time elapsed	0.089096423	0.015654253	0.016963338	0.018813373	0.018040972
task-clock	78.600617	5.916846	7.286212	8.114573	6.747344
context-switches	12	5	5	5	5
cpu-migrations	0	0	0	0	0
page-faults	132	132	715	1,283	694
cycles	213,622,432	8,787,602	9,975,035	9,506,449	8,258,812
stalled-cycles-frontend	46,249,968	5,289,960	4,939,101	4,771,343	4,421,422
stalled-cycles-backend	8,069,075	3,617,957	3,660,122	3,965,894	3,223,960
branches	118,042,429	1,614,855	1,639,492	1,555,953	1,442,130
branch-misses	20,800	12,048	10,295	8,539

Speed benchmark of various methods to copy a small file about 2 KiB.

Performance counter stats	char	block	block big	mmap memcpy	mmap write
seconds time elapsed	0.002350776	0.001894332	0.001848917	0.002023421	0.002016262
task-clock	1.335467	1.008275	0.954096	1.053580	1.013171
context-switches	1	1	1	1	1
cpu-migrations	0	0	0	0	0
page-faults	132	132	132	117	111
cycles		731,955	769,188	606,346	616,053
stalled-cycles-frontend	696,792	625,996	648,758	579,552	552,678
stalled-cycles-backend	536,087	495,138	513,756	455,248	434,162
branches	153,252	125,526	127,090	102,813	99,467
branch-misses	3,292		2,232	2,474

The char method works the slowest as expected.

The block method works slightly faster than all other methods excluding the char method which is significantly slower.

If you wish to do more than just copying a file, other methods may give good starting point for such programs. For example, if many programs access the same file simultaneously, use of mmap(2) should have major advantage over simple block method. I will list all the source of cp.c as below.

Read copy a file (char)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h>     /* exit */
#include <stdio.h>      /* printf, perror */
#include <errno.h>      /* perror */
#include <locale.h>     /* setlocale */

int
main(int argc, char* argv[])
{
    FILE *fi, *fo;
    int i;
    if (argc != 3) {
        printf("E: Need 2 filenames as arguments.\n");
        return EXIT_FAILURE;
    }
    if ((fi = fopen(argv[1], "r")) == NULL) {
        perror("E: Can not open input file");
        return EXIT_FAILURE;
    }
    if ((fo = fopen(argv[2], "w")) == NULL) {
        perror("E: Can not open output file");
        return EXIT_FAILURE;
    }
    for (;;) {
        i = getc(fi);
        if (ferror(fi)) {
            perror("E: Error reading input file");
            exit(EXIT_FAILURE);
        } else if (feof(fi)) {
            break;
        } else {
            putc(i, fo);
        }
    }
    if (fclose(fi)) {
        perror("E: Can not close input file");
        exit(EXIT_FAILURE);
    }
    if (fclose(fo)) {
        perror("E: Can not close output file");
        exit(EXIT_FAILURE);
    }
    return EXIT_SUCCESS;
}

Copy a file (block)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h>  /* printf, perror, freed */
#include <errno.h>  /* perror */
#include <locale.h> /* setlocale */
#define BUFFSIZE (1024*4)

int
main(int argc, char* argv[])
{
    FILE *fi, *fo;
    char *buf;
    size_t n = BUFFSIZE, i;
    if (argc != 3) {
        printf("E: Need 2 filenames as arguments.\n");
        return EXIT_FAILURE;
    }
    if ((fi = fopen(argv[1], "r")) == NULL) {
        perror("E: Can not open input file");
        return EXIT_FAILURE;
    }
    if ((fo = fopen(argv[2], "w")) == NULL) {
        perror("E: Can not open output file");
        return EXIT_FAILURE;
    }
    if ((buf = (char *) malloc(n)) == NULL) {
        perror("E: Can not make a buffer");
        exit(EXIT_FAILURE);
    }
    for (;;) {
        i = fread(buf, 1, n, fi);
        if (ferror(fi)) {
            perror("E: Error reading input file");
            exit(EXIT_FAILURE);
        } else if (i == 0) {
            break;
        } else {
            if (fwrite(buf, 1, i, fo) != i) {
                perror("E: Error writing output file");
                exit(EXIT_FAILURE);
            }
        }
    }
    if (fclose(fi)) {
        perror("E: Can not close input file");
        exit(EXIT_FAILURE);
    }
    if (fclose(fo)) {
        perror("E: Can not close output file");
        exit(EXIT_FAILURE);
    }
    return EXIT_SUCCESS;
}

Copy a file (big block) replaces #define BUFFSIZE (1024*4) in the above with #define BUFFSIZE (1024*1024*4).

Copy a file (mmap+memcpy)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h>  /* printf, perror */
#include <errno.h>  /* perror */
#include <fcntl.h>      /* open */
#include <unistd.h>     /* lseek, write */
#include <sys/stat.h>   /* open */
#include <sys/types.h>  /* open, lseek */
#include <sys/mman.h>   /* mmap */
#include <locale.h> /* setlocale */
#include <string.h>     /* memcpy */
int
main(int argc, char* argv[])
{
    int fdi, fdo;
    void *src, *dst;
    size_t size;
    ssize_t offset;
    if (argc != 3) {
        printf("E: Need 2 filenames as arguments.\n");
        return EXIT_FAILURE;
    }
    if ((fdi = open(argv[1], O_RDONLY)) < 0) {
        perror("E: Can not open input file");
        exit(EXIT_FAILURE);
    }
    size = lseek(fdi, 0, SEEK_END);
    src = mmap(NULL, size, PROT_READ, MAP_SHARED, fdi, 0);
    if (src == (void *) -1) {
        perror("E: Can not map input file");
        exit(EXIT_FAILURE);
    }
    if ((fdo = open(argv[2], O_CREAT | O_RDWR | O_TRUNC, 00666)) < 0) {
        perror("E: Can not open output file");
        exit(EXIT_FAILURE);
    }
    offset = lseek(fdo, size -1, SEEK_SET);
    if (offset == (ssize_t) -1) {
        perror("E: lseek() error");
        exit(EXIT_FAILURE);
    }
    offset = write(fdo, "", 1); /* dummy write at the end */
    if (offset == (ssize_t) -1) {
        perror("E: write() error");
        exit(EXIT_FAILURE);
    }
    dst = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fdo, 0);
    if (dst == MAP_FAILED) {
        perror("E: Can not map output file");
        exit(EXIT_FAILURE);
    }
    memcpy(dst, src, size);
    if (munmap(src, size)) {
        perror("E: Can not unmap input file");
        exit(EXIT_FAILURE);
    }
    if (munmap(dst, size)) {
        perror("E: Can not unmap output file");
        exit(EXIT_FAILURE);
    }
    if (close(fdi)) {
        perror("E: Can not close input file");
        exit(EXIT_FAILURE);
    }
    if (close(fdo)) {
        perror("E: Can not close output file");
        exit(EXIT_FAILURE);
    }
    return EXIT_SUCCESS;
}

TIP: The dummy write of 1 byte of 0 to the output file after lseek is the idiom to set the size of the output file with mmap(2). It will be overwritten by the subsequent memcpy(3).

Copy a file (mmap+write)

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h>     /* exit, malloc */
#include <stdio.h>      /* printf, perror */
#include <errno.h>      /* perror */
#include <fcntl.h>      /* open */
#include <unistd.h>     /* stat */
#include <sys/types.h>  /* open, lseek */
#include <sys/stat.h>   /* open stat */
#include <sys/mman.h>   /* mmap */
#include <locale.h>     /* setlocale */
#include <string.h>     /* memcpy */
int
main(int argc, char* argv[])
{
    int fdi, fdo;
    struct stat st;
    void *src;
    size_t size;
    ssize_t offset;
    if (argc != 3) {
        printf("E: Need 2 filenames as arguments.\n");
        return EXIT_FAILURE;
    }
    if ((fdi = open(argv[1], O_RDONLY)) < 0) {
        perror("E: Can not open input file");
        exit(EXIT_FAILURE);
    }
    if (stat(argv[1], &st) == -1) {
        perror("E: Can not stat input file");
        exit(EXIT_FAILURE);
    }
    size = st.st_size;
    src = mmap(NULL, size, PROT_READ, MAP_SHARED, fdi, 0);
    if (src == (void *) -1) {
        perror("E: Can not map input file");
        exit(EXIT_FAILURE);
    }
    if ((fdo = open(argv[2], O_CREAT | O_RDWR | O_TRUNC, 00666)) < 0) {
        perror("E: Can not open output file");
        exit(EXIT_FAILURE);
    }
    offset = write(fdo, src, size);
    if (offset == (ssize_t) -1) {
        perror("E: write() error");
        exit(EXIT_FAILURE);
    }
    if (munmap(src, size)) {
        perror("E: Can not unmap input file");
        exit(EXIT_FAILURE);
    }
    if (close(fdi)) {
        perror("E: Can not close input file");
        exit(EXIT_FAILURE);
    }
    if (close(fdo)) {
        perror("E: Can not close output file");
        exit(EXIT_FAILURE);
    }
    return EXIT_SUCCESS;
}

Example of compiling cp.c.

$ gcc -Wall -o cp cp.c

libc: setlocale

For decimal conversion functions provided by the libc library such as printf(3), the 3-digit-grouping behavior depends on the locale. Use setlocale(3) to set the locale.

Localization example of printf as lprintf.c

/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h>
#include <stdio.h>
#include <locale.h>

int
main(int argc, char* argv[])
{
    int n = 12345678;
    printf("init        %%i  -> %i\n", n);
    printf("init        %%'i -> %'i\n", n);
    setlocale(LC_ALL,"");
    printf("env         %%i  -> %i\n", n);
    printf("env         %%'i -> %'i\n", n);
    setlocale(LC_ALL,"C");
    printf("C           %%i  -> %i\n", n);
    printf("C           %%'i -> %'i\n", n);
    setlocale(LC_ALL,"en_US.UTF-8");
    printf("en_US.UTF-8 %%i  -> %i\n", n);
    printf("en_US.UTF-8 %%'i -> %'i\n", n);
    setlocale(LC_ALL,"fr_FR.UTF-8");
    printf("fr_FR.UTF-8 %%i  -> %i\n", n);
    printf("fr_FR.UTF-8 %%'i -> %'i\n", n);
    setlocale(LC_ALL,"de_DE.UTF-8");
    printf("de_DE.UTF-8 %%i  -> %i\n", n);
    printf("de_DE.UTF-8 %%'i -> %'i\n", n);
    setlocale(LC_ALL,"ja_JP.UTF-8");
    printf("ja_JP.UTF-8 %%i  -> %i\n", n);
    printf("ja_JP.UTF-8 %%'i -> %'i\n", n);
    return EXIT_SUCCESS;
}

Run lprintf

$ gcc -Wall -o lprintf lprintf.c
$ ./lprintf
init        %i  -> 12345678
init        %'i -> 12345678
env         %i  -> 12345678
env         %'i -> 12,345,678
C           %i  -> 12345678
C           %'i -> 12345678
en_US.UTF-8 %i  -> 12345678
en_US.UTF-8 %'i -> 12,345,678
fr_FR.UTF-8 %i  -> 12345678
fr_FR.UTF-8 %'i -> 12 345 678
de_DE.UTF-8 %i  -> 12345678
de_DE.UTF-8 %'i -> 12.345.678
ja_JP.UTF-8 %i  -> 12345678
ja_JP.UTF-8 %'i -> 12,345,678

TIP: The text translation mechanism also uses the locale. See gettext(3) and “info gettext”.

libm

Although most of standard C library functions are included in the libc library, some math related library functions are in the separate libm library.

So such program requires to be linked not just to libc but also to libm.

Let’s consider math.c to calculate sin(60 degree).

Source code: math.c

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double
sindeg(double x)
{
    return sin(x * 3.141592 / 180.0);
}

int
main()
{
    double x, y;
    x = 60.0;
    y = sindeg(x);
    printf("angle = %f degree, sin(angle) = %f\n", x, y);
    exit(0);
}

Let’s compile math.c while linking to the libm library to create an ELF executable object math and run it.

$ gcc -Wall -omath math.c -lm
$ ./math
angle = 60.000000 degree, sin(angle) = 0.866025

TIP: The linked library is specified after the -l option to GCC while removing the leading lib from the library name.

Let’s list linked libraries to the ELF executable object hello.

$ ldd math
ldd: ./math: No such file or directory

linux-vdso.so.1 : Linux Virtual Dynamic Shared Object
libm.so.6 : The GNU C Library (glibc, support math functions)
libc.so.6 : The GNU C Library (glibc)
/lib64/ld-linux-x86-64.so.2 : dynamic linker/loader

We can split math.c into 3 files and compile each code piece.

Source code: math-main.c containing the main() function only

#include <stdio.h>
#include <stdlib.h>
#include "sindeg.h"
int main()
{
    double x, y;
    x = 60.0;
    y = sindeg(x);
    printf("angle = %f degree, sin(angle) = %f\n", x, y);
    exit(0);
}

Source code: sindeg.c containing the sindeg() function only

#include <math.h>
double sindeg(double x)
{
    return sin(x * 3.141592 / 180.0);
}

Source code: sindeg.h containing the header information of sindeg()

double sindeg(double x);

Let’s compile these into separate object files (*.o files) with the -c option and link them into a executable math-main specified with the -o option.

Building math-main via separate object files and running it.

$ gcc -Wall -c math-main.c
$ gcc -Wall -c sindeg.c
$ gcc -Wall -o math-main math-main.o sindeg.o -lm
$ ./math-main
angle = 60.000000 degree, sin(angle) = 0.866025

libdl

libdl provides the following generic functions for the dynamic loader.

dlopen(3): POSIX
dlerror(3): POSIX
dlsym(3): POSIX
dlclose(3): POSIX
dladdr(3): Glibc extensions
dlvsym(3): Glibc extensions

Let’s convert math.c to math-dl.c which uses libm via dynamic linking loader provided by libdl. Here, function names prefixed with v are defined as wrappers for the dlopen, dlsym, and dlclose functions providing verbose error reporting. So the core is just main().

Source code: math-dl.c

#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

void *
vdlopen(const char *filename, int flag) {
    void *dl_handle;
    dl_handle = dlopen(filename, RTLD_LAZY);
    if (!dl_handle) {
        printf("Load %s library, error: %s\n", filename, dlerror());
        exit(EXIT_FAILURE);
    } else {
        printf("Load %s library, success!\n", filename);
    }
    return dl_handle;
}

void *
vdlsym(void *handle, const char *symbol) {
    double (*func)(double); /* Yep ... this is only for double */
    func = dlsym(handle, symbol);
    if (!func) {
        printf("Load %s symbol, error: %s\n", symbol, dlerror());
        exit(EXIT_FAILURE);
    } else {
        printf("Load %s symbol, success!\n", symbol);
    }
    return func;
}

int
vdlclose(void *handle) {
    int i;
    i = dlclose(handle);
    if (i) {
        printf("Unload error %i: %s\n", i, dlerror());
        exit(EXIT_FAILURE);
    } else {
        printf("Unload, success!\n");
    }
    return i;
}

int
main()
{
    double x, y;
    x = 60.0;
    char *lib = "libm.so";
    char *method = "sin";
    void *dl_handle;
    double (*func)(double);
    dl_handle = vdlopen(lib, RTLD_LAZY);
    func = vdlsym(dl_handle, method);
    y = (*func)(x * 3.141592 / 180.0);
    printf("angle = %f degree, sin(angle) = %f\n", x, y);
    vdlclose(dl_handle);
    exit(0);
}

Let’s compile this math-dl.c by not directly linking to libm but to libdl.

Building math-dl and running it.

$ gcc -Wall -rdynamic -omath-dl math-dl.c -ldl
$ ./math-dl
Load libm.so library, success!
Load sin symbol, success!
angle = 60.000000 degree, sin(angle) = 0.866025
Unload, success!

The -rdynamic option is used to add all symbols to the dynamic symbol table to permit backtraces with the use of dlopen. The -ldl option is for the libdl.so library. You do not need the -lm option here but need libm.so library installed in the system.

libpthread

The thread can efficiently implement parallelism for shared memory multiprocessor architectures, such as SMPs. The thread creation does not copy ever resources like the fork-exec multiprocessing mechanism of the UNIX-like system. POSIX thread is supported by the modern GNU/Linux (with [Linux kernel 2.6 or newer]) with the libpthread library. Here are some references and tutorials.

Introduction to Parallel Computing
POSIX Threads Programming
POSIX threads explained (updated, based on IBM developerWorks: POSIX threads explained.)

TIP: We should focus on reading tutorials which are written after the native POSIX thread library (NPTL) support. This means tutorial should be newer than 2006.

The actual execution speed of a program on the modern CPU can be affected by many issues other than the utilization of CPU cores:

Out-of-order execution
Instruction-level parallelism
Instruction pipeline
CPU cache
Cache coherence
http://en.wikipedia.org/wiki/Amdahl's_law[Amdahl’s law]

I am no expert to tell how all these fit together. But seeing is believing. Let’s use a touched-up C programs to list prime numbers based on while-loop style in C with the list (variants) to to experiment with the POSIX thread programming. This algorithm has some sequential nature. So the task of partitioning the program into parallel and mostly independent code is non-trivial. The observed execution time figures are significantly different.

prime5.c: fast
- single-threaded program, a uninterrupted tight code.
prime6.c: slow
- multi-threaded program, a thread started for each integer.
prime7.c: very slow
- multi-threaded program, fixed number of threads are started and controlled via semaphore.
prime8.c: very fast
- multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.

Here is a benchmark results of execution times for these programs listed below.

Speed benchmark of various program languages

Program	real(2^20)	user(2^20)	sys(2^20)	real(2^24)	user(2^24)	sys(2^24)
`prime5.c`	0.15	0.15	0.00	5.62	5.60	0.02
`prime6.c`	0.48	0.09	1.15	7.38	1.52	18.30
`prime7.c`	0.77	0.26	1.80	12.01	5.18	27.82
`prime8.c`	0.05	0.28	0.00	1.61	10.27	0.08

Here, the time reported by the /usr/bin/time -p command is in POSIX standard 1003.2 style:

real: Elapsed real (wall clock) time used by the process, in seconds.
user: Total number of CPU-seconds that the process used directly (in user mode), in seconds.
sys: Total number of CPU-seconds used by the system on behalf of the process (in kernel mode), in seconds.

It seems that the user time and the sys time add up all multi-threaded time figures so they may end up in larger figure than the real time figure for multi-threaded programs. There are a similar sar command offered by the sysstat and atsar packages which has more functionalities. But if you are looking for more insight for the time, you should consider using the perf command. See Debug: level 4; perf.

TIP: Unless properly used, the use of the thread mechanism does not guarantee the faster code.

`prime5.c`: single-threaded program, a uninterrupted tight code.

Source code for the C program: prime5.c

#include <stdlib.h>
#include <stdio.h>
#define TRUE 1
#define FALSE 0

struct _primelist {
    long prime; 
    struct _primelist *next;
    };
typedef struct _primelist primelist;

primelist *head=NULL, *tail=NULL;

int checkprime(long n) {
    primelist *p;
    long i, n_div_i, n_mod_i;
    int flag;
    flag = TRUE;
    p = head;
    while(p) {
        i = p->prime;
        n_div_i = n / i;
        n_mod_i = n % i;
        if (n_mod_i == 0) {
            flag = FALSE;
            break; /* found not to be prime */
        }
        if (n_div_i < i) {
            break; /* no use doing more i-loop if n < i*i */
        }
        p = p->next;
    }
    return flag;
}

int main(int argc, char **argv) {
    primelist *p=NULL, *q=NULL;
    long n, n_max;
    n_max = atol(argv[1]);
    head = calloc(1, sizeof(primelist));
    tail = head;
    tail->prime = 2;
    n = 2;
    while(n <= n_max) {
        n++;
        if (checkprime(n)) {
            q= calloc(1, sizeof(primelist));
            tail->next = q;
            tail = q;
            tail->prime = n;
        }
    }
    p=head;
    while(p) {
        printf ("%ld\n", p->prime);
        p = p->next;
    }
    p=head;
    while(p) {
        q = p->next;
	    free(p);
	    p = q;
    }
    return EXIT_SUCCESS;
}

Behavior of the C program: prime5.c

$ /usr/bin/time -p ./prime5 "$(echo 2^20 | bc)">/dev/null
real 0.15
user 0.15
sys 0.00
$ /usr/bin/time -p ./prime5 "$(echo 2^24 | bc)">/dev/null
real 5.62
user 5.60
sys 0.02

`prime6.c`: multi-threaded program, a thread started for each integer.

Source code for the Vala program: prime6.c

#include <stdlib.h>
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#define TRUE 1
#define FALSE 0
#define TMAX 64L

struct _primelist {
    long prime; 
    struct _primelist *next;
    };
typedef struct _primelist primelist;

struct _thdata {
    pthread_t           th;
    int                 flag;
};
typedef struct _thdata thdata;

primelist *head=NULL, *tail=NULL;

thdata       thd[TMAX];

int checkprime(long n) {
    primelist *p;
    long i, n_div_i, n_mod_i;
    int flag;
    flag = TRUE;
    p = head;
    while(p) {
        i = p->prime;
        n_div_i = n / i;
        n_mod_i = n % i;
        if (n_mod_i == 0) {
            flag = FALSE;
            break; /* found not to be prime */
        }
        if (n_div_i < i) {
            break; /* no use doing more i-loop if n < i*i */
        }
        p = p->next;
    }
    return flag;
}

int main(int argc, char **argv) {
    primelist *p=NULL, *q=NULL;
    long n, n_max, m;
    n_max = atol(argv[1]);
    /* thdata = calloc(TMAX, sizeof(thdata)); */
    head = calloc(1, sizeof(primelist));
    tail = head;
    tail->prime = 2;
    n = 2; /* last number checking for prime */
    m = 2; /* last number checked  for prime */
    while(m < n_max) {
        if ((n + 1 - m < TMAX) && (n + 1 <= m * m) && (n + 1 <= n_max)) {
            n = n + 1;
            /* start checkprime(n) */
            if (pthread_create(&thd[n%TMAX].th,
                        NULL,
                        (void *) checkprime,
                        (void *) n) ) {
                printf ("E: error creating thread at %li\n", n);
            }
        }
        if ((n + 1 - m >= TMAX) || (n + 1 > m * m) || (n + 1 > n_max) ) {
            m++;
            /* close checkprime(m) */
            pthread_join(thd[m%TMAX].th, (void *) &thd[m%TMAX].flag);
            if (thd[m%TMAX].flag) {
                /* if prime, update list with m */
                q = calloc(1, sizeof(primelist));
        	    tail->next = q;
        	    tail = q;
            	tail->prime = m;
            }
        }
    }
    p=head;
    while(p) {
        printf ("%ld\n", p->prime);
        p = p->next;
    }
    p=head;
    while(p) {
        q = p->next;
	    /* free(p); */
	    p = q;
    }
    return EXIT_SUCCESS;
}

Behavior of the C program: prime6.c

$ /usr/bin/time -p ./prime6 "$(echo 2^16 | bc)">/dev/null
real 0.48
user 0.09
sys 1.15
$ /usr/bin/time -p ./prime6 "$(echo 2^20 | bc)">/dev/null
real 7.38
user 1.52
sys 18.30

`prime7.c`: multi-threaded program, fixed number of threads are started and controlled via semaphore.

Source code for the Vala program: prime7.c

#include <stdlib.h>
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#define TRUE 1
#define FALSE 0
#define TMAX 64L

struct _primelist {
    long prime; 
    struct _primelist *next;
    };
typedef struct _primelist primelist;

struct _thdata {
    pthread_t           th;
    long                n;
    int                 flag;
    sem_t               read_ready;
    sem_t               read_done;
    sem_t               write_ready;
    sem_t               write_done;
};
typedef struct _thdata thdata;

primelist *head=NULL, *tail=NULL;

thdata       thd[TMAX];

int checkprime(long n) {
    primelist *p;
    long i, n_div_i, n_mod_i;
    int flag;
    flag = TRUE;
    p = head;
    while(p) {
        i = p->prime;
        n_div_i = n / i;
        n_mod_i = n % i;
        if (n_mod_i == 0) {
            flag = FALSE;
            break; /* found not to be prime */
        }
        if (n_div_i < i) {
            break; /* no use doing more i-loop if n < i*i */
        }
        p = p->next;
    }
    return flag;
}

void subthread(thdata *thd) {
    while(TRUE) {
        sem_post(&(thd->read_ready));
        sem_wait(&(thd->read_done));
        thd->flag = checkprime(thd->n);
        sem_post(&(thd->write_ready));
        sem_wait(&(thd->write_done));
    }
}

int main(int argc, char **argv) {
    primelist *p=NULL, *q=NULL;
    long n, n_max, m;
    int i;
    n_max = atol(argv[1]);
    head = calloc(1, sizeof(primelist));
    tail = head;
    tail->prime = 2;
    for (i=0;i<TMAX;i++){
        sem_init(&thd[i].read_ready, 0, 0);
        sem_init(&thd[i].read_done, 0, 0);
        sem_init(&thd[i].write_ready, 0, 0);
        sem_init(&thd[i].write_done, 0, 0);
        if (pthread_create(&thd[i].th,
                NULL, 
                (void *) subthread, 
                (void *) &(thd[i]) ) ) {
            printf ("E: error creating thread at %i\n", i);
        }
    }
    n = 2; /* last number started   checking of prime*/
    m = 2; /* last number completed checking of prime */
    while(m < n_max) {
        if ((n + 1 - m < TMAX) && (n + 1 <= m * m) && (n + 1 <= n_max)) {
            n = n + 1;
            /* start checkprime(n) */
            sem_wait(&(thd[n%TMAX].read_ready));
            thd[n%TMAX].n = n;
            sem_post(&(thd[n%TMAX].read_done));
        }
        if ((n + 1 - m >= TMAX) || (n >= m * m) || (n >= n_max) ) {
            m++;
            /* close checkprime(m) */
            sem_wait(&(thd[m%TMAX].write_ready));
            if (thd[m%TMAX].flag) {
                /* if prime, update list with m */
                q = calloc(1, sizeof(primelist));
        	    tail->next = q;
        	    tail = q;
            	tail->prime = m;
            }
            sem_post(&(thd[m%TMAX].write_done));
        }
    }
    for (i=0;i<TMAX;i++){
        if (pthread_cancel(thd[i].th)) {
            printf ("E: error canseling thread at %i\n", i);
        }
        
    }
    p=head;
    while(p) {
        printf ("%ld\n", p->prime);
        p = p->next;
    }
    p=head;
    while(p) {
        q = p->next;
	    /* free(p); */
	    p = q;
    }
    return EXIT_SUCCESS;
}

Behavior of the C program: prime7.c

$ /usr/bin/time -p ./prime7 "$(echo 2^16 | bc)">/dev/null
real 0.77
user 0.26
sys 1.80
$ /usr/bin/time -p ./prime7 "$(echo 2^20 | bc)">/dev/null
real 12.01
user 5.18
sys 27.82

`prime8.c`: multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.

Source code for the Vala program: prime8.c

#include <stdlib.h>
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#define TRUE 1
#define FALSE 0
#define TMAX 64L

struct _primelist {
    long prime; 
    struct _primelist *next;
    };
typedef struct _primelist primelist;

primelist *head=NULL, *tail=NULL;

struct _thdata {
    pthread_t           th;
    long                n0;
    long                n1;
    primelist           *head;
    primelist           *tail;
};
typedef struct _thdata thdata;

thdata       thd[TMAX];

int checkprime(long n) {
    primelist *p;
    long i, n_div_i, n_mod_i;
    int flag;
    flag = TRUE;
    p = head;
    while(p) {
        i = p->prime;
        n_div_i = n / i;
        n_mod_i = n % i;
        if (n_mod_i == 0) {
            flag = FALSE;
            break; /* found not to be prime */
        }
        if (n_div_i < i) {
            break; /* no use doing more i-loop if n < i*i */
        }
        p = p->next;
    }
    return flag;
}

void subthread(thdata *thd) {
    long i;
    primelist *p=NULL, *q=NULL;
    thd->head = NULL;
    for (i = thd->n0; i <= thd->n1; i++) {
        if (checkprime(i)) {
            q = calloc(1, sizeof(primelist));
            q->prime = i;
            if (!thd->head) {
                thd->head = q;
                p = q;
            } else {
                p->next = q;
                p = q;
            }
            thd->tail = q;
        }
    }
}

int main(int argc, char **argv) {
    primelist *p=NULL, *q=NULL;
    long n, n_max, i, nd;
    n_max = atol(argv[1]);
    head = calloc(1, sizeof(primelist));
    tail = head;
    tail->prime = 2;
    n = 2;
    while((n - 1) * (n - 1) <= n_max) {
        n++;
        if (checkprime(n)) {
            q= calloc(1, sizeof(primelist));
            tail->next = q;
            tail = q;
            tail->prime = n;
        }
    }
    nd = (n_max - n ) / (long) TMAX + 1L;
    for (i=0; i < TMAX; i++) {
        /* TMAX thread of checkprime loop */
        thd[i].n0 = n;
        thd[i].n1 = n + nd;
        if (thd[i].n1 >= n_max) {
            thd[i].n1 = n_max;
        }
	n = thd[i].n1;
        if (pthread_create(&thd[i].th,
                NULL, 
                (void *) subthread, 
                (void *) &(thd[i]) ) ) {
            printf ("E: error creating thread at %li\n", i);
        }
    }
    for (i=0; i < TMAX; i++) {
        /* TMAX thread of checkprime loop */
        if (pthread_join(thd[i].th, (void *) NULL) ) {
            printf ("E: error joining thread at %li\n", i);
        }
        tail->next = thd[i].head;
        tail = thd[i].tail;
    }

    p=head;
    while(p) {
        printf ("%ld\n", p->prime);
        p = p->next;
    }
    p=head;
    while(p) {
        q = p->next;
	    free(p);
	    p = q;
    }
    return EXIT_SUCCESS;
}

Behavior of the C program: prime8.c

$ /usr/bin/time -p ./prime8 "$(echo 2^20 | bc)">/dev/null
real 0.05
user 0.28
sys 0.00
$ /usr/bin/time -p ./prime8 "$(echo 2^24 | bc)">/dev/null
real 1.61
user 10.27
sys 0.08

Actually, this program is buggy for smaller than 1090. We will debug this later.

Buggy behavior of the C program for 1090: prime8.c

$ ./prime8 "1090">/dev/null; echo $?
Segmentation fault
139
$ ./prime8 "1091">/dev/null; echo $?
0

Top

Fun to Program – Library

Date: 2013/08/08 (initial publish), 2021/08/02 (last update)

Source: fun2-00008

TOC

Library

Static and dynamic libraries

GNU C Library

libc

libc: macro

libc: error.h

libc: string operations

libc: safe_strncpy

libc: file operations

libc: size of a file

libc: copy a file

libc: setlocale

libm

libdl

libpthread

`prime5.c`: single-threaded program, a uninterrupted tight code.

`prime6.c`: multi-threaded program, a thread started for each integer.

`prime7.c`: multi-threaded program, fixed number of threads are started and controlled via semaphore.

`prime8.c`: multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.

Fun to Program – Library

Date: 2013/08/08 (initial publish), 2021/08/02 (last update)

Source: fun2-00008

TOC

Library

Static and dynamic libraries

GNU C Library

libc

libc: macro

libc: error.h

libc: string operations

libc: safe_strncpy

libc: file operations

libc: size of a file

libc: copy a file

libc: setlocale

libm

libdl

libpthread

prime5.c: single-threaded program, a uninterrupted tight code.

prime6.c: multi-threaded program, a thread started for each integer.

prime7.c: multi-threaded program, fixed number of threads are started and controlled via semaphore.

prime8.c: multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.

`prime5.c`: single-threaded program, a uninterrupted tight code.

`prime6.c`: multi-threaded program, a thread started for each integer.

`prime7.c`: multi-threaded program, fixed number of threads are started and controlled via semaphore.

`prime8.c`: multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.