Previous Post | Top | Next Post |
TOC
This was originally written and created around 2013 and may require to be updated. (2021)
Library
Static and dynamic libraries
Compiling source while stopping at object file can be done with the -c
option.
You can bunch such object files into a single archive/object. This is called library.
- static library:
libfoo.a
- simple archive of object files (
*.o
) as “ar rcs libfoo.a *.o
” - *.a may be used just like bunch of
*.o
files while linking.
- simple archive of object files (
- dynamic library:
libfoo.so
- all object files (
*.o
) compiled withgcc
option-fPIC
. - shared object file created by “
gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o
”. - associated symbolic links created by “
ldconfig
”.
- all object files (
TIP: In order to load a library file with the GCC -l
option, its name must
start with lib
.
I do not go in details here but the gccintro
package provides a good tutorial
“Introduction to GCC by Brian J. Gough” with examples.
See also:
- How To Write Shared Libraries by Ulrich Drepper (2011-12-10)
- Good Practices in Library Design, Implementation, and Maintenance by Ulrich Drepper (2002-03-07)
- ELF Symbol Versioning by Ulrich Drepper
- Shared Library Search Paths by Russ Allbery (2011-11-11)
- RPATH issue (wiki.debian.org)
- Dynamic Linking in Linux and Windows part one and part two by Reji Thomas and Bhasker Reddy (2010-11-02)
- Library Interface Versioning in Solaris and Linux by David J. Brown and Karl Runge (October 2010-10-10)
- Learn Linux, 101: Manage shared libraries by Ian Shields (IBM DW, Date: 31 Aug 2011)
- Practice: Manage shared libraries by Tracy Bost (IBM DW, Date: 21 Jun 2011)
- Anatomy of Linux dynamic libraries by M. Tim Jones (IBM DW, Date: 20 Aug 2008)
- Use shared objects on Linux by Sachin Agrawal (IBM DW, Date: 11 May 2004)
- Program Library HOWTO by David A. Wheeler (TLDP, 11 April 2003)
- Shared objects for the object disoriented! by Ashish Bansal (IBM DW, Date: 01 Apr 2001)
GNU C Library
The Debian libc6:amd64
package offeres
embedded GNU C library which
contains the standard libraries that are used by nearly all programs on the
system.
Shared libraries offered by the libc6:amd64
package.
$ dpkg -L libc6:amd64|grep ^/lib/x86_64-linux-gnu/.*\.so$
/lib/x86_64-linux-gnu/libpthread-2.17.so
/lib/x86_64-linux-gnu/ld-2.17.so
/lib/x86_64-linux-gnu/libanl-2.17.so
/lib/x86_64-linux-gnu/libBrokenLocale-2.17.so
/lib/x86_64-linux-gnu/libc-2.17.so
/lib/x86_64-linux-gnu/libcidn-2.17.so
/lib/x86_64-linux-gnu/libcrypt-2.17.so
/lib/x86_64-linux-gnu/libdl-2.17.so
/lib/x86_64-linux-gnu/libm-2.17.so
/lib/x86_64-linux-gnu/libmemusage.so
/lib/x86_64-linux-gnu/libnsl-2.17.so
/lib/x86_64-linux-gnu/libnss_compat-2.17.so
/lib/x86_64-linux-gnu/libnss_dns-2.17.so
/lib/x86_64-linux-gnu/libnss_files-2.17.so
/lib/x86_64-linux-gnu/libnss_hesiod-2.17.so
/lib/x86_64-linux-gnu/libnss_nis-2.17.so
/lib/x86_64-linux-gnu/libnss_nisplus-2.17.so
/lib/x86_64-linux-gnu/libpcprofile.so
/lib/x86_64-linux-gnu/libresolv-2.17.so
/lib/x86_64-linux-gnu/librt-2.17.so
/lib/x86_64-linux-gnu/libSegFault.so
/lib/x86_64-linux-gnu/libthread_db-1.0.so
/lib/x86_64-linux-gnu/libutil-2.17.so
libc
Most of the standard C library functions are included in the libc
library.
You do not mention linking to the libc
library explicitly using the -l
option to GCC. It is always linked.
Most of the basic functions of the libc
library are explained in many C
programing tutorials such as “The C programming Language, by B. W. Kerninghan
and Dennis M. Ritchie”. I will skip most of those mentioned in such tutorials.
The GNU C Library manual is also good source of information.
libc: macro
There are some macros defined in the libc
library. They tend to make
programs easier to read.
Macro for exit status.
$ grep "#define.*EXIT_" /usr/include/stdlib.h
#define EXIT_FAILURE 1 /* Failing exit status. */
#define EXIT_SUCCESS 0 /* Successful exit status. */
TIP: The exit status value matches with the shell convention. But some
programs return -1
as the non-zero value instead of 1
when errors are
encountered.
TIP: Defining TRUE
and FALSE
macros as the Boolean context values for 1
and 0
are popular in the C program. They are not defined in the libc
library. So normally, user has to define them.
libc: error.h
Here are some notable items for the error handling of the libc
library.
- The
errno
integer variable is set to non-zero value when library functions encounter the error. - The
strerror(errno)
function returns a pointer to a string that describes the meaning of the error forerrno
. - The
perror("foo")
produces a message on the standard error output forerrno
with"foo: <error message>"
The macros for the error are explained in the manpage errno
(3).
The values of the macros for the error are defined in <errno.h>
header file which is a bit convoluted. Arch dependent symlinks are marked as (*).:
header file | action | target |
---|---|---|
<errno.h> |
defines | “extern int errno; ”. |
<errno.h> |
includes | <bits/errno.h> . |
<bits/errno.h> (*) |
includes | <linux/errno.h> . |
<linux/errno.h> |
includes | <asm/errno.h> . |
<asm/errno.h> (*) |
includes | <asm-generic/errno.h> . |
<asm-generic/errno.h> |
defines | many error values. |
<asm-generic/errno.h> |
includes | <asm-generic/errno-base.h> . |
<asm-generic/errno-base.h> |
defines | important error values from 1 to 34. |
TIP: Make sure to include <error.h>
in the header if a program needs to
deal with the libc
library generated errors.
libc: string operations
Unfortunately, some C string functions are known to be troublemakers.
Safe coding recommendations by busybox coders
troublemaker functions | overrun concerns | recommendation |
---|---|---|
strcpy (3) |
dest string |
safe_strncpy |
strncpy (3) |
may fail to 0-terminate dst |
safe_strncpy |
strcat (3) |
dest string |
strncat (3) |
gets (3) |
string it gets | fgets (3) |
getwd (3) |
buf string |
getcwd (3) |
[v]sprintf (3) |
str buffer |
[v]snprintf (3) |
realpath (3) |
path buffer |
use with pathconf (3) |
[vf]scanf (3) |
its arguments | just avoid it |
Although [vf]scanf
(3) are marked as “just avoid it”, it is not the end of the
world for the coding of the scanf
-like logic.
The combination of getline
(3) and sscanf
(3) is the most portable solution
for the safe scanf
alternative. If the incoming data is not delimited by the
newline “\n
” code, getdelim
(3) may alternatively be used in place of
getline
(3).
The use the “m
” flag in the format string, as described in scanf
(3) is the
other solution for the safe scanf
alternative. (You need newer POSIX.1-2008
complient libc
.) It uses “[vf]scanf("%ms", &char)
” with free
instead of
“[vf]scanf("%s", char)
” alone.
libc: safe_strncpy
The safe_strncpy
recommended by busybox coders is essentially the 0-terminate
guranteed strncpy
with the safe data length. Since it is missing in the
libc
library, it should be emulated by adding a custom function definition
as:
Alternative safe string copy function safe_strncpy
: safe_strncpy.c
.
#include <string.h>
char* safe_strncpy(char *dst, const char *src, size_t size)
{
if (size == 0) {
return dst;
} else {
size--;
dst[size] = '\0';
return strncpy(dst, src, size);
}
}
Alternative safe string copy function safe_strncpy
: safe_strncpy.h
.
char* safe_strncpy(char *dst, const char *src, size_t size);
Let’s test string copy functions: strcpy
, strncpy
, and safe_strncpy
.
Test code for string copy functions: test_strcpy.c
.
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "safe_strncpy.h"
char s1[10], s2[10];
int main(int argc, char ** argv)
{
char *sc1 = "0123456789"; /* 11 bytes */
char *sc2 = "I gotcha!"; /* 10 bytes */
printf("%s\n", "Constant strings");
printf("\tsc1 = %s\n", sc1);
printf("\tsc2 = %s\n", sc2);
safe_strncpy(s1, sc1, 10);
safe_strncpy(s2, sc2, 10);
printf("%s\n", "Copied strings: safe_strncpy\n"
"\t\tIt should drop 9 at the end to be safe.");
printf("\ts1 = %s\n", s1);
printf("\ts2 = %s\n", s2);
strncpy(s1, sc1, 10);
printf("%s\n", "Copied strings: strncpy(s1, sc1, 10)\n"
"\t\tprintf(..., s1) can not stop at the end.");
printf("\ts1 = %s\n", s1);
printf("\ts2 = %s\n", s2);
strcpy(s1, sc1);
printf("%s\n", "Copied strings: strcpy(s1, sc1)\n"
"\t\tstrcpy overwrites onto s2.");
printf("\ts1 = %s\n", s1);
printf("\ts2 = %s\n", s2);
return EXIT_SUCCESS;
}
Test result of string copy functions by test_strcpy.c
.
$ gcc -Wall -o test_strcpy safe_strncpy.c test_strcpy.c
$ ./test_strcpy
Constant strings
sc1 = 0123456789
sc2 = I gotcha!
Copied strings: safe_strncpy
It should drop 9 at the end to be safe.
s1 = 012345678
s2 = I gotcha!
Copied strings: strncpy(s1, sc1, 10)
printf(..., s1) can not stop at the end.
s1 = 0123456789
s2 = I gotcha!
Copied strings: strcpy(s1, sc1)
strcpy overwrites onto s2.
s1 = 0123456789
s2 = I gotcha!
Only safe_strncpy
works safely as seen above.
libc: file operations
File operation in C can be done with different levels.
- low level file descriptor based operations:
open
(2): open and possibly create the file and return a new file descriptorclose
(2): close the file descriptorlseak
(2): reposition read/write file offset associated with the file descriptorread
(2): read from the file descriptorwrite
(2): write to the file descriptormmap
(2): map file associated with the file descriptor into memoryfcntl
(2): manipulate file descriptor
Predefined file descriptor macros
$ grep FILENO /usr/include/unistd.h
#define STDIN_FILENO 0 /* Standard input. */
#define STDOUT_FILENO 1 /* Standard output. */
#define STDERR_FILENO 2 /* Standard error output. */
- high level stream IO based operations:
fopen
(3): open and possibly create the file and associate the stream with the filefclose
(3): close the streamfeof
(3): test the end-of-file indicator for the streamferror
(3): test the error indicator for the streamgetc
(3): read a character from the binary streamputc
(3): write a character to the binary streamfread
(3): read blocks of data from the binary streamfwrite
(3): write blocks of data to the binary streamfprintf
(3): formatted text stream output conversionfscanf
(3): formatted text stream input conversion
Predefined file stream macros
$ grep "# *define *std" /usr/include/stdio.h
#define stdin stdin
#define stdout stdout
#define stderr stderr
Let’s learn fundamentals of file operation by creating simple codes such as size of a file or copy a file. These example codes are not meant to be the fastest nor the shortest code.
libc: size of a file
We can think of 4 different methods to obtain the size of a file.
- char : read and count characters
- block : read blocks and count characters
- lseek : move file offset and count characters
- stat : obtain file size from the directory it belongs
Since stat method works only for real files but not for symlinks, lseek method seems to be the most popular one used.
Here are benchmark results of these methods using perf
(See Debug: level 4: perf).
Speed benchmark of various methods to obtain the file size.
Performance counter stats | char | block | lseek | stat |
---|---|---|---|---|
seconds time elapsed | 0.060168937 | 0.004092420 | 0.002638296 | 0.002588425 |
task-clock | 59.126037 | 3.051782 | 1.616662 | 1.634113 |
context-switches | 5 | 0 | 0 | 0 |
cpu-migrations | 1 | 1 | 0 | 1 |
page-faults | 193 | 193 | 191 | 191 |
cycles | 154,305,859 | 2,428,799 | 1,297,107 | |
stalled-cycles-frontend | 32,413,607 | 1,237,525 | 787,158 | 805,663 |
stalled-cycles-backend | 3,089,296 | 828,200 | 612,162 | 627,711 |
branches | 87,138,178 | 382,322 | 209,395 | 208,836 |
branch-misses | 14,795 | 9,023 |
If you wish to do more than just counting characters, other methods may give
good starting point for such programs. I will list all the source of size.c
as below.
Read size of a file (char)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdio.h> /* printf, perror */
#include <errno.h> /* perror */
#include <stdlib.h> /* exit */
#include <fcntl.h> /* open */
#include <sys/stat.h> /* open */
#include <sys/types.h> /* open lseek */
#include <unistd.h> /* lseek */
#include <locale.h> /* setlocale */
int
main(int argc, char* argv[])
{
FILE *f;
off_t size = 0;
if (argc != 2) {
printf("E: Need a filename as an argument.\n");
return EXIT_FAILURE;
}
f = fopen(argv[1], "r");
if (f == NULL) {
perror("E: Can not open input file");
exit(EXIT_FAILURE);
}
for (;;) {
fgetc(f);
if (ferror(f)) {
perror("E: Error reading input file");
exit(EXIT_FAILURE);
}
if (feof(f)) {
break;
} else {
size += 1;
}
}
if (fclose(f)) {
perror("E: Can not close input file");
exit(EXIT_FAILURE);
}
setlocale(LC_ALL,"");
return (printf("\nFile size: %'zi\n", size));
}
Read size of a file (block)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h> /* printf, perror, freed */
#include <errno.h> /* perror */
#include <locale.h> /* setlocale */
#define BUFFSIZE (1024*4)
int
main(int argc, char* argv[])
{
FILE *f;
char *buf;
size_t n = BUFFSIZE, i, size = 0;
if (argc != 2) {
printf("E: Need a filename as an argument.\n");
return EXIT_FAILURE;
}
if ((buf = (char *) malloc(n)) == NULL) {
perror("E: Can not make a buffer");
exit(EXIT_FAILURE);
}
if ((f = fopen(argv[1], "r")) == NULL) {
perror("E: Can not open input file");
exit(EXIT_FAILURE);
}
for (;;) {
i = fread(buf, 1, n, f);
if (ferror(f)) {
perror("E: Error reading input file");
exit(EXIT_FAILURE);
} else if (i == 0) {
break;
} else {
size += i;
}
}
if (fclose(f)) {
perror("E: Can not close input file");
exit(EXIT_FAILURE);
}
setlocale(LC_ALL,"");
return (printf("\nFile size: %'zi\n", size));
}
Read size of a file (lseek)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdio.h> /* printf, perror */
#include <errno.h> /* perror */
#include <stdlib.h> /* exit */
#include <fcntl.h> /* open */
#include <sys/stat.h> /* open */
#include <sys/types.h> /* open, lseek */
#include <unistd.h> /* lseek */
#include <locale.h> /* setlocale */
int
main(int argc, char* argv[])
{
int fd;
off_t size;
if (argc != 2) {
printf("E: Need a filename as an argument.\n");
return EXIT_FAILURE;
}
if ((fd = open(argv[1], O_RDONLY)) == -1) {
perror("E: Can not open input file");
exit(EXIT_FAILURE);
}
size = lseek(fd, 0, SEEK_END);
setlocale(LC_ALL,"");
printf("\nFile size: %'zi\n", size);
return EXIT_SUCCESS;
}
Read size of a file (stat)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdio.h> /* printf, perror */
#include <errno.h> /* perror */
#include <stdlib.h> /* exit */
#include <sys/types.h> /* stat */
#include <sys/stat.h> /* stat */
#include <unistd.h> /* stat */
#include <locale.h> /* setlocale */
int
main(int argc, char* argv[])
{
struct stat st;
off_t size;
if (argc != 2) {
printf("E: Need a filename as an argument.\n");
return EXIT_FAILURE;
}
if (stat(argv[1], &st) == -1) {
perror("E: Can not stat input file");
exit(EXIT_FAILURE);
}
size = st.st_size;
setlocale(LC_ALL,"");
printf("\nFile size: %'zi\n", size);
return EXIT_SUCCESS;
}
All the above example can be compiled as follows.
Example of compiling size.c
.
$ gcc -Wall -o size size.c
libc: copy a file
We can think of 5 different methods to copy a file.
- char : copy a character at a time
- block : copy a block (4 KiB) at a time
- block big : copy a block (4 MiB) at a time
- mmap memcpy : use
mmap
(2) to map input and output files while copying data withmemcpy
(3). - mmap write : use
mmap
(2) to map input file while writing data withwrite
(2) from the memory.
Here are benchmark results of these methods using perf
(See Debug: level 4: perf).
Speed benchmark of various methods to copy a large file about 2.4 MiB.
Performance counter stats | char | block | block big | mmap memcpy | mmap write |
---|---|---|---|---|---|
seconds time elapsed | 0.089096423 | 0.015654253 | 0.016963338 | 0.018813373 | 0.018040972 |
task-clock | 78.600617 | 5.916846 | 7.286212 | 8.114573 | 6.747344 |
context-switches | 12 | 5 | 5 | 5 | 5 |
cpu-migrations | 0 | 0 | 0 | 0 | 0 |
page-faults | 132 | 132 | 715 | 1,283 | 694 |
cycles | 213,622,432 | 8,787,602 | 9,975,035 | 9,506,449 | 8,258,812 |
stalled-cycles-frontend | 46,249,968 | 5,289,960 | 4,939,101 | 4,771,343 | 4,421,422 |
stalled-cycles-backend | 8,069,075 | 3,617,957 | 3,660,122 | 3,965,894 | 3,223,960 |
branches | 118,042,429 | 1,614,855 | 1,639,492 | 1,555,953 | 1,442,130 |
branch-misses | 20,800 | 12,048 | 10,295 | 8,539 |
Speed benchmark of various methods to copy a small file about 2 KiB.
Performance counter stats | char | block | block big | mmap memcpy | mmap write |
---|---|---|---|---|---|
seconds time elapsed | 0.002350776 | 0.001894332 | 0.001848917 | 0.002023421 | 0.002016262 |
task-clock | 1.335467 | 1.008275 | 0.954096 | 1.053580 | 1.013171 |
context-switches | 1 | 1 | 1 | 1 | 1 |
cpu-migrations | 0 | 0 | 0 | 0 | 0 |
page-faults | 132 | 132 | 132 | 117 | 111 |
cycles | 731,955 | 769,188 | 606,346 | 616,053 | |
stalled-cycles-frontend | 696,792 | 625,996 | 648,758 | 579,552 | 552,678 |
stalled-cycles-backend | 536,087 | 495,138 | 513,756 | 455,248 | 434,162 |
branches | 153,252 | 125,526 | 127,090 | 102,813 | 99,467 |
branch-misses | 3,292 | 2,232 | 2,474 |
The char method works the slowest as expected.
The block method works slightly faster than all other methods excluding the char method which is significantly slower.
If you wish to do more than just copying a file, other methods may give good
starting point for such programs. For example, if many programs access the
same file simultaneously, use of mmap
(2) should have major advantage over
simple block method. I will list all the source of cp.c
as below.
Read copy a file (char)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit */
#include <stdio.h> /* printf, perror */
#include <errno.h> /* perror */
#include <locale.h> /* setlocale */
int
main(int argc, char* argv[])
{
FILE *fi, *fo;
int i;
if (argc != 3) {
printf("E: Need 2 filenames as arguments.\n");
return EXIT_FAILURE;
}
if ((fi = fopen(argv[1], "r")) == NULL) {
perror("E: Can not open input file");
return EXIT_FAILURE;
}
if ((fo = fopen(argv[2], "w")) == NULL) {
perror("E: Can not open output file");
return EXIT_FAILURE;
}
for (;;) {
i = getc(fi);
if (ferror(fi)) {
perror("E: Error reading input file");
exit(EXIT_FAILURE);
} else if (feof(fi)) {
break;
} else {
putc(i, fo);
}
}
if (fclose(fi)) {
perror("E: Can not close input file");
exit(EXIT_FAILURE);
}
if (fclose(fo)) {
perror("E: Can not close output file");
exit(EXIT_FAILURE);
}
return EXIT_SUCCESS;
}
Copy a file (block)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h> /* printf, perror, freed */
#include <errno.h> /* perror */
#include <locale.h> /* setlocale */
#define BUFFSIZE (1024*4)
int
main(int argc, char* argv[])
{
FILE *fi, *fo;
char *buf;
size_t n = BUFFSIZE, i;
if (argc != 3) {
printf("E: Need 2 filenames as arguments.\n");
return EXIT_FAILURE;
}
if ((fi = fopen(argv[1], "r")) == NULL) {
perror("E: Can not open input file");
return EXIT_FAILURE;
}
if ((fo = fopen(argv[2], "w")) == NULL) {
perror("E: Can not open output file");
return EXIT_FAILURE;
}
if ((buf = (char *) malloc(n)) == NULL) {
perror("E: Can not make a buffer");
exit(EXIT_FAILURE);
}
for (;;) {
i = fread(buf, 1, n, fi);
if (ferror(fi)) {
perror("E: Error reading input file");
exit(EXIT_FAILURE);
} else if (i == 0) {
break;
} else {
if (fwrite(buf, 1, i, fo) != i) {
perror("E: Error writing output file");
exit(EXIT_FAILURE);
}
}
}
if (fclose(fi)) {
perror("E: Can not close input file");
exit(EXIT_FAILURE);
}
if (fclose(fo)) {
perror("E: Can not close output file");
exit(EXIT_FAILURE);
}
return EXIT_SUCCESS;
}
Copy a file (big block) replaces #define BUFFSIZE (1024*4)
in the above with
#define BUFFSIZE (1024*1024*4)
.
Copy a file (mmap+memcpy)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h> /* printf, perror */
#include <errno.h> /* perror */
#include <fcntl.h> /* open */
#include <unistd.h> /* lseek, write */
#include <sys/stat.h> /* open */
#include <sys/types.h> /* open, lseek */
#include <sys/mman.h> /* mmap */
#include <locale.h> /* setlocale */
#include <string.h> /* memcpy */
int
main(int argc, char* argv[])
{
int fdi, fdo;
void *src, *dst;
size_t size;
ssize_t offset;
if (argc != 3) {
printf("E: Need 2 filenames as arguments.\n");
return EXIT_FAILURE;
}
if ((fdi = open(argv[1], O_RDONLY)) < 0) {
perror("E: Can not open input file");
exit(EXIT_FAILURE);
}
size = lseek(fdi, 0, SEEK_END);
src = mmap(NULL, size, PROT_READ, MAP_SHARED, fdi, 0);
if (src == (void *) -1) {
perror("E: Can not map input file");
exit(EXIT_FAILURE);
}
if ((fdo = open(argv[2], O_CREAT | O_RDWR | O_TRUNC, 00666)) < 0) {
perror("E: Can not open output file");
exit(EXIT_FAILURE);
}
offset = lseek(fdo, size -1, SEEK_SET);
if (offset == (ssize_t) -1) {
perror("E: lseek() error");
exit(EXIT_FAILURE);
}
offset = write(fdo, "", 1); /* dummy write at the end */
if (offset == (ssize_t) -1) {
perror("E: write() error");
exit(EXIT_FAILURE);
}
dst = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fdo, 0);
if (dst == MAP_FAILED) {
perror("E: Can not map output file");
exit(EXIT_FAILURE);
}
memcpy(dst, src, size);
if (munmap(src, size)) {
perror("E: Can not unmap input file");
exit(EXIT_FAILURE);
}
if (munmap(dst, size)) {
perror("E: Can not unmap output file");
exit(EXIT_FAILURE);
}
if (close(fdi)) {
perror("E: Can not close input file");
exit(EXIT_FAILURE);
}
if (close(fdo)) {
perror("E: Can not close output file");
exit(EXIT_FAILURE);
}
return EXIT_SUCCESS;
}
TIP: The dummy write
of 1 byte of 0 to the output file after lseek
is the
idiom to set the size of the output file with mmap
(2). It will be
overwritten by the subsequent memcpy
(3).
Copy a file (mmap+write)
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h> /* exit, malloc */
#include <stdio.h> /* printf, perror */
#include <errno.h> /* perror */
#include <fcntl.h> /* open */
#include <unistd.h> /* stat */
#include <sys/types.h> /* open, lseek */
#include <sys/stat.h> /* open stat */
#include <sys/mman.h> /* mmap */
#include <locale.h> /* setlocale */
#include <string.h> /* memcpy */
int
main(int argc, char* argv[])
{
int fdi, fdo;
struct stat st;
void *src;
size_t size;
ssize_t offset;
if (argc != 3) {
printf("E: Need 2 filenames as arguments.\n");
return EXIT_FAILURE;
}
if ((fdi = open(argv[1], O_RDONLY)) < 0) {
perror("E: Can not open input file");
exit(EXIT_FAILURE);
}
if (stat(argv[1], &st) == -1) {
perror("E: Can not stat input file");
exit(EXIT_FAILURE);
}
size = st.st_size;
src = mmap(NULL, size, PROT_READ, MAP_SHARED, fdi, 0);
if (src == (void *) -1) {
perror("E: Can not map input file");
exit(EXIT_FAILURE);
}
if ((fdo = open(argv[2], O_CREAT | O_RDWR | O_TRUNC, 00666)) < 0) {
perror("E: Can not open output file");
exit(EXIT_FAILURE);
}
offset = write(fdo, src, size);
if (offset == (ssize_t) -1) {
perror("E: write() error");
exit(EXIT_FAILURE);
}
if (munmap(src, size)) {
perror("E: Can not unmap input file");
exit(EXIT_FAILURE);
}
if (close(fdi)) {
perror("E: Can not close input file");
exit(EXIT_FAILURE);
}
if (close(fdo)) {
perror("E: Can not close output file");
exit(EXIT_FAILURE);
}
return EXIT_SUCCESS;
}
Example of compiling cp.c
.
$ gcc -Wall -o cp cp.c
libc: setlocale
For decimal conversion functions provided by the libc
library such as
printf
(3), the 3-digit-grouping behavior depends on the locale. Use
setlocale
(3) to set the locale.
Localization example of printf
as lprintf.c
/* vi:set ts=4 sts=4 expandtab: */
#include <stdlib.h>
#include <stdio.h>
#include <locale.h>
int
main(int argc, char* argv[])
{
int n = 12345678;
printf("init %%i -> %i\n", n);
printf("init %%'i -> %'i\n", n);
setlocale(LC_ALL,"");
printf("env %%i -> %i\n", n);
printf("env %%'i -> %'i\n", n);
setlocale(LC_ALL,"C");
printf("C %%i -> %i\n", n);
printf("C %%'i -> %'i\n", n);
setlocale(LC_ALL,"en_US.UTF-8");
printf("en_US.UTF-8 %%i -> %i\n", n);
printf("en_US.UTF-8 %%'i -> %'i\n", n);
setlocale(LC_ALL,"fr_FR.UTF-8");
printf("fr_FR.UTF-8 %%i -> %i\n", n);
printf("fr_FR.UTF-8 %%'i -> %'i\n", n);
setlocale(LC_ALL,"de_DE.UTF-8");
printf("de_DE.UTF-8 %%i -> %i\n", n);
printf("de_DE.UTF-8 %%'i -> %'i\n", n);
setlocale(LC_ALL,"ja_JP.UTF-8");
printf("ja_JP.UTF-8 %%i -> %i\n", n);
printf("ja_JP.UTF-8 %%'i -> %'i\n", n);
return EXIT_SUCCESS;
}
Run lprintf
$ gcc -Wall -o lprintf lprintf.c
$ ./lprintf
init %i -> 12345678
init %'i -> 12345678
env %i -> 12345678
env %'i -> 12,345,678
C %i -> 12345678
C %'i -> 12345678
en_US.UTF-8 %i -> 12345678
en_US.UTF-8 %'i -> 12,345,678
fr_FR.UTF-8 %i -> 12345678
fr_FR.UTF-8 %'i -> 12 345 678
de_DE.UTF-8 %i -> 12345678
de_DE.UTF-8 %'i -> 12.345.678
ja_JP.UTF-8 %i -> 12345678
ja_JP.UTF-8 %'i -> 12,345,678
TIP: The text translation mechanism also uses the locale. See gettext
(3) and “info gettext
”.
libm
Although most of standard C library functions are included in the libc
library,
some math related library functions are in the separate libm
library.
So such program requires to be linked not just to libc
but also to libm
.
Let’s consider math.c
to calculate sin(60 degree)
.
Source code: math.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double
sindeg(double x)
{
return sin(x * 3.141592 / 180.0);
}
int
main()
{
double x, y;
x = 60.0;
y = sindeg(x);
printf("angle = %f degree, sin(angle) = %f\n", x, y);
exit(0);
}
Let’s compile math.c
while linking to the libm
library to create an ELF executable object math
and run it.
$ gcc -Wall -omath math.c -lm
$ ./math
angle = 60.000000 degree, sin(angle) = 0.866025
TIP: The linked library is specified after the -l
option to GCC while removing the leading lib
from the library name.
Let’s list linked libraries to the ELF executable object hello
.
$ ldd math
ldd: ./math: No such file or directory
linux-vdso.so.1
: Linux Virtual Dynamic Shared Objectlibm.so.6
: The GNU C Library (glibc, support math functions)libc.so.6
: The GNU C Library (glibc)/lib64/ld-linux-x86-64.so.2
: dynamic linker/loader
We can split math.c
into 3 files and compile each code piece.
Source code: math-main.c containing the main() function only
#include <stdio.h>
#include <stdlib.h>
#include "sindeg.h"
int main()
{
double x, y;
x = 60.0;
y = sindeg(x);
printf("angle = %f degree, sin(angle) = %f\n", x, y);
exit(0);
}
Source code: sindeg.c containing the sindeg() function only
#include <math.h>
double sindeg(double x)
{
return sin(x * 3.141592 / 180.0);
}
Source code: sindeg.h containing the header information of sindeg()
double sindeg(double x);
Let’s compile these into separate object files (*.o
files) with the -c
option
and link them into a executable math-main
specified with the -o
option.
Building math-main via separate object files and running it.
$ gcc -Wall -c math-main.c
$ gcc -Wall -c sindeg.c
$ gcc -Wall -o math-main math-main.o sindeg.o -lm
$ ./math-main
angle = 60.000000 degree, sin(angle) = 0.866025
libdl
libdl
provides the following generic functions for the dynamic loader.
dlopen
(3): POSIXdlerror
(3): POSIXdlsym
(3): POSIXdlclose
(3): POSIXdladdr
(3): Glibc extensionsdlvsym
(3): Glibc extensions
Let’s convert math.c
to math-dl.c
which uses libm
via dynamic linking
loader provided by libdl
. Here, function names prefixed with v
are defined
as wrappers for the dlopen
, dlsym
, and dlclose
functions providing
verbose error reporting. So the core is just main()
.
Source code: math-dl.c
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
void *
vdlopen(const char *filename, int flag) {
void *dl_handle;
dl_handle = dlopen(filename, RTLD_LAZY);
if (!dl_handle) {
printf("Load %s library, error: %s\n", filename, dlerror());
exit(EXIT_FAILURE);
} else {
printf("Load %s library, success!\n", filename);
}
return dl_handle;
}
void *
vdlsym(void *handle, const char *symbol) {
double (*func)(double); /* Yep ... this is only for double */
func = dlsym(handle, symbol);
if (!func) {
printf("Load %s symbol, error: %s\n", symbol, dlerror());
exit(EXIT_FAILURE);
} else {
printf("Load %s symbol, success!\n", symbol);
}
return func;
}
int
vdlclose(void *handle) {
int i;
i = dlclose(handle);
if (i) {
printf("Unload error %i: %s\n", i, dlerror());
exit(EXIT_FAILURE);
} else {
printf("Unload, success!\n");
}
return i;
}
int
main()
{
double x, y;
x = 60.0;
char *lib = "libm.so";
char *method = "sin";
void *dl_handle;
double (*func)(double);
dl_handle = vdlopen(lib, RTLD_LAZY);
func = vdlsym(dl_handle, method);
y = (*func)(x * 3.141592 / 180.0);
printf("angle = %f degree, sin(angle) = %f\n", x, y);
vdlclose(dl_handle);
exit(0);
}
Let’s compile this math-dl.c
by not directly linking to libm
but to libdl
.
Building math-dl and running it.
$ gcc -Wall -rdynamic -omath-dl math-dl.c -ldl
$ ./math-dl
Load libm.so library, success!
Load sin symbol, success!
angle = 60.000000 degree, sin(angle) = 0.866025
Unload, success!
The -rdynamic
option is used to add all symbols to the dynamic symbol table
to permit backtraces with the use of dlopen. The -ldl
option is for the
libdl.so
library. You do not need the -lm
option here but need libm.so
library installed in the system.
libpthread
The thread can efficiently implement parallelism for shared memory multiprocessor architectures, such as SMPs. The thread creation does not copy ever resources like the fork-exec multiprocessing mechanism of the UNIX-like system. POSIX thread is supported by the modern GNU/Linux (with [Linux kernel 2.6 or newer]) with the libpthread
library. Here are some references and tutorials.
- Introduction to Parallel Computing
- POSIX Threads Programming
- POSIX threads explained (updated, based on IBM developerWorks: POSIX threads explained.)
TIP: We should focus on reading tutorials which are written after the native POSIX thread library (NPTL) support. This means tutorial should be newer than 2006.
The actual execution speed of a program on the modern CPU can be affected by many issues other than the utilization of CPU cores:
- Out-of-order execution
- Instruction-level parallelism
- Instruction pipeline
- CPU cache
- Cache coherence
- http://en.wikipedia.org/wiki/Amdahl's_law[Amdahl’s law]
I am no expert to tell how all these fit together. But seeing is believing. Let’s use a touched-up C programs to list prime numbers based on while
-loop style in C with the list (variants) to to experiment with the POSIX thread programming. This algorithm has some sequential nature. So the task of partitioning the program into parallel and mostly independent code is non-trivial. The observed execution time figures are significantly different.
prime5.c
: fast- single-threaded program, a uninterrupted tight code.
prime6.c
: slow- multi-threaded program, a thread started for each integer.
prime7.c
: very slow- multi-threaded program, fixed number of threads are started and controlled via semaphore.
prime8.c
: very fast- multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.
Here is a benchmark results of execution times for these programs listed below.
Speed benchmark of various program languages
Program | real(2^20) | user(2^20) | sys(2^20) | real(2^24) | user(2^24) | sys(2^24) |
---|---|---|---|---|---|---|
prime5.c |
0.15 | 0.15 | 0.00 | 5.62 | 5.60 | 0.02 |
prime6.c |
0.48 | 0.09 | 1.15 | 7.38 | 1.52 | 18.30 |
prime7.c |
0.77 | 0.26 | 1.80 | 12.01 | 5.18 | 27.82 |
prime8.c |
0.05 | 0.28 | 0.00 | 1.61 | 10.27 | 0.08 |
Here, the time reported by the /usr/bin/time -p
command is in POSIX standard 1003.2 style:
real
: Elapsed real (wall clock) time used by the process, in seconds.user
: Total number of CPU-seconds that the process used directly (in user mode), in seconds.sys
: Total number of CPU-seconds used by the system on behalf of the process (in kernel mode), in seconds.
It seems that the user
time and the sys
time add up all multi-threaded time figures so they may end up in larger figure than the real
time figure for multi-threaded programs. There are a similar sar
command offered by the sysstat
and atsar
packages which has more functionalities. But if you are looking for more insight for the time, you should consider using the perf
command. See Debug: level 4; perf.
TIP: Unless properly used, the use of the thread mechanism does not guarantee the faster code.
prime5.c
: single-threaded program, a uninterrupted tight code.
Source code for the C program: prime5.c
#include <stdlib.h>
#include <stdio.h>
#define TRUE 1
#define FALSE 0
struct _primelist {
long prime;
struct _primelist *next;
};
typedef struct _primelist primelist;
primelist *head=NULL, *tail=NULL;
int checkprime(long n) {
primelist *p;
long i, n_div_i, n_mod_i;
int flag;
flag = TRUE;
p = head;
while(p) {
i = p->prime;
n_div_i = n / i;
n_mod_i = n % i;
if (n_mod_i == 0) {
flag = FALSE;
break; /* found not to be prime */
}
if (n_div_i < i) {
break; /* no use doing more i-loop if n < i*i */
}
p = p->next;
}
return flag;
}
int main(int argc, char **argv) {
primelist *p=NULL, *q=NULL;
long n, n_max;
n_max = atol(argv[1]);
head = calloc(1, sizeof(primelist));
tail = head;
tail->prime = 2;
n = 2;
while(n <= n_max) {
n++;
if (checkprime(n)) {
q= calloc(1, sizeof(primelist));
tail->next = q;
tail = q;
tail->prime = n;
}
}
p=head;
while(p) {
printf ("%ld\n", p->prime);
p = p->next;
}
p=head;
while(p) {
q = p->next;
free(p);
p = q;
}
return EXIT_SUCCESS;
}
Behavior of the C program: prime5.c
$ /usr/bin/time -p ./prime5 "$(echo 2^20 | bc)">/dev/null
real 0.15
user 0.15
sys 0.00
$ /usr/bin/time -p ./prime5 "$(echo 2^24 | bc)">/dev/null
real 5.62
user 5.60
sys 0.02
prime6.c
: multi-threaded program, a thread started for each integer.
Source code for the Vala program: prime6.c
#include <stdlib.h>
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#define TRUE 1
#define FALSE 0
#define TMAX 64L
struct _primelist {
long prime;
struct _primelist *next;
};
typedef struct _primelist primelist;
struct _thdata {
pthread_t th;
int flag;
};
typedef struct _thdata thdata;
primelist *head=NULL, *tail=NULL;
thdata thd[TMAX];
int checkprime(long n) {
primelist *p;
long i, n_div_i, n_mod_i;
int flag;
flag = TRUE;
p = head;
while(p) {
i = p->prime;
n_div_i = n / i;
n_mod_i = n % i;
if (n_mod_i == 0) {
flag = FALSE;
break; /* found not to be prime */
}
if (n_div_i < i) {
break; /* no use doing more i-loop if n < i*i */
}
p = p->next;
}
return flag;
}
int main(int argc, char **argv) {
primelist *p=NULL, *q=NULL;
long n, n_max, m;
n_max = atol(argv[1]);
/* thdata = calloc(TMAX, sizeof(thdata)); */
head = calloc(1, sizeof(primelist));
tail = head;
tail->prime = 2;
n = 2; /* last number checking for prime */
m = 2; /* last number checked for prime */
while(m < n_max) {
if ((n + 1 - m < TMAX) && (n + 1 <= m * m) && (n + 1 <= n_max)) {
n = n + 1;
/* start checkprime(n) */
if (pthread_create(&thd[n%TMAX].th,
NULL,
(void *) checkprime,
(void *) n) ) {
printf ("E: error creating thread at %li\n", n);
}
}
if ((n + 1 - m >= TMAX) || (n + 1 > m * m) || (n + 1 > n_max) ) {
m++;
/* close checkprime(m) */
pthread_join(thd[m%TMAX].th, (void *) &thd[m%TMAX].flag);
if (thd[m%TMAX].flag) {
/* if prime, update list with m */
q = calloc(1, sizeof(primelist));
tail->next = q;
tail = q;
tail->prime = m;
}
}
}
p=head;
while(p) {
printf ("%ld\n", p->prime);
p = p->next;
}
p=head;
while(p) {
q = p->next;
/* free(p); */
p = q;
}
return EXIT_SUCCESS;
}
Behavior of the C program: prime6.c
$ /usr/bin/time -p ./prime6 "$(echo 2^16 | bc)">/dev/null
real 0.48
user 0.09
sys 1.15
$ /usr/bin/time -p ./prime6 "$(echo 2^20 | bc)">/dev/null
real 7.38
user 1.52
sys 18.30
prime7.c
: multi-threaded program, fixed number of threads are started and controlled via semaphore.
Source code for the Vala program: prime7.c
#include <stdlib.h>
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#define TRUE 1
#define FALSE 0
#define TMAX 64L
struct _primelist {
long prime;
struct _primelist *next;
};
typedef struct _primelist primelist;
struct _thdata {
pthread_t th;
long n;
int flag;
sem_t read_ready;
sem_t read_done;
sem_t write_ready;
sem_t write_done;
};
typedef struct _thdata thdata;
primelist *head=NULL, *tail=NULL;
thdata thd[TMAX];
int checkprime(long n) {
primelist *p;
long i, n_div_i, n_mod_i;
int flag;
flag = TRUE;
p = head;
while(p) {
i = p->prime;
n_div_i = n / i;
n_mod_i = n % i;
if (n_mod_i == 0) {
flag = FALSE;
break; /* found not to be prime */
}
if (n_div_i < i) {
break; /* no use doing more i-loop if n < i*i */
}
p = p->next;
}
return flag;
}
void subthread(thdata *thd) {
while(TRUE) {
sem_post(&(thd->read_ready));
sem_wait(&(thd->read_done));
thd->flag = checkprime(thd->n);
sem_post(&(thd->write_ready));
sem_wait(&(thd->write_done));
}
}
int main(int argc, char **argv) {
primelist *p=NULL, *q=NULL;
long n, n_max, m;
int i;
n_max = atol(argv[1]);
head = calloc(1, sizeof(primelist));
tail = head;
tail->prime = 2;
for (i=0;i<TMAX;i++){
sem_init(&thd[i].read_ready, 0, 0);
sem_init(&thd[i].read_done, 0, 0);
sem_init(&thd[i].write_ready, 0, 0);
sem_init(&thd[i].write_done, 0, 0);
if (pthread_create(&thd[i].th,
NULL,
(void *) subthread,
(void *) &(thd[i]) ) ) {
printf ("E: error creating thread at %i\n", i);
}
}
n = 2; /* last number started checking of prime*/
m = 2; /* last number completed checking of prime */
while(m < n_max) {
if ((n + 1 - m < TMAX) && (n + 1 <= m * m) && (n + 1 <= n_max)) {
n = n + 1;
/* start checkprime(n) */
sem_wait(&(thd[n%TMAX].read_ready));
thd[n%TMAX].n = n;
sem_post(&(thd[n%TMAX].read_done));
}
if ((n + 1 - m >= TMAX) || (n >= m * m) || (n >= n_max) ) {
m++;
/* close checkprime(m) */
sem_wait(&(thd[m%TMAX].write_ready));
if (thd[m%TMAX].flag) {
/* if prime, update list with m */
q = calloc(1, sizeof(primelist));
tail->next = q;
tail = q;
tail->prime = m;
}
sem_post(&(thd[m%TMAX].write_done));
}
}
for (i=0;i<TMAX;i++){
if (pthread_cancel(thd[i].th)) {
printf ("E: error canseling thread at %i\n", i);
}
}
p=head;
while(p) {
printf ("%ld\n", p->prime);
p = p->next;
}
p=head;
while(p) {
q = p->next;
/* free(p); */
p = q;
}
return EXIT_SUCCESS;
}
Behavior of the C program: prime7.c
$ /usr/bin/time -p ./prime7 "$(echo 2^16 | bc)">/dev/null
real 0.77
user 0.26
sys 1.80
$ /usr/bin/time -p ./prime7 "$(echo 2^20 | bc)">/dev/null
real 12.01
user 5.18
sys 27.82
prime8.c
: multi-threaded program, fixed number of threads are started only for the time consuming large number portion while each thread is written as a uninterrupted tight code.
Source code for the Vala program: prime8.c
#include <stdlib.h>
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#define TRUE 1
#define FALSE 0
#define TMAX 64L
struct _primelist {
long prime;
struct _primelist *next;
};
typedef struct _primelist primelist;
primelist *head=NULL, *tail=NULL;
struct _thdata {
pthread_t th;
long n0;
long n1;
primelist *head;
primelist *tail;
};
typedef struct _thdata thdata;
thdata thd[TMAX];
int checkprime(long n) {
primelist *p;
long i, n_div_i, n_mod_i;
int flag;
flag = TRUE;
p = head;
while(p) {
i = p->prime;
n_div_i = n / i;
n_mod_i = n % i;
if (n_mod_i == 0) {
flag = FALSE;
break; /* found not to be prime */
}
if (n_div_i < i) {
break; /* no use doing more i-loop if n < i*i */
}
p = p->next;
}
return flag;
}
void subthread(thdata *thd) {
long i;
primelist *p=NULL, *q=NULL;
thd->head = NULL;
for (i = thd->n0; i <= thd->n1; i++) {
if (checkprime(i)) {
q = calloc(1, sizeof(primelist));
q->prime = i;
if (!thd->head) {
thd->head = q;
p = q;
} else {
p->next = q;
p = q;
}
thd->tail = q;
}
}
}
int main(int argc, char **argv) {
primelist *p=NULL, *q=NULL;
long n, n_max, i, nd;
n_max = atol(argv[1]);
head = calloc(1, sizeof(primelist));
tail = head;
tail->prime = 2;
n = 2;
while((n - 1) * (n - 1) <= n_max) {
n++;
if (checkprime(n)) {
q= calloc(1, sizeof(primelist));
tail->next = q;
tail = q;
tail->prime = n;
}
}
nd = (n_max - n ) / (long) TMAX + 1L;
for (i=0; i < TMAX; i++) {
/* TMAX thread of checkprime loop */
thd[i].n0 = n;
thd[i].n1 = n + nd;
if (thd[i].n1 >= n_max) {
thd[i].n1 = n_max;
}
n = thd[i].n1;
if (pthread_create(&thd[i].th,
NULL,
(void *) subthread,
(void *) &(thd[i]) ) ) {
printf ("E: error creating thread at %li\n", i);
}
}
for (i=0; i < TMAX; i++) {
/* TMAX thread of checkprime loop */
if (pthread_join(thd[i].th, (void *) NULL) ) {
printf ("E: error joining thread at %li\n", i);
}
tail->next = thd[i].head;
tail = thd[i].tail;
}
p=head;
while(p) {
printf ("%ld\n", p->prime);
p = p->next;
}
p=head;
while(p) {
q = p->next;
free(p);
p = q;
}
return EXIT_SUCCESS;
}
Behavior of the C program: prime8.c
$ /usr/bin/time -p ./prime8 "$(echo 2^20 | bc)">/dev/null
real 0.05
user 0.28
sys 0.00
$ /usr/bin/time -p ./prime8 "$(echo 2^24 | bc)">/dev/null
real 1.61
user 10.27
sys 0.08
Actually, this program is buggy for smaller than 1090. We will debug this later.
Buggy behavior of the C program for 1090: prime8.c
$ ./prime8 "1090">/dev/null; echo $?
Segmentation fault
139
$ ./prime8 "1091">/dev/null; echo $?
0
Previous Post | Top | Next Post |