Fun to Program – GCC

Date: 2013/08/07 (initial publish), 2021/08/02 (last update)

Source: en/fun2-00007.md

Previous Post Top Next Post

TOC

This was originally written and created around 2013 and may require to be updated. (2021)

GCC

The gccintro package provides a good tutorial “Introduction to GCC by Brian J. Gough” for the GCC basics to compile C programs.

GCC version

Check gcc version and defaults:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.8.1-9' --with-bu...
Thread model: posix
gcc version 4.8.1 (Debian 4.8.1-9)

Basic options

Basic GCC syntax from the top few lines of its manpage:

gcc [-c|-S|-E] [-std=standard]
    [-g] [-pg] [-Olevel]
    [-Wwarn...] [-pedantic]
    [-Idir...] [-Ldir...]
    [-Dmacro[=defn]...] [-Umacro]
    [-foption...] [-mmachine-option...]
    [-o outfile] [@file] infile...

The manpage for gcc is too long. Here are the part I should remember.

Please note that gcc uses no space after the command switch and a single leading - even for long option.

The current C defualt is -std=gnu90 which is GNU dialect of ISO C90 including some C99 features.

The current C++ defualt is -std=gnu++98 which is GNU dialect of 1998 ISO C++ standard plus amendments including some C++11 features.

TIP: The meaning of inline in C is different between the default -std=gnu90 and the rest of the world (-std=gnu99|-std=c99|...). See An Inline Function is As Fast As a Macro.

Assembler code

The GCC with the -S option produces the assembler code output written in the AT&T assembler style as shown in the “Hello World!” example.

It is not so difficult to grock roghly what the GCC generated assembler code does. (Writing some code in the assembler from scratch requires serious knowledge.)

Some basic register names, command mnemonic names, and command mnemonic suffix conventions need to be noted.

TIP: “mov op1, op2” moves data “op1 -> op2” in the AT&T assembler style (GCC default); while “mov op1, op2” moves data “op1 <- op2” in the Intel assembler style (NASM default). These are in the opposite order.

Examples of assembly codes

AT&T Intel quasi-C
movq $0x12345678, %rax mov rax, 12345678h rax = 0x12345678
movq $0xff, %rax mov rax, 0ffh rax = 0xff
movq -8(%rbp), %rax mov rax, [rbp-8] rax = *(rbp - 8)
movq -0x10(%rbp, %rdx, 8), %rax mov rax, [rbp+rdx*8-10h] rax = *(rbp + rdx * 8 - 0x10)
movq (%rcx), %rax mov rax, [rcx] rax = *(rcx)
movq %rcx, %rax mov rax, rcx rax = rcx
leaq 8(,%rcx,8), %rax lea rax, [rcx*8+8] rax = rcx * 8 + 8
leaq (%rbx,%rcx,4), %rax lea rax, [rbx+rcx*4] rax = rbx + rcx * 4

Some basic 64-bit (= 8 bytes) integer ABI conventions under the x86-64 (amd64) Linux need to be noted.

There are some memory alignment requirements of x86-64 under GCC/Linux.

TIP: These register usages and function call conventions are architecture and OS specific. For example, i386 passes all function arguments in the stack by pushing them in the right-to-left order.

TIP: There are some strange situation on fdivp and fdivrp: Debian Bug #372528: as/i386: reverses fdivp and fdivrp

String in C function

This tricky problem of string in C function becomes simple when you inspect the code under the assembler.

Here is a C code string-array.c which manipulates a string.

string-array.c with “char[]”

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char** argv) {
    char foo[] = "abcdefgh";
    printf("Before foo[] = '%s'\n\n", foo);
    foo[3] = '@';
    printf("After  foo[] = '%s'\n\n", foo);
    return EXIT_SUCCESS;
}

This string-array.c compiles fine and runs without problem.

Compile and run of string-array.c

$ gcc -o string-array string-array.c
$ ./string-array
Before foo[] = 'abcdefgh'

After  foo[] = 'abc@efgh'

Here is a similar looking buggy C code string-pointer.c.

string-pointer.c with “char *”

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char** argv) {
    char* bar = "abcdefgh";
    printf("Before bar* = '%s'\n\n", bar);
    bar[3] = '@';
    printf("After  bar* = '%s'\n\n", bar);
    return EXIT_SUCCESS;
}

This string-pointer.c compiles fine but fails to run.

Compile and run of string-pointer.c

$ gcc -o string-pointer string-pointer.c
$ ./string-pointer
Segmentation fault

This reason can be elucidated by looking into their assembler codes by compiling with the -S option.

Assembler code from string-array.c

$ gcc -S string-array.c
$ cat string-array.s
    .file    "string-array.c"
    .section    .rodata
.LC0:
    .string    "Before foo[] = '%s'\n\n"
.LC1:
    .string    "After  foo[] = '%s'\n\n"
    .text
    .globl    main
    .type    main, @function
main:
.LFB2:
    .cfi_startproc
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $32, %rsp
    movl    %edi, -20(%rbp)
    movq    %rsi, -32(%rbp)
    movabsq    $7523094288207667809, %rax
    movq    %rax, -16(%rbp)
    movb    $0, -8(%rbp)
    leaq    -16(%rbp), %rax
    movq    %rax, %rsi
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    movb    $64, -13(%rbp)
    leaq    -16(%rbp), %rax
    movq    %rax, %rsi
    movl    $.LC1, %edi
    movl    $0, %eax
    call    printf
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE2:
    .size    main, .-main
    .ident    "GCC: (Debian 4.8.1-9) 4.8.1"
    .section    .note.GNU-stack,"",@progbits

Here, upon execution of main function, the stack space for storing data[] is dynamically secured and the value of “abcdefgh” is stored into the stack space by the somewhat obfuscated assignment operation as below:

        movl    $1684234849, -16(%rbp)
        movl    $1751606885, -12(%rbp)

Please note x86-64 (=amd64) is little endian architecture (LSB first memory mapping) thus 'a' = 0x61 comes first in the stack.

Assembler code from string-pointer.c

$ gcc -S string-pointer.c
$ cat string-pointer.s
    .file    "string-pointer.c"
    .section    .rodata
.LC0:
    .string    "abcdefgh"
.LC1:
    .string    "Before bar* = '%s'\n\n"
.LC2:
    .string    "After  bar* = '%s'\n\n"
    .text
    .globl    main
    .type    main, @function
main:
.LFB2:
    .cfi_startproc
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $32, %rsp
    movl    %edi, -20(%rbp)
    movq    %rsi, -32(%rbp)
    movq    $.LC0, -8(%rbp)
    movq    -8(%rbp), %rax
    movq    %rax, %rsi
    movl    $.LC1, %edi
    movl    $0, %eax
    call    printf
    movq    -8(%rbp), %rax
    addq    $3, %rax
    movb    $64, (%rax)
    movq    -8(%rbp), %rax
    movq    %rax, %rsi
    movl    $.LC2, %edi
    movl    $0, %eax
    call    printf
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE2:
    .size    main, .-main
    .ident    "GCC: (Debian 4.8.1-9) 4.8.1"
    .section    .note.GNU-stack,"",@progbits

Here, the value of “abcdefgh” is stored in the section marked as .rodata, i.e., read-only. So the ./string-pointer command tries to overwrite this read-only data and causes segmentation error.

This execution time error can be moved to compilation time error by adding “const” to the line defining the string.

Compile error for string-const-pointer.c with “const char *”

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char** argv) {
    const char* bar = "abcdefgh";
    printf("Before bar* = '%s'\n\n", bar);
    bar[3] = '@';
    printf("After  bar* = '%s'\n\n", bar);
    return EXIT_SUCCESS;
}

This string-const-pointer.c fails to compile.

Compile error of string-const-pointer.c

$ gcc -o string-const-pointer string-const-pointer.c
string-const-pointer.c: In function ‘main’:
string-const-pointer.c:8:5: error: assignment of read-only location ‘*(bar + 3u)’...
     bar[3] = '@';
     ^

Buffer overflow protection

Enabling macro _FORTIFY_SOURCE with -D option substitutes high risk functions in the GNU libc library to protect against the buffer overflow risk. This requires gcc to be run with -O1 or higher optimization. This works on all CPU architectures as long as the source code is linked to the GNU libc library.

GCC’s Stack Smashing Protector (SSP) to protect against the buffer overflow risk of unknown cause was developed by IBM and originally called ProPolice. This only works on some CPU architectures. SSP can be enabled by the GCC flag:

Let’s try these compiler options using an example bof.c code having the buffer overflow risk.

bof.c with the buffer overflow risk:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define DESTLEN 8
int main(int argc, char** argv)
{
    char dest[DESTLEN];
    if (argc == 2) {
    	printf(">>> Before the possible buffer over flow >>>\n");
    	strcpy(dest, argv[1]);
    	printf("<<< After the possible buffer over flow <<<\n");
    } else {
    	fprintf(stderr,"Usage: %s ARG\n", argv[0]);
    	fprintf(stderr,"  Length(ARG) < %i bytes\n", DESTLEN);
    	exit(EXIT_FAILURE);
    }
    return EXIT_SUCCESS;
}

Buffer overflow protection: None

$ gcc -fno-stack-protector -o bof-unsafe bof.c
$ ./bof-unsafe "0123456789"
>>> Before the possible buffer over flow >>>
<<< After the possible buffer over flow <<<

Buffer overflow protection: -D_FORTIFY_SOURCE=2

$ gcc -D_FORTIFY_SOURCE=2 -O2 -o bof-fortify bof.c
$ ./bof-fortify "0123456789"
*** buffer overflow detected ***: ./bof-fortify terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x2aaaaadcbd17]
/lib/x86_64-linux-gnu/libc.so.6(+0xfbcd0)[0x2aaaaadcacd0]
./bof-fortify[0x400578]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2aaaaacf0995]
./bof-fortify[0x4005f5]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fe:01 3146660                            /path/to...
00600000-00601000 rw-p 00000000 fe:01 3146660                            /path/to...
01231000-01252000 rw-p 00000000 00:00 0                                  [heap]
2aaaaaaab000-2aaaaaacc000 r-xp 00000000 fe:01 655581                     /lib/x86...
2aaaaaacc000-2aaaaaad0000 rw-p 00000000 00:00 0
2aaaaaafa000-2aaaaaafc000 rw-p 00000000 00:00 0
2aaaaaccc000-2aaaaaccd000 r--p 00021000 fe:01 655581                     /lib/x86...
2aaaaaccd000-2aaaaaccf000 rw-p 00022000 fe:01 655581                     /lib/x86...
2aaaaaccf000-2aaaaae71000 r-xp 00000000 fe:01 656361                     /lib/x86...
2aaaaae71000-2aaaab071000 ---p 001a2000 fe:01 656361                     /lib/x86...
2aaaab071000-2aaaab075000 r--p 001a2000 fe:01 656361                     /lib/x86...
2aaaab075000-2aaaab077000 rw-p 001a6000 fe:01 656361                     /lib/x86...
2aaaab077000-2aaaab07b000 rw-p 00000000 00:00 0
2aaaab07b000-2aaaab090000 r-xp 00000000 fe:01 655396                     /lib/x86...
2aaaab090000-2aaaab290000 ---p 00015000 fe:01 655396                     /lib/x86...
2aaaab290000-2aaaab291000 rw-p 00015000 fe:01 655396                     /lib/x86...
7fff5d517000-7fff5d538000 rw-p 00000000 00:00 0                          [stack]
7fff5d5d7000-7fff5d5d9000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscal...
>>> Before the possible buffer over flow >>>
Aborted

Buffer overflow protection: -fstack-protector --param=ssp-buffer-size=4

$ gcc -fstack-protector --param=ssp-buffer-size=4 -o bof-safe bof.c
$ ./bof-safe "0123456789"
*** stack smashing detected ***: ./bof-safe terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x2aaaaadcbd17]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x0)[0x2aaaaadcbce0]
./bof-safe[0x400732]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2aaaaacf0995]
./bof-safe[0x4005b9]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fe:01 3146663                            /path/to...
00600000-00601000 rw-p 00000000 fe:01 3146663                            /path/to...
00d82000-00da3000 rw-p 00000000 00:00 0                                  [heap]
2aaaaaaab000-2aaaaaacc000 r-xp 00000000 fe:01 655581                     /lib/x86...
2aaaaaacc000-2aaaaaad0000 rw-p 00000000 00:00 0
2aaaaaafa000-2aaaaaafc000 rw-p 00000000 00:00 0
2aaaaaccc000-2aaaaaccd000 r--p 00021000 fe:01 655581                     /lib/x86...
2aaaaaccd000-2aaaaaccf000 rw-p 00022000 fe:01 655581                     /lib/x86...
2aaaaaccf000-2aaaaae71000 r-xp 00000000 fe:01 656361                     /lib/x86...
2aaaaae71000-2aaaab071000 ---p 001a2000 fe:01 656361                     /lib/x86...
2aaaab071000-2aaaab075000 r--p 001a2000 fe:01 656361                     /lib/x86...
2aaaab075000-2aaaab077000 rw-p 001a6000 fe:01 656361                     /lib/x86...
2aaaab077000-2aaaab07b000 rw-p 00000000 00:00 0
2aaaab07b000-2aaaab090000 r-xp 00000000 fe:01 655396                     /lib/x86...
2aaaab090000-2aaaab290000 ---p 00015000 fe:01 655396                     /lib/x86...
2aaaab290000-2aaaab291000 rw-p 00015000 fe:01 655396                     /lib/x86...
7fff36900000-7fff36921000 rw-p 00000000 00:00 0                          [stack]
7fff369fe000-7fff36a00000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscal...
>>> Before the possible buffer over flow >>>
<<< After the possible buffer over flow <<<
Aborted

Buffer overflow protection: -fstack-protector-all

$ gcc -fstack-protector-all -o bof-safest bof.c
$ ./bof-safest "0123456789"
*** stack smashing detected ***: ./bof-safest terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x2aaaaadcbd17]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x0)[0x2aaaaadcbce0]
./bof-safest[0x400732]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2aaaaacf0995]
./bof-safest[0x4005b9]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fe:01 3146665                            /path/to...
00600000-00601000 rw-p 00000000 fe:01 3146665                            /path/to...
01c1b000-01c3c000 rw-p 00000000 00:00 0                                  [heap]
2aaaaaaab000-2aaaaaacc000 r-xp 00000000 fe:01 655581                     /lib/x86...
2aaaaaacc000-2aaaaaad0000 rw-p 00000000 00:00 0
2aaaaaafa000-2aaaaaafc000 rw-p 00000000 00:00 0
2aaaaaccc000-2aaaaaccd000 r--p 00021000 fe:01 655581                     /lib/x86...
2aaaaaccd000-2aaaaaccf000 rw-p 00022000 fe:01 655581                     /lib/x86...
2aaaaaccf000-2aaaaae71000 r-xp 00000000 fe:01 656361                     /lib/x86...
2aaaaae71000-2aaaab071000 ---p 001a2000 fe:01 656361                     /lib/x86...
2aaaab071000-2aaaab075000 r--p 001a2000 fe:01 656361                     /lib/x86...
2aaaab075000-2aaaab077000 rw-p 001a6000 fe:01 656361                     /lib/x86...
2aaaab077000-2aaaab07b000 rw-p 00000000 00:00 0
2aaaab07b000-2aaaab090000 r-xp 00000000 fe:01 655396                     /lib/x86...
2aaaab090000-2aaaab290000 ---p 00015000 fe:01 655396                     /lib/x86...
2aaaab290000-2aaaab291000 rw-p 00015000 fe:01 655396                     /lib/x86...
7ffffde13000-7ffffde34000 rw-p 00000000 00:00 0                          [stack]
7ffffdffe000-7ffffe000000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscal...
>>> Before the possible buffer over flow >>>
<<< After the possible buffer over flow <<<
Aborted
Previous Post Top Next Post