padding initialization

1. Padding value when a variable is fully initialized

C99 have been very clear on the padding value when a variable is fully initialized: "values are unspecified": ( refer C99 Section J.1, page 487 and C11 Section J.1 page 554),

C99 Section 6.2.6.1, page 38, (C11 Section 6.2.6.1, page 44):

"When a value is stored in an object of structure or union type, including in a member

object, the bytes of the object representation that correspond to any padding bytes take

unspecified values."

C99 Section 7.21.4 page 327, ( C11 Section 7.24.4.2 Page 365):

"The contents of ‘‘holes’’ used as padding for purposes of alignment within structure objects are

indeterminate"

Breakpoint 1, main () at 1.c:23

23 test_x x = {'\0', 0, '\0'} =======> fully initialized;

(gdb) ptype x

type = struct {

char a; ====> garbage value for the padding between 'a' and 'b'

uint32_t b;

char c;

}

(gdb) p &x

$1 = (test_x *) 0x7fffffffdcd0

(gdb) x/32bx 0x7fffffffdcd0

0x7fffffffdcd0: 0xc0 0xdd 0xff 0xff 0xff 0x7f 0x00 0x00

0x7fffffffdcd8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x7fffffffdce0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x7fffffffdce8: 0x15 0xab 0xa3 0xf7 0xff 0x7f 0x00 0x00

(gdb) n

25 test_y y = {0};

(gdb) x/32bx 0x7fffffffdcd0

0x7fffffffdcd0: 0x00 0xdd 0xff 0xff 0x00 0x00 0x00 0x00

0x7fffffffdcd8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x7fffffffdce0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x7fffffffdce8: 0x15 0xab 0xa3 0xf7 0xff 0x7f 0x00 0x00

(gdb) p sizeof(x)

$2 = 12

Above is certainly true for auto variables. what about global and static variable with full initializer?

Since "padding value" is "unspecified behavior", the implementation will chose whatever the most efficient way (in terms of compile time, run time, load time and image size, etc.) to go. All global and static variables are compiled into .data section when they are initialized with full or partial initializer. It is simply nature for a compiler to compile the padding value to zero as well even C99 and C11 do not demand so.

/* global variables with full or partial initializer goes to .data section,where padding will be set to 0 as the most efficient way to compile the code */

test_x abc = {'a', 12, 'c'};

test_x cef = {'b',12, 'c'};

test_x aaa = {'c', 12,'c'};

Disassembly of section .data:

0000000000601028 <__data_start>:

601028: 00 00 add %al,(%rax)

60102a: 00 00 add %al,(%rax)

000000000060102c <abc>:

60102c: 61 (bad)

60102d: 00 00 add %al,(%rax)

60102f: 00 0c 00 add %cl,(%rax,%rax,1)

601032: 00 00 add %al,(%rax)

601034: 63 00 movslq (%rax),%eax

601036: 00 00 add %al,(%rax)

0000000000601038 <cef>:

601038: 62 (bad)

601039: 00 00 add %al,(%rax)

60103b: 00 0c 00 add %cl,(%rax,%rax,1)

60103e: 00 00 add %al,(%rax)

601040: 63 00 movslq (%rax),%eax

601042: 00 00 add %al,(%rax)

0000000000601044 <aaa>:

601044: 63 00 movslq (%rax),%eax

601046: 00 00 add %al,(%rax)

601048: 0c 00 or $0x0,%al

60104a: 00 00 add %al,(%rax)

60104c: 63 00 movslq (%rax),%eax

60104e: 00 00 add %al,(%rax)

(gdb) p &abc

$1 = (test_x *) 0x60102c <abc>

(gdb) x/32bx 0x60102c

0x60102c <abc>: 0x61 0x00 0x00 0x00 0x0c 0x00 0x00 0x00

0x601034 <abc+8>: 0x63 0x00 0x00 0x00 0x62 0x00 0x00 0x00

0x60103c <cef+4>: 0x0c 0x00 0x00 0x00 0x63 0x00 0x00 0x00

0x601044 <aaa>: 0x63 0x00 0x00 0x00 0x0c 0x00 0x00 0x00

(gdb) ptype abc

type = struct {

char a;

uint32_t b;

char c;

}

(gdb)

2. padding values when a variable is not initialized

both C99 and C11 are very clear on the behavior when a variable is just declared and defined without any initializer.

if the variable is an auto variable, any bytes inside the variable are random.

if the variable is global or static, any bytes inside the variable (including internal and trail padding) are zero.

C11, Section 6.7.9.10 Page 140:

"If an object that has automatic storage duration is not initialized explicitly, its value is

indeterminate. If an object that has static or thread storage duration is not initialized

explicitly, then:

— if it has pointer type, it is initialized to a null pointer;

— if it has arithmetic type, it is initialized to (positive or unsigned) zero;

— if it is an aggregate, every member is initialized (recursively) according to these rules,

and any padding is initialized to zero bits;

— if it is a union, the first named member is initialized (recursively) according to these

rules, and any padding is initialized to zero bits;"

the highlighted in red is not present in corresponding C99 Section 6.7.8.10, Page 126.

It is very interesting to note the difference between C99 and C11. In fact, they are a moot difference for below reasons:

1. a global or static variable without initialization will be stored in .bss section without consuming any storage (it does not increase .bss size, only increases symbol table size) as their value is known ( to be zero ). It only make sense if all padding will be zero as well.

2. in addition, when .bss is loaded into the memory, it is only nature to zero out the whole block with machine optimized instruction, instead of one element at a time ( and leaves padding with random value ).

3. padding values when a variable is partially initialized

To summarize C99 and C11 with respect to partial initializing padding value:

C99 Section 6.7.8.21, Page 127: ( same as C11 6.7.9.21 Page 141 )

"If there are fewer initializers in a brace-enclosed list than there are elements or members

of an aggregate, or fewer characters in a string literal used to initialize an array of known

size than there are elements in the array, the remainder of the aggregate shall be

initialized implicitly the same as objects that have static storage duration."

Above is quite clear that if a structure is partially initialized:

1. the padding between initialized elements are still "unspecified". if a structure has two elements and only the first one got initialized, the second element will be initialized to zero, along with the trail padding. However the padding between the first element and second element can still be "unspecified".

2. the rest of the elements along with their padding are treated as if a "static variable without initialization", as per C11, their values and internal and trail padding must be set to zero.

It is only nature for a compiler in such case to just compile the rest of the elements without initializer into zero, including internal and trail padding:

Breakpoint 1, main () at 1.c:17

17 test_x x = {'a', 1, 'b'};

(gdb) ptype x

type = struct {

char a;

uint64_t b;

char c;

uint64_t d;

char e;

}

(gdb) p sizeof(x)

$2 = 40

(gdb) x/12x &x

0x7fffffffdca0: 0x00000000 0x00000000 0x00000000 0x00000000

0x7fffffffdcb0: 0x00400540 0x00000000 0x00400400 0x00000000

0x7fffffffdcc0: 0xffffddb0 0x00007fff 0x00000000 0x00000000

(gdb) set *((long long *)&x + 0) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 1) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 2) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 3) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 4) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 5) = 0xffffffffffffffff

(gdb) x/12x &x

0x7fffffffdca0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

0x7fffffffdcb0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

0x7fffffffdcc0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

(gdb) n

19 return 0;

(gdb) x/12x &x

0x7fffffffdca0: 0x00000061 0x00000000 0x00000001 0x00000000

0x7fffffffdcb0: 0x00000062 0x00000000 0x00000000 0x00000000

0x7fffffffdcc0: 0x00000000 0x00000000 0xffffffff 0xffffffff

4. padding values due to forced alignment at trail and {0} initializer

Since the "uninitialized elements" in a partially initialized structure should be set to zero, along with their internal and trail padding, we would naturally initialize an auto variable with "{0}", assuming compiler will take care the internal padding between the first element and the second element (that has no initializer).

this all sounds good:

Breakpoint 1, main () at 1.c:17

17 test_x x = {0};

(gdb) ptype x

type = struct {

char a;

uint64_t b;

char c;

uint64_t d;

char e;

}

(gdb) p sizeof(x)

$1 = 40

(gdb) set *((long long *)&x + 0) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 1) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 2) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 3) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 4) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 5) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 6) = 0xffffffffffffffff

(gdb) x/2x &x

0x7fffffffdca0: 0xffffffff 0xffffffff

(gdb) x/12x &x

0x7fffffffdca0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

0x7fffffffdcb0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

0x7fffffffdcc0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

(gdb) n

19 return 0;

(gdb) x/12x &x

0x7fffffffdca0: 0x00000000 0x00000000 0x00000000 0x00000000

0x7fffffffdcb0: 0x00000000 0x00000000 0x00000000 0x00000000

0x7fffffffdcc0: 0x00000000 0x00000000 0xffffffff 0xffffffff

(gdb)

If padding is after an element that has no initializer, then it will be set to zero as well as per C11, even this padding is forced alignment trail padding:

#include <stdlib.h>

#include <stdio.h>

#include <inttypes.h>

#define WORD_ALIGN __attribute__ ((aligned (sizeof(void*))))

typedef struct {

char a;

uint32_t b;

uint64_t c;

char d;

char e WORD_ALIGN;

} test_x;

int main(void)

{

test_x x = {0};

return 0;

}

gcc -std=c11 1.c -g

Breakpoint 1, main () at 1.c:17

17 test_x x = {0};

(gdb) ptype x

type = struct {

char a;

uint32_t b;

uint64_t c;

char d;

char e;

}

(gdb) p sizeof(x)

$1 = 32

(gdb) set *((long long *)&x + 0) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 1) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 2) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 3) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 4) = 0xffffffffffffffff

(gdb) set *((long long *)&x + 5) = 0xffffffffffffffff

(gdb) x/12x &x

0x7fffffffdcb0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

0x7fffffffdcc0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

0x7fffffffdcd0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

(gdb) n

19 return 0;

(gdb) x/12x &x

0x7fffffffdcb0: 0x00000000 0x00000000 0x00000000 0x00000000

0x7fffffffdcc0: 0x00000000 0x00000000 0x00000000 0x00000000

0x7fffffffdcd0: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

(gdb)

sounds all good, until when we have padding by forced alignment trail padding with a structure that has only one element, which makes {0} a full initializer, and as per C11 rule, the trail padding will take "unspecified value".

#include <stdlib.h>

#include <stdio.h>

#include <inttypes.h>

#define WORD_ALIGN __attribute__ ((aligned (sizeof(void*))))

typedef struct {

char a WORD_ALIGN;

} test_x;

int main(void)

{

test_x x = {0};

return 0;

}

gcc -std=c11 1.c -pedantic -g

Breakpoint 1, main () at 1.c:13

13 test_x x = {0};

(gdb) ptype x

type = struct {

char a;

}

(gdb) p sizeof(x)

$1 = 8

(gdb) set *((long long *)&x + 0) = 0xfffffffffffffff

(gdb) set *((long long *)&x + 1) = 0xfffffffffffffff

(gdb) x/8x &x

0x7fffffffdcc0: 0xffffffff 0x0fffffff 0xffffffff 0x0fffffff

0x7fffffffdcd0: 0x00000000 0x00000000 0xf7a3ab15 0x00007fff

(gdb) n

15 return 0;

(gdb) x/8x &x

0x7fffffffdcc0: 0xffffff00 0x0fffffff 0xffffffff 0x0fffffff

0x7fffffffdcd0: 0x00000000 0x00000000 0xf7a3ab15 0x00007fff

(gdb)

above is C11 very conforming, as this is an auto variable with only one element, and that one element has a full initializer "0" here, hence padding value is "unspecified".

note that padding are compiler optional ( C99 and C11 allows a compiler to go without padding at all), and forced alignment padding is purely a GCC extension, it is not even bothered by C11 or C99. However, if padding exists for whatever reason, it must follow rules set forth by C11 and C99 earlier.

what about the forced alignment internal padding? no, GCC does not allow so. You have to manually add extra fake or unnamed elements between elements.

5. padding values due to flexible array member

Although flexible array member is counted as an "element" in a structure, in term of padding value, it is equivalent to as if it does not exist. This is also very C99 conforming ( refer data alignment here ).

therefore, padding value due to flexible array member are "unspecified", "{0}" won't work.

#include <stdlib.h>

#include <stdio.h>

#include <inttypes.h>

typedef struct {

char a;

uint64_t b[];

} test_x;

int main(void)

{

test_x x = {0};

return 0;

}

gcc -std=c11 1.c -g

Breakpoint 1, main () at 1.c:12

12 test_x x = {0};

(gdb) ptype x

type = struct {

char a;

uint64_t b[];

}

(gdb) p sizeof(x)

$1 = 8

(gdb) p &x.b

$2 = (uint64_t (*)[]) 0x7fffffffdcc8

(gdb) p &x.a

$3 = 0x7fffffffdcc0 "\260\335\377\377\377\177"

(gdb) set *((long long *)&x + 0) = 0xfffffffffffffff

(gdb) set *((long long *)&x + 1) = 0xfffffffffffffff

(gdb) x/8x &x

0x7fffffffdcc0: 0xffffffff 0x0fffffff 0xffffffff 0x0fffffff

0x7fffffffdcd0: 0x00000000 0x00000000 0xf7a3ab15 0x00007fff

(gdb) n

14 return 0;

(gdb) x/8x &x

0x7fffffffdcc0: 0xffffff00 0x0fffffff 0xffffffff 0x0fffffff

0x7fffffffdcd0: 0x00000000 0x00000000 0xf7a3ab15 0x00007fff

(gdb)

6. guaranteed way to zero out everything

memset() is too expensive, Is there any way to zero out forced alignment trail padding when there is only one element in the structure? or when even "{0}" is not guaranteed to work due to flexible array member?

It depends on compiler. GCC has an undocumented behavior whereas if initializer is "{}", it will zero out the sizeof() of the whole structure, including forced alignment trail padding.

Summary:

1. padding value is "unspecified" when an auto variable is fully initialized, or no initialization at all.

2. internal padding value can still be "unspecified" when an auto variable is partially initialized.

3. padding value is all zero when a global/static variable is fully initialized ( this is GCC undocumented behavior, and it is reasonable to believe all compilers do the same )

3. padding value is all zero when a global/static variable is not initialized as per C11, but not C99.

4. "{0}" may zero out everything, including all padding. However, it is not clear how compiler handles the padding between the first element and the second element. The padding value can still be "unspecified" since C11 does not say anything on this particular case.

5. "{0}" won't work for structures that has only one element, or one element plus a flexible array member. In such cases, it is a full initializer and the trail padding will still be "unspecified".

5. there is no internal forced alignment padding as part of GCC extension.

6. forced alignment trail padding is part of GCC extension, and must zero'ed out by GCC if we do "{0}" as per C11, except when there is only one element in the structure (the trail padding will be "unspecified"), or when the last element is flexible array member (the trail padding will be "unspecified").

7. "{}" as an initializer can zero out the whole sizeof of the strcuture, including all kinds of padding. This is a GCC undocumented behavior.

8. empty brace (curly brackets) "{}" is illegal in both C99 and C11.