The memory space in modern computers is divided by bytes. Theoretically, it seems that access to any type of variable can start from any address, but the actual situation is that when accessing a specific type of variable, it is often in Specific memory address access requires various types of data to be arranged spatially according to certain rules, rather than arranged one after another in sequence. This is alignment. The role and reasons of alignment: Each hardware platform handles storage space very differently. Some platforms can only access certain types of data from certain addresses. For example, CPUs of some architectures will encounter errors when accessing an unaligned variable. Programming under this architecture must ensure byte alignment. This may not be the case on other platforms, but the most common thing is that if you do not follow the instructions that are suitable for it, The platform requires alignment of data storage, which will cause losses in access efficiency. For example, some platforms start from an even address every time they read. If an int type (assumed to be a 32-bit system) is stored at the beginning of the even address, then the 32 bits can be read out in one read cycle, and if it is stored at an odd address From the beginning, 2 read cycles are required, and the high and low bytes of the two read results are pieced together to obtain the 32-bit data. Obviously the reading efficiency drops a lot.
2. The impact of byte alignment on the program:
Let us first look at a few examples (32bit, x86 environment, gcc compiler):
Suppose the structure is defined as follows:
struct A
{
int a;
char b;
short c;
};
struct B
{
char b;
int a;< /p>
short c;
};
It is now known that the lengths of various data types on 32-bit machines are as follows:
char:1 (signed and unsigned are the same)
short:2 (signed and unsigned are the same)
int:4 (signed and unsigned are the same)
long: 4 (signed and unsigned are the same)
float:4 double:8
So what are the sizes of the above two structures?
The result is:
The value of sizeof(strcut A) is 8
The value of sizeof(struct B) is 12
Structure A contains an int with a length of 4 bytes. There is one char of 1 byte length and one short data of 2 byte length, and the same is true for B; logically speaking, the sizes of A and B should both be 7 bytes.
The above result occurs because the compiler needs to align the data members in space. The above is the result of alignment according to the default settings of the compiler. So can we change the default alignment settings of the compiler? Of course we can. For example:
#pragma pack (2)
struct C
{
char b;
int a;
short c;
};
#pragma pack ()
sizeof(struct C) value is 8.
Modify the alignment value to 1:
#pragma pack (1)
struct D
{
char b;
int a;
short c;
};
#pragma pack ()
sizeof(struct D) value is 7.
We will explain the function of #pragma pack() later.
3. What principles does the compiler align according to?
Let us first Look at four important basic concepts:
1. The alignment value of the data type itself:
For char type data, its own alignment value is 1, for short type it is 2, for Int, float, double types, their own alignment value is 4, unit byte.
2. The self-alignment value of the structure or class: the value with the largest self-alignment value among its members.
3. Specify the alignment value: the specified alignment value value when #pragma pack (value).
4. The effective alignment value of data members, structures and classes: the smaller of the own alignment value and the specified alignment value.
With these values, we can easily discuss the members of the specific data structure and its own alignment. The effective alignment value N is the value that is ultimately used to determine the way the data is stored at the address, and is the most important. Effective alignment of N means "aligned on N", that is to say, the "storage starting address %N=0" of the data. The data variables in the data structure are arranged in the order defined. The starting address of the first data variable is the starting address of the data structure. The member variables of the structure must be aligned and arranged, and the structure itself must be rounded according to its own effective alignment value (that is, the total length occupied by the member variables of the structure needs to be an integer multiple of the effective alignment value of the structure, as understood in conjunction with the following example). In this way, the value of the above examples cannot be understood.
Example analysis:
Analysis example B;
struct B
{
char b;
p>int a;
short c;
};
Assume that B starts emitting from address space 0x0000. In this example, the specified alignment value is not defined. In the author's environment, the value defaults to 4. The self-alignment value of the first member variable b is 1, which is smaller than the specified or default specified alignment value 4, so its effective alignment value is 1, so its storage address 0x0000 conforms to 0x0000%1=0. The second member variable a, Its own alignment value is 4, so the effective alignment value is also 4, so it can only be stored in four consecutive byte spaces from the starting address 0x0004 to 0x0007. Check 0x0004%4=0, and it is next to the first variable. The third variable c has its own alignment value of 2, so the effective alignment value is also 2. It can be stored in the two byte spaces from 0x0008 to 0x0009, consistent with 0x0008%2=0. So the contents stored from 0x0000 to 0x0009 are all B contents. Looking at the self-alignment value of data structure B, it is the largest alignment value among its variables (here is b), so it is 4, so the effective alignment value of the structure is also 4. According to the rounding requirements of the structure, 0x0009 to 0x0000=10 bytes, (12)%4=0. Therefore, 0x0000A to 0x000B are also occupied by structure B. Therefore, B has 12 bytes from 0x0000 to 0x000B***, sizeof(struct B)=12; in fact, if it is just this one, it has aligned bytes, because its starting address is 0, so It must be aligned. The reason why 2 bytes are added at the end is because the compiler wants to achieve the access efficiency of the structure array. Imagine if we define an array of structure B, then the starting address of the first structure is 0. Question, but what about the second structure? According to the definition of an array, all elements in the array are next to each other. If we do not supplement the size of the structure to an integer multiple of 4, then the starting address of the next structure will be 0x0000A , this obviously cannot satisfy the address alignment of the structure, so we need to supplement the structure to an integer multiple of the effective alignment size. In fact, for example: for char type data, its own alignment value is 1, for short type is 2, for int, float , the double type has its own alignment value of 4. The self-alignment values ??of these existing types are also considered based on arrays. Just because the lengths of these types are known, their self-alignment values ??are also known.
Similarly, analyze the above example C:
#pragma pack (2)
struct C
{
char b;
int a;
short c;
};
#pragma pack ()
The self-alignment value of the first variable b is 1, and the specified alignment value is 2. Therefore, its effective alignment value is 1. Assume that C starts from 0x0000, then b is stored at 0x0000, consistent with 0x0000%1=0; the second variable , its own alignment value is 4, and the specified alignment value is 2, so the effective alignment value is 2, so the sequence is stored in four consecutive bytes of 0x0002, 0x0003, 0x0004, and 0x0005, consistent with 0x0002%2=0. The self-alignment value of the third variable c is 2, so the effective alignment value is 2. It is stored in 0x0006 and 0x0007 in sequence, which conforms to 0x0006%2=0. So the eight bytes from 0x0000 to 0x00007*** store C variables. And C's own alignment value is 4, so the effective alignment value of C is 2. And 8%2=0, C only occupies eight bytes from 0x0000 to 0x0007.
So sizeof(struct C)=8.
4. How to modify the default alignment value of the compiler?
1. In VC IDE, you can modify it like this: [Project]|[ Settings], modify it in the Struct Member Alignment of the Code Generation option of the c/c++ tab Category, the default is 8 bytes. 2. When encoding, you can dynamically modify it like this: #pragma pack. Note: it is pragma not progma.
5. How do we consider byte alignment in programming?
If we want to consider saving space when programming, then we only need to assume that the first address of the structure is 0, and then arrange the variables according to the above principles. The basic principle is to arrange the variables in the structure according to the type size from small to large. Make a big statement and try to reduce the filling space in the middle. Another way is to exchange space for time efficiency. We explicitly fill the space for alignment. For example: One way to use space to exchange time is to explicitly insert reserved members: p>
struct A{
char a;
char reserved[3];//Use space for time
int b;
}
The reserved member has no meaning to our program. It just fills the space to achieve byte alignment. Of course, even if this member is not added, the compiler will usually give it to us automatically. To fill in the alignment, we add it ourselves just as an explicit reminder.
6. Possible hidden dangers caused by byte alignment:
There are many hidden dangers about alignment in the code. is implicit. For example, when performing forced type conversion. For example:
unsigned int i = 0x12345678;
unsigned char *p=NULL;
unsigned short *p1=NULL;
p=&i;
*p=0x00;
p1=(unsigned short *)(p+1);
*p1=0x0000;
The last two lines of code, accessing unsignedshort variables from odd boundaries, obviously do not comply with the alignment requirements.
On x86, similar operations will only affect efficiency, but on MIPS or sparc, it may be an error because they require byte alignment.
7. How to find Problems with byte alignment:
If there are alignment or assignment problems, check first
1. The compiler's big little side settings
2. Look here Whether the system itself supports non-aligned access
3. If it supports it, check whether alignment is set. If not, check whether some special modifications need to be added during access to mark its special access operation