Summary of several postures that must be mastered in Linux compilation optimization

Summary of several postures that must be mastered in Linux compilation optimization

01. Compile options and kernel compilation

The Linux kernel (English: linux kernel) is a computer operating system kernel written in C language and assembly language, matching the POSIX standard, and released under the GNU General Public License. Technically speaking Linux is just a kernel. "Kernel" refers to a system software that provides hardware abstraction layer, disk and file control, multitasking and other functions.

So first of all, we all know that if the Linux kernel is compiled with O0, it cannot be compiled. The Linux kernel is compiled with either O2 or Os. This can be seen from the Linux Makefile:

When you choose

CONFIG_CC_OPTIMIZE_FOR_SIZE

It will be Os, otherwise it will be O2.

In fact, O2 and Os are a collection of some optimization options:

gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts

gcc -c -Q -Os --help=optimizers > /tmp/Os-opts

The former tends to be based on speed optimization, while the latter tends to be based on smaller size optimization. Compare the switch options of the two:

meld /tmp/O2-opts /tmp/Os-opts

The difference is pitifully small:

Both O2 and Os enable inline small functions and called once functions, but -finline-functions is disabled in O2 and enabled in Os. In O2, optimize-strlen is on, but in Os, this option is off. The meaning of the relevant options can be seen through "man gcc" (if you have any questions, find a man), for example, after searching for inline-functions after man gcc:

From O0 to O1, O2, O3, it is a process of gradually increasing the number of optimization options enabled:

The kernel cannot be compiled with O0 because the kernel itself is not designed to be compiled with O0. Its design includes the assumption that the compilation will be optimized. Let's use a simple example to illustrate this.

02. A simple example

The following code:

O0 compilation will report the following error, saying that the f() function is not defined:

$ gcc -O0 cc.c

cc.c:1:13: warning: 'f' used but never defined [enabled by default]

 void f(void);

    ^

/tmp/ccTwwtHG.o: In function `main':

cc.c:(.text+0x19): undefined reference to `f'

collect2: error: ld returned 1 exit status

But when compiled with O2, there is no problem:

$ gcc -O2 cc.c

The reason is that when O2 compiles, it realizes that a == 1, so if (a>2) does not hold, so it does not matter that f() is not defined.

After changing the code slightly:

O2 is no longer working at this time:

$ gcc -O2 cc.c

/tmp/ccXiyBHn.o: In function `main':

cc.c:(.text.startup+0x7): undefined reference to `f'

collect2: error: ld returned 1 exit status

Therefore, through this example, you can see why the same code can pass with O2 but not with O0. There is a lot of code in the kernel that is supposed to be optimized by the compiler.

3. We don’t want to inline anymore

Due to compilation optimization, some functions (such as small functions and functions called by only one person in the entire project) are not explicitly written as inline, but the compiler optimizes them to inline. This causes some trouble for debugging because the symbol corresponding to this function cannot be found.

At this point, we can explicitly state that we don't want to inline certain functions:

Otherwise, the above two functions may be automatically inlined by the compiler even if you do not write inline in your code because O2 and Os enable the relevant inline options. If we want to refuse inline, we can mark it with noline.

4. I don’t want to be optimized

When O1, O2, O3, and Os are enabled globally, if we do not want to perform any optimization on a single function, we can modify the function with __attribute__((optimize("O0"))) . For example, we modify the above code that can be compiled with O2 as follows:

Recompile with O2:

$ gcc -O2 cc.c

/tmp/cc8M338p.o: In function `main':

cc.c:(.text+0x19): undefined reference to `f'

collect2: error: ld returned 1 exit status

5. Conclusion

Here are a few practical guidelines:

  1. Try not to try to compile the kernel with O0, as this is not in line with real engineering practice and is not well supported by the mainstream Linux community; the kernel relies on O2/Os for more optimizations;
  2. Pursue your code to be correct in O2, and the code should withstand compiler optimization; for example, if O0 works well but O2 does not, you should try to find the cause from yourself and analyze the assembly;
  3. If you want to avoid optimization for a certain part during global optimization, you can try to perform surgical adjustments using noinline, __attribute__((optimize("O0"))), etc.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • How to decompress multiple files using the unzip command in Linux
  • 30 must-know Linux command skills
  • A method of hiding processes under Linux and the pitfalls encountered
  • Detailed explanation of common usage of Linux commands head and tail
  • Summary of the Differences between sudo, su and su - Commands in Linux
  • How to quickly find running processes in Linux

<<:  JS realizes the card dealing animation

>>:  MySQL 5.7.17 winx64 free installation version configuration method graphic tutorial

Recommend

Summary of Vue's cross-domain problem handling and solutions

When you send a network request, the following sa...

WeChat applet implements video player sending bullet screen

This article shares the specific code for WeChat ...

How to create a swap partition file in Linux

Introduction to Swap Swap (i.e. swap partition) i...

JavaScript canvas implements moving the ball following the mouse

This article example shares the specific code of ...

Detailed steps for deploying Tomcat server based on IDEA

Table of contents Introduction Step 1 Step 2: Cre...

Analysis of Apache's common virtual host configuration methods

1. Apache server installation and configuration y...

Detailed explanation of Nginx timed log cutting

Preface By default, Nginx logs are written to a f...

Vue.js front-end web page pop-up asynchronous behavior example analysis

Table of contents 1. Preface 2. Find two pop-up c...

Solution for mobile browsers not supporting position: fix

The specific method is as follows: CSS Code Copy ...

Quick solution for forgetting MySQL8 password

Preface When we forget the MySQL database passwor...