Troubleshooting ideas and solutions for high CPU usage in Linux systems

Troubleshooting ideas and solutions for high CPU usage in Linux systems

Preface

As Linux operation and maintenance engineers, in our daily work we may encounter situations where the CPU load on Linux servers reaches 100% and remains high. If the CPU continues to run high, it will affect the normal operation of the business system and cause losses to the company.


Many operation and maintenance personnel are often at a loss when encountering this situation. For CPU overload problems, the following two methods can usually be used to quickly locate them:

Method 1

Step 1: Use

top command, then press shift+p to sort by CPU

Find the pid of the process that is using too much CPU

Step 2: Use

top -H -p [process id]

Find the id of the thread that consumes the most resources in the process

Step 3: Use

echo 'obase=16;[thread id]' | bc or printf "%x\n" [thread id]

Convert the thread id to hexadecimal (letters should be lowercase)

bc is the calculator command in Linux

Step 4: Execution

jstack [process id] |grep -A 10 [thread id in hexadecimal]"

View thread status information

Method 2

Step 1: Use

top command, then press shift+p to sort by CPU

Find the process that is using too much CPU

Step 2: Use

ps -mp pid -o THREAD,tid,time | sort -rn

Get thread information and find threads that use up a lot of CPU

Step 3: Use

echo 'obase=16;[thread id]' | bc or printf "%x\n" [thread id]

Convert the required thread ID to hexadecimal format

Step 4: Use

jstack pid |grep tid -A 30 [hexadecimal of thread id]

Print thread stack information

Case Study

Scenario Description

Troubleshooting high CPU usage of JAVA processes in production environments

Solution process

1. According to the top command, it is found that the Java process with PID 2633 occupies up to 300% of the CPU and a fault occurs.

2. After finding the process, how to locate the specific thread or code? First, display the thread list and sort it by the threads with high CPU usage:

[root@localhost ~]# ps -mp 2633 -o THREAD,tid,time | sort -rn

The results are as follows:


The thread (TID) 3626 with the highest CPU consumption was found, which has occupied the CPU time for 12 minutes!

3. Convert the required thread TID to hexadecimal format

[root@localhost ~]# printf "%x\n" 3626
e18

4. Finally, use the jstack command to print out the stack information of this thread under the process:

[root@localhost ~]# jstack 2633 |grep "e18" -A 30

Compared with troubleshooting, discovering the fault is equally important! Most monitoring software on the market can achieve real-time observation of server load, such as Zabbix, Nagios, Alibaba Cloud Monitoring (for cloud servers), etc. However, most of the software requires operation and maintenance personnel to actively set rules or conduct tests to discover problems. How can we receive alerts passively?

I would like to recommend a practical operation and maintenance software to you - Professor Wang. For users whose businesses are deployed on Alibaba Cloud, they only need to bind the read-only AcessKey that needs to be monitored to promptly notify the corresponding team members of the alarm information of the cloud resources.

The change from active to passive approach reduces the workload of operation and maintenance engineers on the one hand, and reduces the chances of O&M engineers missing or ignoring alarms on the other.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Detailed explanation of Linux CPU load and CPU utilization
  • Detailed explanation of the process of troubleshooting the cause of high CPU usage under Linux

<<:  Detailed explanation of Vue custom instructions and their use

>>:  Summary of the installation process of MySql 8.0.11 and the problems encountered when linking with Navicat

Recommend

Introduction to vim plugin installation under Linux system

Table of contents Install vim plugin manager Add ...

React antd tabs switching causes repeated refresh of subcomponents

describe: When the Tabs component switches back a...

Bundling non-JavaScript static resources details

Table of contents 1. Custom import in packaging t...

Example code for using @media in CSS3 to achieve web page adaptation

Nowadays, the screen resolution of computer monit...

Handtrack.js library for real-time monitoring of hand movements (recommended)

【Introduction】: Handtrack.js is a prototype libra...

jenkins+gitlab+nginx deployment of front-end application

Table of contents Related dependency installation...

React hooks pros and cons

Table of contents Preface advantage: shortcoming:...

Detailed description of the function of new in JS

Table of contents 1. Example 2. Create 100 soldie...

Why MySQL chooses Repeatable Read as the default isolation level

Table of contents Oracle Isolation Levels MySQL I...

Introduction to container of() function in Linux kernel programming

Preface In Linux kernel programming, you will oft...

MySQL: Data Integrity

Data integrity is divided into: entity integrity,...

Detailed tutorial on using cmake to compile and install mysql under linux

1. Install cmake 1. Unzip the cmake compressed pa...