Troubleshooting ideas and solutions for high CPU usage in Linux systems

Troubleshooting ideas and solutions for high CPU usage in Linux systems

Preface

As Linux operation and maintenance engineers, in our daily work we may encounter situations where the CPU load on Linux servers reaches 100% and remains high. If the CPU continues to run high, it will affect the normal operation of the business system and cause losses to the company.


Many operation and maintenance personnel are often at a loss when encountering this situation. For CPU overload problems, the following two methods can usually be used to quickly locate them:

Method 1

Step 1: Use

top command, then press shift+p to sort by CPU

Find the pid of the process that is using too much CPU

Step 2: Use

top -H -p [process id]

Find the id of the thread that consumes the most resources in the process

Step 3: Use

echo 'obase=16;[thread id]' | bc or printf "%x\n" [thread id]

Convert the thread id to hexadecimal (letters should be lowercase)

bc is the calculator command in Linux

Step 4: Execution

jstack [process id] |grep -A 10 [thread id in hexadecimal]"

View thread status information

Method 2

Step 1: Use

top command, then press shift+p to sort by CPU

Find the process that is using too much CPU

Step 2: Use

ps -mp pid -o THREAD,tid,time | sort -rn

Get thread information and find threads that use up a lot of CPU

Step 3: Use

echo 'obase=16;[thread id]' | bc or printf "%x\n" [thread id]

Convert the required thread ID to hexadecimal format

Step 4: Use

jstack pid |grep tid -A 30 [hexadecimal of thread id]

Print thread stack information

Case Study

Scenario Description

Troubleshooting high CPU usage of JAVA processes in production environments

Solution process

1. According to the top command, it is found that the Java process with PID 2633 occupies up to 300% of the CPU and a fault occurs.

2. After finding the process, how to locate the specific thread or code? First, display the thread list and sort it by the threads with high CPU usage:

[root@localhost ~]# ps -mp 2633 -o THREAD,tid,time | sort -rn

The results are as follows:


The thread (TID) 3626 with the highest CPU consumption was found, which has occupied the CPU time for 12 minutes!

3. Convert the required thread TID to hexadecimal format

[root@localhost ~]# printf "%x\n" 3626
e18

4. Finally, use the jstack command to print out the stack information of this thread under the process:

[root@localhost ~]# jstack 2633 |grep "e18" -A 30

Compared with troubleshooting, discovering the fault is equally important! Most monitoring software on the market can achieve real-time observation of server load, such as Zabbix, Nagios, Alibaba Cloud Monitoring (for cloud servers), etc. However, most of the software requires operation and maintenance personnel to actively set rules or conduct tests to discover problems. How can we receive alerts passively?

I would like to recommend a practical operation and maintenance software to you - Professor Wang. For users whose businesses are deployed on Alibaba Cloud, they only need to bind the read-only AcessKey that needs to be monitored to promptly notify the corresponding team members of the alarm information of the cloud resources.

The change from active to passive approach reduces the workload of operation and maintenance engineers on the one hand, and reduces the chances of O&M engineers missing or ignoring alarms on the other.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Detailed explanation of Linux CPU load and CPU utilization
  • Detailed explanation of the process of troubleshooting the cause of high CPU usage under Linux

<<:  Detailed explanation of Vue custom instructions and their use

>>:  Summary of the installation process of MySql 8.0.11 and the problems encountered when linking with Navicat

Recommend

Overview of time configuration under Linux system

1. Time types are divided into: 1. Network time (...

MySQL trigger definition and usage simple example

This article describes the definition and usage o...

Share some uncommon but useful JS techniques

Preface Programming languages ​​usually contain v...

iframe multi-layer nesting, unlimited nesting, highly adaptive solution

There are three pages A, B, and C. Page A contains...

How to avoid the trap of URL time zone in MySQL

Preface Recently, when using MySQL 6.0.x or highe...

Methods and techniques for designing an interesting website (picture)

Have you ever encountered a situation where we hav...

Detailed explanation of TIMESTAMPDIFF case in MySQL

1. Syntax TIMESTAMPDIFF(unit,begin,end); Returns ...

JavaScript implements single linked list process analysis

Preface: To store multiple elements, arrays are t...

Tkinter uses js canvas to achieve gradient color

Table of contents 1. Use RGB to represent color 2...

How to reset the root password in CentOS7

There are various environmental and configuration...

Native JS to implement the aircraft war game

This article example shares the specific code of ...

Vue implements fuzzy query-Mysql database data

Table of contents 1. Demand 2. Implementation 3. ...

How to use Linux whatis command

01. Command Overview The whatis command searches ...

Pure CSS to change the color of the picture

The css technique for changing the color of an im...