Preface As Linux operation and maintenance engineers, in our daily work we may encounter situations where the CPU load on Linux servers reaches 100% and remains high. If the CPU continues to run high, it will affect the normal operation of the business system and cause losses to the company. Many operation and maintenance personnel are often at a loss when encountering this situation. For CPU overload problems, the following two methods can usually be used to quickly locate them: Method 1 Step 1: Use
Find the pid of the process that is using too much CPU Step 2: Use
Find the id of the thread that consumes the most resources in the process Step 3: Use
Convert the thread id to hexadecimal (letters should be lowercase)
Step 4: Execution
View thread status information Method 2 Step 1: Use
Find the process that is using too much CPU Step 2: Use
Get thread information and find threads that use up a lot of CPU Step 3: Use
Convert the required thread ID to hexadecimal format Step 4: Use
Print thread stack information Case Study Scenario Description Troubleshooting high CPU usage of JAVA processes in production environments Solution process 1. According to the top command, it is found that the Java process with PID 2633 occupies up to 300% of the CPU and a fault occurs. 2. After finding the process, how to locate the specific thread or code? First, display the thread list and sort it by the threads with high CPU usage: [root@localhost ~]# ps -mp 2633 -o THREAD,tid,time | sort -rn The results are as follows: The thread (TID) 3626 with the highest CPU consumption was found, which has occupied the CPU time for 12 minutes! 3. Convert the required thread TID to hexadecimal format [root@localhost ~]# printf "%x\n" 3626 e18 4. Finally, use the jstack command to print out the stack information of this thread under the process: [root@localhost ~]# jstack 2633 |grep "e18" -A 30 Compared with troubleshooting, discovering the fault is equally important! Most monitoring software on the market can achieve real-time observation of server load, such as Zabbix, Nagios, Alibaba Cloud Monitoring (for cloud servers), etc. However, most of the software requires operation and maintenance personnel to actively set rules or conduct tests to discover problems. How can we receive alerts passively? I would like to recommend a practical operation and maintenance software to you - Professor Wang. For users whose businesses are deployed on Alibaba Cloud, they only need to bind the read-only AcessKey that needs to be monitored to promptly notify the corresponding team members of the alarm information of the cloud resources. The change from active to passive approach reduces the workload of operation and maintenance engineers on the one hand, and reduces the chances of O&M engineers missing or ignoring alarms on the other. Summarize The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. You may also be interested in:
|
<<: Detailed explanation of Vue custom instructions and their use
Table of contents Install vim plugin manager Add ...
describe: When the Tabs component switches back a...
Table of contents 1. Custom import in packaging t...
Nowadays, the screen resolution of computer monit...
【Introduction】: Handtrack.js is a prototype libra...
Autotrash is a command line program that automate...
Table of contents Related dependency installation...
Table of contents Preface advantage: shortcoming:...
Table of contents 1. Example 2. Create 100 soldie...
Table of contents Oracle Isolation Levels MySQL I...
Preface In Linux kernel programming, you will oft...
Data integrity is divided into: entity integrity,...
1. Install cmake 1. Unzip the cmake compressed pa...
Go to https://dev.mysql.com/downloads/mysql/ to d...
1. Introduction The difference between row locks ...