Record a troubleshooting record of high CPU usage of Tomcat process

Record a troubleshooting record of high CPU usage of Tomcat process

This article mainly records a tomcat process, and the troubleshooting record of excessive CPU usage due to too many TCP connections.

Problem Description

Under Linux, the CPU usage of a Tomcat web service is very high, and top shows over 200%. The request could not be responded to. Repeated restart still the same phenomenon.

Troubleshooting

1. Get process information

The jvm process can be quickly checked through the jps command provided by jdk.

jps-pid

2. View jstack information

jstack pid

It is found that there are a large number of log4j thread blocks in the waiting lock state

org.apache.log4j.Category.callAppenders(org.apache.log4j.spi.LoggingEvent) @bci=12, line=201 (Compiled frame)

After searching for relevant information, I found that the log4j 1.x version has a deadlock problem.

I found the problem, so I adjusted the log4j configuration, turned on only the error level log, and restarted Tomcat. At this time, the block thread in the stack disappears, but the process CPU usage is still high.

3. Further investigation

To analyze the CPU usage of each thread, we need to introduce a script contributed by a great god to calculate the CPU usage of each thread in the Java process.

#!/bin/bash

typeset top=${1:-10}
typeset pid=${2:-$(pgrep -u $USER java)}
typeset tmp_file=/tmp/java_${pid}_$$.trace

$JAVA_HOME/bin/jstack $pid > $tmp_file
ps H -eo user,pid,ppid,tid,time,%cpu --sort=%cpu --no-headers\
    | tail -$top\
    | awk -v "pid=$pid" '$2==pid{print $4"\t"$6}'\
    | while read line;
do
    typeset nid=$(echo "$line"|awk '{printf("0x%x",$1)}')
    typeset cpu=$(echo "$line"|awk '{print $2}')
    awk -v "cpu=$cpu" '/nid='"$nid"'/,/^$/{print $0"\t"(isF++?"":"cpu="cpu"%");}' $tmp_file
done

rm -f $tmp_file

Script application scope

Because the %CPU statistics in ps come from /proc/stat, this data is not real-time, but depends on the frequency of OS updates, which is generally 1S. This is why the statistics you see are inconsistent with the information from jstack. However, this information is very helpful for troubleshooting problems caused by continuous load from a few threads, because these fixed few threads will continue to consume CPU resources. Even if there is a time difference, it is caused by these threads anyway.

In addition to this script, a simpler method is to find out the process ID and use the following command to view the resource usage of each thread in the process

top -H -p pid

Get the pid (thread id) from here, convert it to hexadecimal, and then find the thread information of the object in the stack information.

Through the above method, it is found that the cumulative CPU usage of the threads corresponding to the tomcat process is about 80%, which is much smaller than the 200%+ given by top.

This means that there are no threads that occupy the CPU for a long time, and there should be many short-term CPU-intensive calculations. I then suspected that it was caused by insufficient JVM memory and frequent GC.

jstat -gc pid

It was found that the jvm memory usage was not abnormal, but the number of gc times increased significantly.

After checking the memory, since it is a network program, further check the network connection.

4. Problem location

Querying the TCP connection of the corresponding port of tomcat, it is found that there are a large number of EASTABLISH connections and some connections in other states, totaling more than 400.

netstat -anp | grep port

Further checking the source of these connections revealed that there were a large number of background threads on the application side of the tomcat service, which frequently polled the service, causing the number of tomcat connections of the service to be full and unable to continue receiving requests.

Netstat status description:

  • LISTEN: Listen for connection requests from remote TCP ports
  • SYN-SENT: Send a connection request and wait for a matching connection request (if there are a large number of such status packets, check whether it has been infected)
  • SYN-RECEIVED: After receiving and sending a connection request, wait for the other party to confirm the connection request (if there are a lot of this state, it is estimated that it has been flooded)
  • ESTABLISHED: represents an open connection
  • FIN-WAIT-1: Waiting for a remote TCP connection interruption request, or confirmation of a previous connection interruption request
  • FIN-WAIT-2: Waiting for a connection interruption request from the remote TCP
  • CLOSE-WAIT: Waiting for a connection interruption request from a local user
  • CLOSING: Waiting for remote TCP to confirm the connection is broken
  • LAST-ACK: Waiting for the confirmation of the original connection interruption request sent to the remote TCP (not a good thing, if this item appears, check whether it has been attacked)
  • TIME-WAIT: Wait enough time to ensure that the remote TCP receives confirmation of the connection termination request
  • CLOSED: No connection status

5. Root Cause Analysis

The direct triggering cause is client polling, request exception, and continued polling; new background threads on the client continue to join the polling team, which eventually leads to the server's Tomcat connection being full.

This is the end of this article about recording a problem of high CPU usage of the tomcat process. For more related content about high CPU usage of the tomcat process, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • A practical record of troubleshooting Spring project packaging issues
  • j2Cache online exception troubleshooting problem solving record analysis

<<:  Detailed discussion of memory and variable storage in JS

>>:  Two ways to manually implement MySQL dual-machine hot standby on Alibaba Cloud Server

Recommend

Native js to realize the upload picture control

This article example shares the specific code of ...

WeChat applet realizes the effect of shaking the sieve

This article shares the specific code of the WeCh...

Differences and comparisons of storage engines in MySQL

MyISAM storage engine MyISAM is based on the ISAM...

How to solve the abnormal error ERROR: 2002 in mysql

Recently, an error occurred while starting MySQL....

Summary of MySQL common functions

Preface: The MySQL database provides a wide range...

Multiple ways to insert SVG into HTML pages

SVG (Scalable Vector Graphics) is an image format...

Pure CSS to achieve hover image pop-out pop-up effect example code

Implementation principle The main graphics are co...

Implementation of FIFO in Linux process communication

FIFO communication (first in first out) FIFO name...

View the dependent libraries of so or executable programs under linux

View the dependent libraries of so or executable ...

React + Threejs + Swiper complete code to achieve panoramic effect

Let’s take a look at the panoramic view effect: D...

Index in MySQL

Preface Let's get straight to the point. The ...

Detailed explanation of the use of MySQL paradigm

1. Paradigm The English name of the paradigm is N...

Vue.js implements simple folding panel

This article example shares the specific code of ...

Div nested html without iframe

Recently, when doing homework, I needed to nest a ...