Record a troubleshooting record of high CPU usage of Tomcat process

Record a troubleshooting record of high CPU usage of Tomcat process

This article mainly records a tomcat process, and the troubleshooting record of excessive CPU usage due to too many TCP connections.

Problem Description

Under Linux, the CPU usage of a Tomcat web service is very high, and top shows over 200%. The request could not be responded to. Repeated restart still the same phenomenon.

Troubleshooting

1. Get process information

The jvm process can be quickly checked through the jps command provided by jdk.

jps-pid

2. View jstack information

jstack pid

It is found that there are a large number of log4j thread blocks in the waiting lock state

org.apache.log4j.Category.callAppenders(org.apache.log4j.spi.LoggingEvent) @bci=12, line=201 (Compiled frame)

After searching for relevant information, I found that the log4j 1.x version has a deadlock problem.

I found the problem, so I adjusted the log4j configuration, turned on only the error level log, and restarted Tomcat. At this time, the block thread in the stack disappears, but the process CPU usage is still high.

3. Further investigation

To analyze the CPU usage of each thread, we need to introduce a script contributed by a great god to calculate the CPU usage of each thread in the Java process.

#!/bin/bash

typeset top=${1:-10}
typeset pid=${2:-$(pgrep -u $USER java)}
typeset tmp_file=/tmp/java_${pid}_$$.trace

$JAVA_HOME/bin/jstack $pid > $tmp_file
ps H -eo user,pid,ppid,tid,time,%cpu --sort=%cpu --no-headers\
    | tail -$top\
    | awk -v "pid=$pid" '$2==pid{print $4"\t"$6}'\
    | while read line;
do
    typeset nid=$(echo "$line"|awk '{printf("0x%x",$1)}')
    typeset cpu=$(echo "$line"|awk '{print $2}')
    awk -v "cpu=$cpu" '/nid='"$nid"'/,/^$/{print $0"\t"(isF++?"":"cpu="cpu"%");}' $tmp_file
done

rm -f $tmp_file

Script application scope

Because the %CPU statistics in ps come from /proc/stat, this data is not real-time, but depends on the frequency of OS updates, which is generally 1S. This is why the statistics you see are inconsistent with the information from jstack. However, this information is very helpful for troubleshooting problems caused by continuous load from a few threads, because these fixed few threads will continue to consume CPU resources. Even if there is a time difference, it is caused by these threads anyway.

In addition to this script, a simpler method is to find out the process ID and use the following command to view the resource usage of each thread in the process

top -H -p pid

Get the pid (thread id) from here, convert it to hexadecimal, and then find the thread information of the object in the stack information.

Through the above method, it is found that the cumulative CPU usage of the threads corresponding to the tomcat process is about 80%, which is much smaller than the 200%+ given by top.

This means that there are no threads that occupy the CPU for a long time, and there should be many short-term CPU-intensive calculations. I then suspected that it was caused by insufficient JVM memory and frequent GC.

jstat -gc pid

It was found that the jvm memory usage was not abnormal, but the number of gc times increased significantly.

After checking the memory, since it is a network program, further check the network connection.

4. Problem location

Querying the TCP connection of the corresponding port of tomcat, it is found that there are a large number of EASTABLISH connections and some connections in other states, totaling more than 400.

netstat -anp | grep port

Further checking the source of these connections revealed that there were a large number of background threads on the application side of the tomcat service, which frequently polled the service, causing the number of tomcat connections of the service to be full and unable to continue receiving requests.

Netstat status description:

  • LISTEN: Listen for connection requests from remote TCP ports
  • SYN-SENT: Send a connection request and wait for a matching connection request (if there are a large number of such status packets, check whether it has been infected)
  • SYN-RECEIVED: After receiving and sending a connection request, wait for the other party to confirm the connection request (if there are a lot of this state, it is estimated that it has been flooded)
  • ESTABLISHED: represents an open connection
  • FIN-WAIT-1: Waiting for a remote TCP connection interruption request, or confirmation of a previous connection interruption request
  • FIN-WAIT-2: Waiting for a connection interruption request from the remote TCP
  • CLOSE-WAIT: Waiting for a connection interruption request from a local user
  • CLOSING: Waiting for remote TCP to confirm the connection is broken
  • LAST-ACK: Waiting for the confirmation of the original connection interruption request sent to the remote TCP (not a good thing, if this item appears, check whether it has been attacked)
  • TIME-WAIT: Wait enough time to ensure that the remote TCP receives confirmation of the connection termination request
  • CLOSED: No connection status

5. Root Cause Analysis

The direct triggering cause is client polling, request exception, and continued polling; new background threads on the client continue to join the polling team, which eventually leads to the server's Tomcat connection being full.

This is the end of this article about recording a problem of high CPU usage of the tomcat process. For more related content about high CPU usage of the tomcat process, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • A practical record of troubleshooting Spring project packaging issues
  • j2Cache online exception troubleshooting problem solving record analysis

<<:  Detailed discussion of memory and variable storage in JS

>>:  Two ways to manually implement MySQL dual-machine hot standby on Alibaba Cloud Server

Recommend

Detailed explanation of Promises in JavaScript

Table of contents Basic usage of Promise: 1. Crea...

A brief discussion on whether CSS animation will be blocked by JS

The animation part of CSS will be blocked by JS, ...

Detailed analysis of binlog_format mode and configuration in MySQL

There are three main ways of MySQL replication: S...

18 Web Usability Principles You Need to Know

You can have the best visual design skills in the...

A brief introduction to the usage of decimal type in MySQL

The floating-point types supported in MySQL are F...

Vue implements a visual drag page editor

Table of contents Drag and drop implementation Dr...

Detailed explanation of MySQL 8.0.18 commands

Open the folder C:\web\mysql-8.0.11 that you just...

How to use the WeChat Mini Program lottery component

It is provided in the form of WeChat components. ...

Introduction to Docker containers

Docker Overview Docker is an open source software...

Let's talk about the LIMIT statement in MySQL in detail

Table of contents question Server layer and stora...

Not all pop-ups are rogue. Tips on designing website pop-ups

Pop-up news is common in domestic Internet servic...