TOP Observation: The percentage of CPU time occupied by IO wait. When it is higher than 30%, the IO pressure is high. Next, use iostat -x 1 10 [root@controller ~]#iostat -d -k 1 10 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 19.00 0.00 112.00 0 112 sda1 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda3 0.00 0.00 0.00 0 0 sda4 0.00 0.00 0.00 0 0 sda5 3.00 0.00 16.00 0 16 sda6 0.00 0.00 0.00 0 0 sda7 16.00 0.00 96.00 0 96 tps: The number of transmissions per second of the device. One transmission means "one I/O request"
Use -x to get more information Use -x to get more information View device usage (%util) and response time (await) [root@controller ~]#iostat -d -x -k 1 10 Device: rrqm/s wrqm/sr/sw/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 22.00 0.00 18.00 0.00 160.00 17.78 0.07 3.78 3.78 6.80 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda3 0.00 15.00 0.00 2.00 0.00 68.00 68.00 0.01 6.50 6.50 1.30 sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda7 0.00 7.00 0.00 16.00 0.00 92.00 11.50 0.06 3.44 3.44 5.50
If %util is close to 100%, it means that too many I/O requests are generated, the I/O system is already fully loaded, and the disk may have a bottleneck. When idle is less than 70%, the IO pressure is relatively high, and generally the read speed has more waits. You can also use vmstat to view the b parameter () and wa parameter () You can also refer to svctm is generally smaller than await (because the waiting time of requests waiting at the same time is counted repeatedly). The size of svctm is generally related to disk performance. The CPU/memory load will also have an impact on it. Too many requests will indirectly lead to an increase in svctm. The size of the await generally depends on the service time (svctm) as well as the length of the I/O queue and the issuance pattern of I/O requests. If svctm is close to await, it means that there is almost no waiting time for I/O; if await is much larger than svctm, it means that the I/O queue is too long and the response time of the application is slow. If the response time exceeds the tolerable range for users, you can consider replacing a faster disk, adjusting the kernel elevator algorithm, optimizing the application, or upgrading the CPU. The queue length (avgqu-sz) can also be used as an indicator to measure the system I/O load, but because avgqu-sz is the average value per unit time, it cannot reflect the instantaneous I/O flood. Others have a good example. (I/O system vs. supermarket queue) For example, when we are queuing up to check out at a supermarket, how do we decide which checkout counter to go to? The first thing to look at is the number of people in the queue. Five people will always be faster than 20, right? In addition to counting the number of people, we often look at how much the person in front of us has purchased. If there is an auntie who has bought a week's worth of food in front of us, then we can consider changing lines. Another thing is the speed of the cashier. If you encounter a newbie who can't even count the money, you will have to wait. In addition, timing is also very important. The checkout counter that was crowded 5 minutes ago may be empty now. It is very pleasant to pay at this time. Of course, the premise is that what you have done in the past 5 minutes is more meaningful than queuing (but I haven't found anything more boring than queuing). I/O systems also have many similarities to supermarket queues:
We can use this data to analyze the I/O request pattern, as well as the I/O speed and response time. %util: All IO processing time during the statistical time, divided by the total statistical time. For example, if the statistical interval is 1 second, the device is processing IO for 0.8 seconds and idle for 0.2 seconds, then the %util of the device = 0.8/1 = 80%, so this parameter implies the busyness of the device. Generally, if this parameter is 100%, it means that the device is running close to full capacity (of course, if there are multiple disks, even if %util is 100%, the disk usage may not necessarily reach a bottleneck because of the concurrency capability of the disks). When deploying a program (I tested a program that uploads logs in real time), you must consider the system's CPU, memory, IO, etc. to ensure efficient operation of the system. If the program itself processes very small packets, has many events, has high pressure and no intervals, it will occupy a lot of CPU resources. If disk cache is used instead of memory cache, breakpoint retransmission can be supported to ensure reliable data upload. In the event of a sudden power outage, the data stored in the disk cache will still be uploaded after recovery and will not be lost. However, the number of disk reads and writes will increase accordingly. If the amount of data is relatively small, the speed is still tolerable. The following is an analysis of this parameter output written by others # iostat -x 1 avg-cpu: %user %nice %sys %idle 16.24 0.00 4.31 79.44 Device: rrqm/s wrqm/sr/sw/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util /dev/cciss/c0d0 0.00 44.90 1.02 27.55 8.16 579.59 4.08 289.80 20.57 22.35 78.21 5.00 14.29 /dev/cciss/c0d0p1 0.00 44.90 1.02 27.55 8.16 579.59 4.08 289.80 20.57 22.35 78.21 5.00 14.29 /dev/cciss/c0d0p2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 The iostat output above shows that there are 28.57 device I/O operations per second: Total IO (io)/s = r/s (read) + w/s (write) = 1.02 + 27.55 = 28.57 (times/second), of which write operations account for the majority (w:r=27:1). On average, each device I/O operation only takes 5ms to complete, but each I/O request needs to wait 78ms. Why? Because there are too many I/O requests (about 29 per second). Assuming that these requests are issued at the same time, the average waiting time can be calculated as follows: Average waiting time = single I/O service time * (1 + 2 + ... + total number of requests - 1) / total number of requests Applied to the above example: average waiting time = 5ms*(1+2+…+28)/29=70ms, which is very close to the average waiting time of 78ms given by iostat. This in turn indicates that the I/Os were initiated simultaneously. There are many I/O requests issued per second (about 29), but the average queue is not long (only about 2), which shows that the arrival of these 29 requests is not uniform, and the I/O is idle most of the time. There are requests in the I/O queue 14.29% of the time in one second, which means that the I/O system has nothing to do 85.71% of the time. All 29 I/O requests are processed within 142 milliseconds. delta(ruse+wuse)/delta(io) =await=78.21=>delta(ruse+wuse)/s=78.21*delta(io)/s= 78.21*28.57=2232.8, indicating that the I/O requests per second need to wait for a total of 2232.8ms. So the average queue length should be 2232.8ms/1000ms=2.23, but the average queue length (avgqu-sz) given by iostat is 22.35. Why?! Because there is a bug in iostat, the avgqu-sz value should be 2.23, not 22.35. The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM. You may also be interested in:
|
<<: JavaScript to implement limited time flash sale function
>>: Usage and difference analysis of replace into and insert into on duplicate key update in MySQL
I often see some circular wave graphics on mobile...
Table of contents Preface Array.prototype.include...
Preface In case of application bug or DBA misoper...
The jquery plug-in implements the dashboard for y...
Dynamically implement a simple secondary menu Whe...
Learning Linux commands is the biggest obstacle f...
This article shares the specific code of Vue to a...
1. Preparation 1.1 Download the tomcat compressed...
Basic syntax You can create a view using the CREA...
Today we will look at why master-slave delay occu...
Cluster Deployment Overview 172.22.12.20 172.22.1...
Table of contents Preface Is the interviewer aski...
Portainer is an excellent Docker graphical manage...
During the development activity, I encountered a ...
Preface Programming languages usually contain v...