Today, let’s discuss an interesting topic: How much data does a single MySQL table need before you need to consider splitting the database and tables? Some say 20 million rows, others say 5 million rows. So, what value do you think is appropriate? There was once a widely circulated saying in China's Internet technology circle: If the amount of data in a single MySQL table exceeds 20 million rows, the performance will be significantly reduced. In fact, this rumor is said to have originated from Baidu. The specific situation is roughly like this. When the DBA tested MySQL performance, he found that when the volume of a single table reached 20 million rows, the performance of SQL operations dropped sharply. Therefore, the conclusion came from this. Then it was said that Baidu's engineers moved to other companies in the industry and brought this information with them, so this rumor spread in the industry. Later, Alibaba's "Java Development Manual" recommended splitting the database and tables only when the number of rows in a single table exceeds 5 million or the capacity of a single table exceeds 2GB. Alibaba's golden rule supports this. Therefore, many people use this as a standard to perform table partitioning when designing big data storage. So, what value do you think is appropriate? Why not 3 million rows, or 8 million rows, but 5 million rows? Maybe you would say that this might be Ali's best actual combat value? So, the question is, how is this value evaluated? Wait a moment, please think about it for a moment. In fact, this value has nothing to do with the actual number of records, but is related to the configuration of MySQL and the hardware of the machine. Because, in order to improve performance, MySQL will load the table's index into memory. If the InnoDB buffer size is sufficient, it can load all the data into memory and there will be no problem with the query. However, when a single-table database reaches a certain upper limit, the memory cannot store its index, so subsequent SQL queries will generate disk IO, resulting in performance degradation. Of course, this is also related to the design of the specific table structure, and the final problem is memory limitation. Here, increasing the hardware configuration may bring immediate performance improvements. So, my opinion on database and table sharding is that it needs to be combined with actual needs and should not be over-designed. Database and table sharding should not be adopted at the beginning of the project. Instead, as the business grows and optimization cannot be continued, database and table sharding should be considered to improve system performance. In this regard, Alibaba's "Java Development Manual" adds: If you estimate that the amount of data will not reach this level in three years, please do not split the database or table when creating the table. So, back to the original question, what value do you think is appropriate? My suggestion is to make a comprehensive assessment based on the situation of your own machine. If you don't have a standard in mind, then temporarily use 5 million rows as a unified standard, which is a relatively compromising value. Let's take a look at some points about SQL writing, which will be helpful to everyone SQL writing needs to be optimized
You may also be interested in:
|
<<: Implementation and usage scenarios of JS anti-shake throttling function
>>: Solve the problem of installing Theano on Ubuntu 19
I searched the entire web and found all kinds of ...
Table of contents need: Ideas: lesson: Share the ...
The spread operator allows an expression to be ex...
1. Application of multimedia in HTML_falsh animat...
Table of contents View all storage engines InnoDB...
Purpose: Under Linux, the server program may be d...
Scrcpy Installation snap install scrcpy adb servi...
Docker allows network services to be provided by ...
1. position : fixed Locked position (relative to ...
It took me more than an hour to open ssh in Ubunt...
Today, after the game was restarted, I found that...
Solution to "Could not run curl-config"...
This article example shares the specific code of ...
Example source code: https://codepen.io/shadeed/p...
This article shares the specific code of vue+elem...