Application scenarios and design methods of MySQL table and database sharding

Application scenarios and design methods of MySQL table and database sharding

Many friends have asked in forums and message areas under what circumstances MySQL needs to be sharded, and what design method is the best choice. Based on these questions, the editor has compiled some application scenarios and examples of the best design methods for MySQL sharding.

1. Sub-table

Scenario: For large-scale Internet applications, the number of record rows in a single database table may reach tens of millions or even hundreds of millions, and the database faces extremely high concurrent access. MySQL architecture using Master-Slave replication mode,

Only the reading of the database can be expanded, while the writing operation of the database is still concentrated on the Master. In addition, it is impossible to have an unlimited number of Slaves mounted on a single Master. The number of Slaves is limited by the Master's capabilities and load.

Therefore, the database throughput capacity needs to be further expanded to meet the needs of high concurrent access and massive data storage!

For a single table that is accessed very frequently and has a huge amount of data, the first thing we need to do is to reduce the number of records in the single table in order to reduce the time required for data query and improve the database throughput. This is the so-called table sharding!

Before sharding a table, you first need to select an appropriate sharding strategy so that the data can be evenly distributed across multiple tables without affecting normal queries!

For Internet companies, most data is associated with users, so user ID is the most commonly used sub-table field. Because most queries need to carry user ID, this will not affect the query and can make the data more balanced.

Distributed to each table (of course, in some scenarios, the distribution of hot and cold data may be unbalanced), as shown in the following figure:

Suppose there is an order table that records user purchase information. Since the order table has too many records, it will be split into 256 tables.

The split records are stored in the corresponding table according to user_id%256, and the front-end application finds the corresponding order storage table and accesses it according to the corresponding user_id%256.

In this way, user_id becomes a necessary query condition, otherwise the data cannot be accessed because the table where the data is stored cannot be located.

Note: The number of tables after splitting is generally 2 to the power of n, which is the reason why it is split into 256 tables above!

Assume that the order table structure is as follows:

create table order_( 
 order_id bigint(20) primary key auto_increment, 
 user_id bigint(20), 
 user_nick varchar(50), 
 auction_id bigint(20), 
 auction_title bigint(20), 
 price bigint(20), 
 auction_cat varchar(200), 
 seller_id bigint(20), 
 seller_nick varchar(50) 
)

After the table is split, assuming user_id = 257 and auction_id = 100, you need to query the corresponding order information based on auction_id. The corresponding SQL statement is as follows:

select * from order_1 where user_id=257 and auction_id = 100;

Among them, order_1 is calculated based on 257%256, which means the first order table after the partition.

2. Database Division

Scenario: Table sharding can solve the problem of decreased query efficiency caused by excessive data volume in a single table, but it cannot bring about a qualitative improvement in the concurrent processing capability of the database. In the face of high concurrent read and write access, when the database master

When the server cannot bear the pressure of write operations, it is meaningless to expand the slave server no matter how you do it.

Therefore, we must change our thinking and split the database to improve the database writing capability. This is the so-called database partitioning!

Similar to the table sharding strategy, database sharding can use a keyword modulo method to route data access, as shown in the following figure:

Still using the previous order table, assuming that the value of the user_id field is 258, the original single database is divided into 256 databases, then the application's access request to the database will be routed to the second database (258%256 = 2).

3. Sub-library and sub-table

Scenario: Sometimes the database may face the pressure of high concurrent access and the need to store massive data. In this case, it is necessary to adopt both the table sharding strategy and the library sharding strategy for the database in order to expand the system at the same time.

Concurrent processing capabilities and improving the query performance of a single table, this is what is called sharding.

The strategy of sharding is more complicated than the previous strategy of sharding only or only sharding only. A routing strategy of sharding is as follows:

1. Intermediate variable = user_id % (number of sub-databases * number of tables in each database)

2. Library = integer (intermediate variable / number of tables in each library)

3. Table = Intermediate variable % Number of tables in each library

Also use user_id as the routing field. First, use user_id to modulo the number of libraries * the number of each library table to get an intermediate variable. Then divide the intermediate variable by the number of each library table and round it up to get

The corresponding library; and the intermediate variable modulo the number of each library table, that is, the corresponding table is obtained.

The detailed process of the database and table sharding strategy is as follows:

Assume that the original single-database single-table order is split into 256 libraries, each containing 1024 tables. Then, according to the routing strategy mentioned above, for the access of user_id=262145, the routing calculation process is as follows:

1. Intermediate variable = 262145 % (256 * 1024) = 1

2. Library = rounded (1/1024) = 0

3. Table = 1 % 1024 = 1

This means that the query and modification of the order record of user_id=262145 will be routed to the first order_1 table in the 0th database for execution! ! !

You may also be interested in:
  • MYSQL performance optimization sharing (sharding of databases and tables)
  • Mysql database partitioning and table partitioning methods (commonly used)
  • Summary of MYSQL database data splitting: sub-library and sub-table
  • Summary of MySQL database and table sharding
  • In-depth explanation of MySql table, database, sharding and partitioning knowledge
  • How to split data in MySQL table and database
  • Getting Started Guide to MySQL Sharding
  • Summary of MySQL's commonly used database and table sharding solutions
  • MySQL sharding details
  • MySQL sharding project practice

<<:  Four ways to switch tab pages in VUE

>>:  Explanation of installation and configuration of building go environment under linux

Recommend

14 techniques for high-performance websites

Original : http://developer.yahoo.com/performance...

How to create Baidu dead link file

There are two types of dead link formats defined b...

Pure js to achieve the effect of carousel

This article shares the specific code of js to ac...

Steps to deploy Docker project in IDEA

Now most projects have begun to be deployed on Do...

Example usage of Linux compression file command zip

The ".zip" format is used to compress f...

Detailed explanation of vuex persistence in practical application of vue

Table of contents vuex persistence Summarize vuex...

Use nginx to dynamically convert image sizes to generate thumbnails

The Nginx ngx_http_image_filter_module module (ng...

Example analysis of mysql user rights management

This article describes the MySQL user rights mana...

What does this.parentNode.parentNode (parent node of parent node) mean?

The parent node of the parent node, for example, t...

Build nginx virtual host based on domain name, port and IP

There are three types of virtual hosts supported ...

Solution to the error when installing Docker on CentOS version

1. Version Information # cat /etc/system-release ...