background The amount of new data in the business table of Alibaba Cloud RDS for MySQL (MySQL version 5.7) database exceeds 10 million per month. As the amount of data continues to increase, our business has slow queries on large tables. During peak business hours, slow queries on the main business table take tens of seconds, which seriously affects the business. Program Overview 1. Database design and index optimization The MySQL database itself is highly flexible, resulting in insufficient performance, which is heavily dependent on the developer's table design and index optimization capabilities. Here are some optimization suggestions
Leftmost index matching rule As the name implies, leftmost first. When creating a composite index, the most frequently used column in the where clause should be placed on the leftmost side according to business needs. A very important question for composite indexes is how to arrange the order of columns. For example, if two fields, c1 and c2, are used after where, should the index order be (c1, c2) or (c2, c1)? The correct approach is to put columns with fewer duplicate values at the front. For example, if 95% of the values in a column are not duplicated, then this column can generally be placed at the front.
2. Switch the database to PloarDB read-write separation PolarDB is Alibaba Cloud's self-developed next-generation relational cloud database, which is 100% compatible with MySQL and has a storage capacity of up to 100 TB. A single database can be expanded to up to 16 nodes, making it suitable for a variety of database application scenarios in enterprises. PolarDB adopts an architecture that separates storage and computing. All computing nodes share a copy of the data, providing minute-level configuration upgrades and upgrades, second-level fault recovery, global data consistency, and free data backup and disaster recovery services. Cluster architecture, separation of computing and storage Read-write separation When an application uses a cluster address, PolarDB provides external services through an internal proxy layer (Proxy). The application's requests first pass through the proxy before accessing the database node. The proxy layer can not only perform security authentication and protection, but also parse SQL, send write operations (such as transactions, UPDATE, INSERT, DELETE, DDL, etc.) to the master node, and distribute read operations (such as SELECT) evenly to multiple read-only nodes to achieve automatic read-write separation. For applications, it's as easy as using a single point database. In offline hybrid scenarios: different services use different connection addresses and different data nodes to avoid mutual impact Sysbench performance stress test report: PloarDB 4 cores 16G 2 units PloarDB 8 cores 32G 2 units 3. Migrate historical data from sharded tables to the MySQL 8.0 X-Engine storage engine The business table retains three months of data (this is based on company needs), and the historical data is divided into monthly tables to the historical database X-Engine storage engine table. Why should we choose the X-Engine storage engine table? What are its advantages? Cost savings: X-Engine's storage cost is about half of InnoDB's X-Engine tiered storage improves QPS. It uses a hierarchical storage structure to store hot data and cold data in different layers, and compresses the layer where cold data is located by default. X-Engine is an online transaction processing (OLTP) database storage engine developed by Alibaba Cloud's database product division.
4. Alibaba Cloud PloarDB MySQL 8.0 version parallel query After the table is split, our data volume is still large, which does not completely solve our slow query problem, but only reduces the size of our business table. For this part of the slow query, we need to use PolarDB's parallel query optimization PolarDB MySQL 8.0 has launched a parallel query framework. When the amount of data you query reaches a certain threshold, the parallel query framework will be automatically started, which will exponentially reduce the query time. The data will be sharded into different threads at the storage layer, and multiple threads will calculate in parallel. The result pipeline will be aggregated to the main thread, and finally the main thread will do some simple merging and return it to the user, improving query efficiency. Parallel query is applicable to most SELECT statements, such as large table queries, multi-table join queries, and queries with large computational workloads. For very short queries, the effect is less dramatic. Parallel query usage: Hint syntax can be used to control a single statement. For example, if the system turns off parallel query by default, but a high-frequency slow SQL query needs to be accelerated, Hint can be used to accelerate the specific SQL. SELECT /+PARALLEL(x)/ … FROM …; – x > 0 SELECT /*+ SET_VAR(max_parallel_degree=n) */ * FROM … // n > 0 Query test: Database configuration 16 cores 32G single table data volume exceeds 30 million The time before parallel query was 4326ms, and after parallel query was added, it was 525ms, which is an 8.24 times performance improvement. 5. Interactive Analysis of Hologre Although we have improved the efficiency of slow queries on large tables by using parallel query optimization, we are still unable to achieve some specific requirements such as real-time reports and real-time large screens, and can only rely on big data for processing. 6. Postscript The optimization of large tables with tens of millions of data is based on business scenarios and is optimized at the expense of cost. It is not about horizontally splitting and expanding the database right away. This will bring huge challenges to operation and maintenance and business, and the effect may not be good in many cases. Whether our database design, index optimization, and table partitioning strategy are in place, we should choose the appropriate technology to implement it according to business needs. This is the end of this article about MySQL large table optimization solutions. For more relevant MySQL large table optimization content, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: The principle and implementation of two-way binding in Vue2.x
>>: How to monitor Tomcat using LambdaProbe
Frameset pages are somewhat different from ordina...
The form code is as shown in the figure. The styl...
Preface Sometimes you need to keep the height of ...
This article example shares the specific code for...
Start a new project This article mainly records t...
This article shares the specific code of Vue usin...
1. Introduction Earlier we introduced the rapid d...
Table of contents 01 Create invisible columns 02 ...
1. Introduction The EXPLAIN statement provides in...
This article briefly introduces the process of se...
This article example shares the specific code of ...
This article summarizes various ways to implement...
Table of contents 1. Determine the entity type be...
describe This article introduces a method to impl...
First, the principle of esp8266 publishes message...