Practical record of optimizing MySQL tables with tens of millions of data

Practical record of optimizing MySQL tables with tens of millions of data

Preface

Let me explain here first. Many people on the Internet say that Alibaba stipulates that 5 million data must be divided into different databases and tables. In fact, this 5 million is not a fixed value, but is related to the configuration of MySQL and the hardware of the machine. In order to improve performance, MySQL will load the table index into memory. However, when the data in the table reaches a certain amount, the memory will be unable to store these indexes. Without the ability to store indexes, only disk IO can be performed, resulting in performance degradation.

Practical Tuning

I have a table with 1000w data and only one primary key index.

CREATE TABLE `user` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `uname` varchar(20) DEFAULT NULL COMMENT 'Account',
  `pwd` varchar(20) DEFAULT NULL COMMENT 'Password',
  `addr` varchar(80) DEFAULT NULL COMMENT 'Address',
  `tel` varchar(20) DEFAULT NULL COMMENT 'Phone',
  `regtime` char(30) DEFAULT NULL COMMENT 'Registration time',
  `age` int(11) DEFAULT NULL COMMENT 'Age',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=10000003 DEFAULT CHARSET=utf8;

Query all about 16s. It's quite slow. Usually we have a backend system, such as an e-commerce platform, and this is a user table. The backend management system generally queries these user information and performs some operations, such as directly adding new users or deleting users.

So there are two requirements here, one is to query count, the other is to query by page

Let's test the time used for count and the time used for paging query respectively.

select * from user limit 1, 10 //Almost not used select * from user limit 1000000, 10 //0.35s
select * from user limit 5000000, 10 //1.7s
select * from user limit 9000000, 10 //2.8s
select count(1) from user //1.7s

From the query time above, we can see that if it is a paging query, the more data is queried, the longer it takes, and the query count also takes 1.7s. This obviously does not meet our requirements. So, we need to optimize here. First, let's try index optimization. This is the execution plan with only the primary key index:

alter table `user` add INDEX `sindex` (`uname`,`pwd`,`addr`,`tel`,`regtime`,`age`)

Looking at the execution plan above, although type is changed from all->index and the sindex index is used, the query speed does not actually change.

In fact, creating a joint index is to make conditional queries faster, rather than full table queries.

select * from user where uname='6.445329111484186' //3.5s (no joint index)
select * from user where uname='6.445329111484186' //0.003s (with joint index)

So this is the difference between having a joint index and having no index

It can be basically proved here that the efficiency will be very slow when performing full table query with or without index.

Since the index result is no longer useful, we can only look for other solutions. According to what I said in my previous MySQL interview, count can be stored in a table separately

CREATE TABLE `attribute` (
  `id` int(11) NOT NULL,
  `formname` varchar(50) COLLATE utf8_bin NOT NULL COMMENT 'table name',
  `formcount` int(11) NOT NULL COMMENT 'Total data of the table',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

Here I would like to say that this kind of table generally does not query all, but only one, so when creating the table, you can create a hash

select formcount from attribute where formname='user' //Almost not used

The optimization of count is completed. If there are selection conditions above, you can create an index and query by index filtering, so you don't need to read the count.

Well, count is fine, but how to optimize paging query? Here we can use subqueries to optimize

select * from user where
id>=(select id from user limit 9000000,1) limit 10 //1.7s

In fact, the writing of subqueries in this way to determine the id is actually querying through covering indexes. Efficiency will be greatly increased. However, my test here is 1.7s. When the company optimized this aspect before, the query time was lower than this. You can also generate data and test it yourself.

But if the amount of data is too large, I still recommend using es or making some default choices. Count can be listed separately.

At this point, the optimization of a paging query for tens of millions of data has been completed.

Summarize

This is the end of this article about MySQL tens of millions of data table optimization. For more relevant MySQL tens of millions of data table optimization content, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Let's talk in detail about the direction of slow SQL optimization in MySQL
  • A brief discussion on MySQL select optimization solution
  • Mysql vertical table conversion to horizontal table method and optimization tutorial
  • Implementation and optimization of MySql subquery IN
  • Help you quickly optimize MySQL
  • MySQL data insertion optimization method concurrent_insert
  • MySQL optimization query_cache_limit parameter description
  • MySQL optimization: how to write high-quality SQL statements
  • MySQL query optimization: a table optimization solution for 1 million data
  • MYSQL's 10 classic optimization cases and scenarios

<<:  Sample code for implementing mysql master-slave replication in docker

>>:  Significantly optimize the size of PNG images with CSS mask (recommended)

Recommend

How to prevent computer slowdown when WIN10 has multiple databases installed

Enable the service when you need it, and disable ...

About the problem of no virtual network card after VMware installation

1 Problem description: 1.1 When VMware is install...

Box-shadow and drop-shadow to achieve irregular projection example code

When we want to add a shadow to a rectangle or ot...

JavaScript Dom Object Operations

Table of contents 1. Core 1. Get the Dom node 2. ...

Problems installing TensorRT in docker container

Uninstall the installed version on Ubuntu: sudo a...

MySQL stored procedures and common function code analysis

The concept of mysql stored procedure: A set of S...

Introduction to JavaScript conditional access attributes and arrow functions

Table of contents 1. Conditional access attribute...

Share the 15 best HTML/CSS design and development frameworks

Professional web design is complex and time-consu...

How to set mysql5.7 encoding set to utf8mb4

I recently encountered a problem. The emoticons o...

Specific use of routing guards in Vue

Table of contents 1. Global Guard 1.1 Global fron...

How to Find the Execution Time of a Command or Process in Linux

On Unix-like systems, you may know when a command...

How to read the regional information of IP using Nginx and GeoIP module

Install GeoIP on Linux yum install nginx-module-g...