Optimized record of using IN data volume in Mysql

Optimized record of using IN data volume in Mysql

The MySQL version number is 5.7.28. Table A has 3.9 million records and uses the InnoDB engine. The varchar type field mac has been indexed using B-tree. Table B has only 5000+ records.

There is a SQL instruction written like this:

SELECT * FROM A WHERE mac IN("aa:aa:aa:aa:aa:aa","bb:bb:bb:bb:bb:bb:b",...more than 900 entries omitted)

It took 294.428s to get the results through the query. Yes, nearly 5 minutes.

Use EXPLAIN to analyze:

The access type is range, the index has been hit, and there are only 587776 rows. Why does the query take so long?

The indexing method of mac uses B-tree. Let's compare it with HASH and summarize it briefly: B-tree index can be used for =, >, >=, <, <= and between calculations, while HASH can only perform equality operations and cannot perform range searches. Then IN is an equivalent operation, and both indexing methods are applicable. In this case, change the index method of mac to HASH, and the same query takes time.

Since adjusting the index method cannot significantly improve the query performance of the statement, it can only be processed from the statement itself. In fact, anyone with a discerning eye can see at first glance that SELECT * is very performance-intensive. So we only query the fields required by the business and adjust the statement to:

SELECT id,mileage FROM A WHERE mac IN("aa:aa:aa:aa:aa:aa","bb:bb:bb:bb:bb:bb:b",...more than 900 entries omitted)

There is no significant increase in time consumption.

Since the IN method is so difficult to optimize, can we give up using LEFT JOIN? The statement is adjusted to:

SELECT a.id,a.mileage FROM A a LEFT JOIN B b ON b.mac = a.mac WHERE b.create_time >= '2020-01-01'

If it takes more than 5 minutes, give up.

We know that when the number of conditions is small, there is no obvious difference between the effects of EXISTS and IN. But when there are many conditions, IN is more efficient than EXISTS. Let's try EXISTS:

SELECT id,mileage FROM A a WHERE EXISTS(SELECT mac FROM B WHERE create_time >= '2020-01-01' AND mac = a.mac)

It also takes more than 5 minutes. IN is indeed more efficient than EXISTS, so I gave up.

So the final conclusion is that if IN is followed by a String with large amounts of data, you should be cautious.

In the project, I use mac as a unique identifier to create a corresponding table with id. In table A, mac_id is used instead of mac, and IN(1,2,3...) is used when querying. The efficiency will be improved. Currently, using NoSQL is also a way.

Summarize

This is the end of this article about optimizing MySQL using IN for large amounts of data. For more information about optimizing MySQL using IN for large amounts of data, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MYSQL IN and EXISTS optimization examples
  • Optimization of not in and minus in MySQL
  • How to use not in to optimize MySql
  • Detailed Analysis of or, in, union and Index Optimization in MySQL
  • Implementation of MySQL select in subquery optimization

<<:  Web project development VUE mixing and inheritance principle

>>:  How to use libudev in Linux to get USB device VID and PID

Recommend

How to implement responsiveness in Vue source code learning

Table of contents Preface 1. Key Elements of a Re...

Vue implements simple production of counter

This article example shares the simple implementa...

How to install vncserver in Ubuntu 20.04

Ubuntu 20.04 has been officially released in Apri...

Mysql master-slave synchronization configuration scheme under Centos7 system

Preface Recently, when working on a high-availabi...

Native JS to achieve digital table special effects

This article shares a digital clock effect implem...

In-depth analysis of the Linux kernel macro container_of

1. As mentioned above I saw this macro when I was...

How to create dynamic QML objects in JavaScript

1. Dynamically create objects There are two ways ...

MySQL joint index effective conditions and index invalid conditions

Table of contents 1. Conditions for joint index f...

Case study of dynamic data binding of this.$set in Vue

I feel that the explanation of this.$set on the I...

Vue Router loads different components according to background data

Table of contents Requirements encountered in act...

Detailed explanation of how a SQL statement is executed in MySQL

Overview I have recently started learning MySQL r...

Detailed usage of Vue timer

This article example shares the specific code of ...

How to draw a vertical line between two div tags in HTML

Recently, when I was drawing an interface, I enco...

How to call a piece of HTML code together on multiple HTML pages

Method 1: Use script method: Create a common head...