Summary of using MySQL isolation columns and prefix indexes

Summary of using MySQL isolation columns and prefix indexes

Isolate Data Columns

Often, we find queries that prevent MySQL from using indexes. MySQL does not use indexes on columns unless they are used independently in a query. "Isolate" means that the indexed columns should not be part of an expression or in a query function body. For example, the following example will not hit the actor_id index.

SELECT `actor_id` FROM `actor` WHERE `actor_id` + 1 = 2;

It is easy for a human to know that the query condition is actually actor_id = 4, but MySQL does not handle it that way, so develop a habit of simplifying the WHERE decision condition, which means that the index column is alone on one side of the comparison operator. Here is another example of a common mistake:

SELECT ... WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date_col) <= 10;

Prefix Indexes and Index Selectivity

Sometimes you need to create an index on a column with very long characters, but this will cause the index to take up a lot of space and slow down the query. One strategy is to use hash index simulation, but sometimes this may not be good enough. What should we do at this time?

Usually, you can index some of the characters in front of the index column to replace the full field index to improve performance and save space. But this approach will result in poor selectivity. The selectivity of an index refers to the proportion of data filtered out by independent index values ​​to the entire data set. Highly selective indexes allow MySQL to filter out more irrelevant data. For example, the selectivity of a unique index is 1. The prefix of the column usually provides good enough performance in terms of selectivity. If you use BLOB or TEXT or very long VARCHAR columns, you must define prefix indexes because MySQL does not allow full-length indexes.

You need to strike a balance between using longer prefixes to get better selectivity and short enough prefixes to save storage space. To determine an appropriate prefix length, find the most frequent value and compare it to the most frequent prefix. For example, taking the city data table as an example, we can use the following statement to count:

SELECT COUNT(*) as cnt, `name` FROM `common_city` GROUP BY `name` ORDER BY cnt DESC LIMIT 10

It can be seen that these city names appear more frequently. Now we can use the 1-word prefix to find the most frequent city name prefixes.

SELECT COUNT(*) as cnt, LEFT(`name`, 1) as pref FROM `common_city` GROUP BY pref ORDER BY cnt DESC LIMIT 10

It can be seen that more data sets are found for one word, which will result in fewer chances of independent selection, so the length of the prefix needs to be adjusted. For example, adjust to 3 words.

SELECT COUNT(*) as cnt, LEFT(`name`, 3) as pref FROM `common_city` GROUP BY pref ORDER BY cnt DESC LIMIT 10

You can see that this is not much different from the full length, so a three-character prefix is ​​actually enough (the original text uses an English city data table with more characters). Another way is to use the ratio of the number of prefixes of different lengths to the number of full fields to evaluate how appropriate it is. For example:

SELECT 
  COUNT(DISTINCT LEFT(`name`, 1)) / COUNT(`name`) as pref1, 
  COUNT(DISTINCT LEFT(`name`, 2)) / COUNT(`name`) as pref2, 
  COUNT(DISTINCT LEFT(`name`, 3)) / COUNT(`name`) as pref3, 
  COUNT(DISTINCT LEFT(`name`, 4)) / COUNT(`name`) as pref4 
FROM `common_city`

The closer the value is to 1, the better the effect. However, it can be seen that the room for improvement decreases as the prefix length increases. It is not a good idea to just look at the average, you also need to check the worst case scenario. You might think 3-4 words are enough, but if your data is very unevenly distributed, there may be pitfalls. Therefore, it is also necessary to check whether there is a situation where the data corresponding to a prefix with fewer prefixes is extremely large compared with the others. Finally, you can add a prefix index to the specified column.

ALTER TABLE `common_city` ADD KEY (name(3));

Prefix indexes perform well in terms of saving space and improving efficiency, but they also have a flaw, which is that the index cannot be used in ORDER BY and GROUP BY (actual verification is also useful in MySQL versions above 5.7). Another common scenario is that in a longer hexadecimal string, such as a stored sessionId, taking the first 8-digit prefix as the index will filter out a lot of irrelevant data, which is very effective.

The above is the detailed content of the summary of the use of MySQL isolation data columns and prefix indexes. For more information about MySQL isolation data columns and prefix indexes, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL independent index and joint index selection
  • MySQL optimization: how to write high-quality SQL statements
  • MySQL query optimization using custom variables
  • Summary of MySQL logical backup and recovery testing
  • Basic principles of MySQL scalable design
  • Implementation ideas and steps for MySQL master-slave construction (multiple masters and one slave)
  • How to construct a table index in MySQL
  • How to maintain MySQL indexes and data tables
  • Detailed explanation of MySQL's Seconds_Behind_Master

<<:  Analysis of the difference between absolute path and relative path in HTML

>>:  CSS realizes process navigation effect (three methods)

Recommend

How to install Solr 8.6.2 in Docker and configure the Chinese word segmenter

1. Environment version Docker version 19.03.12 ce...

Win10 install Linux ubuntu-18.04 dual system (installation guide)

I installed a Linux Ubuntu system on my computer....

3 codes for automatic refresh of web pages

In fact, it is very simple to achieve this effect,...

How to deploy SpringBoot project using Docker

The development of Docker technology provides a m...

Solution to BT Baota Panel php7.3 and php7.4 not supporting ZipArchive

The solution to the problem that the PHP7.3 versi...

Docker container introduction

1. Overview 1.1 Basic concepts: Docker is an open...

MySQL 20 high-performance architecture design principles (worth collecting)

Open Source Database Architecture Design Principl...

How to enable MySQL remote connection in Linux server

Preface Learn MySQL to reorganize previous non-MK...

Analysis of the Neglected DOCTYPE Description

doctype is one of them: <!DOCTYPE HTML PUBLIC &...

Implementation code for operating mysql database in golang

Preface Golang provides the database/sql package ...

Extract specific file paths in folders based on Linux commands

Recently, there is a need to automatically search...

MySQL 5.6 zip package installation tutorial detailed

Previously, we all used files with the suffix .ms...