Discussion on more reasonable creation rules for MySQL string indexes

Discussion on more reasonable creation rules for MySQL string indexes

Preface

Regarding the use of MySQL indexes, we have previously introduced concepts such as the leftmost prefix rule of indexes, index coverage, the use of unique indexes and common indexes, and the optimizer's selection of indexes. Today we will discuss how to create indexes for strings more reasonably.

How to create string indexes better

We know that in MySQL, data and indexes are all on a B+ tree. When we create an index, the smaller the space occupied by the tree, the faster the retrieval speed will be. Some strings in varchar format can be very long. So in today's world where efficiency is the top priority, how can we create string indexes more reasonably?
Suppose there is an email field in a table, and we want to create an index for the email field. The format of the email field value is: [email protected].

There are two ways to create indexes:

1. Create an index for the email field directly: alter table t add index index1(email);

The index tree structure is:

2. Create a prefix index for email: alter table t add index index2(email(6));

The index data structure is:

At this time, our query statement is: select id,name,email from t where email='[email protected]';

When using index1, the execution steps are:

1. Search the primary key value ID1 whose index value is [email protected] from the index1 index tree;

2. Go back to the table based on ID1 and find that the row of data is indeed [email protected]. Add the result to the result set;

3. Continue to search whether the next index value of the index1 index tree satisfies [email protected]. If not, end the query.

When using index2, the execution steps are:

1. Search for the primary key value ID1 with the index value zhangs from the index2 index tree;

2. Go back to the table based on ID1 and find that the row of data is indeed [email protected]. Add the result to the result set;

3. Continue to search whether the next index value of index2 index tree satisfies zhangs. If yes, continue to return to the table to query whether the row data is [email protected]. If not, skip and continue searching;

4. Continue searching the index2 index tree until the index value is not zhangs.

From the above analysis, we can see that the full-field index reduces the number of table returns compared to the prefix index. However, if we increase the prefix from 6 to 7 or 8, the number of prefix index table returns will decrease. In other words, as long as the prefix length is defined, we can save space and ensure efficiency.

So the question is, how do we measure the length of the prefix index?

1. Use select count(distinct email) as L from t; ;

2. Select different prefix lengths in turn to view the number of different values:

select
 count(distinct left(email,4))as L4,
 count(distinct left(email,5))as L5,
 count(distinct left(email,6))as L6,
 count(distinct left(email,7))as L7,
from t;

Then, according to the actual acceptable loss ratio, the shortest suitable prefix length is selected.

We have solved the problem of prefix length, but one problem is that if we use prefix index, then our index coverage feature will not be used.
When using the full field index, when we query select id,email from t where email='[email protected]'; we can directly find the id and email fields without returning to the table.

However, when using a prefix index, MySQL does not know whether the prefix will cover the entire email value. Regardless of whether it is fully included, it will be determined by querying the table based on the primary key value.

Therefore, although using prefix index can save space and ensure efficiency, it cannot use the characteristics of covering index. Whether to use it depends on specific considerations.

Other ways to create string indexes

In actual practice, not all strings can be indexed using prefix truncation. For example, it is not reasonable to use prefix indexes for strings such as ID numbers or IP addresses. The first few digits of ID numbers are generally the same for people in the same area, so it is not reasonable to use prefix indexes. In practice, we usually convert IP values ​​into numbers for storage.

For the ID card number, we can use reverse storage, take the prefix to create an index, or use the crc32() function to obtain a hash check code (int value) as an index.

Reverse: select field_list from t where id_card = reverse('input_id_card_string');

crc32: select field_list from t where id_card_crc=crc32('input_id_card_string') and id_card='input_id_card_string'

These two methods are relatively efficient. Neither supports range searches, but supports equal value searches.

In the reverse method, the reverse function needs to be used, but the number of table returns may be more than the hash method.

In the hash mode, you need to create a new index field and call the crc32() function. (Note: The result obtained by the crc32() function is not guaranteed to be unique. There may be duplication, but the probability of this is small). The number of times the table is returned is small, almost once is enough.

at last

There are generally several ways to create string indexes:

1. The string is short, so the whole field is indexed directly

2. The string is long and the prefix has good discrimination, so create a prefix index

3. The string is long and the prefix is ​​not distinguishable. Create an index in reverse order or hash mode (this method does not work for range queries)

4. Depending on the actual situation, special strings are treated specially, such as ip.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Advantages and disadvantages of MySQL indexes and guidelines for creating indexes
  • Introduction to using MySQL commands to create, delete, and query indexes
  • How to correctly create MySQL indexes
  • How to create an index on a join table in MySQL
  • MySQL creates full-text index sharing
  • Detailed explanation of MySQL index summary -- MySQL index types and creation
  • How to view, create and delete indexes in MySQL
  • mysql add index mysql how to create index
  • What you need to know about creating MySQL indexes

<<:  Use Vue3 to implement a component that can be called with js

>>:  Linux file management command example analysis [permissions, create, delete, copy, move, search, etc.]

Recommend

Docker Gitlab+Jenkins+Harbor builds a persistent platform operation

CI/CD Overview CI workflow design Git code versio...

Solution to MySQL IFNULL judgment problem

Problem: The null type data returned by mybatis d...

In-depth explanation of the global status of WeChat applet

Preface In WeChat applet, you can use globalData ...

Detailed explanation of invisible indexes in MySQL 8.0

Word MySQL 8.0 has been released for four years s...

mysql5.7.19 winx64 decompressed version installation and configuration tutorial

Recorded the installation tutorial of mysql 5.7.1...

Tips on MySQL query cache

Table of contents Preface Introduction to QueryCa...

How to use dl(dt,dd), ul(li), ol(li) in HTML

HTML <dl> Tag #Definition and Usage The <...

dl, dt, dd list label examples

The dd and dt tags are used for lists. We usually...

CentOS8 - bash: garbled characters and solutions

This situation usually occurs because the Chinese...

30 Tips for Writing HTML Code

1. Always close HTML tags In the source code of p...

Linux CentOS MySQL database installation and configuration tutorial

Notes on installing MySQL database, share with ev...

Bootstrap 3.0 study notes CSS related supplement

The main contents of this article are as follows:...

Monitor changes in MySQL table content and enable MySQL binlog

Preface binlog is a binary log file, which record...