In-depth study of MySQL composite index

In-depth study of MySQL composite index

A composite index (also called a joint index) is an index created on multiple columns. The most important thing when creating a composite index is the choice of column order, which affects whether the index can be used or how many predicate conditions can use the index. The use of composite indexes follows the leftmost matching principle. Only when the columns on the left of the index are matched can the subsequent columns continue to match. This article mainly explores the creation order and usage of composite indexes.

1. Concept of composite index

An index created on a single column is called a single-column index, and an index created on two or more columns is called a composite index. It is relatively simple to create an index on a single column. Usually, you only need to consider the selectivity of the column. The better the selectivity, the more dispersed the data is, and the better the performance of the created index will be. Usually, the formula for calculating the selectivity of a column is:
selectivity = the number of records returned after applying the predicate condition / the number of records returned without applying the predicate condition. The selectivity value range is (0,1]. The smaller the value, the better the selectivity.
A composite index (also called a joint index) is an index created on multiple columns. The most important thing when creating a composite index is the choice of column order, which affects whether the index can be used or how many predicate conditions can use the index. The use of composite indexes follows the leftmost matching principle. Only when the columns on the left of the index are matched can the subsequent columns continue to match.

2. When to use composite index columns

Composite indexes follow the leftmost matching principle. Only when the leftmost column in the index matches can the next column be matched. If the left column is used for non-equal value query, the column on the right side of the index will not be used for query or sorting.

Experiment: When to use composite indexes

Which fields in the composite index are used is a question we are very concerned about. A classic example on the Internet:

-- Create a test table CREATE TABLE t1(
c1 CHAR(1) not null,
c2 CHAR(1) not null,
c3 CHAR(1) not null,
c4 CHAR(1) not null,
c5 CHAR(1) not null
)ENGINE innodb CHARSET UTF8;

-- Add index alter table t1 add index idx_c1234(c1,c2,c3,c4);

--Insert test datainsert into t1 values('1','1','1','1','1'),('2','2','2','2','2'),
('3','3','3','3','3'),('4','4','4','4','4'),('5','5','5','5','5');

You need to explore which of the following query statements use the index idx_c1234, and which fields of the index are used?

(A) where c1=? and c2=? and c4>? and c3=?

(B) where c1=? and c2=? and c4=? order by c3

(C) where c1=? and c4=? group by c3,c2

(D) where c1=? and c5=? order by c2,c3

(E) where c1=? and c2=? and c5=? order by c2,c3

(F) where c1>? and c2=? and c4>? and c3=?

Option A:

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1='2' and c2='2' and c4>'1' and c3='2';
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------+
| 1 | SIMPLE | t1 | NULL | range | idx_c1234 | idx_c1234 | 12 | NULL | 1 | 100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------+

The index length used is 12, which means that all four fields use the index. Since c1, c2, and c3 are all equal value queries, the following c4 column can also be used.

Note: In utf8 encoding, the length of an index is 3. Here, 12 means that all four fields use this index.

Option B:

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1='2' and c2='2' and c4='2' order by c3;
+----+-------------+-------+------------+------+---------------+-----------+-------------+------+----------+----------+----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+-------------+------+----------+----------+----------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_c1234 | idx_c1234 | 6 | const,const | 1 | 20.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-----------+-------------+------+----------+----------+----------------------+

The index length used is 6, which means that two fields use the index. According to the leftmost use principle, c1 and c2 use indexes. Because there is no c3 predicate condition in the query, the index value is interrupted after using c2, resulting in only using c1 and c2 columns. The SQL here uses order by sorting, but there is no filesort keyword in the Extra part of the execution plan, which means that data can be read in the order of the c3 field in the index.

Note that although the c3 field in the index is not placed at the end of the index, the ordered nature of the c2 field in the index is indeed used because the "fileasort" keyword does not appear in the Extra part of the execution plan. Why is this? The Index Condition Pushdown (ICP) optimization introduced in MySQL 5.6 is used here. The core idea is to use the fields in the index to filter data. Let's sort out the differences between not using ICP and using ICP:

If ICP optimization is not used, the SQL execution steps are as follows:

1. Use index columns c1 and c2 to obtain row data that meets the conditions. where c1='2' and c2='2'

2. Return to the table to query data and use where c4='2' to filter data

3. Sort and output the data

If ICP optimization is used, the SQL execution steps are as follows:

1. Use index columns c1 and c2 to obtain row data that meets the conditions. where c1='2' and c2='2'

2. Use where c4='2' in the index to filter data

3. Because the data is ordered, directly retrieve the data that meets the conditions in order

Option C:

mysql> explain select c2,c3 from t1 where c1='2' and c4='2' group by c3,c2;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_c1234 | idx_c1234 | 3 | const | 2 | 14.29 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+-----------------------------------------------------------+

The index length used is 3, which means that one field uses the index. According to the leftmost use principle, c1 uses the index. Because there is no c2 predicate condition in the query, the index value is interrupted after using c1, resulting in only using the c1 column. The SQL execution process is:

1. Use the index in column c1 to find all rows where c1='2', then return to the table and use c4='2' to filter out unmatched data
2. Based on the results of the previous step, sort c3 and c2 in the results to obtain continuously changing data. At the same time, create a temporary table inside the database to store the results of group by.

C Option Extensions:

mysql> explain select c2,c3 from t1 where c1='2' and c4='2' group by c2,c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+--------------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_c1234 | idx_c1234 | 3 | const | 2 | 14.29 | Using where; Using index |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+--------------------------+

The index length used is 3, which means that one field uses the index. According to the leftmost use principle, c1 uses the index.

Option D:

mysql> explain select c2,c3 from t1 where c1='2' and c5='2' order by c2,c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+------------------------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_c1234 | idx_c1234 | 3 | const | 2 | 14.29 | Using index condition; Using where |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+------------------------------------+

The index length used is 3, which means that all fields use the index. According to the leftmost use principle, c1 uses the index. Because there is no c2 predicate condition in the query, the index value is interrupted after using c1, resulting in only using the c1 column.

D option expansion:

mysql> explain select c2,c3 from t1 where c1='2' and c5='2' order by c3,c2;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_c1234 | idx_c1234 | 3 | const | 2 | 14.29 | Using index condition; Using where; Using filesort |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+------+----------+----------------------------------------------------+

The index length used is 3, which means that all fields use the index. According to the leftmost use principle, c1 uses the index. Because there is no c2 predicate condition in the query, the index value is interrupted after using c1, resulting in only using the c1 column.

Option E:

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1='2' and c2='2' and c5='2' order by c2,c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+------------------------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_c1234 | idx_c1234 | 6 | const,const | 2 | 14.29 | Using index condition; Using where |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+------------------------------------+

The index length used is 6, which means that both fields use indexes. According to the leftmost use principle, c1 and c2 use indexes. Here, the SQL uses order by sorting, but there is no filesort keyword in the Extra part of the execution plan, which means that data can be read in the order of the c3 field in the index (c2 is a constant).

Option F:

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1>'4' and c2='2' and c3='2' and c4='1';
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------+
| 1 | SIMPLE | t1 | NULL | range | idx_c1234 | idx_c1234 | 3 | NULL | 1 | 20.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------+

The index length used is 3, which means that all fields use the index. According to the leftmost use principle, c1 uses the index. Here, c1 uses an inequality query, which causes the subsequent c2 query to be unable to use the index. This case is very worthy of attention. When the predicate condition contains equal query and range query, if the range query is before the index, the equal query will not be able to use the index; if the equal query is before and the range query is after, both can use the index.

(III) How to create a composite index

The difficulty in creating a composite index lies in the selection of field order. My views are as follows:

  • If there are equal value queries and sorting, when creating a composite index, put the equal value query field first and the sorting field last;
  • If there are multiple equal value queries, the ones with better selectivity are placed first, and the ones with poor selectivity are placed at the end;
  • If there are equal value queries, range queries, and sorting. Equal value queries are placed first, and range queries and sorting need to determine the index order based on actual conditions;

In addition, there are several specifications about composite indexes in the Alibaba Java Development Manual - 2020 Latest Songshan Edition. Let's take a look:

1. If there is an order by scenario, please take advantage of the orderliness of the index. The field after order by is part of the composite index and is placed at the end of the composite index to avoid filesort and affect query performance.

Positive example: where a=? b=? order by c; index a_b_c

Counterexample: If the index has a range query, the index order cannot be used. For example: where a>10 order by b; the index a_b cannot be sorted.

2. When building a composite index, the one with the highest discrimination is on the far left. If where a=? and b=?, the value of column a is almost unique, then you only need to build a single-column index idx_a.

Note: When there are mixed judgment conditions with equal signs and non-equal signs, please put the column with the equal sign condition in front when creating the index. For example: where c>? and d=?, then even if the discrimination of c

Higher, we must also put d at the front of the index, that is, create the index idx_d_c.

Experiment: How to create a composite index

Some documents talk about the rules for creating composite indexes: ESR principle: put the fields with exact (Equal) matching at the front, the sorting (Sort) conditions in the middle, and the fields with range (Range) matching at the end. Next, let's explore whether this approach is correct.

Example: There is an employee table employees

mysql> show create table employees;
+-----------+------------------------------
| Table | Create Table                                                                   
+-----------+-------------------------------------
| employees | CREATE TABLE `employees` (
 `emp_no` int(11) NOT NULL,
 `birth_date` date NOT NULL,
 `first_name` varchar(14) NOT NULL,
 `last_name` varchar(16) NOT NULL,
 `gender` enum('M','F') NOT NULL,
 `hire_date` date NOT NULL,
 PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-----------+-------------------------------------

-- The data volume is about 300,000 rowsmysql> select count(*) from employees;
+----------+
| count(*) |
+----------+
| 300024 |
+----------+

Now we need to query the employees whose first_name is "Ebbe" who joined the company after 1998 and sort them in ascending order by date of birth.

The SQL statement is as follows:

select emp_no,birth_date,first_name,last_name,gender,hire_date 
from employees 
where hire_date >= '1998-01-01'
and first_name = 'Ebbe'
order by birth_date;

In order to optimize the performance of this SQL statement, you need to create an index on the table. In order to ensure that both where and order by use the index, you decide to create a composite index. The creation order is as follows:

(A)hire_date,first_name,birth_date

(B)hire_date,birth_date,first_name

(C)first_name,hire_date,birth_date

(D)first_name,birth_date,hire_date

(E)birth_date,first_name,hire_date

(F) birth_date, hire_date, first_name

Determine which order is optimal for creating indexes.

Note:

1. The date type takes up 3 bytes of space, hire_date and birth_date both take up 3 bytes of space.

2. first_name is a variable-length field, which uses 2 more bytes. If NULL values ​​are allowed, 1 more byte is required, occupying 16 bytes.

Option A: hire_date, first_name, birth_date

create index idx_a on employees(hire_date,first_name,birth_date);

Its execution plan is as follows:

+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | employees | NULL | range | idx_a | idx_a | 19 | NULL | 5678 | 10.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+

Here, the key_len length is 19, which is puzzling. Since hire_date is a non-equal value query, theoretically key_len should be 3. By using MySQL workbench to view the execution plan, we can also find that the index only uses the hire_date column (as shown below). Why is it 19 and not 3? It is really puzzling. I have been thinking about it for a long time but still can't figure it out. If anyone knows the answer, I hope you experts can answer it.

Option B: hire_date,birth_date,first_name

To avoid interference, delete the index idx_a created above, and then create idx_b.

create index idx_b on employees(hire_date,birth_date,first_name);

Its execution plan is as follows:

+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | employees | NULL | range | idx_b | idx_b | 3 | NULL | 5682 | 10.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+

Here, the length of key_len is 3, and hire_date is a non-equal value query, which makes the subsequent index columns unusable.

Option C: first_name, hire_date, birth_date

To avoid interference, delete the index idx_b created above, and then create idx_c.

create index idx_c on employees(first_name,hire_date,birth_date);

Its execution plan is as follows:

+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | employees | NULL | range | idx_c | idx_c | 19 | NULL | 5 | 100.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+------+----------+---------------------------------------+

Here, the key_len length is 19, and first_name is an equal value query. The hire_date column can continue to be used. However, the hire_date column is a non-equal value query, which makes it impossible for the index to continue to use birth_date.

Option D: first_name,birth_date,hire_date

To avoid interference, delete the index idx_c created above and then create idx_d.

create index idx_d on employees(first_name,birth_date,hire_date);

Its execution plan is as follows:

+----+-------------+-----------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | employees | NULL | ref | idx_d | idx_d | 16 | const | 190 | 33.33 | Using index condition |
+----+-------------+-----------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+

Here, the length of key_len is 16, first_name is an equal value query, and birth_date is not used in the predicate filtering, resulting in only the first_name column using the upper index, but the birth_date column is used for sorting. The above execution plan shows that the SQL is not sorted in the end, indicating that the data is retrieved from the index in order according to birth_date.

Option E: birth_date, first_name, hire_date

To avoid interference, delete the index idx_d created above, and then create idx_e.

create index idx_e on employees(birth_date,first_name,hire_date);

Its execution plan is as follows:

+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
| 1 | SIMPLE | employees | NULL | ALL | NULL | NULL | NULL | NULL | 299468 | 3.33 | Using where; Using filesort |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+

The index is not used here, which means that the sort column cannot be used if it is placed at the front of the composite index.

Option F: birth_date, hire_date, first_name

To avoid interference, delete the index idx_e created above, and then create idx_f.

create index idx_f on employees(birth_date,hire_date,first_name);

Its execution plan is as follows:

+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
| 1 | SIMPLE | employees | NULL | ALL | NULL | NULL | NULL | NULL | 299468 | 3.33 | Using where; Using filesort |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+

As with option E, the index is not used here, which means that the sort column cannot be used when it is placed at the front of the composite index.

Through the above 6 index tests, we found that when the equal query column and the range query column are placed before the composite index, the composite index can be used, but the columns used may be different. Which is the best way to create an index? MySQL's query optimizer selects the best execution plan based on cost. Let's take a look at the execution costs of the above six indexes.

Index cost
---------- ------------
idx_a 8518
idx_b 8524
idx_c 13
idx_d 228
idx_e 78083
idx_f 78083

From the above expenses, we can see:

  • idx_a and idx_b: The index uses the beginning of the range query field, so the index can only use the first column and cannot eliminate sorting, resulting in high overhead;
  • idx_c and idx_d: The index starts with the equal value query field, and the range query and sorting are at the end, so the overhead is minimal;
  • idx_e and idx_f: The index starts with the sort field, which causes the index to be unusable and requires a full table scan, which is very expensive.

Furthermore, how to choose idx_c and idx_d? idx_c uses the index to perform equal value query + range query, and then sorts the data; idx_d uses the index to perform equal value query + index condition pushdown query, and then directly obtains the data in order. Both methods have their own advantages and disadvantages. Let's take a look at another example:

Add the above six indexes to the table and see which index the following SQL will select.

mysql> show index from employees;
+-----------+------------+----------+--------------+--------------+--------------+-------------+----------+--------+------+------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------+--------------+--------------+--------------+-------------+----------+--------+------+------------+---------+------------+
| employees | 0 | PRIMARY | 1 | emp_no | A | 299468 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_a | 1 | hire_date | A | 5355 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_a | 2 | first_name | A | 290745 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_a | 3 | birth_date | A | 299468 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_b | 1 | hire_date | A | 6237 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_b | 2 | birth_date | A | 297591 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_b | 3 | first_name | A | 299468 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_c | 1 | first_name | A | 1260 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_c | 2 | hire_date | A | 293517 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_c | 3 | birth_date | A | 299468 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_d | 1 | first_name | A | 1218 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_d | 2 | birth_date | A | 294525 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_d | 3 | hire_date | A | 298095 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_e | 1 | birth_date | A | 4767 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_e | 2 | first_name | A | 292761 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_e | 3 | hire_date | A | 299468 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_f | 1 | birth_date | A | 4767 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_f | 2 | hire_date | A | 297864 | NULL | NULL | | BTREE | | |
| employees | 1 | idx_f | 3 | first_name | A | 299468 | NULL | NULL | | BTREE | | |
+-----------+------------+----------+--------------+--------------+--------------+-------------+----------+--------+------+------------+---------+------------+

SQL1

mysql> explain select emp_no,birth_date,first_name,last_name,gender,hire_date 
from employees 
where hire_date >= '1998-01-01'
and first_name = 'Ebbe'
order by birth_date;
+----+-------------+-----------+------------+-------+-------------------------+-------+---------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+-------------------------+-------+---------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | employees | NULL | range | idx_a,idx_b,idx_c,idx_d | idx_c | 19 | NULL | 5 | 100.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+-------------------------+-------+---------+------+------+----------+---------------------------------------+

Here, MySQL automatically selects idx_c because the two fields first_name+hire_date have filtered the data to only 5 rows. Since the data is small, sorting is very fast. On the contrary, if you select idx_d, you need to first filter out 190 rows of data that meet the conditions through the first_name field, and then use hire_date to filter the data, which is a lot of work.

SQL2

mysql> explain select emp_no,birth_date,first_name,last_name,gender,hire_date 
from employees 
where hire_date >= '1980-01-01'
and first_name = 'Ebbe'
order by birth_date;
+----+-------------+-----------+------------+------+-------------------------+-------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+-------------------------+-------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | employees | NULL | ref | idx_a,idx_b,idx_c,idx_d | idx_d | 16 | const | 190 | 50.00 | Using index condition |
+----+-------------+-----------+------------+------+-------------------------+-------+---------+-------+------+----------+-----------------------+

If idx_c is selected, after the first_name+hire_date fields filter the data through the index, the data volume is large, resulting in very slow sorting. MySQL automatically selects idx_d, filters the data by the first_name column of the index, and filters the hire_date field by pushing down the index condition, and then retrieves the data in order from the index. Relatively speaking, since idx_d does not require sorting, the speed will be faster.

(IV) Summary of composite indexes

1. When creating a composite index, if there are multiple equal value queries, put the column with good selectivity at the front and the column with poor selectivity at the back;

2. When creating a composite index, if it involves equal value query and range query, no matter how good the selectivity of the column of non-equal value query is, the field of equal value query should be placed before the non-equal value query;

3. When creating a composite index, if it involves equal value query, range query, and sorting (order by, group by), the equal value query should be placed at the front of the index. The order of range query and sorting should be determined based on the actual scenario. If the range query comes first, the order of the index cannot be used, and filesort is required. This is suitable for SQL that returns fewer results, because fewer results means less sorting overhead. If sorting comes first, the order of the index can be used, but it is necessary to go back to the table (or push down the index condition) to query the data. This is suitable for SQL that returns more results, because there is no need to sort and the data can be retrieved directly.

4. When creating a composite index, do not put the columns of order by and group by at the front of the index, because the where clause is always executed before the order by clause in the query.

5. Using an index for range query will cause subsequent index fields to be unusable. If there is a sort, the filesort sort cannot be eliminated. Example: a_b_c index, where a>? and b = ? order by c, then a can be used, but b cannot be used, and the c field needs filesort.

Summarize

This is the end of this article about MySQL compound indexes. For more information about MySQL compound indexes, please search previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Summary of MySQL composite indexes
  • Implementation of MySQL joint index (composite index)
  • Optimize Mysql limit, reference the compound index of fast paging from one million to ten million and apply it to lightweight framework
  • Optimize Mysql limit, reference the compound index of fast paging from one million to ten million and apply it to lightweight framework
  • How does the composite index of MySQL take effect?

<<:  Implementation of Docker packaging image and configuration modification

>>:  How to implement Echats chart large screen adaptation

Recommend

jQuery plugin to achieve image suspension

This article shares the specific code of the jQue...

How to deploy Redis 6.x cluster through Docker

System environment: Redis version: 6.0.8 Docker v...

Docker Compose installation and usage steps

Table of contents 1. What is Docker Compose? 2. D...

Summary of data interaction between Docker container and host

Preface When using Docker in a production environ...

Linux nohup command principle and example analysis

nohup Command When using Unix/Linux, we usually w...

XHTML introductory tutorial: Web page Head and DTD

Although head and DTD will not be displayed on th...

JavaScript pie chart example

Drawing EffectsImplementation Code JavaScript var...

Two ways to understand CSS priority

Method 1: Adding values Let's go to MDN to se...

Pure CSS drop-down menu

Achieve results Implementation Code html <div ...

Basic usage of UNION and UNION ALL in MySQL

In the database, both UNION and UNION ALL keyword...

VMware Workstation Installation (Linux Kernel) Kylin Graphic Tutorial

This article shares with you how to install Kylin...

CSS3 simple cutting carousel picture implementation code

Implementation ideas First, create a parent conta...

Several techniques for playing sounds with CSS

CSS is the realm of style, layout, and presentati...