Advantages of MySQL covering indexes

Advantages of MySQL covering indexes

A common suggestion is to create indexes for WHERE conditions, but this is actually one-sided. Indexes should be designed for all queries, not just the WHERE conditions. Indexes do help find rows of data efficiently, but MySQL can also use indexes to retrieve column data without having to read a row of data at all. After all, the leaf nodes of an index contain the value corresponding to the index. Why read the rows when you can just read the index to get the data you want? When an index contains all the data for a query, it is called a covering index.

Covering indexes can be a very powerful tool and can significantly improve performance. Consider the case where you don't need to read the data but only need to read the index:

  • Index values ​​are usually much smaller than the entire row storage space, so MySQL can read only a small amount of data when reading only the index value. This is particularly important for cache workloads - a large portion of the response time is spent copying the data. The same is true for disk I/O, because index data takes up much less storage space than row data, thus saving more I/O load and memory usage (this is more significant for the MyISAM engine, because MyISAM can package indexes to make storage space smaller).
  • Indexes are stored in order of index values, so I/O access spans consume less I/O frequency than fetching row data from random disk locations. For some storage engines, such as MyISAM and Percona XtraDB, you can even use OPTIMIZE to optimize the table to get fully ordered indexes, which will make simple range queries use purely sequential access.
  • Some storage engines, such as MyISAM, only cache indexes in MySQL memory. Because the operating system caches data for MyISAM, access usually requires a system call. This can lead to significant performance impact, especially for cache-loaded scenarios where system calls are the most expensive way to access data.
  • Due to InnoDB's clustered index, covering index is very helpful for InnoDB. InnoDB's secondary index stores the primary key value of the row in its leaf nodes. Therefore, the secondary index covers the query and avoids the primary key query.

In all cases, the most typical is that the cost of a query involving only indexed columns is much lower than that of looking up the data rows. It is important to note that a clustered index is not just any type of index. A clustered index must store the values ​​corresponding to the indexed data columns. Hash, spatial, and full-text indexes do not store these values, so MySQL can only use binary trees to cover the query. Moreover, different storage engines implement covering indexes in different ways, and not all storage engines support covering indexes (for example, the Memory storage engine currently does not support it).

When you verify that the index in the query uses a covering index, you will see "Using index" in the Extra column when using the Explain statement. For example, there is a multi-column index on the store_goods table: (shop_id, goods_category_id1). MySQL can use the index when the query returns data for only these two columns:

EXPLAIN SELECT `goods_category_id1`,`shop_id` FROM `store_goods` WHERE 1

Covering index queries can invalidate such optimizations in some cases. The MySQL query optimizer determines whether the index covers the query when executing the query. Suppose that the index covers the WHERE condition but does not cover the entire query. If the evaluation results in a decision not to use a covering index, MySQL 5.5 and earlier versions will directly fetch the data rows, even if the data is not needed, and then filter them out.

Let's look at why this happens and then how to rewrite the query to fix this problem. First the query is this:

EXPLAIN SELECT * FROM products WHERE actor='SEAN CARREY' AND title like '%APOLLO%'

The result at this time is that the covering index will not be used, but the ordinary index, because:

  • No index covers the query columns because we read all columns from the table and no index covers all columns. In theory, MySQL has another shortcut that can be used, that is, the column covered by the index is used in the WHERE condition, so MySQL can first use this index to find the corresponding actor, then check whether their title matches, and then read all the data rows that meet the conditions.
  • For earlier versions of the storage engine API (versions prior to MySQL 5.5), MySQL cannot use LIKE operations in indexes and only supports simple comparison operations (=, IN, >=). MySQL can use prefix-matching LIKE queries in indexes because it can convert them into comparison operations. However, the leading wildcard character (that is, the leading % in LIKE) causes the storage engine to be unable to evaluate the matching condition. Therefore, MySQL will fetch the row data for comparison instead of the index value.

There is a way to do this using a clever combination of indexes and rewriting the query. We can expand the index to (artist, title, prod_id) and rewrite the query as follows:

EXPLAIN SELECT * 
FROM products
	JOIN (
    SELECT prod_id
    FROM products
    WHERE actor='SEAN CARREY' AND TITLE LIKE '%APOLLO%'
 ) AS t1 ON (t1.prod_id=products.prod_id)

We call this a "deferred join" because it delays access to the columns. In the first phase of the query, when it finds matching rows in the subquery, MySQL uses the covering index. Although it is not covered in the entire query, it is better than nothing.

The effectiveness of this optimization depends on how many rows of data are found in the WHERE condition. Assume that the products table contains millions of rows of data. You can compare the performance of these two queries, with a total of 1 million rows of data.

  • Case 1: There are 30,000 products of the actor "SEAN CARREY", of which 20,000 have the title "APOLLO";
  • Case 2: There are 30,000 products of the actor "SEAN CARREY", 40 of which have the title "APOLLO";
  • Case 3: There are 50 products of the actor "SEAN CARREY", 10 of which have the title "APOLLO".

The comparison results are shown in the following table.

Dataset Original query Optimized query
First case 5qps 5qps
Second case 7qps 35qps
The third case 2400qps 2000qps

The results are explained as follows:

  • In the first case, the query returns a very large result set, so the optimization effect is not seen. Most of the time is spent reading and sending data.
  • In the second case, the subquery filtering with the covering index yields a small result set, which results in a 5-fold performance improvement. The reason for this effect is that only 40 rows need to be read, compared to a dataset with 30,000 rows.
  • The third case shows that the subquery fails. The result set returned by the covering index filtering is too small, resulting in the cost of the subquery being higher than reading the data directly from the data table.

In most storage engines, an index can only cover access to columns that are part of the index. However, InnoDB actually does a further optimization. Think of InnoDB's secondary indexes storing the primary key values ​​in the leaf nodes. This means that InnoDB's secondary indexes actually have additional columns that help InnoDB use covering indexes. For example, the sakila.actor table uses InnoDB and has an index on last_name, so this index can cover queries based on the primary key actor_id - even though this column is not part of the index.

EXPLAIN SELECT actor_id, last_name
FROM sakila.actor WHERE last_name = 'HOPPER'

The above is the detailed content of the advantages of MySQL covering index. For more information about MySQL covering index, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL uses covering index to avoid table return and optimize query
  • Examples of using MySQL covering indexes
  • Summary of knowledge points about covering index in MySQL
  • How to use MySQL covering index and table return
  • Detailed explanation of MySQL covering index
  • Mysql performance optimization case - covering index sharing
  • Mysql performance optimization case study - covering index and SQL_NO_CACHE

<<:  Teach you how to make cool barcode effects

>>:  How to use border-image to implement text bubble border sample code

Recommend

Detailed explanation of common usage of MySQL query conditions

This article uses examples to illustrate the comm...

About Zabbix custom monitoring items and triggers

Table of contents 1. Monitoring port Relationship...

How to calculate the value of ken_len in MySQL query plan

The meaning of key_len In MySQL, you can use expl...

Explanation of the concept and usage of Like in MySQL

Like means "like" in Chinese, but when ...

How to get the real path of the current script in Linux

1. Get the real path of the current script: #!/bi...

JavaScript function detailed introduction

Any number of statements can be encapsulated thro...

Detailed explanation of MySQL Group by optimization

Table of contents Standard execution process opti...

Solution to the problem that input in form cannot be submitted when disabled

I wrote a test program before, in which adding and...

How to use Dayjs to calculate common dates in Vue

When using vue to develop projects, the front end...

How to decompress multiple files using the unzip command in Linux

Solution to the problem that there is no unzip co...

Detailed code for implementing 3D tag cloud in Vue

Preview: Code: Page Sections: <template> &l...

How to reset the root password in CentOS7

There are various environmental and configuration...