How to optimize MySQL indexes

How to optimize MySQL indexes

1. How MySQL uses indexes

Indexes are used to quickly find rows with specific column values. Without an index, MySQL must start at the first row and then walk the entire table to find the relevant rows. The bigger the table, the more it costs. If the table has an index on the relevant columns, MySQL can quickly determine where to look in the middle of the data file without having to look through all the data. This is much faster than reading each row sequentially.

Most MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in B-trees. Exceptions: Indexes on spatial data types use R-trees; MEMORY tables also support hash indexes. InnoDB uses inverted lists for FULLTEXT indexes.

MySQL uses indexes to do the following:

  • Quickly find rows that match the WHERE clause
  • If it can choose between multiple indexes, MySQL usually uses the index that finds the smallest number of rows (the most selective index).
  • With a multi-column index (also called a "composite index" or "joint index"), the optimizer can use any leftmost prefix of the index to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).
  • When you use a join query to retrieve rows from other tables, MySQL can use indexes on columns more efficiently if they are declared of the same type and size. In this case, VARCHAR and CHAR are considered the same if they are declared to be the same size. For example, VARCHAR(10) and CHAR(10) are the same size, but VARCHAR(10) and CHAR(15) are different sizes.
  • For comparisons between nonbinary string columns, both columns should use the same character set.
  • If the sorting or grouping is done on the leftmost prefix of the available index (for example, ORDER BY key_part1, key_part2), the table is sorted or grouped. If all key parts are followed by DESC, the keys will be read in reverse order.
  • In some cases, MySQL can use indexes to satisfy the ORDER BY clause and avoid the extra sorting involved in performing a file sort operation.
  • In some cases, queries can be optimized to retrieve values ​​without querying the data rows. (An index that provides all necessary results for a query is called a covering index) If a query uses only columns from a table that are included in some indexes, the selected values ​​can be retrieved from the index tree for increased speed

Finally, indexes are less important for queries on small tables. When a query needs to access most rows, sequential reads are faster than processing indexes.

2. Avoid full table scans

When MySQL uses a full table scan to resolve a query, the output from EXPLAIN displays ALL in the type column. This usually happens when:

  • The table is so small that a full table scan is much faster than an index seek. This is common for tables with less than 10 rows and short row lengths.
  • The index column is not used in the ON or WHERE clause.
  • Comparing the indexed column to a constant value that MySQL has already computed (based on the index tree) covers a large portion of the table and the table scan will be faster.
  • You are using a key with low cardinality (many rows match the key value) via another column. In this case, MySQL assumes that by using that key, it can potentially perform many key lookups, and a table scan will be faster.

For small tables, a table scan is usually appropriate and has a negligible impact on performance.

For large tables, you can try the following techniques to avoid the optimizer incorrectly choosing a table scan:

  • Use ANALYZE TABLE tbl_name to update the key distribution
  • Use FORCE INDEX to tell MySQL that a table scan is very expensive compared to using a given index.

3. Column Index

The B-tree data structure enables indexes to quickly find a specific value, a set of values, or a range of values ​​corresponding to operators such as =, >, ≤, BETWEEN, IN, etc. in a WHERE clause.

Each storage engine defines the maximum number of indexes and the maximum index length per table. All storage engines support at least 16 indexes per table, and the total index length is at least 256 bytes.

Index prefix

Use col_name(N) to create an index using only the first N characters of a column. In InnoDB tables, the maximum prefix length is 767 bytes.

Full-text index

FULLTEXT indexes are used for full-text searches. FULLTEXT indexes are supported only for the InnoDB and MyISAM storage engines, and only for CHAR, VARCHAR, and TEXT columns. Indexing is always done on the entire column, and column prefix indexes are not supported.

Spatial index

Refers to a data structure that is arranged in a certain order based on the position and shape of spatial objects or a certain spatial relationship between spatial objects.

Indexes on the MEMORY storage engine

By default, the MEMORY storage engine uses HASH indexes, but also supports BTREE indexes.

4. Multi-column index

MySQL can create composite indexes (that is, indexes on multiple columns). An index can contain up to 16 columns.

Suppose there is a representation defined like this:

CREATE TABLE test (
  id INT NOT NULL,
  last_name CHAR(30) NOT NULL,
  first_name CHAR(30) NOT NULL,
  PRIMARY KEY (id),
  INDEX idx_name (last_name,first_name)
);

The idx_name index is an index built on the last_name and first_name columns. This index can be used for queries that specify a combination of last_name and first_name values, or for queries that specify only last_name values, because this index matches the leftmost prefix.

Therefore, the idx_name index can be used for the following queries:

SELECT * FROM test WHERE last_name='Jones';

SELECT * FROM test WHERE last_name='Jones' AND first_name='John';

SELECT * FROM test WHERE last_name='Jones' AND (first_name='John' OR first_name='Jon');

SELECT * FROM test WHERE last_name='Jones' AND first_name >='M' AND first_name < 'N';

However, the idx_name index cannot be used for the following queries:

SELECT * FROM test WHERE first_name='John';

SELECT * FROM test WHERE last_name='Jones' OR first_name='John';

Consider the following SQL:

SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

If a multi-column index exists on col1 and col2, then the appropriate rows can be fetched directly. If separate single-column indexes exist on col1 and col2, the optimizer will attempt to use the index merge optimization, or find the most restrictive index by determining which index needs to exclude more rows, and use that index to fetch the rows.

If the table has a multicolumn index, the optimizer can use any leftmost prefix of the index to find rows. For example, if you have a three-column index (col1, col2, col3), then you have index search capabilities on (col1), (col1, col2), (col1, col2, col3).

If the columns do not form a leftmost prefix of the index, MySQL cannot use the index to perform the lookup.

Look at the following SQL statement:

SELECT * FROM tbl_name WHERE col1=val1;
SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

SELECT * FROM tbl_name WHERE col2=val2;
SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;

If there is a composite index on (col1, col2, col3), then only the first two queries will use it. Then the last two queries will not use the index to perform the lookup because (col2) and (col2, col3) are not leftmost prefixes of (col1, col2, col3).

5. Comparison of B-Tree and Hash indexes

B-tree index characteristics

B-tree indexes can be used for column comparisons in expressions that use the =, >, >=, <, <=, and BETWEEN operators. The index can also be used for LIKE comparisons if the argument to LIKE is a constant string that does not start with a wildcard character.

The following clauses do not use indexes:

/* the LIKE value begins with a wildcard character */
SELECT * FROM tbl_name WHERE key_col LIKE '%Patrick%';
/* the LIKE value is not a constant */
SELECT * FROM tbl_name WHERE key_col LIKE other_col;

Any index that does not cover all AND levels in the WHERE clause will not be used to optimize the query. In other words, in order to be able to use an index, a prefix of the index must be used in each AND group.

The following WHERE clauses use indexes:

... WHERE index_part1=1 AND index_part2=2 AND other_column=3

  /* index = 1 OR index = 2 */
... WHERE index=1 OR A=10 AND index=2

  /* optimized like "index_part1='hello'" */
... WHERE index_part1='hello' AND index_part3=5

  /* Can use index on index1 but not on index2 or index3 */
... WHERE index1=1 AND index2=2 OR index1=3 AND index3=3;

The following WHERE clauses do not use indexes:

/* index_part1 is not used */
... WHERE index_part2=1 AND index_part3=2

  /* Index is not used in both parts of the WHERE clause */
... WHERE index=1 OR A=10

  /* No index spans all rows */
... WHERE index_part1=1 OR index_part2=10

Sometimes MySQL does not use an index even though one is available. One possible reason for this to occur is that the optimizer estimates that using the index will require accessing a large percentage of the rows in the table. (In this case, a table scan might be faster because it requires fewer lookups.) However, if such a query uses LIMIT to retrieve only certain rows, MySQL still uses the index because it can find the few rows to return more quickly.

Hash index characteristics

Hash indexes have some different characteristics from the indexes just discussed:

  • Hash indexes are only used for equality comparisons with the = or <=> operators (but they are very fast), not for comparison operators to find ranges of values. Systems that rely on such single-value lookups are called “key-value stores”; to use MySQL for such applications, use hash indexes whenever possible.
  • The optimizer cannot use hash indexes to speed up ORDER BY operations. (Hash-type indexes cannot be used to search for the next entry in sequence)
  • MySQL cannot determine approximately how many rows there are between two values ​​(which the range optimizer uses to decide which index to use)
  • Only whole keys can be used to search for rows. (For a B-tree index, any leftmost prefix of the key can be used to find a row.)

B-tree

Tree data structure, widely used in database indexing. The structure is always kept in order, allowing fast lookups for exact matches (equals operators) and ranges (for example, greater than, less than, and BETWEEN operators). Such indexes are available for most storage engines, such as InnoDB and MyISAM.

Because a B-tree node can have many children, a B-tree is different from a binary tree, where each node can have at most 2 children.

Use of the term B-tree is intended to refer to a general class of index designs. The B-tree structure used by the MySQL storage engine might be considered a variant due to complexities not present in the classic B-tree design.

Hash index

A type of index designed for queries that use the equality operator instead of the range operator. It can be used for MEMORY tables. Although hash indexes are the default index for MEMORY tables for historical reasons, the storage engine also supports B-tree indexes, which are generally a better choice for general-purpose queries.

6. Optimize data size

Design tables so that they take up minimal space on disk. This can provide a huge improvement by reducing the amount of data written to and read from disk. Smaller tables generally require less main memory when processing their contents during query execution. Any space reduction for the table data results in smaller indexes, which can be processed faster.

MySQL supports many different storage engines (table types) and row formats. For each table, you can decide which storage and indexing method to use. Choosing the appropriate table format for your application can significantly improve performance.

Table Columns

  • Use the most efficient (smallest) data type possible. MySQL has many specialized types that can save disk space and memory. For example, if possible, use smaller integer types to get smaller tables. MEDIUMINT is usually better than INT because a MEDIUMINT column uses 25% less space.
  • If possible, declare the column as NOT NULL. It can make SQL operations faster by making better use of indexes and eliminating the overhead of testing every value for NULL. And it also saves some storage space, one bit per column. If you really need NULL values ​​in your table, use them. Just avoid using the default setting, which allows NULL values ​​in every column.

Row Format

To further reduce space by storing table data in compressed form, specify ROW_FORMAT=COMPRESSED when creating an InnoDB table.

Indexes

  • The primary key index of a table should be as short as possible. This makes identifying each row easy and efficient. For InnoDB tables, the primary key columns are repeated in every secondary index entry, so a shorter primary key can save a lot of space if you have many secondary indexes.
  • Create only the indexes you need to improve query performance. Indexes are great for retrieval, but slow down insert and update operations. If you access a table primarily by searching for combinations of columns, create a single composite index on the table rather than creating separate indexes for each column. The first part of the index should be the most frequently used columns. If many columns are consistently used when querying from the table, the first column in the index should be the most repeated column to better compress the index.
  • If it is a long string column, it is very likely to have a unique prefix on the first character. In this case, it is best to use the MySQL prefix for indexing (PS: only the first few characters are indexed). Shorter indexes are faster, not only because they require less disk space, but also because they also result in more hits in the index cache, thus reducing the number of disk seeks.

Joins

Declare columns with identical information in different tables with the same data type to speed up joins based on corresponding columns.
Keep column names simple so that you can use the same name in different tables and simplify join queries. For example, in a table named customer, use the column name name instead of customer_name. To make your names portable to other SQL servers, consider limiting the name length to 18 characters or less.

Normalization

Generally, try to keep all data non-redundant (called third normal form in database theory). Assign them unique IDs instead of one lengthy repetitive value, repeat those IDs in as many smaller tables as needed, and join the tables in queries by referencing the IDs in the join clause.

7. Optimize data types

Numeric Types

It is better to use numeric values ​​rather than strings to uniquely identify rows because large numbers take up fewer bytes of storage than the corresponding strings, so transferring and comparing them is faster and takes up less memory.

Character and string types

  • When comparing values ​​from different columns, declare those columns using the same character set and collation whenever possible to avoid string conversions when running queries.
  • For column values ​​smaller than 8KB, use binary VARCHAR instead of BLOB. The GROUP BY and ORDER BY clauses can generate temporary tables, and these temporary tables can use the MEMORY storage engine if the original table does not contain any BLOB columns.
  • If a table contains string columns such as name and address, but many queries do not retrieve these columns, consider splitting the string columns into a separate table and using join queries with foreign keys when necessary. When MySQL retrieves any value from a row, it reads a block of data that contains all the columns for that row (and possibly other adjacent rows). Keeping each row small and containing only the most commonly used columns allows you to fit more rows per data block. This compact table reduces disk I/O and memory usage for common queries.
  • When using a randomly generated value as a primary key in an InnoDB table, it is best to prefix it with an ascending value, such as the current date and time (if possible). InnoDB can insert and retrieve consecutive primary key values ​​more quickly when they are stored physically adjacent to each other.

other

  • A temporary table is used when the columns used in ORDER BY and GROUP BY are inconsistent, or when ORDER BY or GROUP BY in a join query uses columns from a table other than the first table.
  • MySQL has a hard limit of 4096 columns per table, but for a given table the effective maximum may be less. InnoDB has a limit of 1017 columns per table.

The above is the details of how to optimize MySQL indexes. For more information about MySQL index optimization, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL uses covering index to avoid table return and optimize query
  • How to view and optimize MySql indexes
  • How to optimize MySQL index function based on Explain keyword
  • How to use indexes to optimize MySQL ORDER BY statements
  • MySQL functional index optimization solution
  • MySQL performance optimization: how to use indexes efficiently and correctly
  • An article to master MySQL index query optimization skills
  • MySQL database optimization: index implementation principle and usage analysis
  • A brief discussion on MySQL index optimization analysis
  • Understanding MySQL - Indexing and Optimization Summary
  • How to design and optimize MySQL indexes

<<:  Docker and iptables and implementation of bridge mode network isolation and communication operations

>>:  Robots.txt detailed introduction

Recommend

This article will show you how JavaScript garbage collection works

Table of contents 1. Overview 2. Memory Managemen...

Detailed explanation of mysql integrity constraints example

This article describes the MySQL integrity constr...

Installation and configuration method of vue-route routing management

introduce Vue Router is the official routing mana...

Details on using order by in MySQL

Table of contents 1. Introduction 2. Main text 2....

Detailed explanation of common commands in Docker repository

Log in docker login Complete the registration and...

3 different ways to clear the option options in the select tag

Method 1 Copy code The code is as follows: documen...

Experience of redesigning the homepage of TOM.COM

<br />Without any warning, I saw news on cnB...

Four practical tips for JavaScript string operations

Table of contents Preface 1. Split a string 2. JS...

Example of using CSS to achieve semi-transparent background and opaque text

This article introduces an example of how to use ...

Steps to deploy Spring Boot project using Docker

Table of contents Create a simple springboot proj...

React+Koa example of implementing file upload

Table of contents background Server Dependencies ...

CSS draw a lollipop example code

Background: Make a little progress every day, acc...