1. Introduction Why do we need indexes? In general application systems, the read-write ratio is about 10:1, and insert operations and general update operations rarely have performance issues. In a production environment, the most common and most problematic operations we encounter are still some complex query operations, so the optimization of query statements is obviously a top priority. When it comes to speeding up queries, we have to mention indexes. What is an index? An index, also called a "key" in MySQL, is a data structure used by the storage engine to quickly find records. Indexes are important for good performance Especially when the amount of data in the table grows, the impact of index on performance becomes more and more important. Index optimization should be the most effective means to optimize query performance. Indexes can easily improve query performance by several orders of magnitude. The index is equivalent to the phonetic table of the dictionary. If you want to look up a certain word, if you don’t use the phonetic table, you need to search through hundreds of pages one by one. 30 10 40 5 15 35 66 1 6 11 19 21 39 55 100 Do you have any misunderstandings about indexing? Indexing is an important aspect of application design and development. If there are too many indexes, the performance of your application may suffer. However, if there are too few indexes, query performance will be affected. Finding a balance is crucial to the performance of the application. Some developers always think of adding indexes as an afterthought - I have always believed that this stems from a wrong development model. If you know how your data will be used, you should add indexes where needed from the beginning. Developers often use the database at the application level, such as writing SQL statements, stored procedures, etc. They may not even know the existence of indexes, or think that they can just ask the relevant DBA to add them afterwards. DBAs often do not have a good understanding of the business data flow, and adding indexes requires monitoring a large number of SQL statements to find problems. This step will definitely take much longer than the time required to initially add the index, and some indexes may be missed. Of course, the more indexes, the better. I once encountered a problem like this: the iostat of a MySQL server showed that the disk usage rate was always at 100%. After analysis, it was found that it was because the developers added too many indexes. After deleting some unnecessary indexes, the disk usage rate immediately dropped to 20%. It can be seen that adding indexes is also very technical. 2. The principle of index 1. Indexing Principle The purpose of the index is to improve query efficiency, which is the same as the catalog we use to look up books: first locate the chapter, then locate a subsection under the chapter, and then find the page number. Similar examples include: looking up a dictionary, checking train schedules, flight schedules, etc. The essence is: to filter out the final desired result by continuously narrowing the scope of the data to be obtained, and at the same time turn random events into sequential events. In other words, with this indexing mechanism, we can always use the same search method to lock the data. The same is true for databases, but it is obviously much more complicated because it not only faces equality queries, but also range queries (>, <, between, in), fuzzy queries (like), union queries (or), and so on. How should the database choose to deal with all the problems? Let’s think back to the dictionary example. Can we divide the data into segments and then query them in segments? The simplest way is to divide 1,000 pieces of data into the first section with numbers 1 to 100, the second section with numbers 101 to 200, and the third section with numbers 201 to 300. Then, to check the 250th piece of data, you only need to find the third section, thus eliminating 90% of the invalid data at once. But what if there are 10 million records, how many segments should they be divided into? Students who have a basic knowledge of algorithms will think of the search tree, which has an average complexity of lgN and has good query performance. But here we have overlooked a key issue, the complexity model is based on the cost of the same operation each time. The database implementation is relatively complex. On the one hand, the data is stored on the disk. On the other hand, in order to improve performance, part of the data can be read into the memory for calculation each time. Because we know that the cost of accessing the disk is about 100,000 times that of accessing the memory, a simple search tree is difficult to meet complex application scenarios. 2. Disk IO and Pre-reading We mentioned disk access earlier, so here is a brief introduction to disk IO and pre-reading. Disk reading relies on mechanical movement. The time spent on each data read can be divided into three parts: seek time, rotational delay, and transmission time. Seek time refers to the time required for the magnetic arm to move to the specified track, and mainstream disks are generally below 5ms; rotational delay is the disk rotation speed we often hear about. For example, a disk with 7200 rpm means it can rotate 7200 times per minute, which means it can rotate 120 times per second. The rotational delay is 1/120/2 = 4.17ms; transmission time refers to the time it takes to read or write data from the disk, which is generally a few tenths of a millisecond and can be ignored compared to the first two times. So the time to access a disk, that is, the time for a disk IO, is about 5+4.17 = 9ms, which sounds good, but you have to know that a 500-MIPS (Million Instructions Per Second) machine can execute 500 million instructions per second. Because instructions rely on the nature of electricity, in other words, the time to execute an IO can execute about 4.5 million instructions. Databases often contain hundreds of thousands, millions, or even tens of millions of data. 9 milliseconds each time is obviously a disaster. The following figure is a comparison of computer hardware delays for your reference: Considering that disk IO is a very expensive operation, the computer operating system has made some optimizations. During an IO, not only the data at the current disk address, but also the adjacent data are read into the memory buffer. This is because the principle of local pre-reading tells us that when the computer accesses data at an address, the data adjacent to it will also be accessed quickly. The data read each time by IO is called a page. The specific size of a page depends on the operating system, which is usually 4k or 8k. That is, when we read data in a page, only one IO actually occurs. This theory is very helpful for the design of index data structure. 3. Index data structure Earlier we discussed the basic principles of indexing, the complexity of databases, and the relevant knowledge of operating systems. The purpose is to make everyone understand that any data structure does not come out of thin air, and it must have its background and usage scenarios. Let's now summarize what we need this data structure to do. In fact, it is very simple, that is: each time you look up data, control the number of disk IO times to a very small order of magnitude, preferably a constant order of magnitude. So we wonder if a highly controllable multi-way search tree can meet the needs? In this way, the b+ tree came into being (the B+ tree evolved from the binary search tree, then the balanced binary tree, and the B tree). As shown in the figure above, it is a b+ tree. For the definition of b+ tree, please refer to B+ tree. Here we will only talk about some key points. The light blue block is called a disk block. You can see that each disk block contains several data items (shown in dark blue) and pointers (shown in yellow). For example, disk block 1 contains data items 17 and 35, and contains pointers P1, P2, and P3. P1 represents a disk block less than 17, P2 represents a disk block between 17 and 35, and P3 represents a disk block greater than 35. The real data exists in the leaf nodes 3, 5, 9, 10, 13, 15, 28, 29, 36, 60, 75, 79, 90, and 99. Non-leaf nodes do not store real data, but only store data items that guide the search direction. For example, 17 and 35 do not actually exist in the data table. ###b+ tree search process As shown in the figure, if you want to find data item 29, disk block 1 will be loaded from the disk to the memory first. At this time, an IO occurs. A binary search is used in the memory to determine that 29 is between 17 and 35. The P2 pointer of disk block 1 is locked. The memory time is very short (compared to the disk IO) and can be ignored. Disk block 3 is loaded from the disk to the memory through the disk address of the P2 pointer of disk block 1. The second IO occurs. 29 is between 26 and 30. The P2 pointer of disk block 3 is locked. Disk block 8 is loaded into the memory through the pointer. The third IO occurs. At the same time, a binary search is performed in the memory to find 29, and the query ends. A total of three IOs are performed. The reality is that a 3-layer b+ tree can represent millions of data. If searching millions of data only requires three IOs, the performance improvement will be huge. If there is no index, each data item will require an IO, then a total of millions of IOs will be required, which is obviously very costly. ###b+Tree Properties 1. The index field should be as small as possible: Through the above analysis, we know that the number of IO times depends on the height h of b+number. Assuming that the data in the current data table is N, and the number of data items in each disk block is m, then h=㏒(m+1)N. When the data volume N is constant, the larger m is, the smaller h is; and m = disk block size/data item size. The disk block size is the size of a data page, which is fixed. If the space occupied by the data item is smaller and the number of data items is greater, the height of the tree is lower. This is why each data item, i.e. the index field, should be as small as possible. For example, int occupies 4 bytes, which is half of bigint 8 bytes. This is why the b+ tree requires that the real data be placed in the leaf nodes rather than the inner nodes. Once placed in the inner nodes, the data items of the disk blocks will drop significantly, causing the tree to increase in height. When the data item is equal to 1, it will degenerate into a linear list. 2. The leftmost matching feature of the index: When the data item of the b+ tree is a composite data structure, such as (name, age, sex), the b+ tree builds the search tree in order from left to right. For example, when data such as (Zhang San, 20, F) is retrieved, the b+ tree will first compare the name to determine the next search direction. If the name is the same, then the age and sex are compared in turn to finally get the retrieved data; but when data without a name such as (20, F) comes, the b+ tree does not know which node to check next, because the name is the first comparison factor when the search tree is established, and it is necessary to search according to the name first to know where to query next. For example, when retrieving data like (Zhang San, F), the b+ tree can use name to specify the search direction, but the next field age is missing, so it can only find the data with the name equal to Zhang San, and then match the data with the gender of F. This is a very important property, namely the leftmost matching feature of the index. 4. Clustered index and auxiliary index In the database, the height of the B+ tree is generally 2 to 4 layers, which means that it only takes 2 to 4 IOs at most to find a row record of a certain key value, which is quite good. Because the current general mechanical hard disk can perform at least 100 IOs per second, 2 to 4 IOs means that the query time only takes 0.02 to 0.04 seconds. The B+ tree index in the database can be divided into clustered index and secondary index. The same thing about clustered indexes and auxiliary indexes is that both are in the form of B+ trees, that is, the height is balanced, and the leaf nodes store all the data. The difference between a clustered index and a secondary index is whether the leaf node stores a whole row of information. 1. Clustered Index
One of the benefits of a clustered index is that it can quickly sort and search for primary keys, and the data in the leaf nodes is exactly what the user wants to query. For example, if a user needs to search a table and query the last 10 digits of user information, since the B+ tree index is a bidirectional linked list, the user can quickly find the last data page and retrieve 10 records. The second benefit of clustered indexes: range query. That is, if you want to find data within a certain range of the primary key, you can get the page range through the upper intermediate nodes of the leaf nodes, and then directly read the data page. 2. Auxiliary index Except for the clustered index, all other indexes in the table are secondary indexes (also called non-clustered indexes). The difference from the clustered index is that the leaf nodes of the secondary index do not contain all the data of the row record. In addition to the key value, each leaf node index row also contains a bookmark. The bookmark is used to tell the InnoDB storage engine where to find the row data corresponding to the index. Since the InnoDB storage engine is an index-organized table, the bookmark of the InnoDB storage engine's auxiliary index is the clustered index key of the corresponding row data. As shown below The existence of auxiliary indexes does not affect the organization of data in the clustered index, so each table can have multiple auxiliary indexes, but only one clustered index. When searching for data through auxiliary indexes, the InnoDB storage engine traverses the auxiliary indexes and obtains the primary key of the primary key index through the leaf-level pointer, and then finds a complete row record through the primary key index. For example, if you search for data in an auxiliary index tree with a height of 3, you need to traverse the auxiliary index tree 3 times to find the specified primary key. If the height of the clustered index tree is also 3, you need to search the clustered index tree 3 times to finally find the page where the complete row data is located. Therefore, a total of 6 logical IO accesses are required to get the final data page. 5. MySQL Index Management 1. Function 1. The function of index is to speed up search 2. Primary key, unique, and joint unique in MySQL are also indexes. In addition to speeding up searches, these indexes also have constraints. 2. Commonly used indexes in MySQL Normal index INDEX: speed up search Unique index: -Primary key index: speed up search + constraint (not empty, cannot be repeated) -Unique index UNIQUE: speed up search + constraint (cannot be repeated) Combined index: -PRIMARY KEY(id,name): combined primary key index -UNIQUE(id,name): combined unique index -INDEX(id,name): combined common index 3. Two major types of indexes: hash and btree
4. Syntax for creating/deleting indexes #Method 1: Create a table CREATE TABLE table name ( Field Name 1 Data Type [Integrity Constraints…], Field name 2 Data type [integrity constraint...], [UNIQUE | FULLTEXT | SPATIAL ] INDEX | KEY [index name] (field name [(length)] [ASC | DESC]) ); #Method 2: CREATE creates an index on an existing table CREATE [UNIQUE | FULLTEXT | SPATIAL ] INDEX index name ON table name (field name [(length)] [ASC | DESC]); #Method 3: ALTER TABLE creates an index on an existing table ALTER TABLE table name ADD [UNIQUE | FULLTEXT | SPATIAL ] INDEX Index Name(Field Name[(Length)] [ASC |DESC]); #Delete index: DROP INDEX index name ON table name; #Method 1 create table t1( id int, name char, age int, sex enum('male','female'), unique key uni_id(id), index ix_name(name) #index has no key ); #Method 2 create index ix_age on t1(age); #Method 3 alter table t1 add index ix_sex(sex); #Viewmysql> show create table t1; | t1 | CREATE TABLE `t1` ( `id` int(11) DEFAULT NULL, `name` char(1) DEFAULT NULL, `age` int(11) DEFAULT NULL, `sex` enum('male','female') DEFAULT NULL, UNIQUE KEY `uni_id` (`id`), KEY `ix_name` (`name`), KEY `ix_age` (`age`), KEY `ix_sex` (`sex`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 6. Test Index 1. Preparation #1. Prepare table create table s1( id int, name varchar(20), gender char(6), email varchar(50) ); #2. Create a stored procedure to insert records in batches delimiter $$ #Declare the end symbol of the stored procedure to be $$ create procedure auto_insert1() BEGIN declare i int default 1; while(i<3000000)do insert into s1 values(i,'duoduo','male',concat('duoduo',i,'@oldboy')); set i=i+1; end while; END$$ #$$end delimiter; #Re-declare the semicolon as the end symbol#3. View the stored procedure show create procedure auto_insert1\G #4. Call the stored procedure call auto_insert1(); #Wait for a certain amount of time to see the machine performance Tip: The time it takes to create a table depends on the performance of the machine, so please be patient! 2. Test query speed without index #No index: MySQL has no idea whether there is a record with id equal to 333333333. It can only scan the data table from beginning to end. At this time, there are as many IO operations as there are disk blocks, so the query speed is very slow. mysql> select * from s1 where id=333333333; Empty set (0.33 sec) 3. If a large amount of data already exists in the table, creating an index for a field segment will be very slow 4. After the index is established, the query speed is significantly improved when the field is used as the query condition PS: 1. MySQL first searches the index table based on the B+ tree search principle and quickly finds that the record with id equal to 333333333 does not exist. The IO is greatly reduced, so the speed is significantly improved. 2. We can find the table in the data directory of mysql and see that it occupies more hard disk space. 3. Please note that as shown below V. Conclusion
7. Use indexes correctly 1. Index miss It does not mean that creating an index will definitely speed up the query. If we want to use the index to achieve the expected effect of improving the query speed, we must follow the following issues when adding the index. 1. Scope problem, or the condition is not clear. The following symbols or keywords appear in the condition: >, >=, <, <=, !=, between...and..., like, Greater than, less than Not equal! = between ...and... like 2Try to choose columns with high discrimination as indexes. The formula for discrimination is count(distinct col)/count(*), which indicates the ratio of non-duplicate fields. The larger the ratio, the fewer records we need to scan. The discrimination of a unique key is 1, while some status and gender fields may have a discrimination of 0 in the face of big data. Someone may ask, is there any empirical value for this ratio? This value is difficult to determine in different usage scenarios. Generally, we require that the value of the field to be joined is above 0.1, which means that on average, 1 scan takes 10 records. # First delete all indexes in the table so that we can focus on the problem of discrimination mysql> desc s1; +--------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+-------------+------+-----+---------+-------+ | id | int(11) | YES | MUL | NULL | | | name | varchar(20) | YES | | NULL | | | gender | char(5) | YES | | NULL | | | email | varchar(50) | YES | MUL | NULL | | +--------+-------------+------+-----+---------+-------+ rows in set (0.00 sec) mysql> drop index a on s1; Query OK, 0 rows affected (0.20 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> drop index d on s1; Query OK, 0 rows affected (0.18 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> desc s1; +--------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+-------------+------+-----+---------+-------+ | id | int(11) | YES | | NULL | | | name | varchar(20) | YES | | NULL | | | gender | char(5) | YES | | NULL | | | email | varchar(50) | YES | | NULL | | +--------+-------------+------+-----+---------+-------+ rows in set (0.00 sec) First, delete all the indexes in the table and let's focus on the problem of discrimination.
3. = and in can be in any order, for example, a = 1 and b = 2 and c = 3. You can create an (a, b, c) index in any order, and the MySQL query optimizer will help you optimize it into a form that the index can recognize. 4. Index columns cannot be used in calculations. Keep the columns "clean". For example, if from_unixtime(create_time) = '2014-05-29', the index cannot be used. The reason is simple. The b+ tree stores the field values in the data table. However, when searching, all elements need to be compared with the function, which is obviously too costly. So the statement should be written as create_time = unix_timestamp('2014-05-29') 5. and/or
When the condition on the left is met but the index field has low discrimination (name, speeding up query) 6 The leftmost prefix matching principle is a very important principle. For a combined index, MySQL will keep matching to the right until it encounters a range query (>, <, between, like) and then stop matching (which means that the range is large and the index speed is slow). For example, if a = 1 and b = 2 and c > 3 and d = 4 are created in the order of (a, b, c, d), d will not be used in the index. If an index is created in the order of (a, b, d, c), all of them can be used, and the order of a, b, d can be adjusted arbitrarily. 7. Other situations - Use function select * from tb1 where reverse(email) = 'duoduo'; - Type inconsistency If the column is a string type, the input condition must be enclosed in quotation marks, otherwise... select * from tb1 where email = 999; #If the sorting condition is an index, the select field must also be an index field, otherwise it will not be hit - order by select name from s1 order by email desc; When sorting by index, if the select query field is not indexed, the speed is still slow. select email from s1 order by email desc; Special: If you sort the primary key, it is still very fast: select * from tb1 order by nid desc; - The leftmost prefix of the combined index If the combined index is: (name, email) name and email -- hits indexname -- hits indexemail -- misses index- count(1) or count(column) instead of count(*) makes no difference in mysql- create index xxxx on tb(title(19)) #text type, length must be specified Other considerations
8. Joint Index and Covering Index 1. Joint Index A joint index refers to combining multiple columns on a table into one index. The method of creating a joint index is the same as that of creating a single index. The only difference is that there are multiple index columns, as follows mysql> create table t( -> a int, -> b int, -> primary key(a), -> key idx_a_b(a,b) -> ); Query OK, 0 rows affected (0.11 sec) So when do you need to use a joint index? Before discussing this issue, let's take a look at the results inside the joint index. In essence, a joint index is a B+ tree, the difference is that the number of key values in a joint index is not 1, but >= 2. Next, let's discuss the joint index consisting of two integer columns. Assume that the names of the two key values are a and b as shown in the figure. You can see that this is no different from the single-key B+ tree we saw before. The key values are sorted, and all data can be read logically and sequentially through the leaf nodes. For the example above, (1,1), (1,2), (2,1), (2,4), (3,1), (3,2), the data is stored in the order of (a,b). Therefore, for the query select * from table where a=xxx and b=xxx, it is obvious that the joint index (a,b) can be used. For the query select * from table where a=xxx on a single column a, the index (a,b) can also be used. But for the query of column b, select * from table where b=xxx, the (a,b) index cannot be used. In fact, it is not difficult to find the reason. The values of b on the leaf nodes are 1, 2, 1, 4, 1, 2, which are obviously not sorted. Therefore, the (a,b) index cannot be used for the query of column b. The second benefit of a joint index is that when the first key is the same, the second key has already been sorted. For example, in many cases, an application needs to query a user's shopping history, sort by time, and finally retrieve the three most recent purchase records. In this case, using a joint index can help us avoid an extra sorting operation because the index itself has already been sorted at the leaf node, as shown below. #===========Prepare the table============== create table buy_log( userid int unsigned not null, buy_date date ); insert into buy_log values (1,'2009-01-01'), (2,'2009-01-01'), (3,'2009-01-01'), (1,'2009-02-01'), (3,'2009-02-01'), (1,'2009-03-01'), (1,'2009-04-01'); alter table buy_log add key(userid); alter table buy_log add key(userid,buy_date); #==========Verification============== mysql> show create table buy_log; | buy_log | CREATE TABLE `buy_log` ( `userid` int(10) unsigned NOT NULL, `buy_date` date DEFAULT NULL, KEY `userid` (`userid`), KEY `userid_2` (`userid`,`buy_date`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | #You can see that possible_keys has two indexes available here, namely the single index userid and the joint index userid_2, but the optimizer finally chooses to use the key userid because the leaf node of the index contains a single key value, so theoretically a page can store more records mysql> explain select * from buy_log where userid=2; +----+-------------+---------+------+-----------------+--------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+-----------------+--------+---------+-------+------+-------+ | 1 | SIMPLE | buy_log | ref | userid,userid_2 | userid | 4 | const | 1 | | +----+-------------+---------+------+-----------------+--------+---------+-------+------+-------+ row in set (0.00 sec) #Next, suppose you want to retrieve the three most recent purchase records for userid 1. You will use the joint index userid_2 because in this index, when userid=1, buy_date is already sorted.mysql> explain select * from buy_log where userid=1 order by buy_date desc limit 3; +----+-------------+---------+------+-----------------+----------+---------+-------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+-----------------+----------+---------+-------+------+--------------------------+ | 1 | SIMPLE | buy_log | ref | userid,userid_2 | userid_2 | 4 | const | 4 | Using where; Using index | +----+-------------+---------+------+-----------------+----------+---------+-------+------+--------------------------+ row in set (0.00 sec) #ps: If the extra sorting shows "Using filesort", it means that secondary sorting is required after the data is found. #For the joint index (a, b), the following statement can directly use the index without secondary sorting select ... from table where a=xxx order by b; #Then for the joint index (a,b,c), the following statement can also get the result directly through the index select ... from table where a=xxx order by b; select ... from table where a=xxx and b=xxx order by c; #However, for the joint index (a, b, c), the following statement cannot get the result directly through the index, and you need to perform a filesort operation yourself, because the index (a, c) is not sorted select ... from table where a=xxx order by c; 2. Covering Index The InnoDB storage engine supports covering indexes (or index covering), that is, query records can be obtained from auxiliary indexes without querying records in the clustered index. One benefit of using a covering index is that the auxiliary index does not contain all the information of the entire row record, so its size is much smaller than the clustered index, thus reducing a large number of IO operations. Note: Covering index technology was first completed and implemented in the InnoDB Plugin, which means that for InnoDB versions less than 1.0 or MySQL database versions less than 5.0, the InnoDB storage engine does not support covering index features. For the auxiliary index of the InnoDB storage engine, since it contains the primary key information, the data stored in its leaf node is (primary key1, primary key2, ..., key1, key2, ...). For example select age from s1 where id=123 and name = 'duoduo'; #The id field has an index, but the name field does not. This SQL hits the index but does not cover it. You need to look for detailed information in the clustered index. The best case is that the index field covers everything, so the entire process of indexing can be used to speed up the query and obtain results.mysql> desc s1; +--------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+-------------+------+-----+---------+-------+ | id | int(11) | NO | | NULL | | | name | varchar(20) | YES | | NULL | | | gender | char(6) | YES | | NULL | | | email | varchar(50) | YES | | NULL | | +--------+-------------+------+-----+---------+-------+ rows in set (0.21 sec) mysql> explain select name from s1 where id=1000; #No index+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+ | 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 2688336 | 10.00 | Using where | +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+ row in set, 1 warning (0.00 sec) mysql> create index idx_id on s1(id); #Create index Query OK, 0 rows affected (4.16 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> explain select name from s1 where id=1000; #The auxiliary index is hit, but the index is not covered. You also need to find name from the clustered index +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+------+------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+------+------+ | 1 | SIMPLE | s1 | NULL | ref | idx_id | idx_id | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+------+------+ row in set, 1 warning (0.08 sec) mysql> explain select id from s1 where id=1000; #All information is found in the auxiliary index. Using index means covering index+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | s1 | NULL | ref | idx_id | idx_id | 4 | const | 1 | 100.00 | Using index | +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+ row in set, 1 warning (0.03 sec) Another benefit of covering indexes is for certain statistical problems. Based on the table buy_log created in the previous summary, the query plan is as follows mysql> explain select count(*) from buy_log; +----+-------------+---------+-------+---------------+--------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+--------+---------+------+------+-------------+ | 1 | SIMPLE | buy_log | index | NULL | userid | 4 | NULL | 7 | Using index | +----+-------------+---------+-------+---------------+--------+---------+------+------+-------------+ row in set (0.00 sec) The innodb storage engine does not choose to perform statistics by querying the clustered index. Since the buy_log table has a secondary index, and the secondary index is much smaller than the clustered index, selecting the secondary index can reduce IO operations. Therefore, the optimizer chooses the userid secondary index as the key above. For a joint index of the form (a, b), it is generally not possible to select the so-called query condition in b. However, if it is a statistical operation and it is a covering index, the optimizer will still choose to use the index, as follows #Joint index userid_2 (userid, buy_date). Generally, we cannot use this index according to buy_date. However, in special cases, when the query statement is a statistical operation and it is a covering index, we can also use this joint index when buy_date is used as the query condition.mysql> explain select count(*) from buy_log where buy_date >= '2011-01-01' and buy_date < '2011-02-01'; +----+-------------+---------+-------+---------------+----------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+----------+---------+------+------+--------------------------+ | 1 | SIMPLE | buy_log | index | NULL | userid_2 | 8 | NULL | 7 | Using where; Using index | +----+-------------+---------+-------+---------------+----------+---------+------+------+--------------------------+ row in set (0.00 sec) 9. Query Optimization Tool - explain I believe everyone is familiar with the explain command. For specific usage and field meanings, please refer to the official website explain-output. It should be emphasized here that rows is the core indicator. Most statements with small rows must be executed very quickly (there are exceptions, which will be discussed below). Therefore, optimization statements are basically optimizing rows.
Reference: https://www.jb51.net/article/140759.htm 10. Basic steps for slow query optimization
11. Slow log management
MySQL log management
1. bin-log 1. Enable # vim /etc/my.cnf [mysqld] log-bin[=dir\[filename]] # service mysqld restart 2. Pause //Current session only SET SQL_LOG_BIN=0; SET SQL_LOG_BIN=1; 3. View View All: # mysqlbinlog mysql.000002 By time: # mysqlbinlog mysql.000002 --start-datetime="2012-12-05 10:02:56" # mysqlbinlog mysql.000002 --stop-datetime="2012-12-05 11:02:54" # mysqlbinlog mysql.000002 --start-datetime="2012-12-05 10:02:56" --stop-datetime="2012-12-05 11:02:54" By bytes: # mysqlbinlog mysql.000002 --start-position=260 # mysqlbinlog mysql.000002 --stop-position=260 # mysqlbinlog mysql.000002 --start-position=260 --stop-position=930 4. Truncate bin-log (generate new bin-log file) a. Restart MySQL server b. # mysql -uroot -p123 -e 'flush logs' 5. Delete bin-log files # mysql -uroot -p123 -e 'reset master' 2. Query log Enable general query log # vim /etc/my.cnf [mysqld] log[=dir\[filename]] # service mysqld restart 3. Slow query log Enable slow query log # vim /etc/my.cnf [mysqld] log-slow-queries[=dir\[filename]] long_query_time=n # service mysqld restart MySQL 5.6: slow-query-log=1 slow-query-log-file=slow.log long_query_time=3 View slow query logs test: BENCHMARK(count,expr) SELECT BENCHMARK(50000000,2*3); Summarize The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM. You may also be interested in:
|
<<: Detailed comparison of Ember.js and Vue.js
>>: How to find the specified content of a large file in Linux
When you browse many websites, you will find that ...
Although Microsoft provides T4 templates, I find ...
Table of contents Overview Four examples Example ...
This article shares the specific code for impleme...
This article mainly introduces the case of Vue en...
Table of contents Preface 1. What is 2. How to us...
Sometimes we need to control whether HTML elements...
Table of contents Preface 1. MySQL main storage e...
Later, I also added how to use Jupyter Notebook i...
MySQL startup error Before installing MySQL on Wi...
Preface When making a top menu, you will be requi...
When multiple images are introduced into a page, ...
Yesterday I installed CentOS7 under VMware. I wan...
Table of contents Preface 1. Overview 2. Read-wri...
Preface: After studying the previous article, we ...