MySQL Views Simply put, a MySQL view is a shortcut for defining a SELECT command. We will use a very complex SELECT statement when querying, and we will use this statement frequently in the future. We can generate a view through this statement. A view is a virtual table that does not store data. All the data used is in the real table. The benefits of doing this are: 1. Prevent unauthorized tenants from accessing sensitive data Index principles and query optimization 1. Introduction 1. What is an index? In general application systems, the read-write ratio is about 10:1, and insert operations and general update operations rarely have performance issues. In a production environment, the most common and most problematic operations we encounter are still some complex query operations, so the optimization of query statements is obviously a top priority. When it comes to speeding up queries, we have to mention indexes. 2. Why do we need indexes? An index, also called a "key" in MySQL, is a data structure used by the storage engine to quickly find records. Indexes are critical to good performance, especially as the amount of data in the table grows, the impact of indexes on performance becomes increasingly important. 2. The principle of index One index principle The purpose of the index is to improve query efficiency, which is the same as the catalog we use to look up books: first locate the chapter, then locate a subsection under the chapter, and then find the page number. Similar examples include: looking up a dictionary, checking train schedules, flight schedules, etc. The essence is: to filter out the final desired result by continuously narrowing the scope of the data to be obtained, and at the same time turn random events into sequential events. In other words, with this indexing mechanism, we can always use the same search method to lock the data. The same is true for databases, but it is obviously much more complicated because it not only faces equality queries, but also range queries (>, <, between, in), fuzzy queries (like), union queries (or), and so on. How should the database choose to deal with all the problems? Let’s think back to the dictionary example. Can we divide the data into segments and then query them in segments? The simplest way is to divide 1,000 pieces of data into the first section with numbers 1 to 100, the second section with numbers 101 to 200, and the third section with numbers 201 to 300. Then, to check the 250th piece of data, you only need to find the third section, thus eliminating 90% of the invalid data at once. But what if there are 10 million records, how many segments should they be divided into? Students who have a basic knowledge of algorithms will think of the search tree, which has an average complexity of lgN and has good query performance. But here we have overlooked a key issue, the complexity model is based on the cost of the same operation each time. The database implementation is relatively complex. On the one hand, the data is stored on the disk. On the other hand, in order to improve performance, part of the data can be read into the memory for calculation each time. Because we know that the cost of accessing the disk is about 100,000 times that of accessing the memory, a simple search tree is difficult to meet complex application scenarios. 2. Disk IO and Pre-reading Considering that disk IO is a very expensive operation, the computer operating system has made some optimizations. During an IO, not only the data at the current disk address, but also the adjacent data are read into the memory buffer. This is because the principle of local pre-reading tells us that when the computer accesses data at an address, the data adjacent to it will also be accessed quickly. The data read each time by IO is called a page. The specific size of a page depends on the operating system, which is usually 4k or 8k. That is, when we read data in a page, only one IO actually occurs. This theory is very helpful for the design of index data structure. 3. Index data structure Any data structure does not come out of thin air. It must have its background and usage scenarios. Let's summarize what we need this data structure to do. In fact, it is very simple, that is: each time you look up data, control the number of disk IO times to a very small order of magnitude, preferably a constant order of magnitude. So we wonder if a highly controllable multi-way search tree can meet the needs? In this way, b+ tree came into being. As shown in the figure above, it is a b+ tree. For the definition of b+ tree, please refer to B+ tree. Here we will only talk about some key points. The light blue block is called a disk block. You can see that each disk block contains several data items (shown in dark blue) and pointers (shown in yellow). For example, disk block 1 contains data items 17 and 35, and contains pointers P1, P2, and P3. P1 represents a disk block less than 17, P2 represents a disk block between 17 and 35, and P3 represents a disk block greater than 35. The real data exists in the leaf nodes 3, 5, 9, 10, 13, 15, 28, 29, 36, 60, 75, 79, 90, and 99. Non-leaf nodes do not store real data, but only store data items that guide the search direction. For example, 17 and 35 do not actually exist in the data table. ###b+ tree search process As shown in the figure, if you want to find data item 29, disk block 1 will be loaded from the disk to the memory first. At this time, an IO occurs. A binary search is used in the memory to determine that 29 is between 17 and 35. The P2 pointer of disk block 1 is locked. The memory time is very short (compared to the disk IO) and can be ignored. Disk block 3 is loaded from the disk to the memory through the disk address of the P2 pointer of disk block 1. The second IO occurs. 29 is between 26 and 30. The P2 pointer of disk block 3 is locked. Disk block 8 is loaded into the memory through the pointer. The third IO occurs. At the same time, a binary search is performed in the memory to find 29, and the query ends. A total of three IOs are performed. The reality is that a 3-layer b+ tree can represent millions of data. If searching millions of data only requires three IOs, the performance improvement will be huge. If there is no index, each data item will require an IO, then a total of millions of IOs will be required, which is obviously very costly. ###b+Tree Properties 1. The index field should be as small as possible: Through the above analysis, we know that the number of IO times depends on the height h of b+number. Assuming that the data in the current data table is N, and the number of data items in each disk block is m, then h=㏒(m+1)N. When the data volume N is constant, the larger m is, the smaller h is; and m = disk block size/data item size. The disk block size is the size of a data page, which is fixed. If the space occupied by the data item is smaller and the number of data items is greater, the height of the tree is lower. This is why each data item, i.e. the index field, should be as small as possible. For example, int occupies 4 bytes, which is half of bigint 8 bytes. This is why the b+ tree requires that the real data be placed in the leaf nodes rather than the inner nodes. Once placed in the inner nodes, the data items of the disk blocks will drop significantly, causing the tree to increase in height. When the data item is equal to 1, it will degenerate into a linear list. 4. Mysql index management 1. Function The function of index is to speed up the search for primary key in MySQL. Unique and joint unique are also indexes. In addition to speeding up the search, these indexes also have the function of constraints. 2. MySQL index classification Index Classification 3. Two major types of indexes: hash and btree #When creating the above index, we can specify the index type for it. There are two types of hash type indexes: fast single query and slow range query. Btree type index: B+ tree, the more layers, the exponential growth of data volume (we use it because InnoDB supports it by default) #Different storage engines support different index types. InnoDB supports transactions, row-level locking, B-tree, Full-text and other indexes, but does not support Hash indexes. MyISAM does not support transactions, but supports table-level locking, B-tree, Full-text and other indexes, but does not support Hash indexes; Memory does not support transactions, but supports table-level locking, B-tree, Hash and other indexes, but does not support Full-text indexes; NDB supports transactions, row-level locking, and hash indexes, but does not support B-tree, Full-text, and other indexes. Archive does not support transactions, but supports table-level locking. It does not support B-tree, Hash, Full-text, and other indexes. 4. Syntax for creating/deleting indexes Make good use of the help document help create help create index ================== 1. Create an index - create it when creating the table (a few points to note) create table s1( id int, #You can add primary key here #id int index #You cannot add an index like this, because index is just an index, not a constraint. #You cannot add an index name char(20) when defining a field, like a primary key or unique constraint. age int, email varchar(30) #primary key(id) #You can also add index(id) #You can add it like this); -After creating the table, create index name on s1(name); #Add a common indexcreate unique age on s1(age);Add a unique indexalter table s1 add primary key(id); #Add a housing and construction index, that is, add a primary key constraint to the id fieldcreate index name on s1(id,name); #Add a common joint index2. Delete the indexdrop index id on s1; drop index name on s1; #Delete a common indexdrop index age on s1; #Delete a unique index, just like a common index, you don't need to add unique in front of index to delete it, you can delete it directlyalter table s1 drop primary key; #Delete the primary key (because it was added according to alter, so we also use alter to delete it) Help View 5. Test Index 1. Preparation #1. Prepare table create table s1( id int, name varchar(20), gender char(6), email varchar(50) ); #2. Create a stored procedure to insert records in batches delimiter $$ #Declare the end symbol of the stored procedure to be $$ create procedure auto_insert1() BEGIN declare i int default 1; while(i<3000000)do insert into s1 values(i,concat('egon',i),'male',concat('egon',i,'@oldboy')); set i=i+1; end while; END$$ #$$end delimiter; #Re-declare the semicolon as the end symbol#3. View the stored procedure show create procedure auto_insert1\G #4. Call the stored procedure call auto_insert1(); 2. Test query speed without index #No index: Scan from beginning to end, so the query speed is very slow mysql> select * from s1 where id=333; +------+---------+--------+----------------+ | id | name | gender | email | +------+---------+--------+----------------+ | 333 | egon333 | male | [email protected] | | 333 | egon333 | f | alex333@oldboy | | 333 | egon333 | f | alex333@oldboy | +------+---------+--------+----------------+ rows in set (0.32 sec) mysql> select * from s1 where email='egon333@oldboy'; .... ... rows in set (0.36 sec) 3. Add index #1. Be sure to create an index for the search condition field, for example, select * from t1 where age > 5; then you need to add an index for age. #2. If there is already a large amount of data in the table, creating an index will be very slow and take up hard disk space. Insertion, deletion and update are all very slow. Only query is fast. For example, create index idx on s1(id); will scan all the data in the table, and then use id as the data item to create an index structure and store it in the table on the hard disk. After the creation, the query will be very fast #3. It should be noted that the index of the innodb table will be stored in the s1.ibd file, while the index of the myisam table will have a separate index file table1.MYI 6. Use indexes correctly 1. Covering Index #Analysis select * from s1 where id=123; The SQL hits the index but does not cover it. Use id=123 to locate the location of the id in the hard disk, or in the data table, in the index data structure. However, the field we selected is *, and we need other fields besides id, which means that it is not enough to get the id through the index structure. We also need to use the id to find the other field values of the row where the id is located, which takes time. Obviously, if we only select the id, This trouble is eliminated, as follows: select id from s1 where id=123; This is a covering index. The index is hit, and the address of the id on the hard disk is directly obtained from the index data structure. The speed is very fast. 2. Joint Index 3. Index Merge #Index merging: Merge multiple single-column indexes and use #Analysis: We can use index merging to solve everything that combined indexes can do, such as create index ne on s1(name,email);#Combined index We can create indexes for name and email separately. Combined indexes can hit: select * from s1 where name='egon'; select * from s1 where name='egon' and email='adf'; Index merge can hit: select * from s1 where name='egon'; select * from s1 where email='adf'; select * from s1 where name='egon' and email='adf'; At first glance, it seems that index merging is better: it can hit more cases, but in fact it depends on the case. If name='egon' and email='adf', Then the efficiency of combined index is higher than that of index merge. If it is a single condition query, it is more reasonable to use index merge. If we want to use the index to achieve the desired effect of improving query speed, we must follow the following principles when adding indexes: 1. The leftmost prefix matching principle is a very important principle. create index ix_name_email on s1(name,email,) - Leftmost prefix match: must match from left to right select * from s1 where name='egon'; #ok select * from s1 where name='egon' and email='asdf'; #ok select * from s1 where email='[email protected]'; #No MySQL will keep matching to the right until it encounters a range query (>, <, between, like) and stops matching. For example, if a = 1 and b = 2 and c > 3 and d = 4, then if we create an index in the order (a, b, c, d), There is no need for an index for d. If you create an index of (a,b,d,c), it can all be used. The order of a,b,d can be adjusted arbitrarily. #2.= and in can be in any order, for example, a = 1 and b = 2 and c = 3. You can create an (a,b,c) index in any order. The MySQL query optimizer will help you optimize it into a form that the index can recognize. #3. Try to choose a column with high discrimination as the index. The formula for discrimination is count(distinct col)/count(*). Indicates the proportion of fields that are not repeated. The larger the proportion, the fewer records we scan. The unique key has a discrimination of 1, while some states, The gender field may have a discrimination value of 0 in the face of big data, so someone may ask, is there any empirical value for this ratio? Different usage scenarios, This value is also difficult to determine. Generally, we require the fields that need to be joined to be above 0.1, that is, an average of 10 records are scanned for 1 record. #4. Index columns cannot be used in calculations. Keep the columns "clean", such as from_unixtime(create_time) = '2014-05-29' The index cannot be used. The reason is very simple. The B+ tree stores the field values in the data table. However, when searching, the function needs to be applied to all elements for comparison, which is obviously too costly. So the statement should be written as create_time = unix_timestamp('2014-05-29'); Leftmost prefix demonstration mysql> select * from s1 where id>3 and name='egon' and email='[email protected]' and gender='male'; Empty set (0.39 sec) mysql> create index idx on s1(id,name,email,gender); #Leftmost prefix not followed Query OK, 0 rows affected (15.27 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from s1 where id>3 and name='egon' and email='[email protected]' and gender='male'; Empty set (0.43 sec) mysql> drop index idx on s1; Query OK, 0 rows affected (0.16 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> create index idx on s1(name,email,gender,id); #Follow the leftmost prefix Query OK, 0 rows affected (15.97 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from s1 where id>3 and name='egon' and email='[email protected]' and gender='male'; Empty set (0.03 sec) When the index cannot be hit, please note: - like '%xx' select * from tb1 where email like '%cn'; - Use function select * from tb1 where reverse(email) = 'wupeiqi'; - or select * from tb1 where nid = 1 or name = '[email protected]'; Special: It will be invalid only when there is a column that is not indexed in the or condition. The following will use the index select * from tb1 where nid = 1 or name = 'seven'; select * from tb1 where nid = 1 or name = '[email protected]' and email = 'alex' - Type inconsistency If the column is a string type, the input condition must be enclosed in quotation marks, otherwise... select * from tb1 where email = 999; Normal index does not mean that the index will not be used - != select * from tb1 where email != 'alex' Special: If it is a primary key, the index will still be used select * from tb1 where nid != 123 -> select * from tb1 where email > 'alex' Special: If the primary key or index is an integer type, the index will still be used select * from tb1 where nid > 123 select * from tb1 where num > 123 #If the sorting condition is an index, the select field must also be an index field, otherwise it will not be hit - order by select name from s1 order by email desc; When sorting by index, if the selected query field is not an index, the index will not be used. select email from s1 order by email desc; Special: If the primary key is sorted, the index is still used: select * from tb1 order by nid desc; - The leftmost prefix of the combined index If the combined index is: (name, email) name and email -- use index name -- use index email -- do not use index - count(1) or count(column) instead of count(*) makes no difference in mysql - create index xxxx on tb(title(19)) #text type, length must be specified - Avoid using select * - count(1) or count(column) instead of count(*) - When creating a table, try to use char instead of varchar - The order of the fields in the table is fixed-length fields first - Composite indexes instead of multiple single-column indexes (when multiple conditions are often used for query) - Use short indexes whenever possible - Use JOINs instead of sub-queries - When joining tables, make sure the condition type is consistent - Index hash values (with few duplicates) are not suitable for indexing, for example: gender is not suitable 7. Basic steps for slow query optimization 0. Run first to see if it is really slow, and set SQL_NO_CACHE 1. Where condition single table query, lock the minimum return record table. This sentence means to apply the where clause of the query statement to the table with the smallest number of records returned, and start the query. Query each field of the table separately to see which field has the highest discrimination. 2. Explain to check whether the execution plan is consistent with the expectation in 1 (start the query from the table with fewer locked records). 3. Use the SQL statement of the form of order by limit to prioritize the sorted table. 4. Understand the usage scenarios of the business side. 5. Refer to the major principles of index creation when adding indexes. 6. Observe the results. If they do not meet expectations, continue to analyze from 0. This concludes this article on the usage and differences of MySQL views and indexes. For more information about MySQL views and indexes, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Detailed explanation of Linux commands sort, uniq, tr tools
>>: Use js to write a simple snake game
Table of contents Install Tomcat with Docker Use ...
If you want to become a Linux master, then master...
Table of contents Overview Require URL of the app...
1. Download address: http://dev.mysql.com/downloa...
1. Design source code Copy code The code is as fol...
Prepare the ingredients as shown above (ps: hadoo...
Table of contents 1. Ternary operator judgment 2....
Copy code The code is as follows: wmode parameter...
We all know that Apache can easily set rewrites f...
Prerequisite: Percona 5.6 version, transaction is...
Need to export the fields and properties of the t...
The order in which objects call methods: If the m...
What is a Viewport Mobile browsers place web page...
Preface If the query information comes from multi...
It is also very simple to deploy Django projects ...