Specific use of MySQL internal temporary tables

Specific use of MySQL internal temporary tables

UNION

UNION semantics: take the union of the two subquery results, and keep only one row of duplicate rows

Table initialization

CREATE TABLE t1(id INT PRIMARY KEY, a INT, b INT, INDEX(a));
DELIMITER ;;
CREATE PROCEDURE idata()
BEGIN
    DECLARE i INT;

    SET i=1;
    WHILE (i<= 1000) DO
        INSERT INTO t1 VALUES (i,i,i);
        SET i=i+1;
    END WHILE;
END;;
DELIMITER ;
CALL idata();

Execute Statement

(SELECT 1000 AS f) UNION (SELECT id FROM t1 ORDER BY id DESC LIMIT 2);

mysql> EXPLAIN (SELECT 1000 AS f) UNION (SELECT id FROM t1 ORDER BY id DESC LIMIT 2);
+----+--------------+------------+------------+-------+---------------+--------+---------+------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+------------+-------+---------------+--------+---------+------+------+------+----------+----------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
| 2 | UNION | t1 | NULL | index | NULL | PRIMARY | 4 | NULL | 2 | 100.00 | Backward index scan; Using index |
| NULL | UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+------------+-------+---------------+--------+---------+------+------+------+----------+----------------------------------+

The second row has Key=PRIMARY , Using temporary

  • Indicates that a temporary table is used when performing UNION RESULT on the results of the subquery

UNION RESULT

  • Create a temporary memory table. This temporary memory table has only one integer field f, and f is the primary key.
  • Execute the first subquery, get 1000, and store it in the temporary table in memory
  • Execute the second subquery
    • Get the first row with id=1000 and try to insert it into the temporary in-memory table, but the value 1000 already exists in the temporary in-memory table.
      • Unique constraint violated, insert failed, continue execution
    • Get the second row id=999 and insert it into the temporary memory table successfully
  • Fetch data from the temporary table in memory row by row, return the result, and delete the temporary table in memory. The result contains two rows with id=1000 and id=999.
  • The temporary memory table plays the role of temporarily storing data. It also uses the uniqueness constraint of the primary key id of the temporary memory table to implement the semantics of UNION.

UNION ALL

UNION ALL does not have deduplication semantics. The subquery is executed once and the results are sent directly to the client without the need for a temporary table in memory.

mysql> EXPLAIN (SELECT 1000 AS f) UNION ALL (SELECT id FROM t1 ORDER BY id DESC LIMIT 2);
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+------+----------+----------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
| 2 | UNION | t1 | NULL | index | NULL | PRIMARY | 4 | NULL | 2 | 100.00 | Backward index scan; Using index |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+------+----------+----------------------------------+

GROUP BY

Sufficient memory

-- 16777216 Bytes = 16 MB
mysql> SHOW VARIABLES like '%tmp_table_size%';
+----------------+----------+
| Variable_name | Value |
+----------------+----------+
| tmp_table_size | 16777216 |
+----------------+----------+

Execute Statement

-- Execute on MySQL 5.6mysql> EXPLAIN SELECT id%10 AS m, COUNT(*) AS c FROM t1 GROUP BY m;
+----+-------------+-------+-------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY,a | a | 5 | NULL | 1000 | Using index; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+------+---------+------+------+----------------------------------------------+

mysql> SELECT id%10 AS m, COUNT(*) AS c FROM t1 GROUP BY m;
+------+-----+
| m | c |
+------+-----+
| 0 | 100 |
| 1 | 100 |
| 2 | 100 |
| 3 | 100 |
| 4 | 100 |
| 5 | 100 |
| 6 | 100 |
| 7 | 100 |
| 8 | 100 |
| 9 | 100 |
+------+-----+

Using index : Indicates that a covering index is used, index a is selected, and no table return is required.

Using temporary: Indicates that a Using temporary

Using filesort : indicates that sorting is required

Execution process

  • Create a temporary memory table with two fields m and c, m is the primary key
  • Scan the index a of t1, take out the id values ​​on the leaf nodes one by one, calculate id%10, and record it as x
    • If there is no row with primary key x in the temporary memory table, insert a row (x,1)
    • If there is a row with primary key x in the temporary memory table, add 1 to the c value of row x.
  • After the traversal is completed, sort according to field m and get the result set and return it to the client.

Sorting process

ORDER BY NULL

-- Skip the final sorting stage and retrieve data directly from the temporary table mysql> EXPLAIN SELECT id%10 AS m, COUNT(*) AS c FROM t1 GROUP BY m ORDER BY NULL;
+----+-------------+-------+-------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY,a | a | 5 | NULL | 1000 | Using index; Using temporary |
+----+-------------+-------+-------+---------------+------+---------+------+------+------------------------------+

-- The data in t1 starts from 1 mysql> SELECT id%10 AS m, COUNT(*) AS c FROM t1 GROUP BY m ORDER BY NULL;
+------+-----+
| m | c |
+------+-----+
| 1 | 100 |
| 2 | 100 |
| 3 | 100 |
| 4 | 100 |
| 5 | 100 |
| 6 | 100 |
| 7 | 100 |
| 8 | 100 |
| 9 | 100 |
| 0 | 100 |
+------+-----+

Out of Memory

SET tmp_table_size=1024;

Execute Statement

-- The upper limit of the memory temporary table is 1024 Bytes, but the memory temporary table cannot completely hold 100 rows of data. The memory temporary table will be converted to a disk temporary table, and the InnoDB engine is used by default. -- If t1 is very large, the disk temporary table required for this query will take up a lot of disk space mysql> SELECT id%100 AS m, count(*) AS c FROM t1 GROUP BY m ORDER BY NULL LIMIT 10;
+------+----+
| m | c |
+------+----+
| 1 | 10 |
| 2 | 10 |
| 3 | 10 |
| 4 | 10 |
| 5 | 10 |
| 6 | 10 |
| 7 | 10 |
| 8 | 10 |
| 9 | 10 |
| 10 | 10 |
+------+----+

Optimization plan

Optimize indexes

Regardless of whether a temporary table is used in memory or on disk, GROUP BY requires constructing a table with a unique index, which is more expensive to execute.

The reason for needing a temporary table: id%100 of each row is out of order, so a temporary table is needed to record and count the results.

If you can ensure that the input data is in order, then when calculating GROUP BY , you only need to scan from left to right and accumulate them one by one.

  • When the first 1 is encountered, X 0s have been accumulated, and the first row in the result set (0,X)
  • When the first 2 is encountered, Y 1s have been accumulated, and the first row in the result set (1,Y)
  • The entire process does not require a temporary table or sorting.
-- Execute ALTER TABLE t1 ADD COLUMN z INT GENERATED ALWAYS AS(id % 100), ADD INDEX(z) on MySQL 5.7;
-- Covering index is used, no temporary table or sorting is requiredmysql> EXPLAIN SELECT z, COUNT(*) AS c FROM t1 GROUP BY z;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | index | z | z | 5 | NULL | 1000 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
2

Direct sorting

If a GROUP BY statement needs to store a large amount of data in a temporary table, it is better to store it in a temporary table in memory first and then degenerate it into a temporary table on disk.

You can use a temporary disk table directly, and use SQL_BIG_RESULT in GROUP BY statement (to tell the optimizer that the amount of data involved is very large).

The disk temporary table originally uses B+ tree storage, which is not as efficient as array storage. When the optimizer sees SQL_BIG_RESULT , it will directly use array storage.

  • That is, give up using the temporary table and go directly to the sorting stage

Execution process

-- No temporary table is used, but the sorting algorithm is used directly mysql> EXPLAIN SELECT SQL_BIG_RESULT id%100 AS m, COUNT(*) AS c FROM t1 GROUP BY m;
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY,a | a | 5 | NULL | 1000 | Using index; Using filesort |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------+

Initialize sort_buffer and put an integer field into it, denoted as m

Scan index a of t1, take out the id values ​​in it one by one, and put the value of id%100 into sort_buffer

After the scan is completed, sort the field m of sort_buffer (if the memory of sort_buffer is insufficient, temporary files on disk will be used to assist in sorting)

After sorting, we get an ordered array. We traverse the ordered array to get the number of times each value appears (similar to the way to optimize the index above).

Comparison with DISTINCT

-- Standard SQL, add an aggregate function COUNT(*) in the SELECT part
SELECT a,COUNT(*) FROM t GROUP BY a ORDER BY NULL;
-- Non-standard SQL
SELECT a FROM t GROUP BY a ORDER BY NULL;

SELECT DISTINCT a FROM t;

Standard SQL: Group by field a and count the number of times a appears in each group

Non-standard SQL: No more COUNT(*) and no need to perform total calculation logic

  • Group by field a, and only return one row with the same value of a, which is consistent with DISTINCT semantics

If no aggregate function is required, the semantics, execution flow, and performance of DISTINCT and GROUP BY are the same.

  • Create a temporary table with a field a and create a unique index on this field a
  • Traverse table t, take out data one by one and insert them into the temporary table
    • If a unique key conflict is found, skip it
    • Otherwise the insertion is successful
  • After the traversal is completed, the temporary table is returned to the client as a result set

summary

  • Scenarios where internal temporary tables are used
    • If the statement can read data and get results at the same time during execution, no additional memory is required.
    • Otherwise, additional memory is required to store intermediate results.
  • join_buffer is an unordered array, sort_buffer is an ordered array, and the temporary table is a two-dimensional table structure
  • If the execution logic requires the use of two-dimensional table features, temporary tables are preferred. If there is no explicit sorting requirement for the results of GROUP BY statement, add ORDER BY NULL (MySQL 5.6)
  • Try to use indexes in GROUP BY process and make sure that the EXPLAIN results do not contain Using temporary and Using filesort
  • If the amount of data to be counted in GROUP BY is not large, try to use a temporary memory table (you can increase tmp_table_size appropriately)
  • If the amount of data is too large, use SQL_BIG_RESULT to tell the optimizer to use the sorting algorithm directly (skipping the temporary table).

References

"MySQL Practice 45 Lectures"

This is the end of this article about the specific use of MySQL internal temporary tables. For more relevant MySQL internal temporary tables, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Summary of MySQL's commonly used database and table sharding solutions
  • MySQL partition table is classified by month
  • MySQL partitions existing tables in the data table
  • How to smoothly go online after MySQL table partitioning
  • MySQL series multi-table join query 92 and 99 syntax examples detailed tutorial
  • A brief discussion on when MySQL uses internal temporary tables
  • Detailed example of collecting and aggregating MySQL table information through Python
  • Comprehensive summary of MYSQL tables

<<:  HTML+CSS3 code to realize the animation effect of the solar system planets

>>:  How to use CSS to center a box horizontally and vertically (8 methods)

Recommend

Example of how to create a database name with special characters in MySQL

Preface This article explains how to create a dat...

Solution to MySql service disappearance for unknown reasons

Solution to MySql service disappearance for unkno...

MySQL database master-slave configuration tutorial under Windows

The detailed process of configuring the MySQL dat...

Vue batch update dom implementation steps

Table of contents Scene Introduction Deep respons...

Vue implements a small weather forecast application

This is a website I imitated when I was self-stud...

Sample code for implementing follow ads with JavaScript

Floating ads are a very common form of advertisin...

Detailed explanation of component development of Vue drop-down menu

This article example shares the specific code for...

Application nesting of HTML ul unordered tables

Application nesting of unordered lists Copy code T...

vue+rem custom carousel effect

The implementation of custom carousel chart using...

How to configure the My.ini file when installing MySQL5.6.17 database

I recently used the MySql database when developin...

MySQL 8.0.13 installation and configuration method graphic tutorial

This article shares the installation and configur...

How to use react-color to implement the front-end color picker

background We can use react-color to implement th...

Implementation of Vue top tags browsing history

Table of contents nonsense Functions implemented ...

Summary of common problems and solutions in Vue (recommended)

There are some issues that are not limited to Vue...