A brief discussion on the underlying principle of mysql join

A brief discussion on the underlying principle of mysql join

join algorithm

MySQL only supports one join algorithm: Nested-Loop Join, but Nested-Loop Join has three variants:

  • Simple Nested-Loop Join, simple nesting
  • Index Nested-Loop Join
  • Block Nested-Loop Join, join buffer nesting (temporary table)

The difference between driven and non-driven tables

The driving table is the main table, and the non-driven table is the slave table. You can see it by looking at the following SQL. A is the driving table and B is the non-driven table.

select * from A left join B

When A joins B, must we check table A first and then table B?

The answer is not necessarily, because MySQL has an optimizer inside, which will make some optimizations based on your query statements. The optimizer also determines which table to check first, but it is certain that the table that is checked first is the driving table, and vice versa. As for the question of which table to check first, we can get the result by looking at the execution plan; just add the explain keyword in front;

explain select * from A join B;

1. Simple Nested-Loop Join, simple nesting-no index

A left join B: Each time, the whole table is matched. Each row of data in table A is matched once in table B. That is to say, if there are 10 records in table A and 1000 records in table B, the number of scans during query is 10*1000, which means that the query needs to scan 10000 times to get the data.

2. Index Nested-Loop Join-Indexed Case

select * from A join B on A.id=B.id where A.id = 1: When querying, the driver table A will search according to the index of the associated field. When a matching value is found on the index, the query will be returned to the table. In other words, the query will be returned to the table only after the index is matched.
If the associated field B.id of the non-driven table B is the primary key, the performance will be very high. If it is not the primary key, multiple table queries will be performed. First, the index will be associated, and then the table query will be performed based on the primary key id of the secondary index. The performance is slower than that of the primary key.

3. Block Nested-Loop Join, join buffer

If there is an index, Index Nested-Loop Join will be used to join the tables. If there is no index on the join column, Block Nested-Loop Join will be used. Join buffer. There is a buffer between the driver table and the non-driver table. When querying, the data of the driver table is first cached in the buffer, and then matched with the non-driver table in batches. This is an optimization solution that combines multiple comparisons into one comparison. Note: not only the columns of the associated table are cached here, but also the columns after the select.

Buffer size

By default, the capacity of the buffer join_biffer_size is 256k. If your data space is larger than 256k, the buffer cannot be used and the join will be converted to the simplest Simple Nested-Loop Join. However, we can manually adjust the buffer size to load large amounts of data. View the join_biffer_size sql: show variables like '%join_biffer_size%'

How to choose the connection order for tables with large and small data volumes

It is best to connect the small table to the big table, which will reduce the number of scans; for example, if the big table has 1,000 data records and the small table has only 10 data records, then the best connection method is: small table joins the big table; why do you do this?

  • If a large table is joined to a small table, and our data is in row 999 of the large table, then we have to scan it at least 999 times to find it when we query the data.
  • If a small table is joined to a large table, and our data is in row 9 of the small table,

detail

  • When querying tables together, it is best not to have more than three tables, because the fields that need to be joined must have the same data type.
  • Inner joins are preferred for joining tables. Outer joins consume more performance than inner joins in querying data.
  • Make sure the columns after on in the associated query or the fields in using() have indexes to speed up data access when joining tables.

This is the end of this article about the underlying principles of MySQL join. For more information about the underlying principles of MySQL join, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Analysis of usage scenarios of JOIN in SQL statements
  • MYSQL database basics - Join operation principle
  • How to solve the problem of invalid left join in MySQL and the precautions for its use
  • The process of quickly converting mysql left join to inner join
  • Why do code standards require SQL statements not to have too many joins?
  • MySQL efficient query left join and group by (plus index)
  • MySQL join buffer principle
  • Detailed explanation of various join summaries of SQL

<<:  Vue implements small search function

>>:  Determine the direction of mouse entry based on CSS

Recommend

CSS and CSS3 flexible box model to achieve element width (height) adaptation

1. CSS realizes fixed width on the left and adapt...

VMware and CentOS system installation method to reset the root password

Today's Tasks 1. Choice of Linux distribution...

Node script realizes automatic sign-in and lottery function

Table of contents 1. Introduction 2. Preparation ...

Implementation code for partial refresh of HTML page

Event response refresh: refresh only when request...

How to configure the pdflatex environment in docker

Technical Background Latex is an indispensable to...

Centos7 implements sample code for restoring data based on MySQL logs

Introduction Binlog logs, that is, binary log fil...

How to import and export Cookies and Favorites in FireFox

FireFox is a commonly used browser with many exte...

Native js encapsulation seamless carousel function

Native js encapsulated seamless carousel plug-in,...

MySQL 5.6.37 (zip) download installation configuration graphic tutorial

This article shares the download, installation an...

Analysis of uniapp entry-level nvue climbing pit record

Table of contents Preface Hello World image Set b...

Tutorial on installing MySQL 8.0.11 using RPM on Linux (CentOS7)

Table of contents 1. Installation preparation 1. ...

Summary of JavaScript JSON.stringify() usage

Table of contents 1. Usage 1. Basic usage 2. The ...