Mysql 8.0.18 hash join test (recommended)

Mysql 8.0.18 hash join test (recommended)

Hash Join

Hash Join does not require any indexes to execute and is more efficient than the current block nested loop algorithm in most cases.

The following example code introduces the MySQL 8.0.18 hash join test. The specific content is as follows:

CREATE TABLE COLUMNS_hj as select * from information_schema.`COLUMNS`;
INSERT INTO COLUMNS SELECT * FROM COLUMNS; -- Insert 250,000 rows for the last time CREATE TABLE COLUMNS_hj2 as select * from information_schema.`COLUMNS`;
explain format=tree
SELECT 
 COUNT(c1.PRIVILEGES),
 SUM(c1.ordinal_position)
FROM
 COLUMNS_hj c1,
 COLUMNS_hj2 c2
WHERE
 c1.table_name = c2.table_name
AND c1.column_name = c2.column_name
GROUP BY
 c1.table_name,
 c1.column_name
ORDER BY
 c1.table_name,
 c1.column_name;

You must use format=tree (a new feature in 8.0.16) to view the execution plan of hash join:

-> Sort: <temporary>.TABLE_NAME, <temporary>.COLUMN_NAME
 -> Table scan on <temporary>
  -> Aggregate using temporary table
   -> Inner hash join (c1.`COLUMN_NAME` = c2.`COLUMN_NAME`), (c1.`TABLE_NAME` = c2.`TABLE_NAME`) (cost=134217298.97 rows=13421218)
    -> Table scan on c1 (cost=1.60 rows=414619)
    -> Hash
     -> Table scan on c2 (cost=347.95 rows=3237)
set join_buffer_size=1048576000;

SELECT 
 COUNT(c1.PRIVILEGES),
 SUM(c1.ordinal_position)
FROM
 COLUMNS_hj c1,
 COLUMNS_hj2 c2
WHERE
 c1.table_name = c2.table_name
AND c1.column_name = c2.column_name
GROUP BY
 c1.table_name,
 c1.column_name
ORDER BY
 c1.table_name,
 c1.column_name;

About 1.5 seconds.


Let’s look at BNL again. Create the index first (it’s fair to optimize them separately and then compare the results).

alter table columns_hj drop index idx_columns_hj;
alter table columns_hj2 drop index idx_columns_hj2;
create index idx_columns_hj on columns_hj(table_name,column_name);
create index idx_columns_hj2 on columns_hj2(table_name,column_name);

-> Sort: <temporary>.TABLE_NAME, <temporary>.COLUMN_NAME
 -> Table scan on <temporary>
  -> Aggregate using temporary table
   -> Nested loop inner join (cost=454325.17 rows=412707)
    -> Filter: ((c2.`TABLE_NAME` is not null) and (c2.`COLUMN_NAME` is not null)) (cost=347.95 rows=3237)
     -> Table scan on c2 (cost=347.95 rows=3237)
    -> Index lookup on c1 using idx_COLUMNS_hj (TABLE_NAME=c2.`TABLE_NAME`, COLUMN_NAME=c2.`COLUMN_NAME`) (cost=127.50 rows=127)

About 4.5 seconds. It can be seen that the hash join effect is still very good.

I have to complain about MySQL's optimizer prompts. It seems that HASH_JOIN/NO_HASH_JOIN are not effective.

In addition to hash_join, the SET_VAR optimizer hint introduced in MySQL 8.0.3 is still very useful. It can be used to set statement-level parameters (Oracle supports it, and I remember MariaDB also supports it), as follows:

mysql> select /*+ set_var(optimizer_switch='index_merge=off') set_var(join_buffer_size=4M) */ c_id from customer limit 1;

List of variables supported by SET_VAR:

auto_increment_increment
auto_increment_offset
big_tables
bulk_insert_buffer_size
default_tmp_storage_engine
div_precision_increment
end_markers_in_json
eq_range_index_dive_limit
foreign_key_checks
group_concat_max_len
insert_id
internal_tmp_mem_storage_engine
join_buffer_size
lock_wait_timeout
max_error_count
max_execution_time
max_heap_table_size
max_join_size
max_length_for_sort_data
max_points_in_geometry
max_seeks_for_key
max_sort_length
optimizer_prune_level
optimizer_search_depth variables
optimizer_switch
range_alloc_block_size
range_optimizer_max_mem_size
read_buffer_size
read_rnd_buffer_size
sort_buffer_size
sql_auto_is_null
sql_big_selects
sql_buffer_result
sql_mode
sql_safe_updates
sql_select_limit
timestamp
tmp_table_size
updatable_views_with_limit
unique_checks
windowing_use_high_precision

Summarize

The above is the Mysql 8.0.18 hash join test introduced by the editor. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!
If you find this article helpful, please feel free to reprint it and please indicate the source. Thank you!

You may also be interested in:
  • MySQL joint table query basic operation left-join common pitfalls
  • Analysis of the difference between the usage of left join setting conditions in on and where in mysql
  • Summary of various common join table query examples in MySQL
  • MySQL 8.0.18 Hash Join does not support left/right join left and right join issues
  • MySQL 8.0 New Features: Hash Join
  • MySQL 8.0.18 stable version released! Hash Join is here as expected
  • Detailed explanation of how to use join to optimize SQL in MySQL
  • Query process and optimization method of (JOIN/ORDER BY) statement in MySQL
  • In-depth understanding of MySQL self-connection and join association
  • Mysql join table and id auto-increment example analysis

<<:  Solution to win10 without Hyper-V

>>:  A brief discussion on logic extraction and field display of Vue3 in projects

Recommend

Detailed explanation of Linux system directories sys, tmp, usr, var!

The growth path from a Linux novice to a Linux ma...

js to achieve the pop-up effect

This article example shares the specific code of ...

How to redirect to https through nginx load balancing

Copy the certificate and key on the web scp -rp -...

Linux series of commonly used operation and maintenance commands (summary)

Table of contents 1. System monitoring 2. File Op...

Detailed steps to configure MySQL remote connection under Alibaba Cloud

Preface As we all know, by default, the MySQL ins...

A brief discussion on MySQL count of rows

We are all familiar with the MySQL count() functi...

MySQL 5.7.15 installation and configuration method graphic tutorial (windows)

Because I need to install MySQL, I record the ins...

Example code for using @media in CSS3 to achieve web page adaptation

Nowadays, the screen resolution of computer monit...

How to use Vue to implement CSS transitions and animations

Table of contents 1. The difference between trans...

How to build a multi-node Elastic stack cluster on RHEL8 /CentOS8

Elastic stack, commonly known as ELK stack, is a ...

Difference and principle analysis of Nginx forward and reverse proxy

1. The difference between forward proxy and rever...