MySQL Interview Questions: How to Set Up Hash Indexes

MySQL Interview Questions: How to Set Up Hash Indexes

In addition to B-Tree indexes, MySQL also provides the following indexes:

  • Hash Index

Only supported by the Memory engine, simple scenarios

  • R-Tree Index

A special index type of MyISAM, mainly used for geographic spatial data types

  • Full-text

A special index of MyISAM, mainly used for full-text indexing. Starting from MySQL 5.6, InnoDB supports full-text indexing.

Index/ Storage EngineMyISAMInnoDBMemoryB-Tree IndexSupportSupportSupportHASH IndexSupportSupportSupportR-Tree IndexSupportSupportFull-text IndexSupportSupportSupport

The most commonly used indexes are B-tree index and Hash index, and only Memory and NDB engines support Hash index. Hash index is suitable for key-value query, and query through Hash index is faster than B-tree index. However, Hash indexes do not support range searches such as <><==, >==, etc. Memory will use hash index only under the condition of "="

MySQL 8.0 supports functional indexes. Before that, only the front part of a column could be indexed. For example, for a title field, only the first 10 characters of the title could be indexed. This feature greatly reduces the size of the index file, but prefix indexes also have disadvantages and are invalid during order by and group by operations.

create index idx_title on film(title(10));

1 Features

There is only an array, a hash function is used to convert the key into a certain memory location, and then the value is placed at that location in the array. When using hash, there is a natural possibility of hash conflicts, and MySQL uses the zipper method to resolve it.

Hash indexes are implemented based on hash tables. Hash indexes can only be used when the query conditions exactly match the columns in the hash index. For all columns in the Hash index, the storage engine calculates a hashcode for each row, and the hashcode is stored in the Hash index.

  • For example, a table that maintains ID numbers and names, and searches for corresponding names based on ID numbers, has the following hash index:

Alibaba interviewer: Can you design a MySQL Hash index?

For example, we want to check the username corresponding to ID_card_n4:

  • Calculate ID_card_n4 through the hash function to get A
  • Traverse in order and find User4

The four ID_card_n values ​​do not necessarily increase in sequence, so even if a new User is added, the speed is fast and only needs to be appended at the end. Of course, the disadvantage is also obvious. It is not ordered, so the hash index is very slow for interval query. For example, if you want to find all users whose ID numbers are in the range [ID_card_X, ID_card_Y], you must scan the entire table.

2 Hash index defects

  • Must search twice
  • Does not support partial index search or range search
  • Hash codes may have hash collisions. If the hash algorithm is not well designed, there will be too many collisions and the performance will deteriorate.
  • The index stores hash values, so only < = > and IN are supported.
  • It is not possible to sort by operating the index, because the hash value will be calculated when it is stored, but the calculated hash value is not necessarily equal to the stored value, so it cannot be sorted.
  • The full table scan cannot be avoided, but the memory table supports non-unique hash indexes, that is, different index keys may have the same hash value.
  • Because a hash table is a data structure that directly accesses memory storage locations based on keywords, using its principle hash index requires adding all data files to the memory, which consumes a lot of memory.
  • If all queries are equal value queries, then hash is indeed fast, but in fact range search data is more
  • Intelligent processing of full value matching of key values
  • The query hash function determines the size of the index key

To make InnoDB or MyISAM support hash index, it can be implemented through pseudo-hash index, called adaptive hash index.

You can add a field to store the hash value, index the hash value, and create a trigger to automatically add the calculated hash to the table when inserting and updating.

The hash table structure is suitable for scenarios where only equal value queries are required, such as Memcached.

3 Case Application

Suppose there is a very large table, for example, when a user logs in, you need to retrieve the user by email. If you create an index directly on the email column, in addition to index range matching, you also need to perform string matching. It's okay if the email is short, but if it's long, the query cost will be relatively high. If at this time, a hash index is created on email and the query is performed using int, the performance will be much faster than a string comparison query.

Hash Algorithms

To create a hash index, you must first select a hash algorithm, such as the CRC32 algorithm mentioned in "High Performance MySQL".

INSERT UPDATE SELECT Operations

Add a hash value field to the table:

ALTER TABLE `User` ADD COLUMN email_hash int unsigned NOT NULL DEFAULT 0;

The next step is to automatically update the email_hash field during UPDATE and INSERT, which is achieved through triggers:

DELIMITER |
CREATE TRIGGER user_hash_insert BEFORE INSERT ON `User` FOR EACH ROW BEGIN
SET NEW.email_hash=crc32(NEW.email);
END;
|
CREATE TRIGGER user_hash_update BEFORE UPDATE ON `User` FOR EACH ROW BEGIN
SET NEW.email_hash=crc32(NEW.email);
END;
|
DELIMITER ;

The SELECT request will then become:

SELECT `email`, `email_hash` FROM `User` WHERE 
	email_hash = CRC32(“[email protected]”) 
			AND `email` = "[email protected]";

+----------------------------+------------+
| email | email_hash |
+----------------------------+------------+
| [email protected] | 2765311122 |
+----------------------------+------------+

AND email = "[email protected]" is to prevent data inaccuracy in case of hash collision.

This is the end of this article on how to set up Hash index in MySQL interview questions. For more information about setting up Hash index in MySQL, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • In-depth analysis of MySQL index data structure
  • Detailed explanation of transactions and indexes in MySQL database
  • MySQL index pushdown details
  • MySQL helps you understand index pushdown in seconds
  • Understanding MySQL index pushdown in five minutes
  • An article to understand what is MySQL Index Pushdown (ICP)

<<:  Two methods to stretch the background image of a web page

>>:  Basic learning tutorial of table tag in HTML

Recommend

Detailed tutorial on using the Prettier Code plugin in vscode

Why use prettier? In large companies, front-end d...

Notes on using $refs in Vue instances

During the development process, we often use the ...

How to install the latest version of docker using deepin apt command

Step 1: Add Ubuntu source Switch to root su root ...

How to set static IP in CentOS7 on VirtualBox6 and what to note

Install CentOS 7 after installing VirtualBox. I w...

Implementation of grayscale release with Nginx and Lua

Install memcached yum install -y memcached #Start...

Detailed explanation of Tomcat core components and application architecture

Table of contents What is a web container? The Na...

English: A link tag will automatically complete href in IE

English: A link tag will automatically complete h...

Detailed explanation of how to upgrade software package versions under Linux

In the Linux environment, you want to check wheth...

How to Dockerize a Python Django Application

Docker is an open source project that provides an...

Vue axios interceptor commonly used repeated request cancellation

introduction The previous article introduced the ...

MySQL 5.7.13 installation and configuration method graphic tutorial on Mac

MySQL 5.7.13 installation tutorial for Mac, very ...