SELECT * from table where username like '%陈哈哈%' and hobby like '%牛逼' This is a fuzzy query method that we often use in MySQL, which uses the wildcard % for matching. In fact, this is just the tip of the iceberg. In MySQL, there are many methods that support fuzzy matching, and each has its own advantages. Well, today, let me take you to lift up the skirt of MySQL and see how many little-known good things are hidden under fuzzy query. 1. MySQL wildcard fuzzy query (%,_)1-1. Classification of wildcards
1-2. Use of wildcards1) % Wildcard: -- Fuzzy matching of data containing the word "网" SELECT * from app_info where appName like '%网%'; -- Fuzzy matching of data ending with "网" SELECT * from app_info where appName like '%网'; -- Fuzzy matching of data starting with "网" SELECT * from app_info where appName like '网%'; -- Exact match, appName like '网' is equivalent to: appName = '网' SELECT * from app_info where appName = '网'; -- Equivalent to SELECT * from app_info where appName like '网'; -- Fuzzy matching of data containing "xxx net xxx car xxx", such as "TuTu online car-hailing driver terminal, online car-hailing platform" SELECT * from app_info where appName like '%网%车%'; 2) _ Wildcard: -- Query data ending with "网" and with a length of three characters, such as "联家网". SELECT * from app_info where appName like '__网'; Note: '%__网、__%网' is equivalent to '%网' -- The first three characters of the query are XX.com, and the following characters can be matched arbitrarily, such as "Chengtong network disk, mold network platform" SELECT * from app_info where appName like '__网%'; -- Fuzzy matching of data containing "xx网x车xxx", such as: "Ctrip Online Car-hailing Client" SELECT * from app_info where appName like '__网_车%';
1-3. Tips and suggestions:As you can see, MySQL wildcards are quite useful. This functionality comes at a price, however: wildcard searches generally take longer to process than the other searches discussed previously, and consume more memory and other resources. Here are some tips to remember when using wildcards.
Careful friends will find that if there are symbols such as "%" and "_" in the data, wouldn't it conflict with the wildcard? SELECT * from app_info where appName LIKE '%%%'; SELECT * from app_info where appName LIKE '%_%'; Indeed, the above two SQL statements query the entire table data, rather than the specified data with "%" and "_". The ESCAPE keyword is needed here to escape. As shown below, ESCAPE is followed by a character with something in it. MySQL treats that symbol as an escape character. I usually write it as "/". Then, just like the escape characters in C language, such as '\n' and '\t', write this character before the % sign you need to escape. SELECT * from app_info where appName LIKE '%/_%' ESCAPE '/'; But is there a higher-end solution to this situation? The kind that can make colleagues or leaders who check your code look at you with new eyes~~ Of course, let's take a look at the second type of MySQL fuzzy matching method - built-in function query 2. MySQL built-in function retrieval (locate, position, instr)Continuing from the previous article, matching is performed through the built-in functions locate, position, and instr, which is equivalent to the str.contains() method in Java. It returns the position of the matching content in the string, which is better than wildcard matching in terms of efficiency and usability. SELECT * from app_info where INSTR(`appName`, '%') > 0; SELECT * from app_info where LOCATE('%', `appName`) > 0; SELECT * from app_info where POSITION( '%' IN `appName`) > 0; As shown above, the default value for the three built-in functions is: > 0, so the following > 0 can be added or not, and adding it makes it more readable. OK, let’s take a look at how to use these three built-in functions. First of all, let's make it clear that the subscripts in MySQL start from 1 from left to right, unlike Java where the leftmost subscript is 0. Therefore, in MySQL, when the subscript is 0, it means it does not exist. 2-1. LOCATE() function
Returns the position of the first occurrence of substr in str. If substr does not exist in str, the return value is 0. If substr exists in str, the return value is: the position where substr first appears in str. Note: LOCATE(substr, str) and POSITION(substr IN str) are synonyms and have the same functionality.
The position of the first occurrence of substring substr in string str starting at position pos. If substr is not in str, returns 0. If substr or str is NULL, returns NULL. SELECT locate('a', 'banana'); -- 2 SELECT locate('a', 'banana', 3); -- 4 SELECT locate('z', 'banana'); -- 0 SELECT locate(10, 'banana'); -- 0 SELECT locate(NULL , 'banana'); -- null SELECT locate('a' , NULL ); -- null Examples: -- Use the LOCATE keyword for fuzzy matching, equivalent to: "like '%网%'" SELECT * from app_info where LOCATE('网', `appName`) > 0; -- Use the LOCATE keyword for fuzzy matching, and match "网" from the second character, so data such as "NetEase Cloud Games, Wanglai Merchants" will be filtered out SELECT * from app_info where LOCATE('网', `appName`, 2) > 0; 2-2. POSITION() method
This method can be understood as an alias for the locate(substr, str) method, because it has the same function as the locate(substr, str) method. Examples: -- Use the POSITION keyword for fuzzy matching, equivalent to: "like '%网%'" SELECT * from app_info where POSITION('网' IN `appName`); 2-3. INSTR() method
Returns the position of the first occurrence of substring substr within string str. The two-argument form of INSTR() is identical to that of LOCATE(), except that the order of the arguments is reversed. Examples: -- Use the INSTR keyword for fuzzy matching, which has the same function as LIKE, equivalent to: "like '%网%'" SELECT * from app_info where INSTR(`appName`, '网'); -- The instr function is generally used to retrieve the position of a character in a string, which is equivalent to: "like '%网%'" SELECT * from app_info where INSTR(`appName`, '网') > 0; 3. MySQL regular matching query based on regexp and rlike
REGEXP does not support wildcards "%, _", but supports regular matching rules . It is a more detailed and elegant matching method. Let's take a look. -- Here are the parameter types contained in regexp
-- REGEXP '网' is equivalent to like '%网%' SELECT * from app_info where appName REGEXP '网'; -- Equivalent to SELECT * from app_info where appName like '%网%'; 3-1. OR in regexp: |Function: You can search for one of multiple strings, equivalent to or -- Supports "|" 'or' symbols, matching data containing "China" or "Internet" or "University", and supports overlapping multiple SELECT * from app_info where appName REGEXP 'China|Internet|University'; -- To match data that hits "中国" and "网" at the same time, you can use ".+" to connect them, which means China xxxx network. Any number of characters are allowed in the middle, and the order cannot be reversed. SELECT * from app_info where appName REGEXP '中国.+网'; 3-2. Regular expression matching in REGEXP: []Function: Match one of the characters in the [] symbol, support parsing regular expressions -- Matches data containing English characters, case-insensitive by default SELECT * from app_info where appName REGEXP '[az]'; -- Just like like, just negate the set and add "not REGEXP", I won't go into details below SELECT * from app_info where appName not REGEXP '[az]'; --Matches data containing uppercase English characters. By default, case is ignored and the "BINARY" keyword needs to be added . For example, where appName REGEXP BINARY 'Hello' -- Regarding case sensitivity: Regular expression matching in MySQL (since version 3.23.4) is not case sensitive. SELECT * from app_info where appName REGEXP BINARY '[AZ]'; -- Matches data containing numbers SELECT * from app_info where appName REGEXP '[0-9]'; -- Matches data containing numbers or English. SELECT * from app_info where appName REGEXP '[a-z0-9]'; az, 0-9 are all considered as one unit, do not add extra symbols . A few days ago, I found a special case, a very interesting bug, share it with him -- I added an extra "|" symbol when writing the query statement, thinking it was "or", and didn't pay attention to it, but I never expected that the number of results would be different. SELECT * from app_info where appName REGEXP '[567]'; -- 87 results SELECT * from app_info where appName REGEXP '[5|6|7]'; -- 88 results I'm confused. Let's see which one is wrong. -- It turns out that the "|" symbol also participates in the matching and is recognized as a unit. Coincidentally, there is a data: "Wireless Mixer" which matches exactly. What the hell is this DJ? SELECT * from app_info where appName REGEXP '[5|6|7]' and pid not in (SELECT pid from app_info where appName REGEXP '[567]'); -- Query data starting with one of 5, 6, or 7 SELECT * from app_info where appName REGEXP '^[5|6|7]'; -- Query data ending with one of 5, 6, or 7 SELECT * from app_info where appName REGEXP '[5|6|7]$';
-- Query data with appName byte length of 10 and any content SELECT * from app_info where appName REGEXP '^.{10}$'; -- Query data where the length of appName bytes is 10 and all are in English SELECT * from app_info where appName REGEXP '^[az]{10}$' ; -- Query the data whose appName byte length is 10 and all are in uppercase English, and add BINARY SELECT * from app_info where appName REGEXP BINARY '^[AZ]{10}$'; -- Query data whose version_name byte length is 6 and all of them are numbers or "." SELECT * from app_info where version_name REGEXP '^[0-9.]{6}$'; -- Query data whose version_name byte length is 6 and all of them are numbers or "."; the first digit is required to be 1 SELECT * from app_info where version_name REGEXP '^1[0-9.]{5}$' ; -- Query data whose version_name byte length is 6 and all of them are numbers or "."; the first digit is 1 and the last digit is 7 SELECT * from app_info where version_name REGEXP '^1[0-9.]{4}7$' ; -- Query data whose version_name byte length is more than 6 characters and all of them are numbers or "."; the first character must be 1 and the last character must be 7 SELECT * from app_info where version_name REGEXP '^1[0-9.]{4,}7$' ; -- Query the data whose version_name byte length is 6 to 8 characters and all of them are numbers or "."; the first digit must be 1 and the last digit must be 7 SELECT * from app_info where version_name REGEXP '^1[0-9.]{4,6}7$' ; -- The first character is not Chinese SELECT * from app_info where appName REGEXP '^[ -~]'; -- The first character is Chinese SELECT * from app_info where appName REGEXP '^[^ -~]'; -- Query data that does not contain Chinese characters SELECT * from app_info where appName REGEXP '^([az]|[0-9]|[AZ])+$'; -- Data starting with 5 or F and containing English SELECT * from app_info where appName REGEXP BINARY '^[5F][a-zA-Z].'; To match special symbols, such as ., you need to add \\ (note that there are two slashes), but if it is in [], you don’t need to add it: -- Match name containing . select * from app_info where appName regexp '\\.'; -- Match name containing . select * from app_info where appName regexp '[.]'; 3-3. Character class matching (posix)There are some special symbols in MySQL that can represent different types of matches: -- Matches names containing numbers select * from app_info where appName regexp '[[:digit:]]'; Other character classes of this type include:
This type of character class requires an additional layer of [] around the main character class. 3-4. [:<:] and [:>:]There are two special ones in the character class above. These two are about position. [:<:] matches the beginning of the word, and [:>:] matches the end of the word. They are different from ^ and $. The latter matches the beginning and end of the entire thing, while the former matches the beginning and end of a word. -- Only matches the string that starts with a, such as abcd select * from app_info where appName regexp '^a'; -- It can match the whole word starting with a, and it can also match the word in the middle starting with a, such as: dance after. select * from app_info where appName regexp '[[:<:]]a'; [[:<:]] and [[:>:]] match an empty string at the beginning and end of a word respectively. The beginning and end of this word are not characters contained in alnum and cannot be an underscore. select "a word a" REGEXP "[[:<:]]word[[:>:]]"; -- 1 (match) select "a xword a" REGEXP "[[:<:]]word[[:>:]]"; -- 0 (meaning no match) select "weeknights" REGEXP "^(wee|week)(knights|nights)$"; -- 1 (match) IV. ConclusionWell, this article ends here. Those who can see this are destined to be here. I hope this article can help you further understand MySQL. For more relevant MySQL fuzzy query content, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Solutions to common problems using Elasticsearch
>>: HTML table tag tutorial (7): background color attribute BGCOLOR
The content involved in Web front-end development...
Table of contents Preface Check and uninstall Ope...
1. Introduction Recently I found that there are m...
Zabbix deployment documentation After zabbix is ...
MySQL creates users and authorizes and revokes us...
[LeetCode] 197.Rising Temperature Given a Weather...
An interesting discovery: There is a table with a...
I recently discussed "advertising" with...
This article example shares the specific code of ...
Preface If someone asks you "What are the ch...
Subquery Classification Classification by returne...
1.Lock? 1.1 What is a lock? The real meaning of a...
The following HTML tags basically include all exis...
Installation introduction under Windows: Check ou...
Step 1: Use Notepad to open the "my.ini"...