The principles and defects of MySQL full-text indexing

The principles and defects of MySQL full-text indexing

MySQL full-text index is a special index that generates a list of all words that appear in a certain column of a certain table.

alter table tablename add fulltext(column1,column2)

illustrate:

Can only be created in MyISAM tables

The full-text index can only be searched by separating words with spaces or punctuation marks. Chinese words cannot be searched (there are special applications that support Chinese word segmentation and can search Chinese words, but they are not ideal)

Words with less than 3 characters will not be included in the full-text index. You can modify this option by modifying my.cnf

ft_min_word_len=3

Restart the MySQL server and use repair table tablename quick to regenerate the full-text index for the relevant data tables.

select * from tablename where match(column1,column2) against('word1 word2 word3')>0.001

match ... against finds data records that contain at least one of the three words word1, word2, and word3 in the data columns column1 and column2. The data column after the keyword match must be the same as the data column for creating the full-text index. The search terms are not case-sensitive and are not sequence-sensitive. Words with less than 3 characters are usually ignored. The match... against ... expression returns a floating point number as the result of its own evaluation, which reflects the degree to which the resulting record matches the searched word. If no records are matched, or too many matching result records are ignored, the expression will return 0. The expression > 0.001 is used to exclude result records whose match return values ​​are too small.

select *,match(column1,column2) against ('word1 word2 word3') as mtch
from tablename
having mtch>0.01
order by mtch desc
limit 5

Find the five most matching records. You cannot use pseudonyms in the where clause, so use having.

The Boolean full-text search capability supports the following operators:

+word: A leading plus sign indicates that the word must appear at the beginning of each line returned.

-word: A leading minus sign indicates that the word must not appear in any of the returned lines.

(no operator): In the default state (when no + or – is specified), the word is optional, but the line containing the word has a higher rank. This is similar to how MATCH() ... AGAINST() works without the IN BOOLEAN MODE modification.

> <These two operators are used to change the effect of a word on the associated value assigned to a row. The > operator increases its effect, while the < operator decreases it. See the example below.

( ) Brackets are used to separate words into subexpressions. Brackets may be nested.

~word: A leading tilde acts as a negation character, negating the effect of word on the relevance of the line. This is useful for marking "noise" words. Rows containing such words are ranked lower than other rows.

word*: Search for words starting with word, only allowed to appear at the end of the word

"word1 word" : The given word must appear in the data record, the order must also match, and the letters are case sensitive.

select * from tablename where match(column1,column2) against ('+word1 +word2 -word3' in boolean mode')

Boolean searches can only return 1 or 0, and no longer return floating-point numbers indicating the degree of match.

Disadvantages of full-text indexing:

1. The larger the data table, the better the full-text index effect. Smaller data tables may return some incomprehensible results.

2. Full-text search uses the entire word as the matching object. If a word is transformed (with a suffix or plural form), it is considered to be another word.

3. Only strings consisting of letters, numbers, single quotes, and underscores are considered words. Letters with phonetic symbols are still letters, but C++ no longer considers them as words.

4. Case-insensitive

5. Can only be used on MyISAM

6. The creation of full-text indexes is slow, and the modification of various data with full-text indexes is also slow

7. Does not support Chinese

The above is the detailed content of the principles and defects of MySQL full-text indexing. For more information about MySQL full-text indexing, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL index failure principle
  • MySQL index principle and usage example analysis
  • Understanding the MySQL query optimization process
  • MySQL paging query optimization techniques
  • MySQL group query optimization method
  • MySQL index principle and query optimization detailed explanation

<<:  How to remove the dividing line of a web page table

>>:  Docker uses CMD or ENTRYPOINT commands to start multiple services at the same time

Recommend

How to enable slow query log in MySQL

1.1 Introduction By enabling the slow query log, ...

Solution to docker suddenly not being accessible from the external network

According to the methods of the masters, the caus...

How to solve the mysql insert garbled problem

Problem description: When inserting Chinese chara...

Summary of basic usage of js array

Preface Arrays are a special kind of object. Ther...

Several ways to center a box in Web development

1. Record several methods of centering the box: 1...

Use the njs module to introduce js scripts in nginx configuration

Table of contents Preface 1. Install NJS module M...

Undo log in MySQL

Concept introduction: We know that the redo log i...

Simple principles for web page layout design

This article summarizes some simple principles of...

A quick solution to the problem of PC and mobile adaptation

When making a web page, we usually need to consid...

JavaScript implements asynchronous acquisition of form data

This article example shares the specific code for...

Getting Started Tutorial on GDB in Linux

Preface gdb is a very useful debugging tool under...