1. Content OverviewWhen using MySQL, it is very important to understand the concepts of character sets and character sequences, as well as the impact of different settings on data storage and comparison. The "garbled characters" problem encountered by many students in their daily work is most likely caused by insufficient understanding of character sets and character sequences, as well as incorrect settings. This article introduces the following contents from the basics to the complex ones:
2. Concepts and Relationships of Character Sets and Character OrdersWhen storing data, MySQL provides support for different character sets. When comparing data, different character order supports are provided. MySQL provides settings at different levels, including server level, database level, table level, and column level, which can provide very precise settings. What are character sets and character sequences? In simple terms:
For example: There are four characters: A, B, a, b. The codes of these four characters are A = 0, B = 1, a = 2, b = 3 respectively. The characters + encoding here constitute the character set. What if we want to compare the size of two characters? For example, A and B, or a and b, the most intuitive way to compare is to use their encoding, for example, because 0 < 1, so A < B. In addition, for A and a, although they have different encodings, we think that uppercase and lowercase characters should be equal, that is, A == a. Two comparison rules are defined above, and the set of these comparison rules is collation.
3. Character sets and character sequences supported by MySQLMySQL supports multiple character sets and character sequences.
The above is rather abstract. Let’s look at the next few sections to understand what’s going on. 1. View supported character sets You can view the character sets supported by MYSQL in the following ways. Method 1: mysql> SHOW CHARACTER SET; +----------+-----------------------------+---------------------+--------+ | Charset | Description | Default collation | Maxlen | +----------+-----------------------------+---------------------+--------+ | big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 | | dec8 | DEC West European | dec8_swedish_ci | 1 | ...omitted Method 2: mysql> use information_schema; mysql> select * from CHARACTER_SETS; +--------------------+----------------------+-----------------------------+--------+ | CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION | MAXLEN | +--------------------+----------------------+-----------------------------+--------+ | big5 | big5_chinese_ci | Big5 Traditional Chinese | 2 | | dec8 | dec8_swedish_ci | DEC West European | 1 | ...omitted When using SHOW CHARACTER SET to view, you can also add WHERE or LIKE restrictions. Example 1: Using the WHERE condition. mysql> SHOW CHARACTER SET WHERE Charset="utf8"; +---------+---------------+-------------------+--------+ | Charset | Description | Default collation | Maxlen | +---------+---------------+-------------------+--------+ | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | +---------+---------------+-------------------+--------+ 1 row in set (0.00 sec) Example 2: Use the LIKE condition. mysql> SHOW CHARACTER SET LIKE "utf8%"; +---------+---------------+--------------------+--------+ | Charset | Description | Default collation | Maxlen | +---------+---------------+--------------------+--------+ | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | | utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 | +---------+---------------+--------------------+--------+ 2 rows in set (0.00 sec) 2. View the supported character sequences Similarly, you can view the character sequences supported by MYSQL in the following way. Method 1: View through SHOW COLLATION. As you can see, the utf8 character set has more than 10 character sequences. Check whether the default character sequence is used by checking whether the value of Default is Yes. mysql> SHOW COLLATION WHERE Charset = 'utf8'; +--------------------------+---------+-----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +--------------------------+---------+-----+---------+----------+---------+ | utf8_general_ci | utf8 | 33 | Yes | Yes | 1 | | utf8_bin | utf8 | 83 | | Yes | 1 | ...slightly Method 2: Query information_schema.COLLATIONS. mysql> USE information_schema; mysql> SELECT * FROM COLLATIONS WHERE CHARACTER_SET_NAME="utf8"; +--------------------------+--------------------+-----+------------+-------------+---------+ | COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN | +--------------------------+--------------------+-----+------------+-------------+---------+ | utf8_general_ci | utf8 | 33 | Yes | Yes | 1 | | utf8_bin | utf8 | 83 | | Yes | 1 | | utf8_unicode_ci | utf8 | 192 | | Yes | 8 | 3. Naming conventions for character sequences The character sequences are named with their corresponding character set as prefix, as shown below. For example, the character sequence utf8_general_ci indicates that it is the character sequence of the character set utf8. For more rules, please refer to the official documentation. MariaDB [information_schema]> SELECT CHARACTER_SET_NAME, COLLATION_NAME FROM COLLATIONS WHERE CHARACTER_SET_NAME="utf8" limit 2; +--------------------+-----------------+ | CHARACTER_SET_NAME | COLLATION_NAME | +--------------------+-----------------+ | utf8 | utf8_general_ci | | utf8 | utf8_bin | +--------------------+-----------------+ 2 rows in set (0.00 sec) 4. Server character set and character sequencePurpose: When you create a database and do not specify a character set or character order, the server character set and server character order will be used as the default character set and sorting rule for the database. How to specify: You can specify it through command line parameters when starting the MySQL service. It can also be specified via variables in the configuration file. Server default character set and character order: specified through compilation parameters when MySQL is compiled. character_set_server and collation_server correspond to the server character set and server character sequence respectively. 1. Check the server character set and character sequence They correspond to the character_set_server and collation_server system variables respectively. mysql> SHOW VARIABLES LIKE "character_set_server"; mysql> SHOW VARIABLES LIKE "collation_server"; 2. Specify when starting the service You can specify the server character set and character sequence when starting the MySQL service. If not specified, the default character order is latin1, latin1_swedish_ci mysqld --character-set-server=latin1 \ --collation-server=latin1_swedish_ci Specify the server character set separately. In this case, the server character sequence is the default character sequence of latin1, latin1_swedish_ci. mysqld --character-set-server=latin1 3. Configuration file specification In addition to specifying it in the command line parameters, you can also specify it in the configuration file, as shown below. [client] default-character-set=utf8 [mysql] default-character-set=utf8 [mysqld] collation-server = utf8_unicode_ci init-connect='SET NAMES utf8' character-set-server = utf8 4. Runtime modification Example: Modification during runtime (will become invalid after restart. If you want to keep it unchanged after restart, you need to write it into the configuration file) mysql> SET character_set_server = utf8; 5. Specify the default character set and character sequence when compiling The default values of character_set_server and collation_server can be specified through compilation options when MySQL is compiled: cmake . -DDEFAULT_CHARSET=latin1 \ -DDEFAULT_COLLATION=latin1_german1_ci 5. Database character set and character sequencePurpose: To specify the character set and character sequence at the database level. Databases under the same MySQL service can specify different character sets/character sequences. 1. Set the character set/character sequence of the data When creating or modifying a database, you can specify the character set and collation of the database through CHARACTER SET and COLLATE. Create the database: CREATE DATABASE db_name [[DEFAULT] CHARACTER SET charset_name] [[DEFAULT] COLLATE collation_name] Modify the database: ALTER DATABASE db_name [[DEFAULT] CHARACTER SET charset_name] [[DEFAULT] COLLATE collation_name] Example: Create a database test_schema and set the character set to utf8. The default collation is utf8_general_ci. CREATE DATABASE `test_schema` DEFAULT CHARACTER SET utf8; 2. Check the character set/character sequence of the database There are 3 ways to view the character set/character order of the database. Example 1: Check the character set and collation of test_schema. (Need to switch the default database) mysql> use test_schema; Database changed mysql> SELECT @@character_set_database, @@collation_database; +--------------------------+----------------------+ | @@character_set_database | @@collation_database | +--------------------------+----------------------+ | utf8 | utf8_general_ci | +--------------------------+----------------------+ 1 row in set (0.00 sec) Example 2: You can also use the following command to view the character set and database of test_schema (no need to switch the default database) mysql> SELECT SCHEMA_NAME, DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE schema_name="test_schema"; +-------------+----------------------------+------------------------+ | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | +-------------+----------------------------+------------------------+ | test_schema | utf8 | utf8_general_ci | +-------------+----------------------------+------------------------+ 1 row in set (0.00 sec) Example 3: You can also view the character set by viewing the statement for creating the database. mysql> SHOW CREATE DATABASE test_schema; +-------------+---------------------------------------------------------------------+ | Database | Create Database | +-------------+---------------------------------------------------------------------+ | test_schema | CREATE DATABASE `test_schema` /*!40100 DEFAULT CHARACTER SET utf8 */ | +-------------+---------------------------------------------------------------------+ 1 row in set (0.00 sec) 3. How are database character sets and character sequences determined? When creating a database, if CHARACTER SET or COLLATE is specified, the corresponding character set and collation will prevail. 6. Table character set and character sequenceThe syntax for creating and modifying a table is as follows. The character set and character sequence can be set through CHARACTER SET and COLLATE. CREATE TABLE tbl_name (column_list) [[DEFAULT] CHARACTER SET charset_name] [COLLATE collation_name]] ALTER TABLE tbl_name [[DEFAULT] CHARACTER SET charset_name] [COLLATE collation_name] 1. Create a table and specify the character set/character sequence The following example specifies the character set as utf8 and the character sequence as the default. CREATE TABLE `test_schema`.`test_table` ( `id` INT NOT NULL COMMENT '', PRIMARY KEY (`id`) COMMENT '') DEFAULT CHARACTER SET = utf8; 2. Check the character set/character sequence of the table Similarly, there are 3 ways to view the character set/character order of a table. Method 1: Use SHOW TABLE STATUS to view the table status. Note that the Collation is utf8_general_ci and the corresponding character set is utf8. MariaDB [blog]> SHOW TABLE STATUS FROM test_schema \G; *************************** 1. row *************************** Name: test_table Engine: InnoDB Version: 10 Row_format: Compact Rows: 0 Avg_row_length: 0 Data_length: 16384 Max_data_length: 0 Index_length: 0 Data_free: 11534336 Auto_increment: NULL Create_time: 2018-01-09 16:10:42 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) Method 2: View the information of information_schema.TABLES. mysql> USE test_schema; mysql> SELECT TABLE_COLLATION FROM information_schema.TABLES WHERE TABLE_SCHEMA = "test_schema" AND TABLE_NAME = "test_table"; +-----------------+ | TABLE_COLLATION | +-----------------+ |utf8_general_ci| +-----------------+ Method 3: Confirm through SHOW CREATE TABLE. mysql> SHOW CREATE TABLE test_table; +------------+----------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +------------+----------------------------------------------------------------------------------------------------------------+ | test_table | CREATE TABLE `test_table` ( `id` int(11) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +------------+----------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) 3. How to determine the table character set and character sequence Assume that the values of CHARACTER SET and COLLATE are charset_name and collation_name respectively. If you create a table: If charset_name and collation_name are specified, use them. If only collation_name is specified but charset_name is not, the character sequence uses collation_name and the character set uses the character set associated with collation_name. If charset_name and collation_name are not specified, the character set and collation settings of the database are used. 7. Column character set and sortingFor columns of type CHAR, VARCHAR, or TEXT, you can specify a character set or character sequence. The syntax is as follows: col_name {CHAR | VARCHAR | TEXT} (col_length) [CHARACTER SET charset_name] [COLLATE collation_name] 1. Add a new column and specify the character set/collation rules The example is as follows: (similar to creating a table) mysql> ALTER TABLE test_table ADD COLUMN char_column VARCHAR(25) CHARACTER SET utf8; 2. Check the character set/character sequence of the column Here are some examples: mysql> SELECT CHARACTER_SET_NAME, COLLATION_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA="test_schema" AND TABLE_NAME="test_table" AND COLUMN_NAME="char_column"; +--------------------+-----------------+ | CHARACTER_SET_NAME | COLLATION_NAME | +--------------------+-----------------+ | utf8 | utf8_general_ci | +--------------------+-----------------+ 1 row in set (0.00 sec) 3. Determine the column character set/collation rules Assume that the values of CHARACTER SET and COLLATE are charset_name and collation_name respectively: If charset_name and collation_name are both clear, the character set and character order are based on charset_name and collation_name. If only charset_name is specified and collation_name is not specified, the character set is charset_name and the character sequence is the default character sequence of charset_name. If only collation_name is specified and charset_name is not specified, the character sequence is the collation_name and the character set is the character set associated with collation_name. If charset_name and collation_name are not specified, the character set and character sequence of the table shall prevail. 8. Choice: When to set character set and character sequenceGenerally speaking, it can be configured in three places: Configured when creating a database. The mysql server is configured when it is started. When compiling MySQL from source code, configure it through compilation parameters 1. Method 1: Configure when creating a database This method is more flexible and safer, and it does not depend on the default character set/character sequence. When you create a database, you specify the character set/character order. When you create tables and columns later, if you do not specify otherwise, the character set/character order of the corresponding database will be inherited. CREATE DATABASE mydb DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; 2. Method 2: Configure when MySQL server starts You can add the following configuration so that when MySQL server starts, character-set-server and collation-server will be configured. When you create a database/table/column through the mysql client and do not explicitly declare the character set/character order, the character-set-server/collation-server will be used as the default character set/character order. In addition, the character set/character order when the client and server connect still needs to be set through SET NAMES. [mysqld] character-set-server=utf8 collation-server=utf8_general_ci 3. Method 3: When compiling MySQL from source code, set it through compilation parameters If -DDEFAULT_CHARSET and -DDEFAULT_COLLATION are specified during compilation, then: When creating a database or table, it will be used as the default character set/character sequence. When the client connects to the server, it uses it as the default character set/character order. (No need to set NAMES separately) shell> cmake . -DDEFAULT_CHARSET=utf8 \ -DDEFAULT_COLLATION=utf8_general_ci 9. Write at the endThis article introduces the character set and character sequence related content in MySQL in detail. This part mainly focuses on the storage and comparison of data. In fact, there is still a very important part that has not been covered: the character set and character sequence settings for the connection. There are also many garbled code problems caused by improper setting of the connection character set and character order. There is a lot of content to expand on this part, so it will be explained in the next article. 10. Related links10.1 Character Set Support https://dev.mysql.com/doc/refman/5.7/en/charset.html Summarize The above is what I introduced to you. I hope it will be helpful to you. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website! You may also be interested in:
|
<<: How to use Nginx proxy to surf the Internet
>>: In-depth understanding of Worker threads in Node.js
Where is my hometown when I look northwest? How m...
Copy code The code is as follows: <thead> &...
Table of contents 1. Discover the problem 2. Dele...
Table of contents Docker container data volume Us...
introduce Vue Router is the official routing mana...
MySQL 8.0: MVCC for Large Objects in InnoDB In th...
Sometimes when requesting certain interfaces, you...
It is troublesome to install the db2 database dir...
Preface Components are something we use very ofte...
Mainly from two aspects: 1. Highlight/Line Break ...
I have installed various images under virtual mac...
Table of contents 1. Open the project directory o...
Preface Since I needed to install Zookeeper durin...
In the vertical direction, you can set the row al...
/******************** * Virtual File System VFS *...