Preface I always thought that UTF-8 was a universal solution to character set problems until I encountered this problem recently. I was working on a crawler for Sina Weibo recently. When saving data, I found that as long as I kept the emoji expression, the following exception would be thrown: Incorrect string value: '\xF0\x90\x8D\x83\xF0\x90...' As we all know, UTF-8 is 3 bytes, which includes most of the fonts we see in daily life. But 3 bytes are far from enough to accommodate all the text, so utf8mb4 was created. Utf8mb4 is a superset of utf8, which takes up 4 bytes and is backward compatible with utf8. The emoji expressions we use in daily life are 4 bytes. So here we insert data into the utf8 table and the error A quick Google search found the solution. The specific solution is as follows: 1. Change the character set of the data table to utf8mb4 This is very simple. You can find a lot of modification statements online. However, it is recommended to rebuild the table, use 2. MySQL database version must be 5.5.3 or above All the articles on the Internet state that only MySQL 5.5.3 or later supports utf8mb4. However, the database version I used was 5.5.18, and I was able to solve the problem in the end. So don't rush to ask the maintenance guy to upgrade the database first. Try to solve the problem yourself first. 3. Modify the database configuration file /etc/my.cnf and restart the mysql service The main purpose is to modify the default character set of the database, as well as the character set for connection and query. [MySQL supports emoji emoticons and upgrades the encoding to UTF8MB4][1] This article has detailed setting methods, and [In-depth MySQL character set settings][2] This article has the functions of each character set set. You can read it for yourself. 4. Upgrade MySQL Connector to 5.1.21 and above Of all the above operations, the most critical one is step 3, modifying the database configuration file, which probably modifies [client] # The default character set for client source data default-character-set = utf8mb4 [mysqld] # The default character set on the server is character-set-server=utf8mb4 # Connection layer default character set collation-server=utf8mb4_unicode_ci [mysql] # Database default character set default-character-set = utf8mb4 These configurations specify the character sets used by the pipelines through which data passes from the client to the server. Problems with any of these pipelines may result in insertion failure or garbled characters. But in many cases, the online database cannot modify the database files at will, so our operation and maintenance colleagues decisively rejected my request to modify the database configuration file (T_T) So the only solution is to use code. At first, I was going to start with the character set specified when connecting to JDBC. jdbc:mysql://localhost:3306/ding?characterEncoding=UTF-8 Mainly changing UTF-8 to utf8mb4 for Java Style Charset string should solve the problem, right? Unfortunately, Java JDBC does not have a character set corresponding to utf8mb4. When using UTF-8, it is compatible with urf8mb4 and automatically converts the character set. For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with character_set_server=utf8mb4, and leave characterEncoding out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting. – [MySQL: Using Character Sets and Unicode][3] Later, I learned that in each query request, you can explicitly specify the character set to be used. Using The current solution is to explicitly call and execute jdbcTemplate.execute("set names utf8mb4"); jdbcTempalte.execute("..."); It should be noted that when we use the ORM framework, the framework will delay committing due to performance optimization reasons. Unless the transaction ends or the user actively calls for forced commit, Here I use myBatis, taking MessageDao as an example // MessageDao public interface MessageDao { @Update("set names utf8mb4") public void setCharsetToUtf8mb4(); @Insert("insert into tb_message ......") public void insert(Message msg); } // test code SqlSession sqlSession = sqlSessioFactory.openSession(); messageDao = sqlSession.getMapper(MessageDao.class); messageDao.setCharsetToUtf8mb4(); // Force commit sqlSession.commit(); messageDao.insert(message); So far, the problem has been solved. Alas, if only things could go so smoothly. In the project, the mybatis instance is managed by Spring, which means I can't get the sqlSession, which means I can't force commit. And because of the limitations of the Spring transaction framework, it does not allow users to explicitly call force commit. I'm still struggling with this issue. There are two solutions:
Summarize The above is the entire content of this article. I will experiment with the above two solutions when I have time. I hope the content of this article can be of some help to your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support of 123WORDPRESS.COM. You may also be interested in:
|
<<: Packetdrill's concise user guide
>>: How to configure ssh/sftp and set permissions under Linux operating system
Today I had a sneak peek at IE8 beta 1 (hereafter...
js realizes the special effect of clicking and dr...
Preface: In interviews for various technical posi...
1. Definition of offsetParent: offsetParent is th...
Table of contents 1. What is redux? 2. The princi...
Official documentation: https://dev.mysql.com/doc...
Table of contents introduce Implementation steps ...
The database, like the operating system, is a sha...
This article example shares the specific code of ...
Written in front: Sometimes you may need to view ...
statement : This article teaches you how to imple...
Table of contents 1. Introduction 2. What is func...
This article shares the second article of using j...
This article shares with you the graphic tutorial...
Personal implementation screenshots: Install: npm...