In HTML, you need to specify the encoding used by the web page. The general way to specify it is:
In the new version of HTML5, you can also use a simpler way:
Because there are many languages and scripts used in the world, in order to meet the requirements of cross-language and cross-platform text conversion and processing, the international organization developed the Unicode encoding, which was officially announced in 1994 and has been continuously upgraded. It provides 1,114,112 code points and defines a character set of all human-readable characters, including ancient writing symbols. However, in order to represent so many characters, Unicode encoding usually uses 32 bits (ie 4 bytes) to represent one character, which takes up a relatively large storage space. Commonly used characters (such as ASCII) also require longer encodings, and memory usage efficiency is relatively low. For this purpose, a variable width encoding format UTF-8 using 8-bit code units is defined. In UTF-8 encoding, some commonly used characters can be represented using fewer bytes, while less commonly used characters use more bytes, which improves the efficiency of encoding space occupation. For example, ASCII code is still represented by one byte, which is achieved by identifying some high bits in the encoding, which builds a bridge between ASCII encoding and Unicode. The specific encoding method is: In the past, the most commonly used encoding for representing Chinese characters in computers was GB2312, which was released in 1980. Its full name is "Chinese Character Coded Character Set for Information Interchange - Basic Set". It uses two bytes to represent a Chinese character, and includes a total of 6763 Chinese characters and 682 non-Chinese graphic characters, which is compatible with the ASCII character set. However, this encoding contains relatively few Chinese characters and cannot represent the traditional Chinese characters used in Hong Kong and Taiwan. It also cannot represent some uncommon characters and characters in ancient books, which causes a lot of inconvenience in practical use. Later, GB2312 was expanded to form the GBK encoding standard, which can represent traditional Chinese characters and some variant characters, and its scope of use was expanded. In order to suit a wider range of applications, the GB18030 encoding standard was released. GB18030-2000 includes 27,533 Chinese characters, and GB18030-2005 includes 70,244 Chinese characters, and also includes Tibetan, Mongolian, Dai, Yi, Korean, Uyghur and other minority languages. The total encoding space of GB18030 exceeds 1.5 million code positions. The encoding adopts single-byte, double-byte and four-byte encoding for characters. The single-byte part adopts the encoding structure and rules of GB/T11383, using code positions from 0x00 to 0x7F, corresponding to the corresponding code positions of ASCII code; for the double-byte part, the first byte code position is from 0x81 to 0xFE, and the last byte code positions are 0x40 to 0x7E and 0x80 to 0xFE respectively; the four-byte part adopts 0x30 to 0x39 which is not adopted by GB/T11383 as the suffix to expand the double-byte encoding. The expanded four-byte encoding has a range of 0x81308130 to 0xFE39FE39. The GB18030 code is still being expanded. In order to represent more Chinese characters and some special symbols, and for better compatibility in the future, it is best to use the GB18030 standard for newly created web pages, that is, to specify the encoding using one of the following two methods:
Of course, in order to facilitate the display of foreign characters, you can also use the internationally accepted UTF-8 encoding. |
<<: Detailed explanation of HTML basics (Part 2)
>>: How to use Greek letters in HTML pages
background If the catalina.out log file generated...
1. Purpose: Make the code easier to maintain and ...
A friend in the group asked a question before, th...
Use the vscode editor to create a vue template, s...
Use JS to implement object-oriented methods to ac...
Table of contents 1. Development Environment 2. I...
This article uses an example to describe how MySQ...
Table of contents cache Cache location classifica...
Table of contents Changes in the life cycle react...
1. Merge the margins of sibling elements The effe...
Table of contents 1. The relationship between red...
Redis Introduction Redis is completely open sourc...
This article uses examples to illustrate the prin...
Compared with vue2, vue3 has an additional concep...
How to modify the mysql table partitioning progra...