Introduction to using Unicode characters in web pages (&#,\u, etc.)

Introduction to using Unicode characters in web pages (&#,\u, etc.)

The earliest computers could only use ASCII characters, but as the scope of computer applications expanded, many countries designed special character sets for computers so that the letters and text of their own countries and nationalities could be displayed and processed on computers, such as China's GB2312 code. Later, the Internet appeared and connected the whole world, and it became a practical need to display the texts of multiple countries and ethnic groups on a single computer or even a single interface. An international organization has developed a character encoding scheme that can accommodate all the text and symbols in the world. It is called Unicode, which is the abbreviation of Universal Character Set. It is used to meet the requirements of cross-language and cross-platform text conversion and processing. Since its release in 1994, it has been continuously expanded and has now reached Version 10.

You can visit https://www.unicode.org/ for detailed information, including downloading the latest version of the code table.

When designing web pages, you can use the Unicode character set. There are different ways to use it depending on whether you are using it in HTML, CSS, or JavaScript.

1) Use in HTML: &#dddd; or &#xhhhh;

Among them, dddd represents a 4-digit decimal value, and hhhh represents a 4-digit hexadecimal value. The two formats are prefixed with &# and &#x respectively, indicating decimal code or hexadecimal Unicode code, and both need to be suffixed with a semicolon. Currently, Unicode characters using 4-bit hexadecimal codes are well supported, and most of them can be displayed normally on web pages, but many other Unicode characters often cannot be displayed. This is because the computer platform used does not have the relevant Unicode support installed. Example:
<p>Display Unicode character --∰</p>
It shows a mathematical symbol with a Unicode code of 2230. You can use “&#x2230;” or “&#8752;” to output this special character, and then it can be displayed on the page.

2) Use in CSS: \hhhh

Unicode characters are rarely used in CSS, but they are occasionally used. They are generally represented by a 4-digit hexadecimal Unicode code prefixed by a backslash.

3) Use in JavaScript: \uhhhh

Special characters are often used in JavaScript code. For example, to output temperature or angle symbols in an element, using Greek letters, Roman numerals, etc., just add the prefix "\u" in front of the 4-bit Unicode hexadecimal code. Example:

document.body.innerHTML="\u25D0";

The Unicode code 25D0 is used. In the geometric figure table, it is a circular pattern, usually filled with white and half filled with black, like a half moon.

Of course, the most common use of Unicode by Chinese people is in Chinese characters. In order to display more Chinese characters, the Chinese character library was first expanded from GB2312 to GBK, and now to GB18030. The latest version of GB18030 has included more than 70,000 Chinese characters, various minority languages, and some special characters. This standard is consistent with the Unicode code method. Of course, some computers may not have a complete new version of the support software installed, so they can only display some characters.

To get the Unicode code of a Chinese character, you can use the JavaScript function charCodeAt(), for example:

var ucode="赵".charCodeAt();

In this way, the Unicode code of the Chinese character "赵" is stored in the variable ucode, and the Unicode code obtained is 36213, which is a decimal Unicode code. You can use the toString(16) method to convert this decimal code to hexadecimal code:

var ucode="赵".charCodeAt().toString(16);

This gives us the hexadecimal Unicode code for the Chinese character "赵", and the value is 8d75.

Generally, when outputting Chinese characters, the character string including Chinese characters can be directly displayed. You can also use the Unicode code of Chinese characters to output the corresponding Chinese characters or other characters:

String.fromCharCode(36213);

This will convert the character with the decimal Unicode code 36213 into a string, and then outputting this string will display the Chinese character "赵". Because Chinese characters can be directly converted into strings using input methods, this method is often used to output some special characters.

Convert the &# encoding into characters

This is unicode encoding, the encoding process is as follows:

For example, if we want to encode "杨", we can create a new Notepad, enter "杨" and choose to save as unicode encoding, and then view the binary content of the file. The first two bytes FF FE are the unicode encoding file header flag, and the next two bytes 68 67 are the unicode encoding of "杨". Use a calculator to convert it to decimal, which is 26472. Now you can write "杨" in an html file, and when IE opens it, it will display the word "杨".

Of course, for general ASCII codes, unicode encoding is consistent with ASCII encoding, so A can display an uppercase letter "A".

Convert the &# encoding into characters

function uncode(str) {
return str.replace(/&#(x)?([^&]{1,5});?/g, function (a, b, c) {
return String.fromCharCode(parseInt(c, b ? 16 : 10));
});
}

Convert characters to &# encoding

function encode(str) {
var a = [], i = 0;
for (; i < str.length ;) a[i] = str.charCodeAt(i++);
return "&#" + a.join(";&#") + ";";
}

This is the end of this article about the use of Unicode characters in web pages (&amp;#, \u, etc.). For more relevant Unicode content, please search 123WORDPRESS.COM's previous articles or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

<<:  Parse CSS to extract image theme color function (tips)

>>:  Example of implementing bidirectional messaging between parent and child pages in HTML iframe

Recommend

mysql join query (left join, right join, inner join)

1. Common connections for mysql INNER JOIN (inner...

Steps for Vue3 to use mitt for component communication

Table of contents 1. Installation 2. Import into ...

Two ways to clear table data in MySQL and their differences

There are two ways to delete data in MySQL: Trunc...

Let's talk about the size and length limits of various objects in MySQL

Table of contents Identifier length limit Length ...

Installation method of MySQL 5.7.18 decompressed version under Win7x64

Related reading: Solve the problem that the servi...

Several important MySQL variables

There are many MySQL variables, some of which are...

How to use the Marquee tag in XHTML code

In the forum, I saw netizen jeanjean20 mentioned h...

Solution for Docker container not recognizing fonts such as Songti

Problem background: When using docker to deploy t...

The process of quickly converting mysql left join to inner join

During the daily optimization process, I found a ...

Two ways to start Linux boot service

Table of contents rc.local method chkconfig metho...

CSS example code to hide the scroll bar and scroll the content

Preface When the HTML structure of a page contain...