How to output Chinese characters in Linux kernel

How to output Chinese characters in Linux kernel

You can easily input Chinese and get Chinese output in the SSH terminal of Linux logged in from Windows/MacOS, such as the following:

insert image description here

But it is almost impossible to display Chinese on Linux's own virtual terminal :

[root@localhost font]# echo leather shoes>/dev/tty2 

insert image description here

Two question marks are displayed, and it is obvious that the Linux kernel cannot recognize Chinese.

Why do you say that the Linux kernel cannot recognize Chinese? There is a relationship that needs to be clarified here:

  • Your input and display output on the remote SSH terminal are all completed by the host machine of the SSH terminal, such as Windows, MacOS, and has nothing to do with Linux.
  • Your input and display output behavior on the Linux local virtual terminal, such as /dev/tty1, is handled by the Linux kernel itself.

For example, I used iTerm SSH on MacOS to connect to a remote CentOS Linux. All keyboard input and display output behaviors on iTerm were completed by the MacOS host machine of iTerm.

On the contrary, if you input directly on the virtual terminal of this CentOS Linux and try to get output, then this input and output must be handled by the Linux kernel itself.

That’s basically it. As for why the Linux kernel does not support Chinese, you need to understand the logic of how the Linux kernel treats unicode when processing virtual terminal input and output. This involves a lot of theoretical knowledge and is very annoying.

Anyway, I just can't output Chinese here, and I'm not doing this. Obviously, this is not a work task that must be completed, so I'm just playing around.

The goal of this article is to enable Linux's virtual terminal to output Chinese.

Just output Chinese, even if it is just one Chinese character. Specifically, when I type the character 'A' on the keyboard, a Chinese character is displayed on the monitor.

Therefore, this article does not intend to make the Linux kernel fully support Chinese on a large scale . Many people and communities have already done this, but the playability is not high. After all, this kind of thing can be done as a private job to make money. As long as it is a job that makes money, the playability is not high because it has to be fast.

You don’t need to understand lengthy and boring unicode encoding, nor do you need to understand boring font formats. Just see how to play.

Let's show the effect first. Here is a 8 × 168\times 168×16 dot matrix example:

insert image description here

It didn't look good, so I made the following 28 × 1628\times 1628×16 dot matrix:

insert image description here

Here’s how this is accomplished.

From the time you press a key on the keyboard to the time a character is finally displayed on the monitor of the virtual terminal, there are actually two mappings:

Keyboard and character set mapping

Convert a key event to a code in a character set, for example, when the 'A' key is pressed, map it to 0x41.

Character set and font mapping

Map the codewords of a character set to a dot matrix for display. For example, 0x41 is mapped to a character that can be seen as an 'A'. 8 × 168\times 168×16 dot matrix.

The Linux console cannot recognize character set codes exceeding 0x00ff, so it cannot process unicode codes exceeding 0x00ff. If you want it to do so, you have to change the kernel code.

As I said earlier, modifying the kernel code to fully support Chinese on a large scale is a money-making business, but it is not only boring, but no one will share it.

So I tried to modify the two mappings above to solve the problem. Since it is just for display, I will not modify the mapping between keyboard and character set , because then there will still be the problem of handling character set code words exceeding 0x00ff.

This means that if you want to display Chinese, there is only one way left, and that is to modify the mapping of character sets and fonts !

This mapping must be stored somewhere in kernel memory or the file system. I can find the following information in the current kernel's config file:

[root@localhost font]# cat /boot/config-3.10.0-862.11.6.el7.x86_64 |grep FONT
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y

Let's see what's in /proc/kallsyms:

[root@localhost font]# cat /proc/kallsyms |grep font.*8x
ffffffffb006a3e0 R font_vga_8x8
ffffffffb006a420 r fontdata_8x8
ffffffffb006ac20 R font_vga_8x16
ffffffffb006ac60 r fontdata_8x16
ffffffffb0307a10 r __ksymtab_font_vga_8x16
ffffffffb03234b8 r __kcrctab_font_vga_8x16
ffffffffb034246e r __kstrtab_font_vga_8x16

Well, this is the font stored in the kernel:

[root@localhost rh]# ll ./drivers/video/console/font_8x*
-rw-r--r--. 1 root root 95976 Sep 17 2018 ./drivers/video/console/font_8x16.c
-rw-r--r--. 1 root root 50858 Sep 17 2018 ./drivers/video/console/font_8x8.c

These two files will not be analyzed here. This just confirms the fact that the kernel will use its own font when it is initialized . After all, at this time, there is nothing but the kernel itself.

The problem is that in user mode, the font can be changed and made fancy. These fonts cannot be handled by just two 8x8 and 8x16 fonts...

At this time, we need to find the font file we installed in the distribution. We need to find it, then change the shape of a certain font inside it and turn it into Chinese! It's that simple.

You don't have to search for where the font file is installed and saved. You can find it by executing the strace setfont command.

[root@localhost ~]# strace -F -e trace=open setfont
...
strace: Process 6276 attached
[pid 6276] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 4
...
[pid 6276] open("/lib/kbd/consolefonts/default8x16.psfu.gz", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 4
[pid 6276] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6276, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++

That's it, /lib/kbd/consolefonts/default8x16.psfu.gz

There is no need to search for the format of the font in psfu format, you can find specific characters through pattern recognition.

I plan to find 'A' first, and then change the 'B' and 'C' behind it to my names "Zhao" and "Ya".

First, I need to make the characters "赵" and "亚" to form a dot matrix. The following is my work "Zhao":

00000000
00000000
00100000
11111000
00100101 
00100101
11111010
00100011 
00111010 
01100101 
01100000
10011000
10000111
00000000
00000000
00000000 

insert image description here

Next, we will use this dot matrix to replace the dot matrix of 'B', and at the same time make a "亚" character to replace the dot matrix of 'C'.

The corresponding dot matrix diagram of the default font can be found at the following site:
https://www.zap.org.au/software/fonts/console-fonts-distributed/psftx-centos-7.5/default8x16.psfu.large.pdf

insert image description here

We can get the dot matrix array of the 'A' character, and then match this array in the default8x16.psfu file. The code is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <linux/fb.h>
#include <string.h>

unsigned char zhaoya[32] = {
			// The first line is "Zhao"
			0x00, 0x00, 0x20, 0xf8, 0x25, 0x25, 0xfa, 0x23, 0x3a, 0x65, 0x60, 0x98, 0x87, 0x00, 0x00, 0x00,
			// The second line is 0x00, 0x00, 0x00, 0x7e, 0x24, 0x24, 0x24, 0xa5, 0xa5, 0x66, 0x24, 0x24, 0x7e, 0x00, 0x00, 0x00
};


int main(int argc, char **argv)
{
	int i = 0;
	unsigned char buf[16];
	off_t offset = 0;
	int s = 0;

	int fd = open("default8x16.psfu", O_RDWR);
	i = pread(fd, buf, 8, offset);
	while (1) {
		i = pread(fd, buf, 16, offset);
		if (s == 2) { // Replace 'C'
			memcpy (buf, &zhaoya[16], 16);
			i = pwrite(fd, buf, 16, offset);
			break;
		}
		if (s == 1) { // Replace 'B'
			memcpy (buf, &zhaoya[0], 16);
			pwrite(fd, buf, 16, offset);
			s = 2;
		}
		// Simple method to identify 'A'
		if (buf[0] == 0x00 && buf[1] == 0x00 &&
			buf[2] == 0x10 && buf[3] == 0x38) {
			printf("A found at %d !\n", offset);
			s = 1;
		}
		offset += 16;
	}
}

Compile and execute directly, and then set this default8x16.psfu as a parameter to the kernel:

[root@localhost font]# setfont ./default8x16.psfu

Now enter the Linux virtual terminal tty2. When you type the capital letter 'B' on the keyboard, the word "Zhao" will appear.

Although 16 × 816\times 816×8 even 8 × 88\times 88×8 can also produce complex Chinese dot matrix, but it is too ugly.

So I'm going to find a higher resolution font. I found a high resolution one on Ubuntu 28 × 1628\times 1628×16 dot matrix Arabic-VGA28x16.psf.gz . The method of modifying it is exactly the same as the previous one, and its bitmap is as follows:
https://www.zap.org.au/software/fonts/console-fonts-distributed/psftx-debian-9.4/Lat7-VGA28x16.psf.pdf

I don't need to do it myself 28 × 1628\times 1628×16 dot matrix, I just need to use the ready-made one from GNU uifont. You can directly index the dot matrix according to the unicode code words of "赵" and "亚" in unifont_sample-12.1.01.hex . For the query of unicode code words of any character, please refer to:
https://graphemica.com/

The code to replace the font is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include "zhao"

#define L 28*2
int fd;

int main(int argc, char **argv)
{
	unsigned char buf[L];
	off_t offset = 0;
	// This 0x0e60 is the offset obtained by pattern matching.
	offset += 0x0e60;

	fd = open("Lat7-VGA28x16.psf", O_RDWR);
	pread(fd, buf, L, offset);
	memset(buf, 0, L);
	memcpy(buf+8, &code[0], 32);
	pwrite(fd, buf, L, offset);

	offset += L;
	pread(fd, buf, L, offset);
	memset(buf, 0, L);
	memcpy(buf+8, &code[32], 32);
	pwrite(fd, buf, L, offset);

	offset += L;
	pread(fd, buf, L, offset);
	memset(buf, 0, L);
	memcpy(buf+8, &code[64], 32);
	pwrite(fd, buf, L, offset);
}

Then its effect is:

insert image description here

not bad.

In fact, the content of this article is just:

  1. Make a crappy dot matrix;
  2. The mapping relationship between keyboard, ascii/unicode, and font;
  3. How to locate and analyze problems without knowing any details;
  4. The simpler the better, the more complicated the worse.

Well, actually the third and fourth points are the most important.

Finally, if you want to know which fonts your current virtual terminal supports, type:

[root@localhost font]# showconsolefont

It will display:

insert image description here

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • A Brief Analysis of Linux Kernel Vulnerabilities
  • Linux kernel linked list implementation process
  • A brief discussion on the whole process of creating a new process in the Linux kernel
  • Linux kernel parameter adjustment method
  • Implementation and analysis of Linux kernel space and user space
  • A brief talk about Linux kernel timers
  • A picture to show the operation principle of Linux kernel
  • Detailed explanation of Linux kernel boot parameters
  • Detailed implementation of the red-black tree algorithm in the Linux kernel
  • Detailed explanation of Linux operating system kernel compilation
  • Writing Linux kernel modules and drivers

<<:  Vue implements click feedback instructions for water ripple effect

>>:  MySQL 8.0.13 decompression version installation and configuration method graphic tutorial

Recommend

How to use Vue cache function

Table of contents Cache function in vue2 Transfor...

Example of how to check the capacity of MySQL database table

This article introduces the command statements fo...

Use of MySQL stress testing tool Mysqlslap

1. MySQL's own stress testing tool Mysqlslap ...

Teach you how to write maintainable JS code

Table of contents What is maintainable code? Code...

VMware virtual machine three connection methods example analysis

NAT In this way, the virtual machine's networ...

Small paging design

Let our users choose whether to move forward or ba...

Summary of Vue watch monitoring methods

Table of contents 1. The role of watch in vue is ...

The submit event of the form does not respond

1. Problem description <br />When JS is use...

Detailed explanation of the use of MySQL DML statements

Preface: In the previous article, we mainly intro...

js to achieve simple image drag effect

This article shares the specific code of js to ac...

Using MySQL in Windows: Implementing Automatic Scheduled Backups

1. Write a backup script rem auther:www.yumi-info...

Installation method of mysql-8.0.17-winx64 under windows 10

1. Download from the official website and unzip h...