Detailed View of Hidden Columns in MySQL

Detailed View of Hidden Columns in MySQL

In the process of introducing MySQL's multi-version concurrency control mvcc , we mentioned that there are some hidden columns in MySQL, such as row identifiers, transaction IDs, rollback pointers, etc. I wonder if you have been as curious as I am, how can you actually see the values ​​of these hidden columns?

In this article, we will focus on the row identifier DB_ROW_ID among many hidden columns. In fact, it is not accurate to call the row identifier a hidden column because it is not a real column. DB_ROW_ID is actually an alias for a non-empty unique column. Before we unveil its mystery, let's take a look at the official documentation:

If a table has a PRIMARY KEY or UNIQUE NOT NULL index that consists of a single column that has an integer type, you can use _rowid to refer to the indexed column in SELECT statements

To put it simply, if there is a primary key or a non-empty unique index in the table and it consists of only one integer column, then you can use the SELECT statement to directly query _rowid , and the value of this _rowid will reference the value of the index column.

Pay special attention to the keywords mentioned in the document: primary key, unique index, not empty, single column, and numeric type. Next, we will start from these perspectives and explore the mysterious hidden field _rowid .

1. Primary key exists

First, let's look at the case where the primary key is set and is of numeric type. Use the following statement to create a table:

CREATE TABLE `table1` (
  `id` bigint(20) NOT NULL PRIMARY KEY ,
  `name` varchar(32) DEFAULT NULL
)ENGINE=InnoDB;

After inserting three test data, execute the following query statement and directly query _rowid in the select query statement:

select *,_rowid from table1

Check the execution results, _rowid can be queried normally:

It can be seen that when the primary key is set and the primary key field is of numeric type, _rowid directly references the value of the primary key field. This situation that can be queried by select statement is called an explicit rowid .

Let’s review several keywords in the previously mentioned document and analyze them separately. Since the primary key must be a non-null field, let's take a look at the case where the primary key is a non-numeric field. The table is created as follows:

CREATE TABLE `table2` (
  `id` varchar(20) NOT NULL PRIMARY KEY ,
  `name` varchar(32) DEFAULT NULL
)ENGINE=InnoDB;

Execute the same query above on table2 , and the result is an error that _rowid cannot be queried, which proves that if the primary key field is a non-numeric type, _rowid cannot be queried directly.

2. No primary key, but a unique index exists

After testing the two types of primary keys above, let's look at the situation when there is no primary key in the table but a unique index exists. First, test the case where a non-empty unique index is added to a numeric field and create a table as follows:

CREATE TABLE `table3` (
  `id` bigint(20) NOT NULL UNIQUE KEY,
  `name` varchar(32)
)ENGINE=InnoDB;

The query executes fine, and _rowid refers to the value of the column where the unique index is located:

The difference between a unique index and a primary key is that the field where the unique index is located can be NULL . In table3 above, a NOT NULL constraint is added to the column where the unique index is located. If we delete this constraint, can we still query _rowid explicitly? Let's create another table. The difference is that no non-null constraint is added to the column where the unique index is located:

CREATE TABLE `table4` (
  `id` bigint(20) UNIQUE KEY,
  `name` varchar(32)
)ENGINE=InnoDB;

Execute the query statement. In this case, _rowid cannot be queried explicitly:

Similar to the primary key, we test the case where the unique index is added to a non-numeric field. Next, when creating a table, a unique index is added to the character type field and a not null constraint is added:

CREATE TABLE `table5` (
  `id` bigint(20),
  `name` varchar(32) NOT NULL UNIQUE KEY
)ENGINE=InnoDB;

The same query cannot display _rowid :

Based on the test results of the above three situations, we can conclude that when there is no primary key but a unique index exists, _rowid can only be explicitly queried if the unique index is added to a numeric field and a non-null constraint is added to the field, and _rowid refers to the value of this unique index field.

3. There is a joint primary key or joint unique index

In the above tests, we applied the primary key or unique index to a single column. What will happen if a composite primary key or composite unique index is used? Let’s take a look at the instructions in the official documentation first:

_rowid refers to the PRIMARY KEY column if there is a PRIMARY KEY consisting of a single integer column. If there is a PRIMARY KEY but it does not consist of a single integer column, _rowid cannot be used.

In simple terms, if a primary key exists and consists of only one column of numeric type, then the value of _rowid will refer to the primary key. If the primary key consists of multiple columns, _rowid will not be available.

Based on this description, let's test the situation of the joint primary key. The following table is created with two columns of numeric type fields as the joint primary key:

CREATE TABLE `table6` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  `name` varchar(32),
  PRIMARY KEY(`id`,`no`)
)ENGINE=InnoDB;

Execute the query whose results cannot be displayed to _rowid :

Similarly, this theory can also be applied to unique indexes. If a non-empty unique index is not composed of a single column, then _rowid cannot be directly queried. This test process is omitted, and those who are interested can try it by themselves.

4. There are multiple unique indexes

In MySQL, each table can have only one primary key, but can have multiple unique indexes. So if there are multiple unique indexes that meet the rules at the same time, which one will be referenced as the value of _rowid ? As usual, let’s look at the official documentation for answers:

Otherwise, _rowid refers to the column in the first UNIQUE NOT NULL index if that index consists of a single integer column. If the first UNIQUE NOT NULL index does not consist of a single integer column, _rowid cannot be used.

To put it simply, if the first non-empty unique index in the table consists of only one integer type field, then _rowid will reference the value of this field. Otherwise, if the first non-null unique index does not satisfy this condition, then _rowid will not be available.

In the following table, create two unique indexes that both meet the rules:

CREATE TABLE `table8_2` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  `name` varchar(32),
  UNIQUE KEY(no),
  UNIQUE KEY(id)
)ENGINE=InnoDB;

Take a look at the results of executing the query statement:

You can see that the value of _rowid is the same as the value of the no column, which proves that _rowid will strictly select the first created unique index as its reference.

So, if the first unique index created in the table does not meet the reference rules of _rowid , and the second unique index meets the rules, in this case, can _rowid be explicitly queried? For this situation, we create a table as follows. The first index in the table is a joint unique index, and the second index is a single-column unique index. Then we test it:

CREATE TABLE `table9` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  `name` varchar(32),
  UNIQUE KEY `index1`(`id`,`no`),
  UNIQUE KEY `index2`(`id`)
)ENGINE=InnoDB;

Querying the table shows that although there is a single-column non-empty unique index, the first column selected sequentially does not meet the requirements, so _rowid cannot be queried directly:

If the order of the statements for creating the unique index above is reversed, _rowid can be queried explicitly.

5. Primary key and unique index exist at the same time

From the above example, we can see that the definition order of the unique index will determine which index will be applied _rowid . So when there are both primary keys and unique indexes, will the definition order affect their references?

Create two tables using the following statements. The only difference is the order in which the primary key and unique index are created:

CREATE TABLE `table11` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  PRIMARY KEY(id),
  UNIQUE KEY (no)
)ENGINE=InnoDB;

CREATE TABLE `table12` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  UNIQUE KEY(id),
  PRIMARY KEY (no)
)ENGINE=InnoDB;

View the running results:

It can be concluded that when both a qualified primary key and a unique index exist, _rowid will take precedence over the value of the primary key field regardless of the order in which they are created.

6. No primary key or unique index that meets the requirements

In the above, we call _rowid that can be directly queried through select statement explicit _rowid. In other cases, although _rowid cannot be queried explicitly, it always exists. In this case, we can call it implicit _rowid .

In fact, innoDB will generate a 6-byte unsigned number as the automatically growing _rowid when there is no default primary key. Therefore, the maximum value is 2^48-1 . After reaching the maximum value, it will start counting from 0. Next, we create a table without a primary key and a unique index, and based on this table, explore the implicit _rowid .

CREATE TABLE `table10` (
  `id` bigint(20),
  `name` varchar(32)
)ENGINE=InnoDB;

First, we need to find the mysql process pid :

ps -ef | grep mysqld

As you can see, the mysql process pid is 2068:

Before we get started, we need to do some preparation. In innoDB , a global variable dictsys.row_id is actually maintained. Tables that do not have a primary key defined will share this row_id . When inserting data, this global row_id will be used as its own primary key, and then this global variable will be increased by 1.

Next we need to use gdb debugging related technology. gdb is a debugging tool under Linux that can be used to debug executable files. On the server, first install gdb through yum install gdb . After the installation is complete, change row_id to 1 through the following gdb command:

gdb -p 2068 -ex 'p dict_sys->row_id=1' -batch

Command execution results:

Insert 3 rows of data into an empty table:

INSERT INTO table10 VALUES (100000001, 'Hydra');
INSERT INTO table10 VALUES (100000002, 'Trunks');
INSERT INTO table10 VALUES (100000003, 'Susan');

View the data in the table. The corresponding _rowid is theoretically 1 to 3:

Then use the gdb command to change row_id to the maximum value of 2^48 , which exceeds the maximum value dictsys.row_id :

gdb -p 2068 -ex 'p dict_sys->row_id=281474976710656' -batch

Command execution results:

Insert three more records into the table:

INSERT INTO table10 VALUES (100000004, 'King');
INSERT INTO table10 VALUES (100000005, 'Queen');
INSERT INTO table10 VALUES (100000006, 'Jack');

Viewing all the data in the table, we can see that two of the three data inserted for the first time have been overwritten:

Why does data coverage occur? Let's analyze this result. First, before the first data is inserted, _rowid is 1, and _rowid corresponding to the three inserted data are 1, 2, and 3. As shown in the following figure:

When _rowid is manually set to the maximum value, the next time data is inserted, the inserted _rowid starts again from 0, so _rowid of the three data inserted the second time should be 0, 1, and 2. The data to be inserted is as follows:

When the same _rowid appears, the newly inserted data will overwrite the original data according to _rowid . The process is shown in the figure:

Therefore, when the primary key or unique index in the table does not meet the requirements mentioned above, the implicit _rowid used by innoDB is at risk. Although the value of 2^48 is large, it is still possible to be used up. When _rowid is used up, the previous records will be overwritten. From this perspective, we can also remind everyone that you must create a primary key when building a table, otherwise data may be overwritten.

This article is tested based on MySQL 5.7.31

Official documentation: https://dev.mysql.com/doc/refman/5.7/en/create-index.html

This is the end of this article about the specific use of hidden columns in MySQL. For more relevant MySQL hidden columns, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • In-depth explanation of hidden fields, a new feature of MySQL 8.0
  • MySQL SQL statement method to hide the middle four digits of the mobile phone number
  • How to hide the password in the command line of MySQL in Linux system

<<:  Web page image optimization tools and usage tips sharing

>>:  Example code for evenly distributing elements using css3 flex layout

Recommend

Detailed explanation of MySQL EXPLAIN output columns

1. Introduction The EXPLAIN statement provides in...

Detailed explanation of DIV+CSS naming rules can help achieve SEO optimization

1. CSS file naming conventions Suggestion: Use le...

Nginx load balancing algorithm and failover analysis

Overview Nginx load balancing provides upstream s...

How to install MySQL and MariaDB in Docker

Relationship between MySQL and MariaDB MariaDB da...

Code for implementing simple arrow icon using div+CSS in HTML

In web design, we often use arrows as decoration ...

MySQL 8.0.21 free installation version configuration method graphic tutorial

Six steps to install MySQL (only the installation...

DIV common attributes collection

1. Property List Copy code The code is as follows:...

Example code for implementing auto-increment sequence in mysql

1. Create a sequence table CREATE TABLE `sequence...

The principle and implementation of two-way binding in Vue2.x

Table of contents 1. Implementation process 2. Di...

In-depth understanding of MySQL various locks

Table of contents Lock Overview Lock classificati...

CSS3 realizes the website product display effect diagram

This article introduces the effect of website pro...

Three common style selectors in html css

1: Tag selector The tag selector is used for all ...