The reason why MySQL manually registers the binlog file and causes master-slave abnormalities

The reason why MySQL manually registers the binlog file and causes master-slave abnormalities

1. Source of the problem

A friend @水米田 asked me about the master-slave based on POSITION. He did the following

Add some backed up binlog files to the directory, modify the index file, and add these binlog files
flush binary logs
Then the entire master-slave environment is delayed a lot.

2. Friend's Test

Here is another test by my friend @徐晨亮:

On the master:
(root:db1@xucl:10:30:22)[(none)]> show binary logs;
+---------------------+-----------+
| Log_name | File_size |
+---------------------+-----------+
|mysql-binlog.000031 | 1019 |
|mysql-binlog.000032 | 424 |
|mysql-binlog.000033 | 244 |
|mysql-binlog.000034 | 2332 |
|mysql-binlog.000035 | 2134 |
|mysql-binlog.000036 | 845915 |
|mysql-binlog.000037 | 11735 |
|mysql-binlog.000038 | 284 |
|mysql-binlog.000039 | 284 |
|mysql-binlog.000040 | 284 |
|mysql-binlog.000041 | 284 |
|mysql-binlog.000042 | 234 |
+---------------------+-----------+
12 rows in set (0.00 sec)
(root:db1@xucl:10:30:34)[(none)]> purge binary logs to 'mysql-binlog.000039';
Query OK, 0 rows affected (0.00 sec)
(root:db1@xucl:10:30:49)[(none)]> show binary logs;
+---------------------+-----------+
| Log_name | File_size |
+---------------------+-----------+
|mysql-binlog.000039 | 284 |
|mysql-binlog.000040 | 284 |
|mysql-binlog.000041 | 284 |
|mysql-binlog.000042 | 234 |
+---------------------+-----------+
4 rows in set (0.00 sec)
Execute insert data (root:db1@xucl:10:31:23)[xucl]> insert into t values(9,9);
Copy the backed up binlog back to the original directory and modify the index file to register [root@izbp12nspj47ypto9t6vyez logs]# ll
Total usage 884
-rw-r----- 1 mysql mysql 1019 May 20 22:03 mysql-binlog.000031
-rw-r----- 1 mysql mysql 424 May 20 22:03 mysql-binlog.000032
-rw-r----- 1 mysql mysql 244 May 20 22:03 mysql-binlog.000033
-rw-r----- 1 mysql mysql 2332 May 20 22:03 mysql-binlog.000034
-rw-r----- 1 mysql mysql 2134 May 20 22:03 mysql-binlog.000035
-rw-r----- 1 mysql mysql 845915 May 20 22:03 mysql-binlog.000036
-rw-r----- 1 mysql mysql 11735 May 20 22:05 mysql-binlog.000037
-rw-r----- 1 mysql mysql 284 May 20 22:06 mysql-binlog.000038
-rw-r----- 1 mysql mysql 284 May 21 10:28 mysql-binlog.000039
-rw-r----- 1 mysql mysql 284 May 21 10:28 mysql-binlog.000040
-rw-r----- 1 mysql mysql 284 May 21 10:28 mysql-binlog.000041
-rw-r----- 1 mysql mysql 491 May 21 10:31 mysql-binlog.000042
-rw-r----- 1 mysql mysql 204 May 21 10:30 mysql-binlog.index
Main library flush binary logs
(root:db1@xucl:10:32:51)[(none)]> flush binary logs;
Query OK, 0 rows affected (0.01 sec)
(root:db1@xucl:10:32:57)[(none)]> show binary logs;
+---------------------+-----------+
| Log_name | File_size |
+---------------------+-----------+
|mysql-binlog.000031 | 1019 |
|mysql-binlog.000032 | 424 |
|mysql-binlog.000033 | 244 |
|mysql-binlog.000034 | 2332 |
|mysql-binlog.000035 | 2134 |
|mysql-binlog.000036 | 845915 |
|mysql-binlog.000037 | 11735 |
|mysql-binlog.000038 | 284 |
|mysql-binlog.000039 | 284 |
|mysql-binlog.000040 | 284 |
|mysql-binlog.000041 | 284 |
|mysql-binlog.000042 | 541 |
|mysql-binlog.000043 | 234 |
+---------------------+-----------+
13 rows in set (0.00 sec)
At this point, the slave reports the following error:
(root:db1@xucl:10:31:05)[(none)]> show slave status\G
*************************** 1. row ***************************
        Slave_IO_State:
         Master_Host: 127.0.0.1
         Master_User: repl
         Master_Port: 3306
        Connect_Retry: 60
       Master_Log_File: mysql-binlog.000035
     Read_Master_Log_Pos: 194
        Relay_Log_File: izbp12nspj47ypto9t6vyez-relay-bin.000011
        Relay_Log_Pos: 373
    Relay_Master_Log_File: mysql-binlog.000035
       Slave_IO_Running: No
      Slave_SQL_Running: Yes
       Replicate_Do_DB:
     Replicate_Ignore_DB:
      Replicate_Do_Table:
    Replicate_Ignore_Table:
   Replicate_Wild_Do_Table:
 Replicate_Wild_Ignore_Table:
          Last_Errno: 0
          Last_Error:
         Skip_Counter: 0
     Exec_Master_Log_Pos: 194
       Relay_Log_Space: 648
       Until_Condition: None
        Until_Log_File:
        Until_Log_Pos: 0
      Master_SSL_Allowed: No
      Master_SSL_CA_File:
      Master_SSL_CA_Path:
       Master_SSL_Cert:
      Master_SSL_Cipher:
        Master_SSL_Key:
    Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
        Last_IO_Errno: 1236
        Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Cannot replicate GTID-transaction when @@GLOBAL.GTID_MODE = OFF, at file /storage/single/mysql3306/logs/mysql-binlog.000035, position 194.; the first event 'mysql-binlog.000039' at 234, the last event read from '/storage/single/mysql3306/logs/mysql-binlog.000035' at 259, the last byte read from '/storage/single/mysql33'
        Last_SQL_Errno: 0
        Last_SQL_Error:
 Replicate_Ignore_Server_Ids:
       Master_Server_Id: 3306
         Master_UUID: e8bdf01a-c79b-11e8-82b3-00163e088352
       Master_Info_File: mysql.slave_master_info
          SQL_Delay: 0
     SQL_Remaining_Delay: NULL
   Slave_SQL_Running_State: Slave has read all relay logs; waiting for more updates
      Master_Retry_Count: 86400
         Master_Bind:
   Last_IO_Error_Timestamp: 190521 10:32:57
   Last_SQL_Error_Timestamp:
        Master_SSL_Crl:
      Master_SSL_Crlpath:
      Retrieved_Gtid_Set:
      Executed_Gtid_Set: 4c423515-6661-11e9-b767-00163e088352:1-7,
e8bdf01a-c79b-11e8-82b3-00163e088352:1-57192
        Auto_Position: 0
     Replicate_Rewrite_DB:
         Channel_Name:
      Master_TLS_Version:
1 row in set (0.00 sec)
From the error on the slave, it seems that after the master flushes binary logs, the slave reads the registered binlog and applies it again.

3. Phenomenon explanation

From the entire test, MySQL seems to transfer and apply the manually registered files. The error is because the library used to have GITD_MODE=ON, but during the test, GTID_MODE was turned off and changed to POSITION mode. The error is because the DUMP thread will detect:

This picture comes from a new series I wrote (it has not been published yet and will probably be completed by the end of the year). In any case, the DUMP thread has started transferring old binlog files. So what is the reason? We will explain below.

4. Flush binary logs operations on binlog

Flushing binary logs will include the following operations:

  • Get the new binlog file name
  • Close the old binlog
  • Close the index file
  • Open the index file
  • Open a new binlog
  • Add new binlog to indexfile

For details, please refer to the function MYSQL_BIN_LOG::new_file_impl. It should be noted that obtaining the new binlog file name is implemented through the find_uniq_filename function, which contains the following code:

 file_info= dir_info->dir_entry;
 for (i = dir_info->number_off_files ; i-- ; file_info++)
 {
  if (strncmp(file_info->name, start, length) == 0 &&
  is_number(file_info->name+length, &number,0))
  {
   set_if_bigger(max_found, number);
  }
 }
...
 *next = (need_next || max_found == 0) ? max_found + 1 : max_found;

The general meaning is to scan the binlog files in the index file, get the one with the highest sequence number, and then add 1. The stack frame is as follows:

#0 find_uniq_filename (name=0x7fffec5ec6d0 "/mysqldata/mysql3340/log/binlog", next=0x7fffec5ec678, need_next=true)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:3679
#1 0x000000000187d208 in generate_new_log_name (new_name=0x7fffec5ec6d0 "/mysqldata/mysql3340/log/binlog", new_ext=0x0, 
  log_name=0x7ffedc011520 "/mysqldata/mysql3340/log/binlog", is_binlog=true) at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:3767
#2 0x0000000001884fdb in MYSQL_BIN_LOG::new_file_impl (this=0x2e83640, need_lock_log=false, extra_description_event=0x0)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:7175
#3 0x0000000001884cbb in MYSQL_BIN_LOG::new_file_without_locking (this=0x2e83640, extra_description_event=0x0)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:7099
#4 0x0000000001886b6b in MYSQL_BIN_LOG::rotate (this=0x2e83640, force_rotate=true, check_purge=0x7fffec5ecbfb)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:7775
#5 0x0000000001886d53 in MYSQL_BIN_LOG::rotate_and_purge (this=0x2e83640, thd=0x7ffedc000b90, force_rotate=true)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:7846

Therefore, even if we manually modify the index file, there will be no problem with flushing binary logs because it actually scans the index file.
At the same time, we can see that flush binary logs reloads the index file. At this time, the manually modified index file takes effect. Use show binary logs to view the files you added.

5. The emergence of the master-slave problem

After the binlog is switched, the DUMP thread also needs to read the next binlog file. Let's see how it determines which file to read.

The code to loop through each binlog file can be found in the function sender.run(). The following sentence is to find the next binlog file:

int error = mysql_bin_log.find_next_log(&m_linfo, 0);

mysql_bin_log.find_next_log contains the following code:

 my_b_seek(&index_file, linfo->index_file_offset);//Offset linfo->index_file_start_offset = linfo->index_file_offset;
 length=my_b_gets(&index_file, fname, FN_REFLEN)); //Read the file name...
  if (normalize_binlog_name (full_fname, fname, is_relay_log))
...

 linfo->index_file_offset = my_b_tell(&index_file); //The offset is newly recorded for next use

We can see that the DUMP thread does not actually scan the entire index file, but reads through an index file offset. If you modify the index file manually, the offset will be messed up. Therefore the next file sent by DUMP will be undefined. Therefore, the phenomenon of sending manually registered binlog files to the slave library occurs. In this case, the following situations may occur:

  1. If GTID_MODE is off and POSITION is used, an error will be reported in this case, such as duplicate rows.
  2. If GTID_MODE and AUTO_POSITION=1, the DUMP thread will filter the GTID and not send it. Because the Event is not sent, the delay will remain at 0 for a period of time.
  3. If GTID_MODE and AUTO_POSITION=0, the SQL thread will filter when applying GITD_EVENT, and the delay may be very large.

Although GTID may be able to shield this problem, the DUMP thread is already considering sending old binlog files, which is inappropriate.

6. Purge binary logs can maintain this offset

Why does purge binary logs not cause problems? Because under the purge binary logs statement, the offset is maintained as follows:

 virtual void operator()(THD *thd)
 {
  LOG_INFO* linfo;
  mysql_mutex_lock(&thd->LOCK_thd_data);
  if ((linfo = thd->current_linfo)) //b binlog.cc:2829
  {
   /*
    Index file offset can be less that purge offset only if
    we just started reading the index file.
    we have nothing to adjust.
   */
   if (linfo->index_file_offset < m_purge_offset)
    linfo->fatal = (linfo->index_file_offset != 0);
   else
    linfo->index_file_offset -= m_purge_offset;
  }
  mysql_mutex_unlock(&thd->LOCK_thd_data);

We can see a statement like linfo->index_file_offset -= m_purge_offset;. Here is the stack frame:

#0 Adjust_offset::operator() (this=0x7fffec5ec720, thd=0x7ffedc000be0) at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:2831
#1 0x0000000000eef320 in Do_THD::operator() (this=0x7fffec5ec6a0, thd=0x7ffedc000be0) at /mysqldata/percona-server-locks-detail-5.7.22/sql/mysqld_thd_manager.cc:46
#2 0x0000000000eefa0f in std::for_each<THD**, Do_THD> (__first=0x3003358, __last=0x3003368, __f=...)
  at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algo.h:4200
#3 0x0000000000eeefc0 in Global_THD_manager::do_for_all_thd (this=0x3003340, func=0x7fffec5ec720)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/mysqld_thd_manager.cc:273
#4 0x000000000187ae0a in adjust_linfo_offsets (purge_offset=0) at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:2873
#5 0x0000000001883239 in MYSQL_BIN_LOG::remove_logs_from_index (this=0x2e83640, log_info=0x7fffec5ec7d0, need_update_threads=true)
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:6352
#6 0x0000000001883676 in MYSQL_BIN_LOG::purge_logs (this=0x2e83640, to_log=0x7fffec5eca80 "/mysqldata/mysql3340/log/binlog.000001", included=false, 
  need_lock_index=true, need_update_threads=true, decrease_log_space=0x0, auto_purge=false) at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:6469
#7 0x000000000187b4b5 in purge_master_logs (thd=0x7ffee0000c00, to_log=0x7ffee0006600 "binlog.000001")
  at /mysqldata/percona-server-locks-detail-5.7.22/sql/binlog.cc:3127

7. Test the error in POSITION mode

1. Environment

mysql> show binary logs;
+---------------+-----------+
| Log_name | File_size |
+---------------+-----------+
| binlog.000001 | 198 |
| binlog.000002 | 154 |
+---------------+-----------+
2 rows in set (0.00 sec)

mysql> show create table testcp \G;
*************************** 1. row ***************************
    Table: testcp
Create Table: CREATE TABLE `testcp` (
 `id` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

ERROR: 
No query specified

2. Execute a statement

Main library:

mysql> insert into testcp values(20);
Query OK, 1 row affected (0.43 sec)

mysql> select * from testcp;
+----+
|id|
+----+
| 10 |
| 20 |
+----+
2 rows in set (0.01 sec)

From the library:

mysql> show slave status \G;
*************************** 1. row ***************************
        Slave_IO_State: Waiting for master to send event
         Master_Host: 192.168.99.41
         Master_User: repl
         Master_Port: 3340
        Connect_Retry: 60
       Master_Log_File: binlog.000002
     Read_Master_Log_Pos: 417
        Relay_Log_File: relay.000004
        Relay_Log_Pos: 624
    Relay_Master_Log_File: binlog.000002
       Slave_IO_Running: Yes
      Slave_SQL_Running: Yes
...
mysql> select * from testcp;
+----+
|id|
+----+
| 10 |
| 20 |
+----+

At this time, the DUMP thread index file offset pointer is as follows:

3. The main database executes purge binary logs

Before doing this, copy the original binlog.000001 to binlog.000001bak because you will need to copy it back later.

mysql> purge binary logs to 'binlog.000002';
Query OK, 0 rows affected (3.14 sec)

mysql> show binary logs;
+---------------+-----------+
| Log_name | File_size |
+---------------+-----------+
| binlog.000002 | 417 |
+---------------+-----------+
1 row in set (0.00 sec)

Because the purge binary logs command maintains the offset pointer, the index file offset pointer of the DUMP thread is as follows:

4. Manual changes

Copy binlog.000001bak to binlog.000001, then modify the index file to add binlog.000001 back, and then flush binary logs as follows:

mysql> flush binary logs;
Query OK, 0 rows affected (0.15 sec)

mysql> show binary logs;
+---------------+-----------+
| Log_name | File_size |
+---------------+-----------+
| binlog.000001 | 198 |
| binlog.000002 | 461 |
+---------------+-----------+
2 rows in set (0.00 sec)

Doing this manually does not maintain the index file offset pointer of the DUMP thread, so it is as follows:

In this way, the DUMP thread resends the content of binlog.000002, and the slave library of POSITION can only report an error as follows:

mysql> select * from replication_applier_status_by_worker \G
*************************** 1. row ***************************
     CHANNEL_NAME: 
      WORKER_ID: 1
      THREAD_ID: NULL
    SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: ANONYMOUS
  LAST_ERROR_NUMBER: 1062
  LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log binlog.000002, end_log_pos 386; Could not execute Write_rows event on table testmts.testcp; Duplicate entry '20' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log binlog.000002, end_log_pos 386
 LAST_ERROR_TIMESTAMP: 2019-05-21 14:46:26

8. Conclusion

This operation is very irregular. If you really want to do this, consider the following steps:

  • Close all slave libraries
  • Manually register binlog files and modify index files
  • flush binary logs
  • Start the slave library

In this case, the DUMP offset pointer will be reinitialized, so there should be no problem. Try not to even think about doing that.

The above is the detailed content of the reasons why the master-slave abnormality is caused by manual registration of MySQL binlog files. For more information about MySQL master-slave abnormalities, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • How to view MySQL links and kill abnormal links
  • MySQL database connection exception summary (worth collecting)
  • How to fix abnormal startup of mysql5.7.21
  • Sharing of experience on repairing MySQL innodb exceptions
  • MySQL definition and exception handling details
  • Some basic exception handling tutorials in MySQL stored procedures
  • Analyzing a MySQL abnormal query case
  • A Brief Analysis of MySQL Exception Handling
  • Analyze several common solutions to MySQL exceptions

<<:  WeChat applet realizes linkage menu

>>:  Solution for FileZilla 425 Unable to connect to FTP (Alibaba Cloud Server)

Recommend

How to configure anti-hotlinking for nginx website service (recommended)

1. Principle of Hotlinking 1.1 Web page preparati...

Suggestions on creating business HTML emails

Through permission-based email marketing, not onl...

JavaScript Basics: Immediate Execution Function

Table of contents Immediately execute function fo...

Detailed explanation of Linux copy and paste in VMware virtual machine

1. Linux under VMware Workstation: 1. Update sour...

MYSQL string forced conversion method example

Preface Since the types of the same fields in the...

Description of the hr tag in various browsers

Generally, we rarely meet HR, but once we do, it c...

Detailed tutorial on installing Nginx 1.16.0 under Linux

Because I have been tinkering with Linux recently...

Detailed explanation of identifying files with the same content on Linux

Preface Sometimes file copies amount to a huge wa...

Summary of Ubuntu backup methods (four types)

Method 1: To use respin, follow these steps: sudo...

Simple implementation method of vue3 source code analysis

Table of contents Preface 🍹Preparation 🍲vue3 usag...

HTML tutorial, understanding the optgroup element

Select the category selection. After testing, IE ...

A brief analysis of Linux resolv.conf

1. Introduction resolv.conf is the configuration ...

How to install and modify the initial password of mysql5.7.18 under Centos7.3

This article shares with you the installation of ...