Case scenarioA problem was discovered online today. Due to the lack of monitoring coverage, the disk of a certain machine was full, causing problems with the online MySQL master-slave replication. The problem is as follows: localhost.(none)>show slave status\G *************************** 1. row *************************** Slave_IO_State: Master_Host: 10.xx.xx.xx Master_User: replica Master_Port: 5511 Connect_Retry: 60 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: relay-bin.001605 Relay_Log_Pos: 9489761 Relay_Master_Log_File: Slave_IO_Running: No Slave_SQL_Running: No Last_Errno: 13121 Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, the server was unable to fetch a keyring key required to open an encrypted relay log file, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. So I checked the error log and found the following content in the error log: 2021-03-31T11:34:39.367173+08:00 11 [Warning] [MY-010897] [Repl] Storing MySQL user name or The password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. 2021-03-31T11:34:39.368161+08:00 12 [ERROR] [MY-010596] [Repl] Error reading relay log event for channel '': binlog truncated in the middle of event; consider out of disk space 2021-03-31T11:34:39.368191+08:00 12 [ERROR] [MY-013121] [Repl] Slave SQL for channel '': Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's The binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log). The slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, the server was unable to fetch a keyring key required to open an encrypted relay log file, or a bug in the master's or slave's MySQL code. If you want to check the The master's binary log or the slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: MY-013121 2021-03-31T11:34:39.368205+08:00 12 [ERROR] [MY-010586] [Repl] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000446' position 9489626 As you can see from the description, the error log is quite intelligent. It found the disk problem and prompted us to "consider out of disk space". Solving the problemAfter logging into the server, I quickly discovered that the disk usage of the server where MySQL is located had reached 100%. The cause of the problem was consistent with the content in the error log. Solve this problem now. The basic idea is to clean up the disk files and then rebuild the replication relationship. This process seems to be relatively simple, but in actual operation, the following error occurs when building the replication relationship: ### Based on gtid replication, I want to rebuild the replication relationship localhost.(none)>reset slave; ERROR 1371 (HY000): Failed purging old relay logs: Failed during log reset localhost.(none)>reset slave all; ERROR 1371 (HY000): Failed purging old relay logs: Failed during log reset Step 1: Because replication is based on gtid, after directly recording the status of show slave status, you can reset the slave and use the change master statement to rebuild the replication relationship. However, the above error message appears. From the error message, it seems that MySQL cannot complete the purge relay log operation, which does not seem scientific. Well, since you can't complete the purge relay logs operation by yourself, let me help you. Step 2: Manually delete all relay logs with rm -f, and find that the error message becomes: localhost.(none)>reset slave all; ERROR 1374 (HY000): I/O error reading log index file Well, okay, the problem was not solved. Then I thought about it. Since I couldn't clean up the relay log by manually resetting the slave, I just stopped it. Is it possible to change from slave to master? Step 3: Stop slave directly, then change master, without executing the reset slave all statement. The result is as follows: localhost.(none)>change master to master_host='10.13.224.31', -> master_user='replica', -> master_password = 'eHnNCaQE3ND', -> master_port=5510, -> master_auto_position=1; ERROR 1371 (HY000): Failed purging old relay logs: Failed during log reset Well, the problem remains. Step 4: Anyway, the replication has been disconnected with an error, so execute start slave to see what happens. As a result, a dramatic scene appears: localhost.(none)>start slave; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 262 Current database: *** NONE *** Query OK, 0 rows affected (0.01 sec) localhost.(none)> [root@ ~]# After executing start slave, the instance hangs directly. At this point, replication is completely disconnected and the slave instance has crashed. Step 5: Check if the instance can be restarted. Try to restart the instance and find that the instance can be started again. After the instance is restarted, check the replication relationship and the results are as follows: localhost.(none)>show slave status\G *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: 10.xx.xx.xx Master_User: replica Master_Port: 5511 Connect_Retry: 60 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: relay-bin.001605 Relay_Log_Pos: 9489761 Relay_Master_Log_File: Slave_IO_Running: Yes Slave_SQL_Running: No Last_Errno: 13121 Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, the server was unable to fetch a keyring key required to open an encrypted relay log file, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Skip_Counter: 0 Copying the relationship still results in an error. Step 6: Reset slave all and see if it succeeds. localhost.(none)>stop slave; Query OK, 0 rows affected (0.00 sec) localhost.(none)>reset slave all; Query OK, 0 rows affected (0.03 sec) Step 7: Re-establish the replication relationship and start replication localhost.(none)>change master to master_host='10.xx.xx.xx', -> master_user='replica', -> master_password='xxxxx', -> master_port=5511, -> master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.01 sec) localhost.(none)>start slave; Query OK, 0 rows affected (0.00 sec) localhost.(none)>show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.xx.xx.xx Master_User: replica Master_Port: 5511 Connect_Retry: 60 ... Slave_IO_Running: Yes Slave_SQL_Running: Yes It is found that the replication relationship of the instance can be established. SummaryWhen the disk is full, the MySQL service cannot write data to the meta-information table, and the relay log may be incomplete. If you directly clean up the disk data on the server and then change the master again to modify the master-slave replication relationship, an error may occur and it cannot be repaired directly because this is not a normal master-slave replication relationship break scenario. So, the correct approach should be: 1. Clean the server's disk 2. Restart the slave library whose replication relationship is disconnected 3. Reset slave all and change master to build a master-slave replication relationship If there is a better way, please let me know. The above is the detailed content of the solution to the problem of MySQL replication failure caused by disk full. For more information about the solution to MySQL replication failure, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: CSS -webkit-box-orient: vertical property lost after compilation
>>: Solve the problem that Docker must use sudo operations
As shown below: XML/HTML CodeCopy content to clip...
How to install MySQL 5.7.18 on Linux 1. Download ...
I. Introduction 1: SSL Certificate My domain name...
Directly code: select 'bigint unsigned' a...
Tomcat itself optimization Tomcat Memory Optimiza...
In the past, creating a printer-friendly version ...
Table of contents 1. Background 2. What is a virt...
Table of contents Project Introduction: Project D...
When developing a project, you will often encount...
MySQL 8.0.19 supports locking the account after e...
1. Download MySQL Workbench Workbench is a graphi...
history route History mode refers to the mode of ...
Table of contents 1: Build webpack 2. Data hijack...
As the number of visits increases, the pressure o...