Summary of online MYSQL synchronization error troubleshooting methods (must read)

Summary of online MYSQL synchronization error troubleshooting methods (must read)

Preface

After a failover occurs, a common problem is synchronization errors. When the database is small, it can be easily handled by dumping and then importing it. However, online databases are 150G-200G. If this method is used alone, the cost is too high. Therefore, after a period of exploration, several processing methods have been summarized.

Production environment architecture diagram

The current network architecture stores two copies of data and uses asynchronous replication to create a high-availability cluster, with two machines providing external services. When a failure occurs, switch to the slave and turn it into a master. The broken machine reversely synchronizes with the new master. When handling failures, the most common problem encountered is master-slave errors. The following is the error message I collected.

Common Mistakes

The three most common situations

These three situations occur during HA switching. Due to asynchronous replication and sync_binlog=0, a small part of binlog will not be received, resulting in synchronization errors.

The first one: a record is deleted on the master, but it cannot be found on the slave.

Last_SQL_Error: Could not execute Delete_rows event on table hcy.t1;
Can't find record in 't1',
Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;
the event's master log mysql-bin.000006, end_log_pos 254

The second type: duplicate primary key. The record already exists on the slave, and the same record is inserted on the master.

Last_SQL_Error: Could not execute Write_rows event on table hcy.t1;
Duplicate entry '2' for key 'PRIMARY',
Error_code: 1062;
handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000006, end_log_pos 924

The third type: A record is updated on the master, but it cannot be found on the slave, resulting in data loss.

Last_SQL_Error: Could not execute Update_rows event on table hcy.t1;
Can't find record in 't1',
Error_code: 1032;
handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000010, end_log_pos 263

Asynchronous semi-synchronous difference

Asynchronous replication <br /> Simply put, the master sends the binlog, and the action ends regardless of whether the slave has received or executed it.

Semi-synchronous replication <br /> Simply put, the master sends the binlog, and the slave confirms that it has received it. Regardless of whether it has been executed, the master sends a signal that it has received it, and the action is completed. (Code written by Google, officially applied in 5.5.)

Disadvantages of asynchrony <br /> When the master is busy with write operations, the current POS point is, for example, 10, and the IO_THREAD thread on the slave receives 3. At this time, the master crashes, resulting in a difference of 7 points not being transmitted to the slave and data loss.

Special circumstances

The slave's relay log relay-bin is damaged.
Last_SQL_Error: Error initializing relay log position: I/O error reading the header from the binary log
Last_SQL_Error: Error initializing relay log position: Binlog has bad magic number;
It's not a binary log file that can be used by this version of MySQL

In this case, the SLAVE is down or shut down illegally, such as power failure, motherboard burnout, etc., causing the relay log to be damaged and synchronization to stop.

Be careful of human errors: multiple slaves have duplicate server-ids
In this case, the synchronization will be delayed and will never be completed. The above two lines of information will always appear in the error log. The solution is to change the server-id to be inconsistent.

Slave: received end packet from server, apparent master shutdown:
Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000012' at postion 106

Problem Solving

Deletion failed

Delete a record on the master, but it cannot be found on the slave.

Last_SQL_Error: Could not execute Delete_rows event on table hcy.t1;
Can't find record in 't1',
Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;
the event's master log mysql-bin.000006, end_log_pos 254

Solution:

Because the master wants to delete a record, but the slave cannot find it and reports an error, in this case, the master has deleted it, so the slave can skip it directly. Available commands:

stop slave;
set global sql_slave_skip_counter=1;
start slave;

If this happens frequently, you can use a script I wrote, skip_error_replcation.sh, which skips 10 errors by default (it only skips this situation, and outputs error results in other situations and waits for processing). This script is written in shell with reference to the mk-slave-restart principle of the maakit toolkit. It defines some of its own functions, and does not skip all errors. )

Duplicate primary key

The record already exists on the slave, and the same record is inserted on the master.

Last_SQL_Error: Could not execute Write_rows event on table hcy.t1; 
Duplicate entry '2' for key 'PRIMARY', 
Error_code: 1062; 
handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000006, end_log_pos 924

Solution:

Use desc hcy.t1 on the slave; first look at the following table structure:

mysql> desc hcy.t1;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | NO | PRI | 0 | | 
| name | char(4) | YES | | NULL | | 
+-------+---------+------+-----+---------+-------+

Remove duplicate primary keys

mysql> delete from t1 where id=2;
Query OK, 1 row affected (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G;
…
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
…
mysql> select * from t1 where id=2;

Confirm again on the master and slave.

Update lost

A record is updated on the master, but it cannot be found on the slave, resulting in data loss.

Last_SQL_Error: Could not execute Update_rows event on table hcy.t1; 
Can't find record in 't1', 
Error_code: 1032; 
handler error HA_ERR_KEY_NOT_FOUND; 
the event's master log mysql-bin.000010, end_log_pos 794

Solution:

On the master, use mysqlbinlog to analyze what the wrong binlog log is doing.

/usr/local/mysql/bin/mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS mysql-bin.000010 | grep -A '10' 794

#120302 12:08:36 server id 22 end_log_pos 794 Update_rows: table id 33 flags: STMT_END_F
### UPDATE hcy.t1
###WHERE
### @1=2 /* INT meta=0 nullable=0 is_null=0 */
### @2='bbc' /* STRING(4) meta=65028 nullable=1 is_null=0 */
### SET
### @1=2 /* INT meta=0 nullable=0 is_null=0 */
### @2='BTV' /* STRING(4) meta=65028 nullable=1 is_null=0 */
# at 794
#120302 12:08:36 server id 22 end_log_pos 821 Xid = 60
COMMIT /*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

On the slave, find the updated record, which should not exist.

mysql> select * from t1 where id=2;
Empty set (0.00 sec)

Then go to the master to check

mysql> select * from t1 where id=2;
+----+------+
| id | name |
+----+------+
| 2 | BTV | 
+----+------+
1 row in set (0.00 sec)

Fill in the missing data on the slave and skip the error.

mysql> insert into t1 values ​​(2,'BTV');
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1 where id=2;  
+----+------+
| id | name |
+----+------+
| 2 | BTV | 
+----+------+
1 row in set (0.00 sec)

mysql> stop slave ;set global sql_slave_skip_counter=1;start slave;
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G;
…
 Slave_IO_Running: Yes
 Slave_SQL_Running: Yes
…

Relay log corruption

The slave's relay log relay-bin is damaged.

Last_SQL_Error: Error initializing relay log position: I/O error reading the header from the binary log
Last_SQL_Error: Error initializing relay log position: Binlog has bad magic number; 
It's not a binary log file that can be used by this version of MySQL

Manual repair

Solution: Find the synchronized binlog and POS points, and then re-synchronize them so that you can have a new relay day value.

example:

mysql> show slave status\G;
*************************** 1. row ***************************
       Master_Log_File:mysql-bin.000010
     Read_Master_Log_Pos: 1191
        Relay_Log_File: vm02-relay-bin.000005
        Relay_Log_Pos: 253
    Relay_Master_Log_File: mysql-bin.000010
       Slave_IO_Running: Yes
      Slave_SQL_Running: No
       Replicate_Do_DB: 
     Replicate_Ignore_DB: 
      Replicate_Do_Table: 
    Replicate_Ignore_Table: 
   Replicate_Wild_Do_Table: 
 Replicate_Wild_Ignore_Table: 
          Last_Errno: 1593
          Last_Error: Error initializing relay log position: I/O error reading the header from the binary log
         Skip_Counter: 1
     Exec_Master_Log_Pos: 821

Slave_IO_Running: Receive binlog information from the master

Master_Log_File
Read_Master_Log_Pos

Slave_SQL_Running: Execute write operation

Relay_Master_Log_File
Exec_Master_Log_Pos

The binlog and POS point of execution shall prevail.

Relay_Master_Log_File: mysql-bin.000010
Exec_Master_Log_Pos: 821
mysql> stop slave;
Query OK, 0 rows affected (0.01 sec)

mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000010',MASTER_LOG_POS=821;
Query OK, 0 rows affected (0.01 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)


mysql> show slave status\G;
*************************** 1. row ***************************
        Slave_IO_State: Waiting for master to send event
         Master_Host: 192.168.8.22
         Master_User: repl
         Master_Port: 3306
        Connect_Retry: 10
       Master_Log_File:mysql-bin.000010
     Read_Master_Log_Pos: 1191
        Relay_Log_File: vm02-relay-bin.000002
        Relay_Log_Pos: 623
    Relay_Master_Log_File: mysql-bin.000010
       Slave_IO_Running: Yes
      Slave_SQL_Running: Yes
       Replicate_Do_DB: 
     Replicate_Ignore_DB: 
      Replicate_Do_Table: 
    Replicate_Ignore_Table: 
   Replicate_Wild_Do_Table: 
 Replicate_Wild_Ignore_Table: 
          Last_Errno: 0
          Last_Error: 
         Skip_Counter: 0
     Exec_Master_Log_Pos: 1191
       Relay_Log_Space: 778
       Until_Condition: None
        Until_Log_File: 
        Until_Log_Pos: 0
      Master_SSL_Allowed: No
      Master_SSL_CA_File: 
      Master_SSL_CA_Path: 
       Master_SSL_Cert: 
      Master_SSL_Cipher: 
        Master_SSL_Key: 
    Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
        Last_IO_Errno: 0
        Last_IO_Error: 
        Last_SQL_Errno: 0
        Last_SQL_Error: 
Ibbackup

All the tricks have been used, but the slave data has been lost too much. It's time for you to use ibbackup (which requires money).

Ibbackup hot backup tool is paid. Xtrabackup is free and has the same functionality.

Ibbackup does not lock the table during backup. A transaction is started during backup (equivalent to making a snapshot), and then a point is recorded. The subsequent data changes are saved in the ibbackup_logfile file. During recovery, the changed data in the ibbackup_logfile is written back to ibdata.

Ibbackup only backs up data (ibdata, .ibd), and does not back up the table structure .frm.

Here is a demonstration example:

Backup: ibbackup /bak/etc/my_local.cnf /bak/etc/my_bak.cnf

Restore: ibbackup --apply-log /bak/etc/my_bak.cnf

[root@vm01 etc]# more my_local.cnf 

datadir = /usr/local/mysql/data
innodb_data_home_dir = /usr/local/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /usr/local/mysql/data
innodb_buffer_pool_size = 100M
innodb_log_file_size = 5M
innodb_log_files_in_group=2


[root@vm01 etc]# ibbackup /bak/etc/my_local.cnf /bak/etc/my_bak.cnf 

InnoDB Hot Backup version 3.0.0; Copyright 2002-2005 Innobase Oy
License A21488 is granted to vm01 ([email protected])
(--apply-log works in any computer regardless of the hostname)
Licensed for use in a computer whose hostname is 'vm01'
Expires 2012-5-1 (year-month-day) at 00:00
See http://www.innodb.com for further information
Type ibbackup --license for detailed license terms, --help for help

Contents of /bak/etc/my_local.cnf:
innodb_data_home_dir got value /usr/local/mysql/data
innodb_data_file_path got value ibdata1:10M:autoextend
datadir got value /usr/local/mysql/data
innodb_log_group_home_dir got value /usr/local/mysql/data
innodb_log_files_in_group got value 2
innodb_log_file_size got value 5242880

Contents of /bak/etc/my_bak.cnf:
innodb_data_home_dir got value /bak/data
innodb_data_file_path got value ibdata1:10M:autoextend

datadir got value /bak/data
innodb_log_group_home_dir got value /bak/data
innodb_log_files_in_group got value 2
innodb_log_file_size got value 5242880

ibbackup: Found checkpoint at lsn 0 1636898
ibbackup: Starting log scan from lsn 0 1636864
120302 16:47:43 ibbackup: Copying log...
120302 16:47:43 ibbackup: Log copied, lsn 0 1636898
ibbackup: We wait 1 second before starting copying the data files...
120302 16:47:44 ibbackup: Copying /usr/local/mysql/data/ibdata1
ibbackup: A copied database page was modified at 0 1636898
ibbackup: Scanned log up to lsn 0 1636898
ibbackup: Was able to parse the log up to lsn 0 1636898
ibbackup: Maximum page number for a log record 0
120302 16:47:46 ibbackup: Full backup completed!
[root@vm01 etc]#
[root@vm01 etc]# cd /bak/data/
[root@vm01 data]# ls
ibbackup_logfile ibdata1

[root@vm01 data]# ibbackup --apply-log /bak/etc/my_bak.cnf 

InnoDB Hot Backup version 3.0.0; Copyright 2002-2005 Innobase Oy
License A21488 is granted to vm01 ([email protected])
(--apply-log works in any computer regardless of the hostname)
Licensed for use in a computer whose hostname is 'vm01'
Expires 2012-5-1 (year-month-day) at 00:00
See http://www.innodb.com for further information
Type ibbackup --license for detailed license terms, --help for help

Contents of /bak/etc/my_bak.cnf:
innodb_data_home_dir got value /bak/data
innodb_data_file_path got value ibdata1:10M:autoextend
datadir got value /bak/data
innodb_log_group_home_dir got value /bak/data
innodb_log_files_in_group got value 2
innodb_log_file_size got value 5242880

120302 16:48:38 ibbackup: ibbackup_logfile's creation parameters:
ibbackup: start lsn 0 1636864, end lsn 0 1636898,
ibbackup: start checkpoint 0 1636898


ibbackup: start checkpoint 0 1636898
InnoDB: Doing recovery: scanned up to log sequence number 0 1636898
InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 .....99
Setting log file size to 0 5242880
ibbackup: We were able to parse ibbackup_logfile up to
ibbackup: lsn 0 1636898
ibbackup: Last MySQL binlog file position 0 1191, file name ./mysql-bin.000010
ibbackup: The first data file is '/bak/data/ibdata1'
ibbackup: and the newly created log files are at '/bak/data/'
120302 16:48:38 ibbackup: Full backup prepared for recovery successfully!

[root@vm01 data]# ls
ibbackup_logfile ibdata1 ib_logfile0 ib_logfile1

Copy ibdata1 ib_logfile0 ib_logfile1 to the slave, and also copy .frm to it. After starting MySQL, synchronize it. That point is the output above:

ibbackup: Last MySQL binlog file position 0 1191, file name ./mysql-bin.000010
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000010',MASTER_LOG_POS=1191;

Maatkit Toolkit
http://www.maatkit.org/

Introduction

MaatKit is an open source toolkit that provides assistance for daily management of MySQL. Currently, it has been acquired and maintained by Percona. in:

mk-table-checksum is used to check whether the table structure and data on the master and slave are consistent.

mk-table-sync is used to repair inconsistencies between master and slave data.

There is no actual experience of operating these two toolkits in the live network. This is just a discussion of new technologies and academic exchanges. The following shows how to use them.

http://www.actionsky.com/products/mysql-others/maatkit.jsp

[root@vm02]# mk-table-checksum h=vm01,u=admin,p=123456 h=vm02,u=admin,p=123456 -d hcy -t t1
Cannot connect to MySQL because the Perl DBI module is not installed or not found. 
Run 'perl -MDBI' to see the directories that Perl searches for DBI.
If DBI is not installed, try:
 Debian/Ubuntu apt-get install libdbi-perl
 RHEL/CentOS yum install perl-DBI
 OpenSolaris pgk install pkg:/SUNWpmdbi

If it prompts that the perl-DBI module is missing, then directly yum install perl-DBI.

[root@vm02 bin]# mk-table-checksum h=vm01,u=admin,p=123456 h=vm02,u=admin,p=123456 -d hcy -t t1
DATABASE TABLE CHUNK HOST ENGINE COUNT CHECKSUM TIME WAIT STAT LAG
hcy t1 0 vm02 InnoDB NULL 1957752020 0 0 NULL NULL
hcy t1 0 vm01 InnoDB NULL 1957752020 0 0 NULL NULL

If the table data is inconsistent, the CHECKSUM value will be unequal.

Explain what the output means:

DATABASE: database name
TABLE: table name
CHUNK: approximate value of checksum
HOST: MYSQL address
ENGINE: table engine
COUNT: The number of rows in the table
CHECKSUM: Checksum value
TIME: time taken
WAIT: Waiting time
STAT: MASTER_POS_WAIT() return value
LAG: slave delay time

If you want to filter out which tables are not equal, you can use the tool mk-checksum-filter. Just add a pipe character at the end.

[root@vm02 ~]# mk-table-checksum h=vm01,u=admin,p=123456 h=vm02,u=admin,p=123456 -d hcy | mk-checksum-filter    
hcy t2 0 vm01 InnoDB NULL 1957752020 0 0 NULL NULL
hcy t2 0 vm02 InnoDB NULL 1068689114 0 0 NULL NULL

Knowing which tables are inconsistent, you can use the mk-table-sync tool to handle them.

Note: The table will be locked when executing mk-table-checksum. The size of the table depends on the speed of execution.

Table t2 data on MASTER:

Table t2 data on SLAVE:

mysql> select * from t2; mysql> select * from t2;  
+----+------+ +----+------+
| id | name | | id | name |
+----+------+ +----+------+
| 1 | a | | 1 | a | 
| 2 | b | | 2 | b | 
| 3 | ss | | 3 | ss | 
| 4 | asd | | 4 | asd | 
| 5 | ss | +----+------+
+----+------+ 4 rows in set (0.00 sec)
5 rows in set (0.00 sec) 
                     mysql> \!hostname; 
mysql> \! hostname; vm02    
vm01 
[root@vm02 ~]# mk-table-sync --execute --print --no-check-slave --transaction --databases hcy h=vm01,u=admin,p=123456 h=vm02,u=admin,p=123456 
INSERT INTO `hcy`.`t2`(`id`, `name`) VALUES ('5', 'ss') /*maatkit src_db:hcy src_tbl:t2 src_dsn:h=vm01,p=...,u=admin dst_db:hcy dst_tbl:t2 
dst_dsn:h=vm02,p=...,u=admin lock:0 transaction:1 changing_src:0 replicate:0 bidirectional:0 pid:3246 user:root host:vm02*/;

Its working principle is: first check whether the tables of the master and slave libraries are the same row by row. If there are any differences, perform delete, update, insert and other operations to make them consistent. The size of the table determines how fast the execution will be.

If C<--transaction> is specified, C<LOCK TABLES> is not used. Instead, lock
and unlock are implemented by beginning and committing transactions.
The exception is if L<"--lock"> is 3.
If C<--no-transaction> is specified, then C<LOCK TABLES> is used for any
value of L<"--lock">. See L<"--[no]transaction">.
When enabled, either explicitly or implicitly, the transaction isolation level
is set C<REPEATABLE READ> and transactions are started C<WITH CONSISTENT
SNAPSHOT>

MySQL replication monitoring

Common MySQL error types

1005: Failed to create table
1006: Failed to create database
1007: The database already exists. Failed to create the database.
1008: The database does not exist. Failed to delete the database.
1009: Unable to delete database file, resulting in failure to delete database
1010: Failed to delete the data directory, resulting in failure to delete the database
1011: Failed to delete database file
1012: Unable to read records from system table
1020: The record has been modified by another user
1021: Insufficient free space on the hard disk. Please increase the free space on the hard disk.
1022: Keyword duplicated, record change failed
1023: Error occurred during shutdown
1024: Error reading file
1025: An error occurred while changing the name
1026: Error writing file
1032: Record does not exist
1036: The data table is read-only and cannot be modified
1037: Insufficient system memory. Please restart the database or the server.
1038: Insufficient memory for sorting, please increase the sort buffer
1040: The maximum number of database connections has been reached. Please increase the number of available database connections.
1041: Insufficient system memory
1042: Invalid hostname
1043: Invalid connection
1044: The current user does not have permission to access the database
1045: Unable to connect to the database, incorrect username or password
1048: Field cannot be empty
1049: Database does not exist
1050: The data table already exists
1051: The data table does not exist
1054: Field does not exist
1065: Invalid SQL statement, SQL statement is empty
1081: Unable to establish socket connection
1114: The data table is full and cannot accommodate any records.
1116: Too many open tables
1129: The database is abnormal. Please restart the database.
1130: Failed to connect to the database. No permission to connect to the database
1133: Database user does not exist
1141: The current user does not have permission to access the database
1142: The current user does not have permission to access the data table
1143: The current user does not have permission to access the fields in the data table
1146: The data table does not exist
1147: User access rights to the table are not defined
1149: SQL statement syntax error
1158: Network error, read error, please check the network connection status
1159: Network error, read timeout, please check the network connection status
1160: Network error, write error occurred, please check the network connection status
1161: Network error, write timeout, please check the network connection status
1062: Field value is repeated, storage failed
1169: Field value is duplicated, record update failed
1177: Failed to open the data table
1180: Transaction commit failed
1181: Rollback transaction failed
1203: The number of connections between the current user and the database has reached the maximum number of connections for the database. Please increase the number of available database connections or restart the database.
1205: Lock timeout
1211: The current user does not have permission to create a user
1216: Foreign key constraint check failed, update of child table record failed
1217: Foreign key constraint check failed, deletion or modification of primary table records failed
1226: The resources used by the current user have exceeded the allowed resources. Please restart the database or restart the server.
1227: Insufficient permissions. You do not have permission to perform this operation.
1235: The MySQL version is too low and does not have this function.

Copy monitoring script

Modified based on the original text.

Original script

#!/bin/bash
#
#check_mysql_slave_replication_status
#
#
#
parasum=2
help_msg(){
 
cat <<
 help
+---------------------+
+Error
 Cause:
+you
 must input $parasum parameters!
+1st
 : Host_IP
+2st
 : Host_Port
help
exit
}
 
[
 $#
 -ne ${parasum} ] && help_msg #If the parameters are not enough, print the help information and exit export HOST_IP=$1
export HOST_PORt=$2
MYUSER="root"     
MYPASS="123456"
 
MYSQL_CMD="mysql
 -u$MYUSER -p$MYPASS"
MailTitle="" #Mail subject Mail_Address_MysqlStatus="[email protected]" #Recipient mailbox time1=$(date +"%Y%m%d%H%M%S")
time2=$(date +"%Y-%m-%d
 %H:%M:%S")
 
SlaveStatusFile=/tmp/salve_status_${HOST_PORT}.${time1} 
#The file where the email content is located echo "--------------------Begin
 at: "$time2
 > $SlaveStatusFile
echo "" >>
 $SlaveStatusFile
 
#get
 slave status
${MYSQL_CMD}
 -e "show
 slave status\G" >>
 $SlaveStatusFile #Get the status of the salve process#get
 io_thread_status,sql_thread_status,last_errno Get the following status value IOStatus=$(cat $SlaveStatusFile|grep Slave_IO_Running|awk '{print
 $2}')
SQLStatus=$(cat $SlaveStatusFile|grep Slave_SQL_Running
 |awk '{print
 $2}')
  Errno=$(cat $SlaveStatusFile|grep Last_Errno
 | awk '{print
 $2}')
  Behind=$(cat $SlaveStatusFile|grep Seconds_Behind_Master
 | awk '{print
 $2}')
 
echo "" >>
 $SlaveStatusFile
 
if [
"$IOStatus" ==
"No" ]
 || [ "$SQLStatus" ==
"No" ]; then #Judge the error type if [
"$Errno" -eq 0
 ]; then #Maybe the salve thread is not started $MYSQL_CMD
 -e "start
 slave io_thread;start slave sql_thread;"
      echo "Cause
 slave threads does not run, trying to start slsave io_thread; start slave sql_thread;" >>
 $SlaveStatusFile
      MailTitle="[Warning]
 Slave threads stopped on $HOST_IP $HOST_PORT"
    elif [
"$Errno" -eq 1007
 ] || [ "$Errno" -eq 1053
 ] || [ "$Errno" -eq 1062
 ] || [ "$Errno" -eq 1213
 ] || [ "$Errno" -eq 1032
 ]\
      ||
 [ "Errno" -eq 1158
 ] || [ "$Errno" -eq 1159
 ] || [ "$Errno" -eq 1008
 ]; then #Ignore these errors $MYSQL_CMD
 -e "stop
 slave;set global sql_slave_skip_counter=1;start slave;"
      echo "Cause
 slave replication catch errors,trying skip counter and restart slave;stop slave ;set global sql_slave_skip_counter=1;slave start;" >>
 $SlaveStatusFile
      MailTitle="[Warning]
 Slave error on $HOST_IP $HOST_PORT! ErrNum: $Errno"
    else
      echo "Slave
 $HOST_IP $HOST_PORT is down!" >>
 $SlaveStatusFile
      MailTitle="[ERROR]Slave
 replication is down on $HOST_IP $HOST_PORT ! ErrNum:$Errno"
    fi
fi
if [
 -n "$Behind" ];then
    Behind=0
fi
echo "$Behind" >>
 $SlaveStatusFile
 
#delay
 behind master determines the delay time if [
 $Behind -gt 300 ];then
  echo `date +"%Y-%m%d
 %H:%M:%S"`
"slave
 is behind master $Bebind seconds!" >>
 $SlaveStatusFile
  MailTitle="[Warning]Slave
 delay $Behind seconds,from $HOST_IP $HOST_PORT"
fi
 
if [
 -n "$MailTitle" ]; then #If an error occurs or the delay is greater than 300s, send an email cat ${SlaveStatusFile}
 | /bin/mail -s
"$MailTitle" $Mail_Address_MysqlStatus
fi
 
#del
 tmpfile:SlaveStatusFile
>
 $SlaveStatusFile

Modified script

Only simple tidying was done, and the judgment of Behind being NULL was corrected, but no test was done;

Consider adding:

Judgment on the results of repair execution; cyclic repair, detection, and repair of multiple errors?

Cancel the SlaveStatusFile temporary file.

The Errno and Behind alarms are sent to emails separately, and the alarm body is added with the original text of the show slave result.

Increase PATH so that it can be added to crontab.

Consider periodic execution in crontab (locking to avoid execution conflicts, execution cycle selection)

Add execution log?

#!/bin/sh
#
 check_mysql_slave_replication_status
#
 Reference: http://www.tianfeiyu.com/?p=2062
 
Usage(){
  echo Usage:
  echo "$0
 HOST PORT USER PASS"
}
 
[
 -z "$1" -o
 -z "$2" -o
 -z "$3" -o
 -z "$4" ]
 && Usage && exit 1
HOST=$1
PORT=$2
USER=$3
PASS=$4
 
MYSQL_CMD="mysql
 -h$HOST -P$PORT -u$USER -p$PASS"
 
MailTitle="" #Mail subject Mail_Address_MysqlStatus="[email protected]" #Recipient mailbox time1=$(date +"%Y%m%d%H%M%S")
time2=$(date +"%Y-%m-%d
 %H:%M:%S")
 
SlaveStatusFile=/tmp/salve_status_${HOST_PORT}.${time1} 
#The file where the email content is located echo "--------------------Begin
 at: "$time2
 > $SlaveStatusFile
echo "" >>
 $SlaveStatusFile
 
#get
 slave status
${MYSQL_CMD}
 -e "show
 slave status\G" >>
 $SlaveStatusFile #Get the status of the salve process#get
 io_thread_status,sql_thread_status,last_errno Get the following status value IOStatus=$(cat $SlaveStatusFile|grep Slave_IO_Running|awk '{print
 $2}')
SQLStatus=$(cat $SlaveStatusFile|grep Slave_SQL_Running
 |awk '{print
 $2}')
  Errno=$(cat $SlaveStatusFile|grep Last_Errno
 | awk '{print
 $2}')
  Behind=$(cat $SlaveStatusFile|grep Seconds_Behind_Master
 | awk '{print
 $2}')
 
echo "" >>
 $SlaveStatusFile
 
if [
"$IOStatus" =
"No" -o
"$SQLStatus" =
"No" ]; then
  case "$Errno" in
  0)
    #
 It may be that the slave has not started $MYSQL_CMD
 -e "start
 slave io_thread;start slave sql_thread;"
    echo "Cause
 slave threads does not run, trying to start slsave io_thread; start slave sql_thread;" >>
 $SlaveStatusFile
    ;;
  1007|1053|1062|1213|1032|1158|1159|1008)
    #
 Ignore these errors $MYSQL_CMD
 -e "stop
 slave;set global sql_slave_skip_counter=1;start slave;"
    echo "Cause
 slave replication catch errors,trying skip counter and restart slave;stop slave ;set global sql_slave_skip_counter=1;slave start;" >>
 $SlaveStatusFile
    MailTitle="[Warning]
 Slave error on $HOST:$PORT! ErrNum: $Errno"
    ;;
  *)
    echo "Slave
 $HOST:$PORT is down!" >>
 $SlaveStatusFile
    MailTitle="[ERROR]Slave
 replication is down on $HOST:$PORT! Errno:$Errno"
    ;;
  esac
fi
 
if [
"$Behind" =
"NULL" -o
 -z "$Behind" ];then
  Behind=0
fi
echo "Behind:$Behind" >>
 $SlaveStatusFile
 
#delay
 behind master determines the delay time if [
 $Behind -gt 300 ];then
  echo `date +"%Y-%m%d
 %H:%M:%S"`
"slave
 is behind master $Bebind seconds!" >>
 $SlaveStatusFile
  MailTitle="[Warning]Slave
 delay $Behind seconds,from $HOST $PORT"
fi
 
if [
 -n "$MailTitle" ]; then #If an error occurs or the delay is greater than 300s, send an email cat ${SlaveStatusFile}
 | /bin/mail -s
"$MailTitle" $Mail_Address_MysqlStatus
fi
 
#del
 tmpfile:SlaveStatusFile
>
 $SlaveStatusFile

The above summary of online MYSQL synchronization error troubleshooting methods (must-read) is all the content that the editor shares with you. I hope it can give you a reference, and I also hope that you will support 123WORDPRESS.COM.

You may also be interested in:
  • How to check mysql locks through mysql show processlist command
  • mysql show processlist displays the mysql query process
  • A solution to the MYSQL master-slave out-of-sync problem
  • A MySQL slow query caused a failure
  • Super deployment tutorial of MHA high availability failover solution under MySQL
  • Overview, installation, troubleshooting, tips, and tools of MySQL replication (shared by Huo Ding)
  • Methods for detecting MySQL table failures
  • Common failures and reasons for mysql connection failure
  • MySQL SHOW PROCESSLIST assists in the entire process of troubleshooting

<<:  JS gets the position of the nth occurrence of a specified string in a string

>>:  Detailed explanation of FTP server configuration and 425 error and TLS warning solutions for FileZilla Server

Recommend

MYSQL slow query and log example explanation

1. Introduction By enabling the slow query log, M...

Summary of four ways to loop through an array in JS

This article compares and summarizes four ways of...

The difference between mysql outer join and inner join query

The syntax for an outer join is as follows: SELEC...

Tutorial on installing MySQL 5.6 on CentOS 6.5

1. Download the RPM package corresponding to Linu...

IE8 provides a good experience: Activities

Today I had a sneak peek at IE8 beta 1 (hereafter...

Detailed explanation of MySQL combined index method

For any DBMS, indexes are the most important fact...

A brief discussion on the Linux kernel's support for floating-point operations

Currently, most CPUs support floating-point units...

How to uninstall MySQL 5.7 on CentOS7

Check what is installed in mysql rpm -qa | grep -...

Vue implements image dragging and sorting

This article example shares the specific code of ...

Detailed explanation of the relationship between Linux and GNU systems

Table of contents What is the Linux system that w...

js implements a simple calculator

Use native js to implement a simple calculator (w...