How to solve the synchronization delay caused by MySQL DDL

How to solve the synchronization delay caused by MySQL DDL

Preface

Writing case analysis, mainly tool introduction and recommendation. The synchronization mechanism of MySQL is relatively simple. The DML and DDL executed on the master database will be executed again on the slave database. Therefore, the DDL that takes 10 minutes to execute on the master database will theoretically take at least 10 minutes to execute on the slave database. This means that the synchronization of the slave database will be delayed by more than 10 minutes, and the synchronization will continue only after the DDL is executed.

Solution

From the perspective of MySQL synchronization principles, the main reason is that the DDL operation alone takes too long, causing the slave database to be stuck in the master state. Then it is easy to think of a solution to this problem: "disassemble" the DDL operation, split a large operation (the same applies to large transactions) into multiple small operations, and reduce the time of a single operation.

To "disassemble" DDL operations, MySQL Online DDL tools are generally used, such as pt-osc, facebook-osc, oak-online-alter-table, gh-ost, etc. The ideas of these tools are similar. They create a mirror table of the source table, execute the table structure change first, and then synchronize the full data and incremental data of the source table. This can avoid the synchronization delay caused by a single DDL operation.

Tool Introduction

This article will introduce gh-ost, a MySQL online DDL tool maintained by Github. It also uses the mirror table format, but abandons the use of inefficient triggers and instead extracts the required incremental data from the binlog to maintain data consistency between the mirror table and the source table. The entire Online DDL operation will only block reading and writing for a few seconds when the source table and mirror table are finally renamed.

How it works

The operation process of go-ost ​​is as follows:

  • Create a mirror table (_tablename_gho) and a heartbeat table (_tablename_ghc) in the Master.
  • Write the progress and time of Online-DDL to the heartbeat table.
  • Perform an ALTER operation on the mirror table.
  • Disguise as a slave and connect to a slave instance of the Master to obtain binlog information (connect to the Slave by default, but can also connect to the Master).
  • Complete the data synchronization of the mirror table in the Master:
    • Copy data from the source table to the mirror table;
    • Complete incremental data changes based on Binlog information;
  • Lock the source table;
  • Confirm the time in the heartbeat table to ensure that the data is fully synchronized;
  • Replace the source table with the mirror table.
  • Online DDL completed.
  • Functions or features that will be supported in the future:
    • Supports foreign keys.
    • If a gh-ost process is unexpectedly terminated, you can start a new process to continue the Online DDL.

The contents of _tablename_ghc are as follows:

Limitation of Use

  • The binlog format must use row, and binlog_row_image must be FULL.
  • The required permissions are SUPER, REPLICATION CLIENT, REPLICATION SLAVE on *.* and ALL on dbname.*
    • If you confirm that the binlog format is row, you can add -assume-rbr, and the super permission is no longer required.
    • TiDB cannot be used because it does not support REPLICATION-related permissions.
  • Foreign keys are not supported.
    • Whether the source table is a primary table or a child table, it cannot be used.
  • Triggers are not supported.
  • Primary keys containing JSON columns are not supported.
  • The migration table needs to have an explicitly defined primary key or a non-empty unique index.
  • The migration tool is not case sensitive. If a table with the same name but different capitalization exists, it cannot be migrated.
  • When the primary key or non-empty unique index of the migration table contains an enumeration type, the migration efficiency will be greatly reduced.

Usage Notes

  • If the source table has a lot of data, try to delete it in batches.
    • delete from table tablename_old limit 5000;
    • Alternatively, during business idle time, clear the table data using truncate table tablename_old and then drop the table.
  • When starting multiple gh-ost servers on a single MySQL instance to perform online DDL operations on multiple tables, you need to specify the -replica-server-id parameter.
  • Always be aware of available disk space, especially when working with large tables.
    • gh-ost's mirror table contains all the data of the source table and will take up twice as much disk space.
    • gh-ost generates a large amount of binlogs during operation, and binlog_row_image must be FULL, which will take up a lot of disk space.
  • The rename column operation may cause problems. Consider combining the drop and add operations.

Usage Examples

The installation package is available for download on the github official website, refer to the release note.

The actual command can be referred to as follows (row mode is enabled):

gh-ost --max-load=Threads_running=50 \
            --critical-load=Threads_running=100 \
            --chunk-size=3000 --user="temp" --password="test" --host=10.10.1.10 \
            --allow-on-master --database="sbtest" --table="sbtest1" \
            --alter="engine=innodb" --cut-over=default \
            --exact-rowcount --concurrent-rowcount --default-retries=120 \
            --timestamp-old-table -assume-rbr --panic-flag-file=/tmp/ghost.panic.flag \
            --execute

Some parameter descriptions

The above order content shall prevail:

max-load=Threads_running=50 When more than 50 clients are executing SQL queries, Online DDL operations are suspended. critical-load=Threads_running=100 When more than 100 clients are executing SQL queries, Online DDL operations are interrupted. chunk-size=3000 Each synchronization operation processes 3000 rows of data. allow-on-master Allows all operations related to Online DDL to be executed on the master database. alter Online DDL operations only require part of the alter statement (the part in square brackets).
                                     Example: alter table sbtest.sbtest1 [add column t int not NULL]
cut-over=default Automatically switch the mirror table and the source table after data synchronization is completed exact-rowcount Accurately calculate the number of rows to provide more accurate progress timestamp-old-table Use timestamp to name the old table assume-rbr Skip restarting the slave thread and row format check, no super permission is required after setting panic-flag-file After creating this file, the Online DDL operation will be forced to be interrupted

In addition to these parameters, gh-ost also provides many ways to externally pause or force abort Online DDL operations. Detailed information can be viewed using the gh-ost --help command.

Output example

# Migrating `sbtest`.`sbtest1`; Ghost table is `sbtest`.`_sbtest1_gho`
# Migrating 10.10.1.10:3306; inspecting10.10.1.10:3306; executing on localhost-debian
# Migration started at Thu Jul 30 11:30:17 +0800 2020
# chunk-size: 3000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=50; critical-load: Threads_running=100; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.sbtest.sbtest1.sock
Copy: 0/9863066 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000050:31635038; Lag: 0.03s, State: migrating; ETA: N/A
Copy: 0/9863066 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql-bin.000050:31639503; Lag: 0.03s, State: migrating; ETA: N/A
Copy: 69000/9999998 0.7%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 2s(copy); streamer: mysql-bin.000050:44815698; Lag: 0.03s, State: migrating; ETA: 4m49s
Copy: 135000/9999998 1.4%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 3s(copy); streamer: mysql-bin.000050:57419220; Lag: 0.03s, State: migrating; ETA: 3m39s
Copy: 195000/9999998 2.0%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 4s(copy); streamer: mysql-bin.000050:68877374; Lag: 0.03s, State: migrating; ETA: 3m21s
......(omitted)
Copy: 9729000/9999998 97.3%; Applied: 0; Backlog: 0/1000; Time: 3m16s(total), 3m16s(copy); streamer: mysql-bin.000057:8595335; Lag: 0.04s, State: migrating; ETA: 5s
[2020/07/30 11:33:32] [info] binlogsyncer.go:723 rotate to (mysql-bin.000057, 4)
Copy: 9774000/9999998 97.7%; Applied: 0; Backlog: 0/1000; Time: 3m17s(total), 3m17s(copy); streamer: mysql-bin.000057:17190073; Lag: 0.03s, State: migrating; ETA: 4s
[2020/07/30 11:33:32] [info] binlogsyncer.go:723 rotate to (mysql-bin.000057, 4)
Copy: 9822000/9999998 98.2%; Applied: 0; Backlog: 0/1000; Time: 3m18s(total), 3m18s(copy); streamer: mysql-bin.000057:26357495; Lag: 0.04s, State: migrating; ETA: 3s
Copy: 9861000/9999998 98.6%; Applied: 0; Backlog: 0/1000; Time: 3m19s(total), 3m19s(copy); streamer: mysql-bin.000057:33806865; Lag: 0.03s, State: migrating; ETA: 2s
Copy: 9903000/9999998 99.0%; Applied: 0; Backlog: 0/1000; Time: 3m20s(total), 3m20s(copy); streamer: mysql-bin.000057:41828922; Lag: 0.03s, State: migrating; ETA: 1s
Copy: 9951000/9999998 99.5%; Applied: 0; Backlog: 0/1000; Time: 3m21s(total), 3m21s(copy); streamer: mysql-bin.000057:50996347; Lag: 0.03s, State: migrating; ETA: 0s
Copy: 9999998/9999998 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m22s(total), 3m21s(copy); streamer: mysql-bin.000057:60354465; Lag: 0.03s, State: migrating; ETA: due
# Migrating `sbtest`.`sbtest1`; Ghost table is `sbtest`.`_sbtest1_gho`
# Migrating 10.10.1.10:3306; inspecting 10.10.1.10:3306; executing onlocalhost-debian
# Migration started at Thu Jul 30 11:30:17 +0800 2020
# chunk-size: 3000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=50; critical-load: Threads_running=100; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.sbtest.sbtest1.sock
Copy: 9999998/9999998 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m23s(total), 3m21s(copy); streamer: mysql-bin.000057:60359997; Lag: 0.03s, State: migrating; ETA: due
[2020/07/30 11:33:41] [info] binlogsyncer.go:164 syncer is closing...
[2020/07/30 11:33:41] [error] binlogstreamer.go:77 close sync with err: sync is being closed...
[2020/07/30 11:33:41] [info] binlogsyncer.go:179 syncer is closed

You can see that the log content outputs a detailed progress percentage and the remaining time of the migration, which is very convenient for estimating the end time of maintenance and checking the progress of DDL execution.

Tencent Cloud Database MySQL Usage Notes

  • The default binlog_row_image of Tencent Cloud Database MySQL is MINIMAL. You need to actively adjust it to FULL in the control before use (online change, effective immediately).
  • Tencent Cloud Database, Alibaba Cloud Database, MySQL in the container, etc. may encounter port problems. Just add the --aliyun-rds parameter.
    • The error message is similar to FATAL Unexpected database port reported.
    • For related discussions, see issues.

To sum up

gh-ost outputs better information, migrates data more efficiently, and supports more features than pt-osc and other tools. However, the problems with gh-ost (such as disk space) are also encountered with other tools. Therefore, when you want to avoid latency issues during DDL operations, it is recommended to give priority to gh-ost.

The above is the details on how to solve the synchronization delay caused by MySQL DDL. For more information about the synchronization delay caused by MySQL DDL, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Detailed explanation of MySQL's Seconds_Behind_Master
  • Causes and solutions for MySQL master-slave synchronization delay
  • Analysis and solution of MYSQL master-slave asynchrony delay principle
  • Detailed explanation of reducing MySQL master-slave data synchronization delay
  • Why Seconds_Behind_Master is still 0 when MySQL synchronization delay occurs

<<:  Tomcat source code analysis of Web requests and processing

>>:  Learn about JavaScript closure functions in one article

Recommend

JavaScript custom calendar effect

This article shares the specific code of JavaScri...

ul list tag design web page multi-column layout

I suddenly thought of this method when I was writi...

Talk about the 8 user instincts behind user experience in design

Editor's note: This article is contributed by...

MySQL complete collapse: detailed explanation of query filter conditions

Overview In actual business scenario applications...

Flex layout achieves fixed number of rows per line + adaptive layout

This article introduces the flex layout to achiev...

Some summary of MySQL's fuzzy query like

1. Common usage: (1) Use with % % represents a wi...

Vue implements drag and drop or click to upload pictures

This article shares the specific code of Vue to a...

Detailed explanation of MySQL slow queries

Query mysql operation information show status -- ...

MySQL full backup and quick recovery methods

A simple MySQL full backup script that backs up t...

Solution to forgetting the password of the pagoda panel in Linux 3.X/4.x/5.x

Enter ssh and enter the following command to rese...

MySQL 5.7.17 Compressed Version Installation Notes

This article shares the installation steps of MyS...

Summary of pitfalls encountered in installing mysql and mysqlclient on centos7

1. Add MySQL Yum repository MySQL official websit...

How to use mysqldump to backup MySQL data

1. Introduction to mysqldump mysqldump is a logic...