MySQL online DDL tool gh-ost principle analysis

MySQL online DDL tool gh-ost principle analysis

1. Introduction

gh-ost is based on the golang language and is an open source DDL tool from GitHub. It is the abbreviation of GitHub's Online Schema Transmogrifier/Transfigurator/Transformer/Thingy, which means GitHub's online table definition converter.

1.1 Principle

The main implementation principle is to first create two tables, one is the shadow table _gho, gh-ost will apply the original table data and incremental data to this table, and finally switch the table name between this table and the original table, and the other is the _ghc table, which stores the changelog data, including signal markers, heartbeats, etc. Secondly, gh-ost will start two goroutines, one for copying the original table data and the other for applying the incremental binlog to the _gho table. The two goroutines run in parallel, which means you don’t need to worry about whether the data is copied first or the binlog is applied first. Because the insert statement will be adjusted here, first the insert into we copied will be rewritten as insert ignore into, and the insert into in the binlog will be rewritten as replace into, which can well support the parallelization of the two goroutines. But can such adjustments be applied to all DDLs? The answer is no. Finally, once all the original table data has been copied, gh-ost will enter the table swap phase, using a more secure atomic swap.

1.2 Process

1. Check for foreign keys and triggers.
2. Check the primary key information of the table.
3. Check whether it is a master or slave database, whether log_slave_updates is enabled, and binlog information
4. Check whether the temporary table ending with gho and del exists
5. Create a table ending with ghc to store data migration information and binlog information, etc.
---The above verification stage
6. Initialize the stream connection and add binlog monitoring
--- The following migration phases
7. Create a temporary table ending with gho and execute DDL on the temporary table ending with gho
8. Start a transaction, write the source table data to the table ending with gho according to the primary key ID, submit it, and apply binlog.
---The following cut-over stage
9. Lock the source table, rename the table: rename the source table to the source_del table, gho table to the source table.
10. Clean up the ghc tables.

1.3 Features

1. No trigger: Monitor data changes in the table by analyzing binlog logs.

2. Lightweight: Since no triggers are used, the impact on the main database during the operation is minimal, and there is no need to worry about concurrency and locking.

3. Pausable: All write operations are controlled by gh-ost. When rate-limited, gh-ost can pause writing data to the master, create an internal tracking table, and write heartbeat events to this table with minimal system overhead.

4. Dynamic control: gh-ost can listen for requests via a unix socket file or a TCP port (configurable), and the operator can change the corresponding parameters after the command is run.

5. Auditable: The program interface can be used to obtain the status of gh-ost, report the current progress, the configuration of key parameters, the current server identification, etc.

6. Testable: gh-ost has built-in support for testing, which can be specified using the --test-on-replica flag: it will perform mutations on the replica, at the end of which gh-ost will stop replication, swap the table, reverse the swap, keep both tables in sync, and stop replication. You can test and compare the data in the two tables at your leisure.

1.4 github address

https://github.com/github/gh-ost/

2. Test environment:

2.1 Test Server

Main database: 110.119.120.231

From the library: 110.119.120.230

2.2 Installation

cd /usr/local/src/

wget https://github.com/github/gh-ost/releases/download/v1.0.48/gh-ost-binary-linux20190214020851.tar.gz

tar xzvf gh-ost-binary-linux-20190214020851.tar.gz -C /usr/local/
ln -s /usr/local/gh-ost /usr/bin/gh-ost

2.3 Create a User

create user ghost@'110.%' identified by 'ghost';

grant ALL PRIVILEGES on *.* to ghost@'110.%';

flush privileges;

2.4 Command Parameters

Usage of gh-ost:
 --aliyun-rds: Whether to execute on Alibaba Cloud Database. true
 --allow-master-master: Whether to allow gh-ost to run in a dual-master replication architecture. Generally used together with the -assume-master-host parameter. --allow-nullable-unique-key: Allow gh-ost to allow the unique key that the data migration depends on to be NULL. The default is to not allow NULL unique keys. If the unique key that the data migration relies on allows NULL values, incorrect data may result. Please use it with caution.
 --allow-on-master: Allow gh-ost to run directly on the master. The default slave that gh-ost connects to. In addition, for DDL on a single instance, a single instance is equivalent to a master database, and the --allow-on-master parameter and ROW mode need to be enabled.
 --alter string:DDL statement --approve-renamed-columns ALTER: If you change the name of a column, gh-ost will recognize this and ask for a reason for the column rename. By default, gh-ost will not proceed unless you provide --approve-renamed-columns ALTER.
 --ask-pass:MySQL password --assume-master-host string:Specify a master database for gh-ost, in the format of "ip:port" or "hostname:port". This is useful in a master-master architecture, or when gh-ost cannot find the master.
 --assume-rbr: If binlog_format=ROW is set for the database instance to which gh-ost is connected, you can specify -assume-rbr. This will prevent stop slave and start slave from running on the slave, and the gh-ost user does not need the SUPER privilege to execute.
 --check-flag
 --chunk-size int: Number of rows to process in each iteration (allowed range: 100-100000), default value is 1000.
 --concurrent-rowcount: If this parameter is True (default), after row-copying, gh-ost estimates the number of rows (using explain select count(*)) and adjusts the ETA. Otherwise, gh-ost first estimates the number of rows and then starts row-copying.
 --conf string: gh-ost configuration file path.
 --critical-load string: A comma-separated list of status-name=values. When the MySQL status exceeds the corresponding values, gh-ost will exit. -critical-load Threads_connected=20,Connections=1500 means that when the status values ​​in MySQL Threads_connected>20,Connections>1500, gh-ost will stop and exit due to the critical load of the database.
 Comma delimited status-name=threshold, same format as --max-load. When status exceeds threshold, app panics and quits
 --critical-load-hibernate-seconds int : When the critical load is reached, gh-ost will go into hibernation for the specified time. It will not read/write anything from any server.
 --critical-load-interval-millis int: When the value is 0, gh-ost will exit immediately when the critical load is reached. When the value is not 0, gh-ost will check again after -critical-load-interval-millis seconds when the -critical-load is reached. If the -critical-load is still reached after the second check, gh-ost will exit.
 --cut-over string: Select the cut-over type: atomic/two-step. The atomic (default) type of cut-over uses the github algorithm, and the two-step type uses the facebook-OSC algorithm.
 --cut-over-exponential-backoff
 --cut-over-lock-timeout-seconds int: The maximum lock waiting time during the cut-over phase of gh-ost. When the lock times out, gh-ost will retry the cut-over. (Default: 3)
 --database string: database name.
 --debug: debug mode.
 --default-retries int: The number of times various operations are retried before panicking. (Default is 60)
 --discard-foreign-keys: This parameter is for a table with foreign keys. When gh-ost creates a ghost table, it will not create foreign keys for the ghost table. This parameter is useful for deleting foreign keys, but otherwise use it with caution.
 --dml-batch-size int: Batch size for applying DML events in a single transaction (range 1-100) (default 10)
 --exact-rowcount: Accurately count the number of table rows (using select count(*)) to get a more accurate estimate of the time.
 --execute: actually execute alter & migrate table. The default is noop, which means no execution. Just test and exit. If you want the ALTER TABLE statement to be actually implemented in the database, you need to explicitly specify -execute
 --exponential-backoff-max-interval int
 --force-named-cut-over: If true, the 'unpostpone | cut-over' interactive command must name the migrated table --force-table-names string: Table name prefix to use on temporary tables --heartbeat-interval-millis int: gh-ost heartbeat frequency value, default is 500
 --help
 --hooks-hint string: arbitrary messages injected into hooks via GH_OST_HOOKS_HINT --hooks-path string: directory where hook files are stored (default is empty, i.e. hooks are disabled). The hook will search for a hook file with the same naming convention in this directory to execute.
 --host string :MySQL IP/hostname
 --initially-drop-ghost-table: Check and drop existing ghost tables before gh-ost operation. This parameter is not recommended. Please manually process the existing ghost table. By default, this parameter is not enabled and gh-ost will exit the operation.
 --initially-drop-old-table: Before gh-ost operation, check and drop the existing old table. This parameter is not recommended. Please manually process the existing ghost table. By default, this parameter is not enabled and gh-ost will exit the operation.
 --initially-drop-socket-file:gh-ost forcefully deletes an existing socket file. This parameter is not recommended and may delete a running gh-ost program, causing DDL failures.
 --master-password string :MySQL master password --master-user string:MySQL master account --max-lag-millis int:Maximum delay time of master-slave replication. When the delay time of master-slave replication exceeds this value, gh-ost will take throttle measures. Default value: 1500s.
 --max-load string: comma separated status name = threshold, such as: 'Threads_running=100,Threads_connected=500'. When status exceeds threshold, app throttles writes
 --migrate-on-replica: gh-ost runs migration on the replica, not the master. 
 --nice-ratio float: The sleep time for each chunk period, in the range [0.0…100.0]. 0: No sleep in each chunk period, that is, one chunk is executed after another; 1: Sleep for an additional 1 millisecond for each row-copy; 0.7: Sleep for an additional 7 milliseconds for each row-copy of 10 milliseconds.
 --ok-to-drop-table: After the gh-ost operation is completed, the old table is deleted. By default, the old table is not deleted and the _tablename_del table will exist.
 --panic-flag-file string: When this file is created, gh-ost will exit immediately.
 --password string : MySQL password --port int : MySQL port, preferably from the slave --postpone-cut-over-flag-file string: When this file exists, the cut-over phase of gh-ost will be postponed and data will still be replicated until the file is deleted.
 --quiet: Quiet mode.
 --replica-server-id uint : gh-ost's server_id
 --replication-lag-query string: Deprecated --serve-socket-file string: Absolute path to gh-ost's socket file.
 --serve-tcp-port int: Port to use for gh-ost. Default is closed port.
 --skip-foreign-key-checks: Set to 'true' when you are sure there are no foreign keys on your table, and want to skip gh-ost validation.
 --skip-renamed-columns ALTER: If you change the name of a column (e.g., change column), gh-ost will recognize this and ask for a reason for the column rename. By default, gh-ost will not proceed with the command. This parameter tells gh-ost to skip the migration of the column and treat the renamed column as a don’t care column. This operation is dangerous, you will lose all values ​​in the column.
 --stack: Add error stack trace.
 --switch-to-rbr: Tell gh-ost to automatically switch the binlog_format of the slave repository to ROW format.
 --table string: table name --test-on-replica: Test gh-ost on the slave, including data migration on the slave. After the data migration is complete, stop the slave, and immediately swap the original table with the ghost table and then immediately swap back. Continue to stop the slave so that you can compare the two tables.
 --test-on-replica-skip-replica-stop: When -test-on-replica is executed, this parameter means that the slave does not need to be stopped during the process.
 --throttle-additional-flag-file string: When this file is created, gh-ost operation will stop immediately. This parameter can be used when multiple gh-ost operations are running at the same time to create a file and stop all gh-ost operations, or delete the file and resume all gh-ost operations.
 --throttle-control-replicas string: Lists all slave libraries that need to be checked for master-slave replication delay.
 --throttle-flag-file string: When this file is created, gh-ost operation will stop immediately. This parameter is suitable for controlling a single gh-ost operation. -throttle-additional-flag-file string is suitable for controlling multiple gh-ost operations.
 --throttle-http string
 --throttle-query string: Throttle query. Executes once every second. When the return value = 0, no throttling is required. When the return value > 0, throttling needs to be performed. This query will be run on the migrated server, so please make sure it is lightweight.
 --timestamp-old-table: Use timestamp in old table names. This will make cross-migration of old table names unique and conflict-free --tungsten: Tells gh-ost that you are running a tungsten-replication topology.
 --user string :MYSQL user --verbose
 --version

3. Operation Mode

Mode 1: Connect to the slave database and modify it on the master database

This is the default mode in which gh-ost will look for slaves, find the master of the cluster and connect to it. The specific steps for the modification operation are:

1. Read and write row data on the master database;

2. Read binary log events on the slave database and apply the changes to the master database;

3. Check the table format, fields, primary key, total number of rows, etc. on the slave database;

4. Read gh-ost internal event logs (such as heartbeats) on the slave.

5. Complete table switching on the main database.

If your main database log format is SBR, the tool can also work normally. But the slave must be configured with binary logging enabled (log_bin, log_slave_updates) and binlog_format=ROW (gh-ost is a binary file that reads from the slave).

Example of use:

# gh-ost --initially-drop-old-table --initially-drop-ghost-table --user="ghost" --password="ghost" --host=110.119.120.230 --port=3306 --database="test" --table="t1" --verbose --alter="ADD COLUMN y1 varchar(10),add column y2 int not null default 0 comment 'test' " --assume-rbr --execute

Parameter meaning:

--initially-drop-old-table: Check and drop existing old tables before gh-ost operation.

--initially-drop-ghost-table: Check and drop existing ghost tables before gh-ost operation.

--verbose: Output log of execution process

--assume-rbr: If binlog_format=ROW is set for the database instance to which gh-ost is connected, you can specify -assume-rbr. This will avoid running stop slave and start slave on the slave. The user who executes gh-ost does not need the SUPER privilege.

Mode 2: Modify directly on the main database

If there is no slave library, or you do not want to operate on the slave library, you can use the master library directly. gh-ost will do all operations directly on the master. You can still view the master-slave replication delay above.

1) The master database must generate binary logs in Row format

2) You must use the --allow-on-master option when starting gh-ost to enable this mode.

# gh-ost --initially-drop-old-table --initially-drop-ghost-table --user="ghost" --password="ghost" --host="110.119.120.231" --port=3306 --database="test" --table="t2" --verbose --alter="add column test_field varchar(256) default '';" --exact-rowcount --serve-socket-file=/tmp/gh-ost.t2.sock --panic-flag-file=/tmp/ghost.panic.t2.flag --postpone-cut-over-flag-file=/tmp/ghost.postpone.t2.flag --allow-on-master --execute

Parameter meaning:

--exact-rowcount: Accurately count the number of table rows (using select count(*)) to get a more accurate estimate of the time.

--serve-socket-file: The absolute path to the gh-ost socket file. For example: --serve-socket-file=/tmp/gh-ost.t1.sock creates a socket file for listening and adjusts parameters through the interface. When the load and latency increase during the operation, the operation has to be terminated and parameters such as chunk-size have to be reconfigured. Then the operation command can be re-executed. Dynamic adjustments can be made through the socket interface.

#pause

echo throttle | socat - /tmp/gh-ost.t1.sock

#recover

echo no-throttle | socat - /tmp/gh-ost.t1.sock

Modify the speed limit parameters:

echo chunk-size=1500 | socat - /tmp/gh-ost.t1.sock
echo max-lag-millis=2000 | socat - /tmp/gh-ost.t1.sock

echo max-load=Thread_running=30 | socat - /tmp/gh-ost.t1.sock


--panic-flag-file: This file is created to immediately terminate the running gh-ost. Temporary file cleanup needs to be done manually.

--postpone-cut-over-flag-file: When this file exists, the cut-over phase of gh-ost will be postponed. Data will still be copied, but the table will not be switched until this file is deleted.

--allow-on-master: Allow gh-ost to run directly on the master.

Mode 3: Modify and test on the slave

This mode will make changes on the slave database. All operations are performed on the slave database and will not have any impact on the master database. During the operation, gh-ost will also pause from time to time so that the replica data can be kept up to date.

--test-on-replica indicates that the operation is for testing purposes only. Replication is stopped before the final switchover operation. The original table and temporary table will switch back and forth, and ultimately the original table will remain unchanged. When the master-slave replication is paused, you can check and compare the data in the two tables (if you do not want to stop the slave, you can add the parameter --test-on-replica-skip-replica-stop).

# gh-ost --initially-drop-old-table --initially-drop-ghost-table --user="ghost" --password="ghost" --host=110.119.120.230 --port=3306 --database="test" --table="t3" --verbose --alter="ADD COLUMN abc1 varchar(10),add column abc2 int not null default 0 comment 'test' " --test-on-replica --assume-rbr --execute

RDS limitations:

1. The user does not have the Super privilege, so the --assume-rbr command must be added during use. gh-ost will assume that the binlog itself is in row mode and will not modify it. The binlog on Alibaba Cloud RDS is also in row mode by default, so there is no problem.

2. Other permissions, mainly REPLICATION SLAVE, REPLICATION CLIENT can pull binlog and also obtain it.

3. Unable to connect to the standby database to pull binlog. The standby database is usually transparent to the user, so gh-ost needs to connect directly to the primary database, which may increase the load on the primary database. When using it, you need to add --allow-on-master and --assume-master-host. The officially recommended method is to connect to one of the backup databases, because there will be some high-pressure SELECT operations, and it is best to put them in the backup database.

4. To execute on Alibaba Cloud Database, you need to add a parameter --aliyun-rds. If you are using it now, remember to add the following parameters: --allow-on-master --assume-rbr --assume-master-host --aliyun-rds

4. Comparison between gh-ost and pt-osc

4.1 A brief introduction to pt-osc

pt-osc workflow
1. Check whether the change table has a primary key or unique index, and whether there is a trigger
2. Check and modify the table structure, create a temporary table, and execute the ALTER TABLE statement on the new table
3. Create three triggers on the source table for INSERT, UPDATE, and DELETE operations respectively.
4. Copy data from the source table to the temporary table. During the copying process, updates to the source table will be written to the newly created table.
5. Rename the temporary table and the source table (metadata modification lock is required, and the table needs to be locked for a short time)
6. Delete the source table and trigger to complete the modification of the table structure.

pt-osc tool limitations
1. The source table must have a primary key or unique index. If not, the tool will stop working.
2. If the online replication environment filter operation is too complex, the tool will not work
3. If the replication delay check is turned on, but the master-slave delay occurs, the tool will suspend the data copy work
4. If the main server load check is turned on, but the main server load is high, the tool will suspend operation
5. When the table uses foreign keys, the tool will not execute if the --alter-foreign-keys-method parameter is not used.
6. Only Innodb storage engine tables are supported, and more than 1 times the free space of the table is required on the server.

So what are the specific advantages of gh-ost over pt-osc? The following is a brief introduction to its two most core features.

4.2 Triggerless

Before the emergence of gh-ost, third-party MySQL DDL tools were all implemented using triggers, including Percona's pt-osc, Facebook's OSC, and so on. The mechanism used by gh-ost is completely different from them: it synchronizes data through MySQL binlog. gh-ost itself is registered as a fake slave, which can pull binlog from the master or slave in the cluster, parse it in real time, and reapply all DML operations of the change table to the shadow table. Therefore, for DML operations that occur on the change table during the release, the performance overhead caused by triggers and lock contention can be completely avoided.

In addition, we usually choose the slave node in the cluster as the target publishing machine, and the slave generally does not carry any business. In this way, the binlog parsing overhead will not fall on the master providing the business, but is only an asynchronous DML statement replay.

4.3 Dynamically controllable

Another most important feature is dynamic control, which was not previously available in other third-party open source tools.

When publishing through pt-osc before, the parameters cannot be modified after the command is executed unless it is stopped and restarted. Assume that the release is 90% complete and suddenly the server load increases due to various other reasons. In order not to affect the business, you can only choose to stop the release and wait for performance to recover before restarting.

The tables published through pt-osc are all large tables and take a long time, so it is very embarrassing to encounter such scenarios. Therefore, it is very important if the parameters in the release can be dynamically adjusted. gh-ost also implements a socket server. We can interact with the publishing process in real time through the socket. It can support real-time pause, resume, and dynamic adjustment of many parameters to adapt to external changes.

5. References

1. gh-ost principle

https://www.cnblogs.com/mysql-dba/p/9901589.html

2. Technology sharing | gh-ost online ddl change tool

https://zhuanlan.zhihu.com/p/83770402

3. Practical skills | The evolution of Ctrip’s database publishing system

https://blog.csdn.net/ctrip_tech/article/details/108395676

4.MySQL Online DDL gh-ost Usage Instructions

https://www.cnblogs.com/zhoujinyi/p/9187421.html

5.MySQL--pt-osc tool learning

https://www.cnblogs.com/TeyGao/p/7160421.html

This is the end of this article about the MySQL online DDL tool gh-ost. For more information about MySQL online DDL gh-ost, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • How to quickly add columns in MySQL 8.0
  • Detailed explanation of the use of MySQL Online DDL
  • How to solve the synchronization delay caused by MySQL DDL
  • Detailed explanation of MySQL 8.0 atomic DDL syntax
  • Use of MySQL DDL statements
  • Summary of common Mysql DDL operations
  • Analysis of the new features of MySQL 8.0 - transactional data dictionary and atomic DDL
  • Basic statements of MySQL data definition language DDL
  • MySQL 8.0 DDL atomicity feature and implementation principle
  • Summary of using MySQL online DDL gh-ost
  • Solve the problem of blocking positioning DDL in MySQL 5.7
  • MySQL 8.0 new features: support for atomic DDL statements
  • MySQL exposes Riddle vulnerability that can cause username and password leakage
  • Summary of MySQL 8.0 Online DDL Quick Column Addition

<<:  Vue realizes simple effect of running light

>>:  A brief discussion on CSS height collapse problem

Recommend

MySQL slow query: Enable slow query

1. What is the use of slow query? It can record a...

Some CSS questions you may be asked during an interview

This article is just to commemorate those CSS que...

CSS to achieve dynamic secondary menu

Dynamically implement a simple secondary menu Whe...

Detailed explanation of the command mode in Javascript practice

Table of contents definition structure Examples C...

Standard summary for analyzing the performance of a SQL statement

This article will introduce how to use explain to...

The whole process of upgrading Angular single project to multiple projects

Table of contents Preface Development Environment...

Several ways to solve the 1px border problem on mobile devices (5 methods)

This article introduces 5 ways to solve the 1px b...

In-depth understanding of Worker threads in Node.js

Table of contents Overview The history of CPU-bou...

A brief discussion on which fields in Mysql are suitable for indexing

Table of contents 1 The common rules for creating...

CSS style does not work (the most complete solution summary in history)

When we write pages, we sometimes find that the C...

Similar to HTML tags: strong and em, q, cite, blockquote

There are some tags in XHTML that have similar fu...