MySQL data archiving tool mysql_archiver detailed explanation

MySQL data archiving tool mysql_archiver detailed explanation

I. Overview

There are three main ways to archive historical data in MySQL database: 1. Create and write SP and set Event; 2. Import and export through dump; 3. Archive through pt-archiver tool. The first method is often limited by the same instance requirements and is often abandoned. The second type has relatively better performance, but operation and maintenance is also a headache when there are many archive tables. Therefore, many DBAs often use the third method - pt-archiver.

pt-archiver is a component of the Percona-Toolkit toolset, and is a tool mainly used to archive and clear MySQL table data. It can archive the data into another table or a file. pt-archiver does not affect the query performance of OLTP transactions during the process of clearing table data. For archiving of data, it can be archived to another table on another server or to a file.

A netizen developed pt-archiver into a small tool - mysql_archiver through Python. It basically realizes the configurable deployment of archiving jobs and is very good to use. "MySQL_archiver basically realizes the automatic operation of data archiving, unified archiving task scheduling management, automatic monitoring and early warning, and automatic report generation. To a certain extent, it saves productivity and improves operation and maintenance efficiency."

GitHub address: https://github.com/dbarun/mysql_archiver

2. pt-archiver main parameters

parameter default value Parameter Explanation
--check-slave-lag Specifies to pause archiving operations after the master-slave replication lag exceeds the value specified by option '--max-lag'. By default, the tool checks all slave libraries, but this option only works on the specified slave library (connected via DSN).
--check-interval 1s If the option '--check-slave-lag' is specified at the same time, the time specified by this option is the time the tool pauses when it finds a master-slave replication delay. This check is performed every 100 rows.
--[no]check-charset yes Specifies to check to ensure that the database character set at the time of the connection is the same as the table character set.
--commit-each Specifies to commit as many rows as possible per fetch and archive. This option disables the '--txn-size' option. After each acquisition and archiving of table data, before the next acquisition of data and the sleep time specified by option '--sleep', the transaction is committed and the file specified by option '--file' is refreshed. The size of the transaction is controlled by option '--limit'.
--host, -h Specify the database IP address to connect to.
--port, -P Specify the database port to connect to.
--user, -u Specify the database user for the connection.
--password, -p Specify the database user password for the connection.
--socket, -S Specifies to use a SOCKET file to connect.
--databases,-d Specify the database to connect to
--source Specifies the table that needs to be archived. This option is required and is expressed in DSN format.
--dest Specify the target table to be archived, using DSN. If this option is not specified, the default is the same source table as specified by option '--source'.
--where Specifies the data to be archived through the WHERE conditional statement. This option is a required option. There is no need to add the 'WHERE' keyword. If you do not need a WHERE condition to restrict, specify '--where 1=1'.
--file Specify the file to which the table data needs to be archived. Use a format similar to MySQL DATE_FORMAT(). The file contents use the same format as the SELECT INTO OUTFILE statement in MySQL. The file naming options are as follows: ' %Y: Year, numeric, four digits %m: Month, numeric (01..12) %d: Day of the month, numeric (01..31) %H: Hour (00..23) %i: Minutes, numeric (00..59) %s: Seconds (00..59) %D: Database name %t: Table name For example: --file '/var/log/archive/%Y-%m-%d-%D.%t' '
--output-format Specifies the format of the file content output by option '--file'. By default, if this option is not specified, the tab character is used as the field separator. If this option is specified, ',' (comma) is used as the field separator and '"' (double quotes) are used to enclose the fields. Usage example: '--output-format=dump'.
--limit 1 Specifies the number of rows to retrieve from tables and archive tables per statement.
--max-lag 1s Specifies the maximum allowed delay time for master-slave replication, in seconds. If the master-slave delay exceeds the specified value after each row data is obtained, the archiving operation will be suspended, and the suspension sleep time is the value specified by the option '--check-interval'. After the sleep time is over, check the master-slave delay again. The checking method is to determine the 'Seconds_Behind_Master' value queried from the database. If the master-slave replication delay is always greater than the value specified by this parameter or the slave stops replicating, the operation will wait until the slave is restarted and the delay is less than the value specified by this parameter.
--no-delete Specifies not to delete archived table data.
--progress Specify how many lines to print progress information, the current time, the elapsed time, and how many lines to archive.
--purge Specifies that a purge operation be performed instead of an archive operation. It allows to ignore options '--dest' and '--file'. If it is just a clearing operation, it will be more efficient to combine it with option '--primary-key-only'.
--replace When the write option '--dest' is specified to specify the destination table, the INSERT statement is rewritten as a REPLACE statement.
--retries 1 Specifies the number of retries to be performed if an archive operation encounters a deadlock or timeout. When the number of retries exceeds the value specified by this option, the tool will exit with an error.
--run-time Specifies how long the tool archive operation needs to run before exiting. The allowed time suffixes are s=seconds, m=minutes, h=hours, d=days. If not specified, the default is s.
--[no]safe-auto-increment yes Specifies not to archive rows corresponding to the maximum value of the AUTO_INCREMENT column. This option adds an additional WHERE clause when performing archive and purge to prevent the tool from deleting data rows with the maximum value of the AUTO_INCREMENT attribute for a single-column ascending field, so that the corresponding value of AUTO_INCREMENT can still be used after the database is restarted, but this will cause the row corresponding to the maximum value of the field to be unable to be archived or purged.
--sleep Specifies the time the tool needs to sleep when obtaining archived data through the SELECT statement. The default value is no sleep. The transaction will not be committed before sleeping, and the file specified by option '--file' will not be flushed. If option '--commit-each' is specified, transaction commit and file flush will be done before sleeping.
--statistics Specifies that the tool collect and print timing statistics for operations.
--txn-size 1 Specifies the number of rows to process per transaction. If it is 0, transactions are disabled.
--why-quit Specifies that the tool prints the reason when it exits due to a failure to complete the number of rows to archive. This option is very convenient to use together with option '--run-time' when executing an automatic archiving task, so that you can determine whether the archiving task is completed within the specified time. If option '--statistics' is also specified, the reasons for all exits are printed.
--skip-foreign-key-checks Specify to disable foreign key checks using the statement SET FOREIGN_KEY_CHECKS = 0.

3. Common errors in installing and deploying mysql_archiver

1. Execute python db_archive_exec.py 127.0.0.1 db123 and an error is reported

Traceback (most recent call last):
File "/data/mysql_archiver/mysql_archiver-master/db_archive_exec.py", line 7, in <module>
import db_conn
File "/data/mysql_archiver/mysql_archiver-master/db_conn.py", line 4, in <module>
import MySQLdb
ImportError: No module named MySQLdb

Solution:

Step 1 Basic Environment

rm -rf /etc/yum.repos.d/epel*
yum install postgresql-devel python3-devel postgresql-devel python-devel

Step 2: Install setuptools

(1) Download setuptools-39.2.0.zip
(2) Unzip and install python setup.py build
    python setup.py install

Step 3 Install pip

(1) Download pip-1.5.4.tar.gz
(2) Install python setup.py install

Step 4 Install MySQL-python

yum install MySQL-python

2. Report an error about missing pt-archiver tool

sh: pt-archiver: command not found

Solution:

Download percona-toolkit-3.2.1_x86_64.tar.gz, install it, and set up a soft link.

For example:

ln -s /data/mysql_archiver/percona-toolkit-3.2.1/bin/pt-archiver /usr/local/bin/pt-archiver

3. Lack of documentation

<type 'exceptions.Exception'>
[Errno 2] No such file or directory: u'/software/python_script/db_archive_XXXXdb_XXXtablename.log'

Solution:

Create a file: /software/python_script

4. Character set problem

Character set mismatch: --source DSN uses utf8, table uses utf8mb4. You can disable this check by specifying --no-check-charset.

Solution:

Modify the db_archive_exec.py executable file and add the parameter --no-check-charset for pt-archiver

5. Parameter setting error

Usage: pt-archiver [OPTIONS] --source DSN --where WHERE

Errors in command-line arguments:
 * --txn-size and --commit-each are mutually exclusive.

pt-archiver nibbles records from a MySQL table. The --source and --dest
arguments use DSN syntax; if COPY is yes, --dest defaults to the key's value
from --source. For more details, please use the --help option, or try 'perldoc
/usr/local/bin/pt-archiver' for complete documentation.

Solution:

pt-archiver removes the parameter --commit-each

6.Cannot find encoding "utf8mb4"

Error message:

Cannot find encoding "utf8mb4" at /usr/local/bin/pt-archiver line 6711.

Solution:

pt-archiver removes the parameters --bulk-insert --bulk-delete

IV. Main reference materials

1. MySQL data archiving practice - the powerful combination of Python and pt-archiver

https://blog.csdn.net/n88lpo/article/details/78905528?utm_medium=distribute.pc_relevant.none-task-blog-baidulandingword-14&spm=1001.2101.3001.4242

2.MySQL Data Archiving Tool

https://github.com/dbarun/mysql_archiver#readme

3. Install MySQLdb and basic operations under Linux

https://www.cnblogs.com/blogsme/archive/2012/12/12/2814588.html

4.pt tool pt-archiver

https://www.cnblogs.com/yhq1314/p/10601801.html

5. Installation steps and detailed usage of pip under Linux

https://www.jb51.net/article/118035.htm

6. Summary of Percona-Toolkit's pt-archiver

https://www.cnblogs.com/dbabd/p/10721857.html

This is the end of this article about the detailed explanation of the MySQL data archiving tool mysql_archiver. For more relevant MySQL data archiving content, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Steps to install MySQL on Windows using a compressed archive file
  • Solve the problem of Oracle database archive logs occupying disk space
  • Database ORA-01196 failure - detailed explanation of archive log loss recovery
  • Detailed explanation of SQL Server script for automatic circular archiving of partition data
  • Oracle's archive mode Common commands for ORACLE database archive logs

<<:  Docker connects to a container through a port

>>:  Markup language - CSS layout

Recommend

Solution to the Docker container cannot be stopped and deleted

Find the running container id docker ps Find the ...

Steps for docker container exit error code

Sometimes some docker containers exit after a per...

Vue sample code for implementing two-column horizontal timeline

Table of contents 1. Implement the component time...

Detailed explanation of pid and socket in MySQL

Table of contents 1. Introduction to pid-file 2.S...

MySQL merge and split by specified characters example tutorial

Preface Merging or splitting by specified charact...

A brief discussion on how to write beautiful conditional expressions in JS

Table of contents Multiple conditional statements...

Linux uses suid vim.basic file to achieve privilege escalation

Reproduce on Kali First set suid permissions for ...

Docker port mapping and external inaccessibility issues

The Docker container provides services and listen...

HTML table markup tutorial (48): CSS modified table

<br />Now let's take a look at how to cl...

v-for directive in vue completes list rendering

Table of contents 1. List traversal 2. The role o...

Docker container exits after running (how to keep running)

Phenomenon Start the Docker container docker run ...

Mysql NULL caused the pit

Using NULL in comparison operators mysql> sele...