Table of contents- I. Overview
- 2. pt-archiver main parameters
- 3. Common errors in installing and deploying mysql_archiver
- IV. Main reference materials
I. Overview There are three main ways to archive historical data in MySQL database: 1. Create and write SP and set Event; 2. Import and export through dump; 3. Archive through pt-archiver tool. The first method is often limited by the same instance requirements and is often abandoned. The second type has relatively better performance, but operation and maintenance is also a headache when there are many archive tables. Therefore, many DBAs often use the third method - pt-archiver. pt-archiver is a component of the Percona-Toolkit toolset, and is a tool mainly used to archive and clear MySQL table data. It can archive the data into another table or a file. pt-archiver does not affect the query performance of OLTP transactions during the process of clearing table data. For archiving of data, it can be archived to another table on another server or to a file. A netizen developed pt-archiver into a small tool - mysql_archiver through Python. It basically realizes the configurable deployment of archiving jobs and is very good to use. "MySQL_archiver basically realizes the automatic operation of data archiving, unified archiving task scheduling management, automatic monitoring and early warning, and automatic report generation. To a certain extent, it saves productivity and improves operation and maintenance efficiency." GitHub address: https://github.com/dbarun/mysql_archiver 2. pt-archiver main parameters parameter | default value | Parameter Explanation | --check-slave-lag | | Specifies to pause archiving operations after the master-slave replication lag exceeds the value specified by option '--max-lag'. By default, the tool checks all slave libraries, but this option only works on the specified slave library (connected via DSN). | --check-interval | 1s | If the option '--check-slave-lag' is specified at the same time, the time specified by this option is the time the tool pauses when it finds a master-slave replication delay. This check is performed every 100 rows. | --[no]check-charset | yes | Specifies to check to ensure that the database character set at the time of the connection is the same as the table character set. | --commit-each | | Specifies to commit as many rows as possible per fetch and archive. This option disables the '--txn-size' option. After each acquisition and archiving of table data, before the next acquisition of data and the sleep time specified by option '--sleep', the transaction is committed and the file specified by option '--file' is refreshed. The size of the transaction is controlled by option '--limit'. | --host, -h | | Specify the database IP address to connect to. | --port, -P | | Specify the database port to connect to. | --user, -u | | Specify the database user for the connection. | --password, -p | | Specify the database user password for the connection. | --socket, -S | | Specifies to use a SOCKET file to connect. | --databases,-d | | Specify the database to connect to | --source | | Specifies the table that needs to be archived. This option is required and is expressed in DSN format. | --dest | | Specify the target table to be archived, using DSN. If this option is not specified, the default is the same source table as specified by option '--source'. | --where | | Specifies the data to be archived through the WHERE conditional statement. This option is a required option. There is no need to add the 'WHERE' keyword. If you do not need a WHERE condition to restrict, specify '--where 1=1'. | --file | | Specify the file to which the table data needs to be archived. Use a format similar to MySQL DATE_FORMAT(). The file contents use the same format as the SELECT INTO OUTFILE statement in MySQL. The file naming options are as follows: ' %Y: Year, numeric, four digits %m: Month, numeric (01..12) %d: Day of the month, numeric (01..31) %H: Hour (00..23) %i: Minutes, numeric (00..59) %s: Seconds (00..59) %D: Database name %t: Table name For example: --file '/var/log/archive/%Y-%m-%d-%D.%t' ' | --output-format | | Specifies the format of the file content output by option '--file'. By default, if this option is not specified, the tab character is used as the field separator. If this option is specified, ',' (comma) is used as the field separator and '"' (double quotes) are used to enclose the fields. Usage example: '--output-format=dump'. | --limit | 1 | Specifies the number of rows to retrieve from tables and archive tables per statement. | --max-lag | 1s | Specifies the maximum allowed delay time for master-slave replication, in seconds. If the master-slave delay exceeds the specified value after each row data is obtained, the archiving operation will be suspended, and the suspension sleep time is the value specified by the option '--check-interval'. After the sleep time is over, check the master-slave delay again. The checking method is to determine the 'Seconds_Behind_Master' value queried from the database. If the master-slave replication delay is always greater than the value specified by this parameter or the slave stops replicating, the operation will wait until the slave is restarted and the delay is less than the value specified by this parameter. | --no-delete | | Specifies not to delete archived table data. | --progress | | Specify how many lines to print progress information, the current time, the elapsed time, and how many lines to archive. | --purge | | Specifies that a purge operation be performed instead of an archive operation. It allows to ignore options '--dest' and '--file'. If it is just a clearing operation, it will be more efficient to combine it with option '--primary-key-only'. | --replace | | When the write option '--dest' is specified to specify the destination table, the INSERT statement is rewritten as a REPLACE statement. | --retries | 1 | Specifies the number of retries to be performed if an archive operation encounters a deadlock or timeout. When the number of retries exceeds the value specified by this option, the tool will exit with an error. | --run-time | | Specifies how long the tool archive operation needs to run before exiting. The allowed time suffixes are s=seconds, m=minutes, h=hours, d=days. If not specified, the default is s. | --[no]safe-auto-increment | yes | Specifies not to archive rows corresponding to the maximum value of the AUTO_INCREMENT column. This option adds an additional WHERE clause when performing archive and purge to prevent the tool from deleting data rows with the maximum value of the AUTO_INCREMENT attribute for a single-column ascending field, so that the corresponding value of AUTO_INCREMENT can still be used after the database is restarted, but this will cause the row corresponding to the maximum value of the field to be unable to be archived or purged. | --sleep | | Specifies the time the tool needs to sleep when obtaining archived data through the SELECT statement. The default value is no sleep. The transaction will not be committed before sleeping, and the file specified by option '--file' will not be flushed. If option '--commit-each' is specified, transaction commit and file flush will be done before sleeping. | --statistics | | Specifies that the tool collect and print timing statistics for operations. | --txn-size | 1 | Specifies the number of rows to process per transaction. If it is 0, transactions are disabled. | --why-quit | | Specifies that the tool prints the reason when it exits due to a failure to complete the number of rows to archive. This option is very convenient to use together with option '--run-time' when executing an automatic archiving task, so that you can determine whether the archiving task is completed within the specified time. If option '--statistics' is also specified, the reasons for all exits are printed. | --skip-foreign-key-checks | | Specify to disable foreign key checks using the statement SET FOREIGN_KEY_CHECKS = 0. |
3. Common errors in installing and deploying mysql_archiver 1. Execute python db_archive_exec.py 127.0.0.1 db123 and an error is reported Traceback (most recent call last): File "/data/mysql_archiver/mysql_archiver-master/db_archive_exec.py", line 7, in <module> import db_conn File "/data/mysql_archiver/mysql_archiver-master/db_conn.py", line 4, in <module> import MySQLdb ImportError: No module named MySQLdb
Solution: Step 1 Basic Environment
rm -rf /etc/yum.repos.d/epel*
yum install postgresql-devel python3-devel postgresql-devel python-devel Step 2: Install setuptools
(1) Download setuptools-39.2.0.zip
(2) Unzip and install python setup.py build
python setup.py install Step 3 Install pip
(1) Download pip-1.5.4.tar.gz
(2) Install python setup.py install Step 4 Install MySQL-python 2. Report an error about missing pt-archiver tool
sh: pt-archiver: command not found Solution: Download percona-toolkit-3.2.1_x86_64.tar.gz, install it, and set up a soft link. For example:
ln -s /data/mysql_archiver/percona-toolkit-3.2.1/bin/pt-archiver /usr/local/bin/pt-archiver 3. Lack of documentation
<type 'exceptions.Exception'>
[Errno 2] No such file or directory: u'/software/python_script/db_archive_XXXXdb_XXXtablename.log' Solution: Create a file: /software/python_script 4. Character set problem Character set mismatch: --source DSN uses utf8, table uses utf8mb4. You can disable this check by specifying --no-check-charset.
Solution: Modify the db_archive_exec.py executable file and add the parameter --no-check-charset for pt-archiver 5. Parameter setting error
Usage: pt-archiver [OPTIONS] --source DSN --where WHERE
Errors in command-line arguments:
* --txn-size and --commit-each are mutually exclusive.
pt-archiver nibbles records from a MySQL table. The --source and --dest
arguments use DSN syntax; if COPY is yes, --dest defaults to the key's value
from --source. For more details, please use the --help option, or try 'perldoc
/usr/local/bin/pt-archiver' for complete documentation. Solution: pt-archiver removes the parameter --commit-each 6.Cannot find encoding "utf8mb4" Error message: Cannot find encoding "utf8mb4" at /usr/local/bin/pt-archiver line 6711.

Solution: pt-archiver removes the parameters --bulk-insert --bulk-delete IV. Main reference materials 1. MySQL data archiving practice - the powerful combination of Python and pt-archiver https://blog.csdn.net/n88lpo/article/details/78905528?utm_medium=distribute.pc_relevant.none-task-blog-baidulandingword-14&spm=1001.2101.3001.4242 2.MySQL Data Archiving Tool https://github.com/dbarun/mysql_archiver#readme 3. Install MySQLdb and basic operations under Linux https://www.cnblogs.com/blogsme/archive/2012/12/12/2814588.html 4.pt tool pt-archiver https://www.cnblogs.com/yhq1314/p/10601801.html 5. Installation steps and detailed usage of pip under Linux https://www.jb51.net/article/118035.htm 6. Summary of Percona-Toolkit's pt-archiver https://www.cnblogs.com/dbabd/p/10721857.html This is the end of this article about the detailed explanation of the MySQL data archiving tool mysql_archiver. For more relevant MySQL data archiving content, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:- Steps to install MySQL on Windows using a compressed archive file
- Solve the problem of Oracle database archive logs occupying disk space
- Database ORA-01196 failure - detailed explanation of archive log loss recovery
- Detailed explanation of SQL Server script for automatic circular archiving of partition data
- Oracle's archive mode Common commands for ORACLE database archive logs
|