MySQL Quick Data Comparison Techniques

MySQL Quick Data Comparison Techniques

In MySQL operation and maintenance, a R&D colleague wants to compare the data on two different instances and find the differences. In addition to the primary key, every field needs to be compared. How to do it?

The first solution is to write a program to extract each row of data from the two instances for comparison. This is theoretically feasible, but the comparison time is long.

The second solution is to merge all the fields in each row of data, take the checksum value, and then compare according to the checksum value. This seems feasible, so give it a try.

First, we need to merge the values ​​of all fields and use the CONCAT function provided by MySQL. If the CONCAT function contains NULL values, the final result will be NULL. Therefore, we need to use the IFNULL function to replace the NULL values, such as:

CONCAT(IFNULL(C1,''),IFNULL(C2,''))

The table you are joining has many rows, and it is tiring to manually compose a script. Don't worry, you can use information_schema.COLUMNS to handle it:

## Get the concatenated string of column names SELECT
GROUP_CONCAT('IFNULL(',COLUMN_NAME,','''')')
FROM information_schema.COLUMNS 
WHERE TABLE_NAME='table_name';

Assume we have test table:

CREATE TABLE t_test01
(
 id INT AUTO_INCREMENT PRIMARY KEY,
 C1 INT,
 C2 INT
)

We can then splice out the following SQL:

SELECT
id,
MD5(CONCAT(
IFNULL(id,''),
IFNULL(c1,''),
IFNULL(c2,''),
)) AS md5_value
FROM t_test01

Execute it on two instances, and then compare the results using Beyond Compare. It is easy to find the different rows and primary key IDs.

For tables with large amounts of data, the result set is also large and difficult to compare. So first try to reduce the result set. You can combine the md5 values ​​of multiple rows to calculate the MD5 value. If the final MD5 values ​​are the same, then these rows are the same. If they are different, it proves that there are differences. Then compare these rows line by line.

Assume that we compare in groups of 1,000 rows. If we need to merge the grouped results, we need to use the GROUP_CONCAT function. Note that we need to add sorting in the GROUP_CONCAT function to ensure the order of the merged data. The SQL is as follows:

SELECT
min(id) as min_id,
max(id) as max_id,
count(1) as row_count,
MD5(GROUP_CONCAT(
MD5(CONCAT(
IFNULL(id,''),
IFNULL(c1,''),
IFNULL(c2,''),
)) ORDER BY id
))AS md5_value
FROM t_test01
GROUP BY (id div 1000)

The execution results are:

min_id max_id row_count md5_value
0 999 1000 7d49def23611f610849ef559677fec0c
1000 1999 1000 95d61931aa5d3b48f1e38b3550daee08
2000 2999 1000 b02612548fae8a4455418365b3ae611a
3000 3999 1000 fe798602ab9dd1c69b36a0da568b6dbb

When there are fewer different data, even if we need to compare tens of millions of data, we can easily locate the 1,000 data with differences based on min_id and max_id, and then compare the MD5 values ​​line by line to finally find the different rows.

Final comparison chart:

PS:

When using GROUP_CONCAT, you need to configure the MySQL variable group_concat_max_len. The default value is 1024, and the excess will be staged.

You may also be interested in:
  • MySQL 5.7.20 common download, installation and configuration methods and simple operation skills (decompression version free installation)
  • How to use Java Web to connect to MySQL database
  • How to use tcpdump to capture packets for mysql
  • Detailed explanation of 30 SQL query optimization techniques for MySQL tens of millions of large data
  • Mysql optimization techniques for querying dates based on time
  • 10 SQL statement optimization techniques to improve MYSQL query efficiency
  • Summary of common problems and application skills in MySQL
  • 5 Tips for Protecting Your MySQL Data Warehouse
  • Share 101 MySQL debugging and optimization tips
  • MySql Sql optimization tips sharing
  • Summary of MySQL injection bypass filtering techniques
  • Summary of common operation skills of MySQL database

<<:  How to implement nginx smooth restart

>>:  JS Decorator Pattern and TypeScript Decorators

Recommend

Analysis of the methods of visual structure layout design for children's websites

1. Warm and gentle Related address: http://www.web...

Getting the creation time of a file under Linux and a practical tutorial

background Sometimes we need to get the creation ...

MySQL uses aggregate functions to query a single table

Aggregate functions Acts on a set of data and ret...

Detailed explanation of Vue filters

<body> <div id="root"> <...

How to maintain MySQL indexes and data tables

Table of contents Find and fix table conflicts Up...

Markup Language - Phrase Elements

Click here to return to the 123WORDPRESS.COM HTML ...

MySQL helps you understand index pushdown in seconds

Table of contents 1. The principle of index push-...

A brief talk about React Router's history

If you want to understand React Router, you shoul...

Solution to garbled display of Linux SecureCRT

Let's take a look at the situation where Secu...

Vue implements image dragging and sorting

This article example shares the specific code of ...

JavaScript to achieve JD.com flash sale effect

This article shares the specific code of JavaScri...

An article to deal with Mysql date and time functions

Table of contents Preface 1. Get the current time...

How to use ElementUI pagination component Pagination in Vue

The use of ElementUI paging component Pagination ...

How to implement digital paging effect code and steps in CSS

A considerable number of websites use digital pagi...