In MySQL operation and maintenance, a R&D colleague wants to compare the data on two different instances and find the differences. In addition to the primary key, every field needs to be compared. How to do it? The first solution is to write a program to extract each row of data from the two instances for comparison. This is theoretically feasible, but the comparison time is long. The second solution is to merge all the fields in each row of data, take the checksum value, and then compare according to the checksum value. This seems feasible, so give it a try. First, we need to merge the values of all fields and use the CONCAT function provided by MySQL. If the CONCAT function contains NULL values, the final result will be NULL. Therefore, we need to use the IFNULL function to replace the NULL values, such as: CONCAT(IFNULL(C1,''),IFNULL(C2,'')) The table you are joining has many rows, and it is tiring to manually compose a script. Don't worry, you can use information_schema.COLUMNS to handle it: ## Get the concatenated string of column names SELECT GROUP_CONCAT('IFNULL(',COLUMN_NAME,','''')') FROM information_schema.COLUMNS WHERE TABLE_NAME='table_name'; Assume we have test table: CREATE TABLE t_test01 ( id INT AUTO_INCREMENT PRIMARY KEY, C1 INT, C2 INT ) We can then splice out the following SQL: SELECT id, MD5(CONCAT( IFNULL(id,''), IFNULL(c1,''), IFNULL(c2,''), )) AS md5_value FROM t_test01 Execute it on two instances, and then compare the results using Beyond Compare. It is easy to find the different rows and primary key IDs. For tables with large amounts of data, the result set is also large and difficult to compare. So first try to reduce the result set. You can combine the md5 values of multiple rows to calculate the MD5 value. If the final MD5 values are the same, then these rows are the same. If they are different, it proves that there are differences. Then compare these rows line by line. Assume that we compare in groups of 1,000 rows. If we need to merge the grouped results, we need to use the GROUP_CONCAT function. Note that we need to add sorting in the GROUP_CONCAT function to ensure the order of the merged data. The SQL is as follows: SELECT min(id) as min_id, max(id) as max_id, count(1) as row_count, MD5(GROUP_CONCAT( MD5(CONCAT( IFNULL(id,''), IFNULL(c1,''), IFNULL(c2,''), )) ORDER BY id ))AS md5_value FROM t_test01 GROUP BY (id div 1000) The execution results are: min_id max_id row_count md5_value 0 999 1000 7d49def23611f610849ef559677fec0c 1000 1999 1000 95d61931aa5d3b48f1e38b3550daee08 2000 2999 1000 b02612548fae8a4455418365b3ae611a 3000 3999 1000 fe798602ab9dd1c69b36a0da568b6dbb When there are fewer different data, even if we need to compare tens of millions of data, we can easily locate the 1,000 data with differences based on min_id and max_id, and then compare the MD5 values line by line to finally find the different rows. Final comparison chart: PS: When using GROUP_CONCAT, you need to configure the MySQL variable group_concat_max_len. The default value is 1024, and the excess will be staged. You may also be interested in:
|
<<: How to implement nginx smooth restart
>>: JS Decorator Pattern and TypeScript Decorators
1. Warm and gentle Related address: http://www.web...
When insert into employee values(null,'張三'...
background Sometimes we need to get the creation ...
Aggregate functions Acts on a set of data and ret...
<body> <div id="root"> <...
Table of contents Find and fix table conflicts Up...
Click here to return to the 123WORDPRESS.COM HTML ...
Table of contents 1. The principle of index push-...
If you want to understand React Router, you shoul...
Let's take a look at the situation where Secu...
This article example shares the specific code of ...
This article shares the specific code of JavaScri...
Table of contents Preface 1. Get the current time...
The use of ElementUI paging component Pagination ...
A considerable number of websites use digital pagi...