A practical record of checking and processing duplicate MySQL records on site

A practical record of checking and processing duplicate MySQL records on site

Preface

I was on a business trip at a customer site to deploy and debug software. One day, the customer suddenly came to me and said that there was an abnormality in the data produced on site. The most direct manifestation was that the same label appeared on multiple materials. They needed my cooperation to see how to troubleshoot the problem.

analyze

The customer put a pile of boxes with duplicate labels in front of me. I was so scared that I was afraid I was in big trouble.

After a little thought, what we need to do now is to query the database for duplicate tags, that is, count a tag and determine the ones with count > 1.

Emmm, grammatical error, I remember there was a "Having", let's try it instead

Wow, there are more than 1,500 duplicate tags. Let's count the total number of questions and group them to see the percentage of duplicate tags.

By the way, first throw these duplicate label data to the customer to trace the product (fortunately Navicat supports copying data)

Data Total

The result of the above query record is a temporary table. Based on this, sum() is used to sum

There are a lot of duplicate records, which is a bit of a big problem.

Repetition rate

Change the query method of the previous query table, that is, group the count data again

From the results, most of the problem data were repeated twice.

The difference between where and having

Where is a constraint statement that constrains the query conditions in the database before the query results are returned. That is, it takes effect before the results are returned, and aggregate functions cannot be used after where.

Having is a filtering statement. The so-called filtering is performed after the results of the database query are returned, that is, it takes effect after the results are returned, and the aggregation function can be used after having.

The so-called aggregate function is a function that calculates a set of values ​​and returns a single value: sum---sum, count---count, max---maximum value, avg---average value, etc.

Summarize

In the query operation of MySQL, we usually use multi-table joint query, use the query results as the intermediate table for multiple queries, group the query results, perform statistical aggregation and other operations

If function calculation or grouping is used in the query operation, the typical feature is to merge multiple records in the original table into one, and having is required to filter the results of these operations. In addition, use where to conditionally filter the table records.

This is the end of this article about the troubleshooting and processing of duplicate MySQL records on site. For more information about the troubleshooting and processing of duplicate MySQL records, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MySQL's method of dealing with duplicate data (preventing and deleting)
  • MySQL study notes on handling duplicate data
  • How to handle concurrent updates of MySQL data
  • Detailed explanation of MySQL execution principle, logical layering, and changing database processing engine
  • Some methods to optimize query speed when MySQL processes massive data
  • MySQL data processing sorting and explaining the operations of adding, deleting and modifying

<<:  CSS HACK for IE6/IE7/IE8/IE9/FF (summary)

>>:  Detailed explanation of CSS3 elastic expansion box

Recommend

Introduction to query commands for MySQL stored procedures

As shown below: select name from mysql.proc where...

js implements the pop-up login box by clicking the pop-up window

This article shares the specific code of js to re...

Design Theory: Text Legibility and Readability

<br />Not long ago, due to business needs, I...

ES6 loop and iterable object examples

This article will examine the ES6 for ... of loop...

How to use docker to deploy front-end applications

Docker is becoming more and more popular. It can ...

How to use the Marquee tag in XHTML code

In the forum, I saw netizen jeanjean20 mentioned h...

Specific use of MySQL window functions

Table of contents 1. What is a window function? 1...

MySQL sharding details

1. Business scenario introduction Suppose there i...

Recommended plugins and usage examples for vue unit testing

Table of contents frame First-class error reporti...

Docker win ping fails container avoidance guide

Using win docker-desktop, I want to connect to co...

Upgrade MySQL 5.1 to 5.5.36 in CentOS

This article records the process of upgrading MyS...