How to query duplicate data in mysql table

How to query duplicate data in mysql table
INSERT INTO hk_test(username, passwd) VALUES
('qmf1', 'qmf1'),('qmf2', 'qmf11')
 
delete from hk_test where username='qmf1' and passwd='qmf1'

Query duplicate data records in the table in MySQL:

First view the repeated raw data:

Scenario 1: List the data with repeated reads in the username field

select username,count(*) as count from hk_test group by username having count>1;
 
SELECT username,count(username) as count FROM hk_test GROUP BY username HAVING count(username) >1 ORDER BY count DESC;

This method only counts the specific number of repetitions of the field.

Scenario 2: List the specific information of duplicate records in the username field:

select * from hk_test where username in (select username from hk_test group by username having count(username) > 1)
 
SELECT username,passwd FROM hk_test WHERE username in ( SELECT username FROM hk_test GROUP BY username HAVING count(username)>1)
 
However, this statement is too inefficient in MySQL. It feels like MySQL does not generate a temporary table for the subquery. When the amount of data is large, it takes a long time

Solution:

So create a temporary table first using create table `tmptable` as ( 
SELECT `name` 
FROM `table` 
GROUP BY `name` HAVING count(`name`) > 1 
); 
 
Then use multi-table join query SELECT a.`id`, a.`name` 
FROM `table` a, `tmptable` t 
WHERE a.`name` = t.`name`; 
 
This time the results came out very quickly.
 
Use distinct to remove duplicates SELECT distinct a.`id`, a.`name` 
FROM `table` a, `tmptable` t 
WHERE a.`name` = t.`name`;

Scenario 3: View records with duplicate fields: For example, there are duplicate records in both username and password fields:

select * from hk_test a
where (a.username,a.passwd) in (select username,passwd from hk_test group by username,passwd having count(*) > 1)

Scenario 4: Query records with multiple fields repeated at the same time in the table:

select username,passwd,count(*) from hk_test group by username,passwd having count(*) > 1 

How to query and delete duplicate records in MySQL query table (Part 1)
1. Find redundant duplicate records in the table. Duplicate records are determined based on a single field (peopleId) select * 
from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId)>1)
 
2. Delete the redundant duplicate records in the table. Duplicate records are determined based on a single field (peopleId). Only one record is left. delete from people
where peopleId in (select peopleId 
from people group by peopleId having count(peopleId)>1)
and min(id) not 
in (select id from people group by peopleId having count(peopleId)>1)
 
3. Find redundant duplicate records in the table (multiple fields)
select * from vitae a
where (a.peopleId,a.seq) in 
(select peopleId,seq from vitae group by peopleId,seq having count(*)>1)
 
4. Delete redundant duplicate records (multiple fields) in the table, leaving only the record with the smallest rowid delete from vitae a
where 
(a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq 
having count(*) > 1)
and rowid not in (select min(rowid) from vitae group 
by peopleId,seq having count(*)>1)
 
5. Find redundant duplicate records (multiple fields) in the table, excluding the record with the smallest rowid select * from vitae a
where 
(a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq 
having count(*) > 1)
and rowid not in (select min(rowid) from vitae group 
by peopleId,seq having count(*)>1)
 
(two)
For example, there is a field "name" in table A, and the "name" values ​​between different records may be the same. Now we need to find out the items with duplicate "name" values ​​between the records in the table.
Select Name,Count(*) From A Group By Name Having Count(*) > 1
If the gender is the same, the results are as follows:
Select Name,sex,Count(*) From A Group By Name,sex Having Count(*) > 1
 
(three)
Method 1: declare @max integer, @id integer
declare cur_rows cursor local for select primary field, count(*) from table name group by primary field having count(*) >; 1
open cur_rows
fetch cur_rows into @id,@max
while @@fetch_status=0
begin
select @max = @max -1
set rowcount @max
delete from table name where primary field = @id
fetch cur_rows into @id,@max
end
close cur_rows
set rowcount 0 

SELECT * from tab1 where CompanyName in( SELECT companyname from tab1 GROUP BY CompanyName HAVING COUNT(*)>1);
-- 129.433ms 
 
SELECT * from tab1 INNER join ( SELECT companyname from tab1 GROUP BY CompanyName HAVING COUNT(*)>1) as tab2 USING(CompanyName);
-- 0.482ms
 
Method 2 has duplicate records in two senses: one is completely duplicate records, that is, records with all fields repeated; the other is records with some key fields repeated, such as the Name field is repeated, while other fields may not be repeated or all repeated can be ignored.
 
  1. For the first type of duplication, it is easier to solve. Use select distinct * from tableName
 
  You can get a result set without duplicate records.
 
  If the table needs to delete duplicate records (retain only one duplicate record), you can delete it as follows: select distinct * into #Tmp from tableName
 
drop table tableName
 
select * into tableName from #Tmp
 
drop table #Tmp
 
  This duplication occurs due to poor table design and can be resolved by adding a unique index column.
 
2. This type of duplication problem usually requires retaining the first record of the duplicate records. The operation method is as follows: Assume that the duplicate fields are Name and Address, and you need to get a unique result set for these two fields. select identity(int,1,1) as autoID, * into #Tmp from tableName
 
select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoID
 
select * from #Tmp where autoID in(select autoID from #tmp2)
 
The last select returns a result set with unique Name and Address (but with an additional autoID field, which can be omitted in the select clause when writing).
 
(IV) Query repeated select * from tablename where id in (
 
select id from tablename group by id having count(id) > 1)

Commonly used statements 1. Find redundant duplicate records in the table. Duplicate records are determined based on a single field (mail_id). The code is as follows: Copy code SELECT * FROM table WHERE mail_id IN (SELECT mail_id FROM table GROUP BY mail_id HAVING COUNT(mail_id) > 1);
 
 
2. Delete redundant duplicate records in the table. Duplicate records are determined based on a single field (mail_id). Only the record with the smallest rowid is retained. The code is as follows: DELETE FROM table WHERE mail_id IN (SELECT mail_id FROM table GROUP BY mail_id HAVING COUNT(mail_id) > 1) AND rowid NOT IN (SELECT MIN(rowid) FROM table GROUP BY mail_id HAVING COUNT(mail_id )>1);
 
 
3. Find redundant duplicate records in the table (multiple fields)
 
 The code is as follows Copy code SELECT * FROM table WHERE (mail_id,phone) IN (SELECT mail_id,phone FROM table GROUP BY mail_id,phone HAVING COUNT(*) > 1);
 
 
4. Delete redundant duplicate records (multiple fields) in the table and keep only the record with the smallest rowid. The code is as follows: DELETE FROM table WHERE (mail_id,phone) IN (SELECT mail_id,phone FROM table GROUP BY mail_id,phone HAVING COU(www.jb51.net)NT(*) > 1) AND rowid NOT IN (SELECT MIN(rowid) FROM table GROUP BY mail_id,phone HAVING COUNT(*)>1);
 
 
5. Find redundant duplicate records (multiple fields) in the table, excluding the record with the smallest rowid. The code is as follows: SELECT * FROM table WHERE (a.mail_id,a.phone) IN (SELECT mail_id,phone FROM table GROUP BY mail_id,phone HAVING COUNT(*) > 1) AND rowid NOT IN (SELECT MIN(rowid) FROM table GROUP BY mail_id,phone HAVING COUNT(*)>1);
 
 
Stored procedure declare @max integer,@id integer
 
declare cur_rows cursor local for select primary field, count(*) from table name group by primary field having count(*) >; 1
 
open cur_rows
 
fetch cur_rows into @id,@max
 
while @@fetch_status=0
 
begin
 
select @max = @max -1
 
set rowcount @max
 
delete from table name where primary field = @id
 
fetch cur_rows into @id,@max
 
end
 
close cur_rows
 
set rowcount 0
 
 
 
(I) Single field 1. Find redundant duplicate records in the table and judge according to the (question_title) field. The code is as follows: Copy code select * from questions where question_title in (select question_title from people group by question_title having count(question_title) > 1)
 
 
2. Delete the redundant duplicate records in the table. According to the (question_title) field, only one record is left. The code is as follows: Copy the code delete from questions
where peopleId in (select peopleId from people group by peopleId having count(question_title) > 1)
and min(id) not in (select question_id from questions group by question_title having count(question_title)>1) 
 
(II) Multiple fields Delete redundant duplicate records (multiple fields) in the table, leaving only the record with the smallest rowid. The code is as follows: Copy code DELETE FROM questions WHERE (questions_title,questions_scope) IN (SELECT questions_title,questions_scope FROM questions GROUP BY questions_title,questions_scope HAVING COUNT(*) > 1) AND question_id NOT IN (SELECT MIN(question_id) FROM questions GROUP BY questions_scope,questions_title HAVING COUNT(*)>1)
 
 
The above statement cannot be used to delete the data. A temporary table must be created before it can be deleted. Could you please explain this to me?
 
 The code is as follows. Copy the code CREATE TABLE tmp AS SELECT question_id FROM questions WHERE (questions_title,questions_scope) IN (SELECT questions_title,questions_scope FROM questions GROUP BY questions_title,questions_scope HAVING COUNT(*) > 1) AND question_id NOT IN (SELECT MIN(question_id) FROM questions GROUP BY questions_scope,questions_title HAVING COUNT(*)>1);
 
DELETE FROM questions WHERE question_id IN (SELECT question_id FROM tmp);
 
DROP TABLE tmp;

Find duplicate records in mysql table
There is more and more data in the MySQL database, and of course duplicate data cannot be eliminated. When maintaining the data, I suddenly thought of deleting the redundant data and leaving only the valuable data.

The following SQL statement can find all duplicate records in a table.
select user_name,count(*) as count from user_table group by user_name having count>1;

Parameter Description:

user_name is the repeated field to be searched.

Count is used to determine whether the number is greater than one and whether it is repeated.

user_table is the name of the table to be searched.

group by is used to group

having is used to filter.

Replace the parameters with the corresponding field parameters of your own data table. You can first run it in Phpmyadmin or Navicat to see which data are repeated, and then delete them in the database. You can also directly put the SQL statement into the background page that reads news, read it out, and complete it into a list of duplicate data for query. If there are duplicates, you can delete them directly.

The effect is as follows:

Disadvantages: The disadvantage of this method is that when the amount of data in your database is large, the efficiency is very low. I used Navicat to test it. The amount of data was not large and the efficiency was very high. Of course, the website also has other SQL statements that repeat the query data. You can learn from this and study it carefully to find a query statement that suits your website.

You may also be interested in:
  • A complete guide on how to query and delete duplicate records in MySQL
  • How to randomly query several pieces of data in MySQL
  • mysql query statement from row to row
  • MySQL query continuous record method

<<:  In-depth explanation of the global status of WeChat applet

>>:  Detailed explanation of the mechanism and implementation of accept lock in Nginx

Recommend

How to implement scheduled backup of MySQL in Linux

In actual projects, the database needs to be back...

MySQL database advanced query and multi-table query

MySQL multi-table query Add a worksheet -- User t...

Summary of common tool examples in MySQL (recommended)

Preface This article mainly introduces the releva...

How to clean up Alibaba Cloud MySQL space

Today I received a disk warning notification from...

Detailed explanation of Vue configuration request multiple server solutions

1. Solution 1.1 Describing the interface context-...

How to force vertical screen on mobile pages

I recently wrote a mobile page at work, which was...

The correct way to migrate MySQL data to Oracle

There is a table student in the mysql database, i...

Solution to the problem that Navicat cannot remotely connect to MySql server

The solution to the problem that Navicat cannot r...

How to use VIM editor in Linux

As a powerful editor with rich options, Vim is lo...

Linux uses dual network card bond and screwdriver interface

What is bond NIC bond is a technology that is com...

CSS3 to achieve timeline effects

Recently, when I turned on my computer, I saw tha...

Detailed tutorial on installing nvidia driver + CUDA + cuDNN in Ubuntu 16.04

Preparation 1. Check whether the GPU supports CUD...

Detailed explanation of the execution order of JavaScript Alert function

Table of contents question analyze solve Replace ...

Vue implements infinite loading waterfall flow

This article example shares the specific code of ...

Implementation of vscode custom vue template

Use the vscode editor to create a vue template, s...