MySQL deduplication methods 【Beginner】There are very few repeated lines Use distinct to find them out, and then delete them manually one by one. [Intermediate] Remove duplicates based on a single field For example: remove duplicates from the id field How to use: Get the value of the repeated field of id, compare the fields with different data in the rows with the same id field, and delete all repeated rows except the row with the smallest (or largest) field. Generally, the primary key is used for comparison because the value of the primary key must be unique and absolutely different. id name 1 a 1 b 2 c 2 a 3 c result: id name 1 a 2 a operate: delete from a_tmp where id in (select * from (select b.id from a_tmp b group by b.id having count(b.id) >1) bb) and name not in (select * from (select min(a.name) from a_tmp a GROUP BY a.id having count(a.id) >1) aa); Notice: The bold and green words above must be aliased and must use the select * from (……) format, otherwise an error will be reported: [Err] 1093 - You can't specify target table 'a_tmp' for update in FROM clause [Advanced] Remove duplicates by repetition of multiple fields For example, to remove duplicates of the same id and name, that is, to count the same id and name as duplicate rows, and to count the same id but different names as non-duplicate rows Usage: Similar to a single field, the primary key is generally used for comparison because the value of the primary key must be unique. id name rowid 1a1 1 a 2 1 b 3 2 b 4 2 b 5 3 c 6 3d7 result: id name rowid 1a1 1 b 3 2 b 4 3 c 6 3d7 operate: The first one: delete from a_tmp where (id,name) in (select * from (select b.id,b.name from a_tmp b group by b.id,b.name having count(b.id) >1) bb) and rowid not in (select * from (select min(a.rowid) from a_tmp a group by a.id,a.name having count(a.id) >1) aa); Second type: Connect the values of the id and name fields and insert them into the temporary table b_tmp. Then you can use the [Intermediate] single field judgment and deletion method. #Insert the value of the two fields connected and the field with the unique value in the a_tmp table into the b_tmp table insert into b_tmp select concat(id,name),rowid from a_tmp; #Find out the rows that need to be kept select id_name,max(rowid) from b_tmp group by id_name having count(id_name)>1; #Use the [Intermediate] method or stored procedure to complete the deduplication work [Ultimate] Each row has two copies of the same data For example: How to use: If the data in the entire row is the same, it is impossible to use SQL statements to delete it, because there is no conditional restriction that can be used to keep one row and delete all the identical rows. There are no different fields that can be created by yourself, that is: add a field, set it to auto-increment, and set it as the primary key, it will automatically add values. id name 1 a 1 a 1 b 1 b 2 c 2 c 3 c 3 c result: id name rowid 1a1 1 b 3 2 c 5 3 c 7 operate: Add an auto-increment field and temporarily set it as the primary key. Use the [Intermediate] and [Advanced] methods above. Thank you for reading, I hope it can help you, thank you for your support of this site! You may also be interested in:
|
<<: CentOS7 deployment Flask (Apache, mod_wsgi, Python36, venv)
>>: Detailed explanation of Docker+Jenkins+Gitlab+Django application deployment practice
[LeetCode] 177.Nth Highest Salary Write a SQL que...
In actual project development, if we have a lot o...
I saw that Taobao’s webpage uses import, while man...
This article shares the specific code for the WeC...
GitHub address, you can star it if you like it Pl...
Data backup and restore part 3, details are as fo...
Table of contents Preface Benefits of axios encap...
Table of contents Overview What is Image Compress...
MySQL installation is relatively simple, usually ...
I encountered a sql problem at work today, about ...
1. nohup Run the program in a way that ignores th...
The difference between run and start in docker Do...
Tag type (display mode) HTML tags are generally d...
Update: Recently, it was discovered that the serv...
Preface A character set is a set of symbols and e...