MySQL database deletes duplicate data and only retains one method instance

MySQL database deletes duplicate data and only retains one method instance

1. Problem introduction

Assume a scenario where a user table contains 3 fields. id, identity_id, name. Now there are a lot of duplicate data for ID number identity_id and name name, which need to be deleted and only one valid data is retained.

2. Simulation environment

1. Log in to the MySQL database and create a separate test database mysql_exercise

create database mysql_exercise charset utf8;

2. Create user table users

create table users(
					id int auto_increment primary key,
					identity_id varchar(20),
					name varchar(20) not null
     );

3. Insert test data

insert into users values(0,'620616199409206512','张三'),
						(0,'620616199409206512','张三'),
						(0,'62062619930920651X','Li Si'),
						(0,'62062619930920651X','Li Si'),
						(0,'620622199101206211','王五'),
						(0,'620622199101206211','王五'),
						(0,'322235199909116233','赵六');

You can execute it several times to generate more duplicate data.

4. Solution

(1) Group by ID number and name;

(2) Get the maximum id (or minimum id) after grouping;

(3) Delete all fields except the maximum (or minimum) id;

5. First attempt (failed!!!)

delete from users where id not in (select max(id) from users group by identity_id,name);

Error:

1093 (HY000): You can't specify target table 'users' for update in FROM clause

Because in MYSQL, you cannot select the records of a table first, and then update or delete the records of the same table according to the same conditions.

The solution is to select the result obtained by select again through the intermediate table, so as to avoid the error.

This problem only occurs in MySQL, not in MSSQL and Oracle.

So we can take out the SQL statement in the brackets first and find the maximum (or minimum) ID first.

select max_id from (select max(id) as max_id from users group by identity_id,name);

Then, another error was reported! ! !

ERROR 1248 (42000): Every derived table must have its own alias

This means: the prompt says that each derived table must have its own alias!

When executing a subquery, the outer query will treat the inner query as a table, so we need to add an alias to the inner query

Continue to correct:

Treat the maximum (or minimum id) result found in the query as a new table, give it the alias t, and query t.mix_id.

select t.max_id from (select max(id) as max_id from users group by identity_id,name) as t;

The maximum (or minimum) id can be successfully found, as shown below:

6. Second attempt (success!!!)

delete from users where id not in (
		select t.max_id from 
		(select max(id) as max_id from users group by identity_id,name) as t
		);

Execution Result:

The duplicate data is deleted successfully, and only the last added record is retained. Similarly, you can also keep the first added record (that is, delete all records except the smallest id in each group)

3. Knowledge expansion 1: Update data

Other scenarios: To change the status of users whose name is an empty string ("") in the user table user_info to "0"

update user_info set status='0' where user_id in (select user_id from user_info where name='')

The following error was also reported:

You can't specify target table 'user_info' for update in FROM clause

Because in MYSQL, you cannot select the records of a table first, and then update or delete the records of the same table according to the same conditions. The solution is to select the results obtained through the intermediate table again, thus avoiding errors.
Both of the following are acceptable! ! !

update user_info set status='0' where user_id in 
	 (select user_id from (select user_id from user_info where name = '') t1);

The following example is also acceptable, with a slight difference: the alias can have as or not, and t1.user_id can directly correspond to the inner user_id.

update user_info set status='0' where user_id in 
	(select t1.user_id from (select user_id from user_info where name='') as t1);

3.1 Step-by-step analysis

(1) Use the following query results as the intermediate table:

select user_id from user_info where name='';

(2) Query the intermediate table again as the result set:

select user_id from (select user_id from user_info where name='') as t;

(3) Update data

update user_info set status='0' where user_id in 
	(select user_id from (select user_id from user_info where name='') as t1);

4. Extension Exercise: Deleting Duplicate Data

Write a SQL query to delete all duplicate email addresses in the Person table, and keep only the one with the smallest ID.

+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
| 3 | [email protected] |
+----+------------------+

Id is the primary key of this table.

For example, after running your query, the Person table above should return the following rows:

+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
+----+------------------+

Answer 1:

delete from Person where Id not in (
	select t.min_id from (
		select min(Id) as min_id from Person group by Email
		) as t
	);

Answer 2:

delete p1 from 
	Person as p1, Person as p2 
		where p1.Email=p2.Email and p1.Id > p2.Id;

Summarize

This concludes the article on how to delete duplicate data in MySQL database and keep only one. For more information on deleting duplicate data in MySQL, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MyBatis batch insert/modify/delete MySql data
  • Implementation of mysql using mysqlbinlog command to restore accidentally deleted data
  • How to recover data after accidentally deleting ibdata files in mysql5.7.33
  • How to implement batch deletion of large amounts of data in MySQL large tables
  • Why MySQL does not recommend deleting data
  • Python script to batch delete tens of millions of data in MySQL
  • Mysql delete data and data table method example
  • Why the table file size remains unchanged after deleting data in MySQL
  • Detailed explanation of several practical solutions for quickly deleting large amounts of data (tens of millions) in MySQL
  • Why the disk space is not released after deleting data in MySQL

<<:  Achieve 3D flip effect with pure CSS3 in a few simple steps

>>:  Solution to the docker command exception "permission denied"

Recommend

Summarize how to optimize Nginx performance under high concurrency

Table of contents Features Advantages Installatio...

W3C Tutorial (16): Other W3C Activities

This section provides an overview of some other i...

Summary of events that browsers can register

Html event list General Events: onClick HTML: Mous...

Detailed tutorial on installing Docker on Windows

Since my local MySQL version is relatively low, I...

Install CentOS 7 on VMware14 Graphic Tutorial

Introduction to CentOS CentOS is an enterprise-cl...

Several CSS3 tag shorthands (recommended)

border-radius: CSS3 rounded corners Syntax: borde...

A brief discussion on MySQL count of rows

We are all familiar with the MySQL count() functi...

How to install pyenv under Linux

Prerequisites Need to install git Installation St...

Detailed explanation of screen command usage in Linux

GUN Screen: Official website: http://www.gnu.org/...

Detailed explanation of Vue filter implementation and application scenarios

1. Brief Introduction Vue.js allows you to define...

How to use Vue3 mixin

Table of contents 1. How to use mixin? 2. Notes o...

Native JS implementation of loading progress bar

This article shares a dynamic loading progress ba...