MySQL database deletes duplicate data and only retains one method instance

MySQL database deletes duplicate data and only retains one method instance

1. Problem introduction

Assume a scenario where a user table contains 3 fields. id, identity_id, name. Now there are a lot of duplicate data for ID number identity_id and name name, which need to be deleted and only one valid data is retained.

2. Simulation environment

1. Log in to the MySQL database and create a separate test database mysql_exercise

create database mysql_exercise charset utf8;

2. Create user table users

create table users(
					id int auto_increment primary key,
					identity_id varchar(20),
					name varchar(20) not null
     );

3. Insert test data

insert into users values(0,'620616199409206512','张三'),
						(0,'620616199409206512','张三'),
						(0,'62062619930920651X','Li Si'),
						(0,'62062619930920651X','Li Si'),
						(0,'620622199101206211','王五'),
						(0,'620622199101206211','王五'),
						(0,'322235199909116233','赵六');

You can execute it several times to generate more duplicate data.

4. Solution

(1) Group by ID number and name;

(2) Get the maximum id (or minimum id) after grouping;

(3) Delete all fields except the maximum (or minimum) id;

5. First attempt (failed!!!)

delete from users where id not in (select max(id) from users group by identity_id,name);

Error:

1093 (HY000): You can't specify target table 'users' for update in FROM clause

Because in MYSQL, you cannot select the records of a table first, and then update or delete the records of the same table according to the same conditions.

The solution is to select the result obtained by select again through the intermediate table, so as to avoid the error.

This problem only occurs in MySQL, not in MSSQL and Oracle.

So we can take out the SQL statement in the brackets first and find the maximum (or minimum) ID first.

select max_id from (select max(id) as max_id from users group by identity_id,name);

Then, another error was reported! ! !

ERROR 1248 (42000): Every derived table must have its own alias

This means: the prompt says that each derived table must have its own alias!

When executing a subquery, the outer query will treat the inner query as a table, so we need to add an alias to the inner query

Continue to correct:

Treat the maximum (or minimum id) result found in the query as a new table, give it the alias t, and query t.mix_id.

select t.max_id from (select max(id) as max_id from users group by identity_id,name) as t;

The maximum (or minimum) id can be successfully found, as shown below:

6. Second attempt (success!!!)

delete from users where id not in (
		select t.max_id from 
		(select max(id) as max_id from users group by identity_id,name) as t
		);

Execution Result:

The duplicate data is deleted successfully, and only the last added record is retained. Similarly, you can also keep the first added record (that is, delete all records except the smallest id in each group)

3. Knowledge expansion 1: Update data

Other scenarios: To change the status of users whose name is an empty string ("") in the user table user_info to "0"

update user_info set status='0' where user_id in (select user_id from user_info where name='')

The following error was also reported:

You can't specify target table 'user_info' for update in FROM clause

Because in MYSQL, you cannot select the records of a table first, and then update or delete the records of the same table according to the same conditions. The solution is to select the results obtained through the intermediate table again, thus avoiding errors.
Both of the following are acceptable! ! !

update user_info set status='0' where user_id in 
	 (select user_id from (select user_id from user_info where name = '') t1);

The following example is also acceptable, with a slight difference: the alias can have as or not, and t1.user_id can directly correspond to the inner user_id.

update user_info set status='0' where user_id in 
	(select t1.user_id from (select user_id from user_info where name='') as t1);

3.1 Step-by-step analysis

(1) Use the following query results as the intermediate table:

select user_id from user_info where name='';

(2) Query the intermediate table again as the result set:

select user_id from (select user_id from user_info where name='') as t;

(3) Update data

update user_info set status='0' where user_id in 
	(select user_id from (select user_id from user_info where name='') as t1);

4. Extension Exercise: Deleting Duplicate Data

Write a SQL query to delete all duplicate email addresses in the Person table, and keep only the one with the smallest ID.

+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
| 3 | [email protected] |
+----+------------------+

Id is the primary key of this table.

For example, after running your query, the Person table above should return the following rows:

+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
+----+------------------+

Answer 1:

delete from Person where Id not in (
	select t.min_id from (
		select min(Id) as min_id from Person group by Email
		) as t
	);

Answer 2:

delete p1 from 
	Person as p1, Person as p2 
		where p1.Email=p2.Email and p1.Id > p2.Id;

Summarize

This concludes the article on how to delete duplicate data in MySQL database and keep only one. For more information on deleting duplicate data in MySQL, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MyBatis batch insert/modify/delete MySql data
  • Implementation of mysql using mysqlbinlog command to restore accidentally deleted data
  • How to recover data after accidentally deleting ibdata files in mysql5.7.33
  • How to implement batch deletion of large amounts of data in MySQL large tables
  • Why MySQL does not recommend deleting data
  • Python script to batch delete tens of millions of data in MySQL
  • Mysql delete data and data table method example
  • Why the table file size remains unchanged after deleting data in MySQL
  • Detailed explanation of several practical solutions for quickly deleting large amounts of data (tens of millions) in MySQL
  • Why the disk space is not released after deleting data in MySQL

<<:  Achieve 3D flip effect with pure CSS3 in a few simple steps

>>:  Solution to the docker command exception "permission denied"

Recommend

How to use css overflow: hidden (overflow hiding and clearing floats)

Overflow Hide It means hiding text or image infor...

Introduction to Semantic HTML Tags

In the past few years, DIV+CSS was very popular in...

Detailed explanation of Vue's TodoList case

<template> <div id="root"> ...

Application nesting of HTML ul unordered tables

Application nesting of unordered lists Copy code T...

Zabbix uses PSK shared key to encrypt communication between Server and Agent

Since Zabbix version 3.0, it has supported encryp...

HTML tags list and usage instructions

List of HTML tags mark type Name or meaning effec...

Web front-end development course What are the web front-end development tools

With the development of Internet technology, user...

Introduction to JavaScript array deduplication and flattening functions

Table of contents 1. Array flattening (also known...

Drop-down menu implemented by HTML+CSS3+JS

Achieve results html <div class="containe...

Detailed tutorial on compiling and installing MySQL 5.7.24 on CentOS7

Table of contents Install Dependencies Install bo...

Vue2.x - Example of using anti-shake and throttling

Table of contents utils: Use in vue: explain: Ima...

How to develop Java 8 Spring Boot applications in Docker

In this article, I will show you how to develop a...