Detailed example of removing duplicate data in MySQL

Detailed example of removing duplicate data in MySQL

Detailed example of removing duplicate data in MySQL

There are two meanings of duplicate records: one is a completely duplicate record, that is, all fields are repeated; the other is a record with some fields repeated. The first type of duplication is relatively easy to solve. You only need to use the distinct keyword in the query statement to remove duplicates. Almost all database systems support the distinct operation. The main reason for this duplication is poor table design, which can be avoided by adding a primary key or unique index column to the table.

select distinct * from t;

For the second type of duplicate problem, it is usually required to query any one of the duplicate records. Assume that table t has three fields: id, name, and address. id is the primary key. The repeated fields are name and address. It is required to get a unique result set for these two fields.

-- Oracle, MySQL, using correlated subqueries

select * from t t1

 where t1.id =

 (select min(t2.id)

  from t t2

  where t1.name = t2.name and t1.address = t2.address);

-- Hive only supports subqueries in the FROM clause. Subqueries must have names and columns must be unique.

select t1.*

 from t t1,

   (select name, address, min(id) id from t group by name, address) t2

 where t1.id = t2.id;

 

-- You can also use hive's row_number() analysis function

select t.id, t.name, t.address

 from (select id, name, address,

row_number() over (distribute by name, address sort by id) as rn 

     from t) t 

 where t.rn=1;

Thank you for reading, I hope it can help you, thank you for your support of this site!

You may also be interested in:
  • Summary of methods for finding and deleting duplicate data in MySQL tables
  • MySQL query duplicate data (delete duplicate data and keep the one with the smallest id as the only data)
  • MySQL SQL statement to find duplicate data based on one or more fields
  • MySQL study notes on handling duplicate data

<<:  Sample code for implementing login and registration template in Vue

>>:  Detailed steps for quick installation of openshift

Recommend

MySQL Series 4 SQL Syntax

Table of contents Tutorial Series 1. Introduction...

mysql trigger creation and usage examples

Table of contents What is a trigger Create a trig...

Example of how to set up a third-level domain name in nginx

Problem Description By configuring nginx, you can...

How to set MySQL foreign keys for beginners

Table of contents The role of foreign keys mysql ...

How to build a complete samba server in Linux (centos version)

Preface smb is the name of a protocol that can be...

Use CSS's clip-path property to display irregular graphics

clip-path CSS properties use clipping to create t...

Detailed explanation of XML syntax

1. Documentation Rules 1. Case sensitive. 2. The a...

MySQL 8.0.11 installation and configuration method graphic tutorial (win10)

This article records the installation and configu...

Using zabbix to monitor the ogg process (Linux platform)

The ogg process of a database produced some time ...

How to install openjdk in docker and run the jar package

Download image docker pull openjdk Creating a Dat...

Docker5 full-featured harbor warehouse construction process

Harbor is an enterprise-level registry server for...

How to design MySQL statistical data tables

Table of contents Is real-time update required? M...

How to use JS WebSocket to implement simple chat

Table of contents Short Polling Long-Polling WebS...

Detailed steps to install xml extension in php under linux

Installing XML extension in PHP Linux 1. Enter th...