MySQL's method of dealing with duplicate data (preventing and deleting)

MySQL's method of dealing with duplicate data (preventing and deleting)

Some MySQL tables may contain duplicate records. In some cases, we allow duplicate data to exist, but sometimes we also need to delete the duplicate data.

In this chapter, we will introduce how to prevent duplicate data from appearing in data tables and how to delete duplicate data in data tables.

Preventing duplicate data from appearing in the table

You can set a specified field in a MySQL table as a PRIMARY KEY or UNIQUE index to ensure the uniqueness of the data.
Let's try an example: There is no index or primary key in the following table, so multiple duplicate records are allowed in the table.

CREATE TABLE person_tbl
(
 first_name CHAR(20),
 last_name CHAR(20),
 sex CHAR(10)
);

If you want to set the data of the first_name and last_name fields in the table to not be repeated, you can set a dual primary key mode to set the uniqueness of the data. If you set a dual primary key, the default value of that key cannot be NULL and can be set to NOT NULL. As shown below:

CREATE TABLE person_tbl
(
 first_name CHAR(20) NOT NULL,
 last_name CHAR(20) NOT NULL,
 sex CHAR(10),
 PRIMARY KEY (last_name, first_name)
);

If we set a unique index, the SQL statement will not be executed successfully when inserting duplicate data and an error will be thrown.

The difference between INSERT IGNORE INTO and INSERT INTO is that INSERT IGNORE will ignore the data that already exists in the database. If there is no data in the database, new data will be inserted. If there is data, the data will be skipped. In this way, the existing data in the database can be retained, achieving the purpose of inserting data in the gap.

The following example uses INSERT IGNORE INTO. No error occurs after execution, and no duplicate data is inserted into the data table:

mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
 -> VALUES('Jay', 'Thomas');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
 -> VALUES('Jay', 'Thomas');
Query OK, 0 rows affected (0.00 sec)

INSERT IGNORE INTO When inserting data, after setting the uniqueness of the record, if duplicate data is inserted, no error will be returned, only a warning will be returned. If REPLACE INTO has a record with the same primary or unique, it will be deleted first. Insert new records again.

Another way to set uniqueness on your data is to add a UNIQUE index, as shown below:

CREATE TABLE person_tbl
(
 first_name CHAR(20) NOT NULL,
 last_name CHAR(20) NOT NULL,
 sex CHAR(10),
 UNIQUE (last_name, first_name)
);

Counting duplicate data

Below we will count the number of duplicate records of first_name and last_name in the table:

mysql> SELECT COUNT(*) as repetitions, last_name, first_name
 -> FROM person_tbl
 -> GROUP BY last_name, first_name
 -> HAVING repetitions > 1;

The above query will return the number of duplicate records in the person_tbl table. In general, to query for duplicate values, do the following:

  • Determines which column contains values ​​that may be repeated.
  • Use COUNT(*) in the column selection list to list those columns.
  • The columns listed in the GROUP BY clause.
  • The HAVING clause sets the number of repetitions to greater than 1.

Filtering Duplicate Data

If you need to read non-duplicate data, you can use the DISTINCT keyword in the SELECT statement to filter out duplicate data.

mysql> SELECT DISTINCT last_name, first_name
 -> FROM person_tbl;

You can also use GROUP BY to read unique data in the table:

mysql> SELECT last_name, first_name
 -> FROM person_tbl
 -> GROUP BY (last_name, first_name);

Deduplication

If you want to delete duplicate data in a table, you can use the following SQL statement:

mysql> CREATE TABLE tmp SELECT last_name, first_name, sex FROM person_tbl GROUP BY (last_name, first_name, sex);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;

Of course, you can also add INDEX (index) and PRIMAY KEY (primary key) in the data table to delete duplicate records in the table. Here’s how:

mysql> ALTER IGNORE TABLE person_tbl
 -> ADD PRIMARY KEY (last_name, first_name);

The above is the details of MySQL's method of handling duplicate data (prevention and deletion). For more information about MySQL's handling of duplicate data, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • A practical record of checking and processing duplicate MySQL records on site
  • MySQL study notes on handling duplicate data
  • How to handle concurrent updates of MySQL data
  • Detailed explanation of MySQL execution principle, logical layering, and changing database processing engine
  • Some methods to optimize query speed when MySQL processes massive data
  • MySQL data processing sorting and explaining the operations of adding, deleting and modifying

<<:  How to View All Running Processes in Linux

>>:  How to use the yum command

Recommend

Webpack file packaging error exception

Before webpack packaging, we must ensure that the...

Explanation of factors affecting database performance in MySQL

A story about database performance During the int...

About MySQL innodb_autoinc_lock_mode

The innodb_autoinc_lock_mode parameter controls t...

How to display the border when td is empty

Previously, I summarized how to use CSS to achieve...

HTML table markup tutorial (48): CSS modified table

<br />Now let's take a look at how to cl...

Introduction to the process of creating TCP connection in Linux system

Table of contents Steps to create TCP in Linux Se...

Linux IO multiplexing epoll network programming

Preface This chapter uses basic Linux functions a...

js to achieve drag and drop sorting details

Table of contents 1. Introduction 2. Implementati...

Detailed explanation of the watch listener example in vue3.0

Table of contents Preface The difference between ...

Example code for making the pre tag automatically wrap

The pre element defines preformatted text. Text en...

Creating a file system for ARM development board under Linux

1. Please download the Busybox source code online...

Summary of CSS usage tips

Recently, I started upgrading my blog. In the proc...

How to disable foreign key constraint checking in MySQL child tables

Prepare: Define a teacher table and a student tab...