MySQL study notes on handling duplicate data

MySQL study notes on handling duplicate data

MySQL handles duplicate data

Some MySQL tables may contain duplicate records. In some cases, we allow duplicate data to exist, but sometimes we also need to delete the duplicate data.

In this chapter, we will introduce how to prevent duplicate data from appearing in data tables and how to delete duplicate data in data tables.

Preventing duplicate data from appearing in the table

You can set a specified field in a MySQL table as a PRIMARY KEY or UNIQUE index to ensure the uniqueness of the data.

Let's try an example: There is no index or primary key in the following table, so multiple duplicate records are allowed in the table.

CREATE TABLE person_tbl
(
  first_name CHAR(20),
  last_name CHAR(20),
  sex CHAR(10)
);

If you want to set the data of the first_name and last_name fields in the table to not be repeated, you can set a dual primary key mode to set the uniqueness of the data. If you set a dual primary key, the default value of that key cannot be NULL and can be set to NOT NULL. As shown below:

CREATE TABLE person_tbl
(
  first_name CHAR(20) NOT NULL,
  last_name CHAR(20) NOT NULL,
  sex CHAR(10),
  PRIMARY KEY (last_name, first_name)
);

If we set a unique index, the SQL statement will not be executed successfully when inserting duplicate data and an error will be thrown.

The difference between INSERT IGNORE INTO and INSERT INTO is that INSERT IGNORE will ignore the data that already exists in the database. If there is no data in the database, new data will be inserted. If there is data, the data will be skipped. In this way, the existing data in the database can be retained, achieving the purpose of inserting data in the gap.

The following example uses INSERT IGNORE INTO. No error occurs after execution, and no duplicate data is inserted into the data table:

mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
  -> VALUES('Jay', 'Thomas');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
  -> VALUES('Jay', 'Thomas');
Query OK, 0 rows affected (0.00 sec)

INSERT IGNORE INTO When inserting data, after setting the uniqueness of the record, if duplicate data is inserted, no error will be returned, only a warning will be returned. If REPLACE INTO has a record with the same primary or unique, it will be deleted first. Insert new records again.

Another way to set uniqueness on your data is to add a UNIQUE index, as shown below:

CREATE TABLE person_tbl
(
  first_name CHAR(20) NOT NULL,
  last_name CHAR(20) NOT NULL,
  sex CHAR(10),
  UNIQUE (last_name, first_name)
);

Counting duplicate data

Below we will count the number of duplicate records of first_name and last_name in the table:

mysql> SELECT COUNT(*) as repetitions, last_name, first_name
  -> FROM person_tbl
  -> GROUP BY last_name, first_name
  -> HAVING repetitions > 1;

The above query will return the number of duplicate records in the person_tbl table. In general, to query for duplicate values, do the following:

  • Determines which column contains values ​​that may be repeated.
  • Use COUNT(*) in the column selection list to list those columns.
  • The columns listed in the GROUP BY clause.
  • The HAVING clause sets the number of repetitions to greater than 1.

Filtering Duplicate Data

If you need to read non-duplicate data, you can use the DISTINCT keyword in the SELECT statement to filter out duplicate data.

mysql> SELECT DISTINCT last_name, first_name
  -> FROM person_tbl;

You can also use GROUP BY to read unique data in the table:

mysql> SELECT last_name, first_name
  -> FROM person_tbl
  -> GROUP BY (last_name, first_name);

Deduplication

If you want to delete duplicate data in a table, you can use the following SQL statement:

mysql> CREATE TABLE tmp SELECT last_name, first_name, sex FROM person_tbl GROUP BY (last_name, first_name, sex);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;

Of course, you can also add INDEX (index) and PRIMAY KEY (primary key) in the data table to delete duplicate records in the table. Here’s how:

mysql> ALTER IGNORE TABLE person_tbl
  -> ADD PRIMARY KEY (last_name, first_name);

The above is the detailed content of the study notes on MySQL's handling of duplicate data. For more information about MySQL's handling of duplicate data, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • A practical record of checking and processing duplicate MySQL records on site
  • MySQL's method of dealing with duplicate data (preventing and deleting)
  • How to handle concurrent updates of MySQL data
  • Detailed explanation of MySQL execution principle, logical layering, and changing database processing engine
  • Some methods to optimize query speed when MySQL processes massive data
  • MySQL data processing sorting and explaining the operations of adding, deleting and modifying

<<:  How to install and configure the Docker Compose orchestration tool in Docker.v19

>>:  How to manage large file uploads and breakpoint resume based on js

Recommend

Unicode signature BOM (Byte Order Mark) issue for UTF-8 files

I recently encountered a strange thing when debug...

JavaScript String Object Methods

Table of contents Methods of String Object Method...

How to use xshell to connect to Linux in VMware (2 methods)

【Foreword】 Recently I want to stress test ITOO...

Get a list of your top 10 most frequently used terminal commands in Linux

I think the commands I use most often are: Choice...

How to use html css to control div or table to be fixed in a specified position

CSS CodeCopy content to clipboard .bottomTable{ b...

getdata table table data join mysql method

public function json_product_list($where, $order)...

HTML table markup tutorial (2): table border attributes BORDER

By default, the border of the table is 0, and we ...

WeChat applet picker multi-column selector (mode = multiSelector)

Table of contents 1. Effect diagram (multiple col...

What you need to know about MySQL auto-increment ID

Introduction: When using MySQL to create a table,...

How to automatically delete records before a specified time in Mysql

About Event: MySQL 5.1 began to introduce the con...

Use of VNode in Vue.js

What is VNode There is a VNode class in vue.js, w...

Detailed steps for debugging VUE projects in IDEA

To debug js code, you need to write debugger in t...

Summary of MySQL log related knowledge

Table of contents SQL execution order bin log Wha...

How to implement n-grid layout in CSS

Common application scenarios The interfaces of cu...

Detailed explanation of MySQL monitoring tool mysql-monitor

1. Overview mysql-monitor MYSQL monitoring tool, ...