How to implement batch deletion of large amounts of data in MySQL large tables

How to implement batch deletion of large amounts of data in MySQL large tables

The question is referenced from: https://www.zhihu.com/question/440066129/answer/1685329456. In MySQL, there are 300 million data in a table, which is not divided into tables. One of the fields is the enterprise type. The enterprise types are general enterprises and self-employed individuals. The amount of data for self-employed individuals accounts for about 50%. According to the conditions, all the rows of self-employed individuals are deleted. How do I do this? The answer is original

Assume the table engine is Innodb, MySQL 5.7+

To delete a record, first lock the record, discard the original data, and change the record header, mainly by adding a deletion mark. That is, the original data deleted_flag becomes 1, indicating that the data is deleted. However, the data is not cleared, and when the size of a new row of data is smaller than that of this row, it may occupy this row. This is actually storage fragmentation.

Afterwards, the index of related data needs to be updated to clear the data. In addition, corresponding binlog and redolog logs will be generated.
If the data to be deleted is a large amount of data, it will:

  • If limit is not added, a large amount of data needs to be updated, which will cause the index to become invalid and a full scan to lock the table. At the same time, due to the modification of a large number of indexes, a large amount of logs will be generated, resulting in a long update time and a long table lock time. During this period, the table cannot handle online business.
  • The large amount of binlogs generated leads to increased pressure on master-slave synchronization
  • A lot of storage fragmentation occurs due to mark deletion. Since MySQL loads data by page, these storage fragments not only greatly increase the number of random reads, but also reduce the page hit rate, resulting in an increase in page swaps.
  • Due to the large amount of logs generated, we can see that the space occupied by this table has increased significantly.

Solution

It is easy to think that adding limit after delete can control the number of deletions. This number will make it go through the index and will not lock the entire table.

However, the problems of storage fragmentation, master-slave synchronization, and space occupation have not been solved. After the deletion is complete, you can rebuild the table using the following statement:

alter table your table engine=InnoDB, ALGORITHM=INPLACE, LOCK=NONE;

Note that this sentence actually rebuilds your table. Although the engine of your table is already InnoDB, adding ALGORITHM=INPLACE, LOCK=NONE at the end allows you to rebuild the table without locking it.

Another solution is to create a new table with the same structure and add a trigger to the original table:

create trigger person_trigger_update AFTER UPDATE on original table for each row 
begin set @x = "trigger UPDATE";
Replace into new table SELECT * from original table where new table.id = original table.id;
END IF;
end;

This ensures that new data for online business will be synchronized. After that, insert all enterprise-type data into the new table. If the data already exists, it will not be inserted if update synchronization has occurred. Due to business changes, the data of individual households is not updated in this table, so the data cleaning of the large table is achieved through synchronization without table locks.

This is the end of this article about how to implement bulk deletion of large amounts of data in MySQL tables. For more information about bulk deletion of large amounts of data in MySQL tables, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MyBatis batch insert/modify/delete MySql data
  • Implementation of mysql using mysqlbinlog command to restore accidentally deleted data
  • How to recover data after accidentally deleting ibdata files in mysql5.7.33
  • MySQL database deletes duplicate data and only retains one method instance
  • Why MySQL does not recommend deleting data
  • Python script to batch delete tens of millions of data in MySQL
  • Mysql delete data and data table method example
  • Why the table file size remains unchanged after deleting data in MySQL
  • Detailed explanation of several practical solutions for quickly deleting large amounts of data (tens of millions) in MySQL
  • Why the disk space is not released after deleting data in MySQL

<<:  Detailed explanation of CSS image splicing technology (sprite image)

>>:  How to change the website accessed by http to https in nginx

Recommend

Record the whole process of MySQL master-slave configuration based on Linux

mysql master-slave configuration 1. Preparation H...

Analyze the selection problem of storing time and date types in MySQL

In general applications, we use timestamp, dateti...

Explore VMware ESXI CLI common commands

Table of contents 【Common commands】 [Summary of c...

The reason why MySQL uses B+ tree as its underlying data structure

We all know that the underlying data structure of...

Install Docker on Centos7 (2020 latest version available, just copy and paste)

Refer to the official documentation here for oper...

How to deploy python crawler scripts on Linux and set up scheduled tasks

Last year, due to project needs, I wrote a crawle...

Ways to improve MongoDB performance

MongoDB is a high-performance database, but in th...

Detailed code for implementing 3D tag cloud in Vue

Preview: Code: Page Sections: <template> &l...

What qualities should a good advertisement have?

Some people say that doing advertising is like bei...

SQL injection vulnerability process example and solution

Code example: public class JDBCDemo3 { public sta...

18 sets of exquisite Apple-style free icon materials to share

Apple Mug Icons and Extras HD StorageBox – add on...

Specific use of GNU Parallel

what is it? GNU Parallel is a shell tool for exec...