A few days ago, when I was working on a requirement, I needed to clean up duplicate records in MySQL. At that time, I thought of writing it out through code traversal, but then I felt it was too complicated, and I thought that the problem should be solved through a SQL statement. After checking the information and consulting the experts, I came up with a very convenient SQL statement. Here I share this SQL statement and ideas. Demand Analysis There are duplicate records in the database, delete one and keep one (the basis for determining whether it is a duplicate is multiple fields) Solution When I encountered this requirement, I probably had some ideas in mind. The first thing that comes to my mind is that it can be solved with a SQL statement, but I am not very experienced in complex SQL statements, so I want to ask an expert for help. Find someone to help Because this requirement is a bit urgent, the first thing I thought of was to find a colleague in this field to solve it, and then share this problem with my colleague. As a result, this guy just Baidu and gave me a SQL statement that I had never used before, and asked me to try it myself. A lot of things rushed through my heart... Baidu Found a sql statement: DELETE FROM vitamin a WHERE (a.peopleId, a.seq) IN ( SELECT peopleId, seq FROM vitae GROUP BY peopleId, seq HAVING count(*) > 1 ) AND rowid NOT IN ( SELECT min(rowid) FROM vitae GROUP BY peopleId, seq HAVING count(*) > 1 ) This statement is found in the article [Delete duplicate data in MySQL and keep only one]. The idea of this SQL statement is very clear, and there are three steps: Finally, according to the above two conditions, delete the remaining duplicate records except the smallest ID in the duplicate records But unfortunately, an error occurred when running this statement. The general meaning of the error is that the table cannot be updated at the same time as the query. Code Solution Based on the above SQL statement, it is possible to achieve the same purpose in two steps through code: Remove the duplicate data sets first According to the queried data set, loop to delete the remaining duplicate data I had the idea and wrote it quickly, but I was shocked when I ran it. It took about 116 seconds . Then I thought I must find a SQL statement that can be used. I posted the code and the running results: Perfect [Remove Duplicates and Keep One] SQL Finally, I got the perfect answer in a technical group. Look at this SQL statement: DELETE consum_record FROM consum_record, ( SELECT min(id) id, user_id, monetary, consume_time FROM consum_record GROUP BY user_id, monetary, consume_time HAVING count(*) > 1 ) t2 WHERE consum_record.user_id = t2.user_id and consum_record.monetary = t2.monetary and consum_record.consume_time = t2.consume_time AND consum_record.id > t2.id; If you look carefully at the above SQL statement, it is not difficult to figure out the idea. It can be understood in three steps: According to the conditions, delete the records in the original table whose id is greater than the id in t2 When I saw this sentence, I thought it was amazing. Such a simple SQL statement can actually solve such a complex problem. It's really interesting! It also runs super fast. The original code loop execution takes about 116 seconds , but here it only takes 0.3 seconds . Amazing~ Summarize As a PHP programmer, it stands to reason that SQL should not lag behind. However, in reality, there are too many things to do and my current SQL level is only at an average level. I will find an opportunity to improve my knowledge in this area in the future. That’s all for today. You may also be interested in:
|
<<: js implements simple provincial, municipal and district three-level selection cascade
>>: Detailed analysis of when tomcat writes back the response datagram
Preface I feel like my mind is empty lately, as I...
CSS Clear Default Styles The usual clear default ...
This is a cheating scheme for voting websites wit...
1. Principle of Hotlinking 1.1 Web page preparati...
Why do we need to build a nexus private server? T...
HTML forms are used to collect different types of...
Preface nginx uses a multi-process model. When a ...
Table of contents Install Tomcat with Docker Use ...
Table of contents Nginx proxies two socket.io ser...
There is no solution for Chinese input method und...
Table of contents Problem Description Historical ...
When we write some UI components, if we don't...
Because the project needs to use https service, I...
Disable right-click menu <body oncontextmenu=s...
1. The Linux server configures /etc/hosts.deny to...