1. ProblemThe story originated from a report that queries the error and omission rate: there are two query results, one for the items that have been added to the report and the other for the items that should be added to the report. What does it mean to be without omission? That is, all items that should be added have been added The reporting completeness rate is the ratio of the number of complete reports to the total number of reports. Here are two examples of reports (one with all added and one with missing parts) First, find the first result - the items that should be added to the report SELECT r.id AS report ID, m.project_id should be added to the project FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id RIGHT JOIN application_sample_item si ON s.id=si.sample_id RIGHT JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id; Then, find the second result - report the items that have been added SELECT r.id AS report_id,i.project_id AS added_project FROM report r RIGHT JOIN report_item i ON r.id=i.report_id WHERE r.id IN ('44930','44927'); The above is the result set we want to compare. It is not difficult to see that report 44927 is complete, while 44930 has the same number of items, but actually adds 758 items and lacks 112 items, so it is a missing report. 2. SolutionJudging from the question, it is obviously a question of judging whether it is a subset. You can traverse the added items and the items that should be added separately. If the items that should be added can be matched in the added items, it means that the items that should be added are a subset of the added items, that is, there are no omissions. This problem can indeed be solved by looping and comparing, but the cross join of Cartesian products in SQL often means huge overhead and slow query speed. So is there any way to avoid this problem? Option 1:With the help of the functions FIND_IN_SET and GROUP_CONCAT, first understand the following two functions FIND_IN_SET(str,strlist)
The FIND_IN_SET function returns the position of the string to be queried in the target string. GROUP_CONCAT( [distinct] Field to be connected [order by sort field asc/desc ] [separator 'separator'] ) The GROUP_CONCAT() function can concatenate the values of the same field of multiple records into one record and return it. The default separator is English ',' . However, the default length of GROUP_CONCAT() is 1024 Therefore, if the length of the splicing needs to exceed 1024, it will cause incomplete truncation and the length needs to be modified. SET GLOBAL group_concat_max_len=102400; SET SESSION group_concat_max_len=102400; From the above two function introductions, we find that FIND_IN_SET and GROUP_CONCAT are separated by English ',' (in bold) Therefore, we can use GROUP_CONCAT to concatenate the items that have been added into a string, and then use FIND_IN_SET to query one by one whether the items to be added exist in the string. 1. Modify the SQL in the description of the problem and use GROUP_CONCAT to concatenate the items of the added items into a string SELECT r.id,GROUP_CONCAT(i.project_id ORDER BY i.project_id,'') AS added project list FROM report r LEFT JOIN report_item i ON r.id=i.report_id WHERE r.id IN ('44930','44927') GROUP BY r.id; 2. Use FIND_IN_SET to check one by one whether the items to be added exist in the string SELECT Q.id,FIND_IN_SET(W.List of items to be added, Q.List of items already added) AS Is FROM missing ( -- Report the added projects SELECT r.id,GROUP_CONCAT(i.project_id ORDER BY i.project_id,'') AS added project list FROM report r LEFT JOIN report_item i ON r.id=i.report_id WHERE r.id IN ('44930','44927') GROUP BY r.id )Q, ( -- Report items that should be addedSELECT r.id,s.app_id,m.project_id should add project list FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id )W WHERE Q.id=W.id; 3. Filter out missed reports SELECT Q.id,CASE WHEN FIND_IN_SET(W.List of items to be added, Q.List of items already added)>0 THEN 1 ELSE 0 END AS Is FROM missing ( -- Report the added projects SELECT r.id,GROUP_CONCAT(i.project_id ORDER BY i.project_id,'') AS added project list FROM report r LEFT JOIN report_item i ON r.id=i.report_id WHERE r.id IN ('44930','44927') GROUP BY r.id )Q, ( -- Report items that should be addedSELECT r.id,s.app_id,m.project_id should add project list FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id )W WHERE Q.id=W.id GROUP BY Q.id HAVING COUNT(`Is it missing`)=SUM(`Is it missing`); 4. Our ultimate goal is to find the zero omission rate SELECT COUNT(X.id) number of reports without missing items, Y.total total number of reports, CONCAT(FORMAT(COUNT(X.id)/Y.total*100,2),'%') AS project without missing items rate FROM ( SELECT Q.id,CASE WHEN FIND_IN_SET(W.List of items to be added, Q.List of items already added)>0 THEN 1 ELSE 0 END AS Is FROM missing ( -- Report the added projects SELECT r.id,GROUP_CONCAT(i.project_id ORDER BY i.project_id,'') AS added project list FROM report r LEFT JOIN report_item i ON r.id=i.report_id WHERE r.id IN ('44930','44927') GROUP BY r.id )Q, ( -- Report items that should be addedSELECT r.id,s.app_id,m.project_id should add project list FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id )W WHERE Q.id=W.id GROUP BY Q.id HAVING COUNT(`Is it missing`)=SUM(`Is it missing`) )X, ( -- Total number of reports SELECT COUNT(E.nums) AS total FROM ( SELECT COUNT(r.id) AS nums FROM report r WHERE r.id IN ('44930','44927') GROUP BY r.id )E )Y ; Option 2:Although the above solution 1 avoids line-by-line traversal and comparison, it is essentially a one-by-one comparison of items. So is there any way to avoid comparison? Of course the answer is yes. We can determine whether it is fully included based on the statistical quantity. 1. Use union all to join the added items with the items to be added without removing duplicates. ( -- Items that should be addedSELECT r.id,m.project_id FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id ) UNION ALL ( -- Added project select r.id,i.project_id from report r,report_item i where r.id = i.report_id and r.id IN ('44930','44927') group by r.app_id,i.project_id ) From the results, we can see that there are duplicate items under the same report, which represent the items that should be added and the items that have been added. 2. According to the joint table results, the number of overlapping items in the statistical report # Should add the number of items that overlap with those already added select tt.id,count(*) count from ( select t.id,t.project_id,count(*) from ( ( -- Items that should be addedSELECT r.id,m.project_id FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id ) UNION ALL ( -- Added project select r.id,i.project_id from report r,report_item i where r.id = i.report_id and r.id IN ('44930','44927') group by r.app_id,i.project_id ) ) GROUP BY t.id,t.project_id HAVING count(*) >1 ) tt group by tt.id 3. Compare the amount in the second step with the amount that should be added. If they are equal, it means there is no omission. select bb.id, aa.count has been added, bb.count needs to be added, CASE WHEN aa.count/bb.count=1 THEN 1 ELSE 0 END AS 'Is it missing' from ( # Should add the number of items that overlap with those already added select tt.id,count(*) count from ( select t.id,t.project_id,count(*) from ( ( -- Items that should be addedSELECT r.id,m.project_id FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id ) UNION ALL ( -- Added project select r.id,i.project_id from report r,report_item i where r.id = i.report_id and r.id IN ('44930','44927') group by r.app_id,i.project_id ) ) GROUP BY t.id,t.project_id HAVING count(*) >1 ) tt group by tt.id ) aa RIGHT JOIN ( -- The number of items that should be addedSELECT r.id,s.app_id,COUNT(m.project_id) count FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') GROUP BY r.id ORDER BY r.id,m.project_id ) bb ON aa.id = bb.id ORDER BY aa.id 4. Find the no-missing rate select SUM(asr.`Is it missing?) AS No missing number, COUNT(asr.id) AS Total number, CONCAT(FORMAT(SUM(asr.`Is it missing?)/COUNT(asr.id)*100,5),'%') AS Report no missing rate from ( select bb.id, aa.count has been added, bb.count needs to be added, CASE WHEN aa.count/bb.count=1 THEN 1 ELSE 0 END AS 'Is it missing' from ( # Should add the number of items that overlap with those already added select tt.id,count(*) count from ( select t.id,t.project_id,count(*) from ( ( -- Items that should be addedSELECT r.id,m.project_id FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') ORDER BY r.id,m.project_id ) UNION ALL ( -- Added project select r.id,i.project_id from report r,report_item i where r.id = i.report_id and r.id IN ('44930','44927') group by r.app_id,i.project_id ) ) GROUP BY t.id,t.project_id HAVING count(*) >1 ) tt group by tt.id ) aa RIGHT JOIN ( -- The number of items that should be addedSELECT r.id,s.app_id,COUNT(m.project_id) count FROM report INNER JOIN application a ON r.app_id=a.id INNER JOIN application_sample s ON a.id=s.app_id INNER JOIN application_sample_item si ON s.id=si.sample_id INNER JOIN set_project_mapping m ON si.set_id=m.set_id WHERE r.id IN ('44930','44927') GROUP BY r.id ORDER BY r.id,m.project_id ) bb ON aa.id = bb.id ORDER BY aa.id ) asr; This is the end of this article about the steps to determine whether MySQL is a subset. For more information about how to determine whether MySQL is a subset, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: W3C Tutorial (9): W3C XPath Activities
>>: What is Nginx load balancing and how to configure it
1. Which three formats? They are: gif, jpg, and pn...
Table of contents 1. Create a table 1.1 Create te...
Click here to return to the 123WORDPRESS.COM HTML ...
BMP is an image file format that is independent o...
<br />The content is reproduced from the Int...
Table of contents Overview Code Implementation Pa...
dig - DNS lookup utility When a domain name acces...
The following problem occurred when installing my...
<br />Original URL: http://www.lxdong.com/po...
1 What is MVCC The full name of MVCC is: Multiver...
Table of contents Solution 1 Solution 2 When crea...
export default ({ url, method = 'GET', da...
Let’s build the data table first. use test; creat...
Table of contents 1. What is a directive? Some co...
Configure tomcat 1. Click run configuration 2. Se...