A possible bug when MySQL executes the sum function on the window function

A possible bug when MySQL executes the sum function on the window function

When using MySql's window function to collect statistical data, I found a small problem and would like to discuss it with you.

Environment configuration:

  • mysql-installer-community-8.0.20.0

Problem: When sum is executed on the window function, if there is duplicate data, the same data will be added directly instead of step by step.

Problem Description

Data: In a grade table, there are three fields: student s_id, course c_id, grade s_score.

The query conditions query the student score ranking and score summary for each course.

Query results: It was found that if the same course has the same score, the summary score is not accumulated, but added all at once.

Create a data table

CREATE TABLE `Score`(
`s_id` VARCHAR(20),
`c_id` VARCHAR(20),
`s_score` INT(3),
PRIMARY KEY(`s_id`,`c_id`)
)

Inserting Data

-- insert into Score values('01', '01', 80);
insert into Score values('01' , '02' , 90);
insert into Score values('01' , '03' , 99);
insert into Score values('02' , '01' , 70);
insert into Score values('02' , '02' , 60);
insert into Score values('02' , '03' , 80);
insert into Score values('03' , '01' , 80);
insert into Score values('03' , '02' , 80);
insert into Score values('03' , '03' , 80);
insert into Score values('04' , '01' , 50);
insert into Score values('04' , '02' , 30);
insert into Score values('04' , '03' , 20);
insert into Score values('05' , '01' , 76);
insert into Score values('05' , '02' , 87);
insert into Score values('06' , '01' , 31);
insert into Score values('06' , '03' , 34);
insert into Score values('07' , '02' , 89);
insert into Score values('07' , '03' , 98);

Query data

select c_id,s_id,s_score,
first_value(s_score) over w as first_v,
last_value(s_score) over w as last_v,
sum(s_score) over w as sum_v,
max(s_score) over w as max_v,
min(s_score) over w as min_v,
count(s_id) over w as count_v,
row_number() over w as row_id,
rank() over w as rank_id,
dense_rank() over w as dense_id
from score window was (partition by c_id order by s_score desc);

Query results

Looking at the statistical results of course number 01, in the sum_v column of the first row of data, the first two data are both 160. According to the function principle, the data should be 80 and 160.

Looking at the statistical results of course number 02, we find that the results are correct. The first sum_v is 90 and the second is 179.

The actual display is inconsistent with the expected result, something went wrong.

c_id s_id s_score first_v last_v sum_v max_v min_v count_v row_id rank_id dense_id
01 01 80 80 80 160 80 80 2 1 1 1
01 03 80 80 80 160 80 80 2 2 1 1
01 05 76 80 76 236 80 76 3 3 3 2
01 02 70 80 70 306 80 70 4 4 4 3
01 04 50 80 50 356 80 50 5 5 5 4
01 06 31 80 31 387 80 31 6 6 6 5
02 01 90 90 90 90 90 90 1 1 1 1
02 07 89 90 89 179 90 89 2 2 2 2
02 05 87 90 87 266 90 87 3 3 3 3
02 03 80 90 80 346 90 80 4 4 4 4
02 02 60 90 60 406 90 60 5 5 5 5
02 04 30 90 30 436 90 30 6 6 6 6
03 01 99 99 99 99 99 99 1 1 1 1
03 07 98 99 98 197 99 98 2 2 2 2
03 02 80 99 80 357 99 80 4 3 3 3
03 03 80 99 80 357 99 80 4 4 3 3
03 06 34 99 34 391 99 34 5 5 5 4
03 04 20 99 20 411 99 20 6 6 6 5

Thinking Verification

The data for course number 02 is correct, but that for 01 is incorrect. The difference between 01 and 02 is that the first two students in course 01 have the same score of 80.

Could it be that the scores were the same, which led to an error in the sum?

To verify this problem, change the score of course number 01 and student number 01 to 82, and then execute the query. The result is as follows

The sum_v column shows 82 and 162, which is consistent with the expected results.

This shows that when sum is executed on the window function, if there is duplicate data, the same data will be added directly instead of step by step.


c_id s_id s_score first_v last_v sum_v max_v min_v count_v row_id rank_id dense_id
01 01 80 80 82 82 82 82 2 1 1 1
01 03 80 80 80 162 82 80 2 2 1 1
01 05 76 80 76 236 82 76 3 3 3 2
01 02 70 80 70 306 82 70 4 4 4 3
01 04 50 80 50 356 82 50 5 5 5 4
01 06 31 80 31 387 82 31 6 6 6 5
02 01 90 90 90 90 90 90 1 1 1 1
02 07 89 90 89 179 90 89 2 2 2 2
02 05 87 90 87 266 90 87 3 3 3 3
02 03 80 90 80 346 90 80 4 4 4 4
02 02 60 90 60 406 90 60 5 5 5 5
02 04 30 90 30 436 90 30 6 6 6 6
03 01 99 99 99 99 99 99 1 1 1 1
03 07 98 99 98 197 99 98 2 2 2 2
03 02 80 99 80 357 99 80 4 3 3 3
03 03 80 99 80 357 99 80 4 4 3 3
03 06 34 99 34 391 99 34 5 5 5 4
03 04 20 99 20 411 99 20 6 6 6 5

Other Sql verification and comparison

After the above verification, Mysql did have an error when summing, and it was not accumulated step by step.

I checked whether the same problem exists on other platforms in Sqlite Expert 5.3 and found the same results.

This is strange. If an error occurs in the implementation of Mysql, the probability of Sqlite making the same error is much smaller.

Could it be due to the characteristics of the sum and window functions when used together? Welcome to discuss and research together.

Summarize

This is the end of this article about a possible bug in MySQL's execution of the sum function on the window function. For more information about MySQL's execution of the sum function on the window function, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Detailed explanation of the difference between Mysql temporary table and partition table
  • MySQL Best Practices: Basic Types of Partition Tables
  • Basic introductory tutorial on MySQL partition tables
  • MySQL optimization partition table
  • Solve the mysql user deletion bug
  • A bug in MySQL about exists
  • CentOS installation PHP5.5+Redis+XDebug+Nginx+MySQL full record
  • A performance bug about MySQL partition tables

<<:  nginx solves the problem of slow image display and incomplete download

>>:  How to elegantly implement WeChat authorized login in Vue3 project

Recommend

Detailed explanation of loop usage in javascript examples

I was bored and sorted out some simple exercises ...

Getting Started: A brief introduction to HTML's basic tags and attributes

HTML is made up of tags and attributes, which are...

W3C Tutorial (2): W3C Programs

The W3C standardization process is divided into 7...

Common operation commands of MySQL in Linux system

Serve: # chkconfig --list List all system service...

Detailed explanation of TIMESTAMPDIFF case in MySQL

1. Syntax TIMESTAMPDIFF(unit,begin,end); Returns ...

Two box models in web pages (W3C box model, IE box model)

There are two types of web page box models: 1: Sta...

Analyze Tomcat architecture principles to architecture design

Table of contents 1. Learning Objectives 1.1. Mas...

Binary Type Operations in MySQL

This article mainly introduces the binary type op...

MySQL 8.0.12 installation steps and basic usage tutorial under Windows

This article shares the installation steps and us...

How to limit the value range of object keys in TypeScript

When we use TypeScript, we want to use the type s...