Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
6.4.2
-
None
-
2021-17
Description
Specifically regr_r2 and regr_slope, but there could be a similar behavior in others.
When we calculate the final result, often the amount of rounding error in the accumulators is such that we can get different results depending on the order of the data read. This makes our test suite spuriously fail.
There are techniques found in the literature that can help compensate for these numerical issues.
See
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
https://www.johndcook.com/blog/standard_deviation/
For ideas. I'm sure there are other references that can be found.
Attachments
Issue Links
- includes
-
MCOL-3655 regr_sxy() returns different results across different runs of the same query
- Closed