[MDEV-33340] fix for MDEV-24670 causes a performance regression Created: 2024-01-31 Updated: 2024-02-01 Resolved: 2024-02-01 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | N/A |
| Affects Version/s: | N/A |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Axel Schwenke | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 1 |
| Labels: | regression | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
The regression test suite found a regression for the t_threadpool* tests. It turned out to be a regression for sorting rows. Further analysis in TODO-4510 traced it to commit a057a6e41f2 for The regression gets bigger when more rows have to be sorted and is in the order of 3% for 1000 rows. Additionally the execution times for queries fluctuate more than normal. |
| Comments |
| Comment by Axel Schwenke [ 2024-01-31 ] | ||
|
The performance test results for the OLTP order ranges test: | ||
| Comment by Marko Mäkelä [ 2024-01-31 ] | ||
|
Because there are no such messages in the server error log, the problem should not be this code (which would likely cause more severe regression later on during any performance test, by forcing pages to be read back into the buffer pool), the problem should be in some code that danblack wrote, such as mem_pressure::setup() or mem_pressure::pressure_routine(). | ||
| Comment by Daniel Black [ 2024-02-01 ] | ||
|
[a057a6e41f2](https://github.com/MariaDB/server/commit/a057a6e41f2) is a simple condition in the buffer pool init so that it doesn't get activated for MariaDB-backup. As both commits contain the same "Failed to initialize memory pressure: No such file or directory" the both took the same path though this. (Note this error message was removed later). Because of this error, there isn't even a background thread running (confirmed by PMP). There would have been a little extra processing in init, however extra no CPU time during the run. Even it if had of been Ubuntu 22.04, or an OS with cgroups2, the background thread is waiting on a poll for an even that shouldn't happen without memory pressure. Looking at the flame graphs there's 0.08 difference in percentage at start_thread. By the time gets up to JOIN::exec the difference in percentage is 0.25. Going up further to join_init_read_record the difference in percentage is 0.83. Obviously we haven't touched any code created by memory pressure. Obviously with 7f11fad85a885d148254ca05f508125e3b94339c showing the same performance, there's still a regression there. Did reverting a057a6e41f2 show the improvement come back? Yes, there's a regression, but with nothing showing in the CPU profile related to | ||
| Comment by Axel Schwenke [ 2024-02-01 ] | ||
|
I retested commit 7f11fad85a8 (the original bad release candidate) with the supposed bad commit a057a6e41f2 reverted. Result: for threadpool=off this looks like a057a6e41f2 could be the culprit, but then for threadpool=on it does not. I noticed also high fluctuations in throughput. Meaning the test used for bisecting could have returned bogus numbers. I will close this ticket, and reopen TODO-4510. Then bisect again in a different branch and maybe with a better (more stable) test case. | ||
| Comment by Axel Schwenke [ 2024-02-01 ] | ||
|
Looks like this was a false alarm. |