[MDEV-23862] High contention on last_value in my_timer_microseconds Created: 2020-10-01 Updated: 2020-10-07 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | Georgy Kirichenko | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Description: After Linux kernel source investigation I found that the only way in which gettimeofday can fail with tz=NULL passed is to use an invalid pointer as the first argument but this is definitely not our case. Even if we assume that intermittent errors of gettimeofday are possible then this failures are handled quite inconsistently throughout the MySQL source. In some cases we run an endless loop until it succeeds, in some cases we just ignore failures, and in some cases we return "last seen" value. The latter is the most expensive method, because it involves updating a global variable. The price to pay is visible on modern Linux where gettimeofday with tz=NULL may never fail, below are sysbench results on X86 and AARCH64 for vanilla and patched version of my_timer_microseconds where assumed that gettimeofday is reliable: AARCH64:
X86:
The code in question is used only in Performance Schema. Even if gettimeofday() may fail, using the expensive "last seen value" method is not any different from any other ways to handle errors: it will only result in skewed statistics, which is not that important, given how unlikely this event is. |
| Comments |
| Comment by Sergei Golubchik [ 2020-10-03 ] |
|
I don't see that my_timer_milliseconds() uses last_value in MariaDB. Returning 0 when gettimeofday() fails could significantly corrupt the statistis, I'm afraid. If the caller doesn't reject zero values, the sudden huge duration will make averages pretty much meaningless for quite a while. Because we don't know why gettimeofday() might fail, I'd say that looping is not an option. But we canfallback to some other existing counter, scaled accordingly. |
| Comment by Georgy Kirichenko [ 2020-10-07 ] |
|
You are right - my_timer_milliseconds does not use gettimeofday, excuse me. Despite the statistic corruption could be significant this function does not seem to able to fail (according to Linux code investigation). |