[MDEV-21954] mysqld got signal 11 Fatal signal 6 while backtracing on parallel show global status Created: 2020-03-16 Updated: 2020-10-23 Resolved: 2020-10-23 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - TokuDB |
| Affects Version/s: | 10.4.12, 10.4 |
| Fix Version/s: | 10.4.16 |
| Type: | Bug | Priority: | Major |
| Reporter: | Reinis Rozitis | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | upstream | ||
| Environment: |
Linux |
||
| Attachments: |
|
| Description |
|
After upgrading from 10.3.22 to 10.4.12 I have encountered a strange issue - if two or more ' show global status where ...' are issued in parallel mysql crashes Something like:
crashes the server reliably:
If you execute each query sequentially there are no problems. The coredump looks like:
(full backtrace in attachment) |
| Comments |
| Comment by Reinis Rozitis [ 2020-03-16 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
It looks somewhat similar to | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Reinis Rozitis [ 2020-03-16 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Initially I though/wrote that the issue doesn't manifest on an empty server, but just by increasing the parallel query count (for example to 6) it crashes also, but at least it now mysqld generates a stacktrace (bt2.txt) | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Alice Sherepa [ 2020-03-17 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Thanks! I repeated on 10.4, when TokuDB engine is installed, could not repeat on 5.5-10.3
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Reinis Rozitis [ 2020-09-16 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Any status or ETA on this? It still affects 10.4.14 but not the 10.3.x branch. The only way to upgrade is to turn off monitoring system and hope someone doesn't run 'show global status' too often | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2020-09-17 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Does it happen without TokuDB? TokuDB is not supported even by its owner, there's little chance we'll be able to fix a bug, if it's inside TokuDB | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Reinis Rozitis [ 2020-09-17 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Without tokudb there is no crash, but it doesn't happen on 10.3.x branch with tokudb. Isn't the source base for the engine generally the same so there is something different how 10.4 fetches the engine stats? Also I assumed that at least till EOL of 10.4 tokudb will be somewhat supported. Atm I have to downgrade back to 10.3 because it's somewhat scary to have such an easy way to invoke a segfault | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2020-09-18 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
This is TokuDB bug after all. In show_tokudb_vars() (storage/tokudb/hatoku_hton.cc:1961) TokuDB reads the status into and freely modifies a shared global status array with the comment
See, a shared global toku_global_status_rows, the status is read into it, later a status row is modified under the assumption that "it belongs to us". When this function is called concurrently by two threads at the same time the global array of status rows gets corrupted. It 10.3 this did not cause a crash because the server protected access to status variables with a mutex. In 10.4 |