[MDEV-32598] crash in maria/ma_rt_split.c:210(_pcre_xclass) with Aria geospatial index page split Created: 2023-10-27 Updated: 2023-10-31 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Data Manipulation - Insert |
| Affects Version/s: | 10.4.31 |
| Fix Version/s: | 10.4 |
| Type: | Bug | Priority: | Major | ||||||||||||||||||||||||||||||||||||||||||||||
| Reporter: | Jukka Santala | Assignee: | Alexey Botchkov | ||||||||||||||||||||||||||||||||||||||||||||||
| Resolution: | Unresolved | Votes: | 0 | ||||||||||||||||||||||||||||||||||||||||||||||
| Labels: | crash | ||||||||||||||||||||||||||||||||||||||||||||||||
| Environment: |
RHEL 8.8 server
-----------------------------------------
----------------------------------------- |
||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
First attempted to do CREATE LOCATIONS LIKE source.locations; INSERT INTO locations ( SELECT * FROM source.locations ); and it crashed. After that I tried REPAIR TABLE source.locations and it crashed again, then in recovery, and finally hung indefinitely. Moving source.locations out and killing process got it to start again.
This table is read & written extensively, with about 300k rows of about 160 bytes each, but otherwise no idea how to reproduce the issue. I'm seeing MDEV-31766 is similar, though another function in the code. It looks like in our case "square" of all the SplitStruct it's handling is INFINITY (I don't know why though), then the d in pick_seeds() becomes negative infinity so no combination passes, and pick_seeds() passes a & b (which are defined UNINIT_VAR) back unassigned. |
| Comments |
| Comment by Sergei Golubchik [ 2023-10-27 ] | |||||||||
|
we need a way to reproduce the issue to be able to fix it.
and so on. | |||||||||
| Comment by Jukka Santala [ 2023-10-28 ] | |||||||||
|
Bad news regarding that. The other MDEV looks superficially similar, but under ASAN turns out to be heap-buffer-overflow in /mariadb-10.4.31/storage/maria/ma_rt_index.c:859 root= &share->state.key_root[key->keyinfo->key_nr]; accessing key->keyinfo->key_nr. This is almost certainly unrelated to our case. The crash happens on other instances of the same cluster with same table writes (not binary copy) as well, around 250k rows. However, it doesn't happen if the rows are inserted in different order OR from a known good copy of the table. The tables still compare as identical, only spatial index is broken. So figuring out the error in the logic or instrumenting it to that effect would seem like the best bet here. Additionally, pick_seeds() could check if it fails to find a pair, and return merge failed instead of crashing. Though that would negate any possible performance improvement in not assigning them (doubt it's for performance at all in that case). Looks like EXTENDED check-table finds a clue to the problem though: "Record at: 368:23 Can't find key for index: 2" Is there easy way to peek at what we have? | |||||||||
| Comment by Jukka Santala [ 2023-10-28 ] | |||||||||
|
Got a reproducer now, not that mysterious after:
|