[MDEV-27798] SIGSEGV in dict_index_t::reconstruct_fields() Created: 2022-02-10 Updated: 2022-02-23 Resolved: 2022-02-23 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Storage Engine - InnoDB |
| Affects Version/s: | 10.4, 10.5, 10.6, 10.7, 10.8 |
| Fix Version/s: | 10.4.25, 10.5.16, 10.6.8, 10.7.4, 10.8.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | crash, instant | ||
| Attachments: |
|
| Description |
|
The following crash was observed on 10.8:
One column has been instantly dropped, and some columns have been reordered. I tried to guess a test case, but I failed.
The following should fix the failure by preventing the out-of-bounds access:
I hope that mleich can come up with a test case for this. |
| Comments |
| Comment by Matthias Leich [ 2022-02-10 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The grammar | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-02-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I tried to create the following test for reproducing the failure, from an rr replay trace of an RQG based test, but it fails to crash on my system.
The rr replay trace that I analyzed ended in a SIGKILL during the commit of DROP COLUMN (before any redo log was written) and then the failed recovery. The ALTER TABLE statements were not always run in such an optimal order, and there were about 500 such statements in total (so, some of them must have failed). In the end, the total number of instantly dropped columns would be 175, matching the above test. I added the DEBUG_SYNC and INSERT trick is there to ensure that the last DROP COLUMN gets stuck before it wrote any log. Possibly the INSERT and the auxiliary table should be removed, because in the rr replay trace that I checked, the server did not write any redo log after the commit for DROP COLUMN was invoked (and quickly interrupted by the SIGKILL). I tried removing all references to the table t1, and it did not change the outcome for me. If we are unable to create a regression test for this, I think that we can do without one, given that the fix is so simple and it behaved well in broadband testing. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I pushed the fix without a test case. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-02-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|