[MDEV-16131] Assertion `is_instant() || id == DICT_INDEXES_ID' failed in dict_index_t::instant_field_value Created: 2018-05-09 Updated: 2021-11-17 Resolved: 2018-07-26 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.3.2 |
| Fix Version/s: | 10.3.9 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Very poorly reproducible, so far once in travis tests and once locally out of ~40 attempts
|
| Comments |
| Comment by Elena Stepanova [ 2018-05-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
New occurrence: https://api.travis-ci.org/v3/job/378823604/log.txt | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2018-07-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have some RQG grammar which replays the problem quite fast.
My backtrace:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2018-07-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
No replay on a 10.2 compiled with debug within 8 attempts. I have uploaded the files required for the replay. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
There is no debug injection point ib_rename_indexes_too_many_concurrent_trxs in MariaDB, so the SET DEBUG_DBUG should have no effect. I can repeat this with a slightly simpler test:
Then, after starting the test, start two clients:
This will result in the assertion failure during the execution of DROP COLUMN. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that the following is happening:
We could try 2 fixes:
I believe that the first fix is easier and less risky. Converting the table to the canonical format is merely an optimization for performance and file format compatibility. If the table-rebuilding ALTER succeeds, the table would be in the canonical format again. If the ALTER fails, then the table would remain in the less efficient "instant" format. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Deterministic test case:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The test case needs 2 auxiliary connections to be deterministic. We must keep the read view open until after the ALTER TABLE has been sent. Also, adding a default value of NULL is a special case. When adding a column with a NOT NULL DEFAULT value, then we would have a similar problem in row_log_table_apply_convert_mrec(). It turns out to be simplest to introduce row_log_t::non_core_fields[] for storing the default values, and rely on those during row_log_table_apply(). In that way, we are free to invoke dict_index_t::remove_instant() on the source table at any time. I implemented this fix, but it needs some adjustment, because my test is randomly crashing due to row_log_table_apply_op() not consuming all data for instantly added columns. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
My fix ensures that the online_log is written and parsed in a consistent way. It was easier to change how online table rebuild works than to try avoid calls to dict_index_t::remove_instant(). |