[MDEV-19660] wsrep_rec_get_foreign_key() is dereferencing a stale pointer to a page that was previously latched Created: 2019-05-31 Updated: 2020-08-28 Resolved: 2019-07-09 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Storage Engine - InnoDB, Storage Engine - XtraDB |
| Affects Version/s: | 5.5.33a-galera, 10.0.19-galera, 10.1.6, 10.2.0, 10.3.0, 10.4.0 |
| Fix Version/s: | 10.2.26, 10.1.41, 10.3.17, 10.4.7 |
| Type: | Bug | Priority: | Major |
| Reporter: | Thirunarayanan Balathandayuthapani | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | corruption, foreign-keys, race | ||
| Description |
|
In row_ins_foreign_check_on_constraint(), clustered index record is being passed to wsrep_append_foreign_key() after releasing the latch. If a record has been changed by other thread in the meantime then it could lead to a crash when The following is the problematic code :
|
| Comments |
| Comment by Marko Mäkelä [ 2019-05-31 ] |
|
As far as I can tell, this was introduced in 5.5.25-galera, 10.0.19-galera, 10.1.6. |
| Comment by Jan Lindström (Inactive) [ 2019-06-04 ] |
|
In row_ins_check_foreign_constraint same function is called inside a active mtr. |
| Comment by Jan Lindström (Inactive) [ 2019-06-04 ] |
|
https://github.com/MariaDB/server/commit/42a1ad314700b705077333f42393250c978c92d7 |
| Comment by Marko Mäkelä [ 2019-06-10 ] |
|
I think that we need a test case that exercises the error handling code. Note: I am not asking for a test that reproduces the race condition. Also, please address my review comments regarding the code changes. |
| Comment by Marko Mäkelä [ 2019-06-10 ] |
|
I wonder if we could simply replace clust_rec with cascade->pcur->old_rec in the call. |
| Comment by Jan Lindström (Inactive) [ 2019-06-18 ] |
|
https://github.com/MariaDB/server/commit/859052dfcc3da3d61e3023c7401ef009cea50bcd |
| Comment by Marko Mäkelä [ 2019-06-25 ] |
|
Can we merely replace the clust_rec with cascade->pcur->old_rec and omit all other changes? I am concerned about adding so much code for error handling or reporting, especially when that code is not being backed by an ‘organic’ test case that does not resort to fault injection. |
| Comment by Jan Lindström (Inactive) [ 2019-07-01 ] |
|
Sure, I will do that for 10.1-10.4. |