[MDEV-15199] Triangular FKs - Cascade delete causes broken referential integrity Created: 2018-02-03 Updated: 2021-03-09 Resolved: 2018-02-07 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Manipulation - Delete |
| Affects Version/s: | 10.2.2, 10.3.0 |
| Fix Version/s: | 10.2.13, 10.3.5 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Oliver Schonrock | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | delete, foreign-keys, innodb, upstream-fixed | ||
| Environment: |
FreeBSD ports |
||
| Attachments: |
|
||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
See SQL test case below for detail. See this thread for problem discovery: For a "triangular" table FK structure (see attached png visual) deleting parent record, should delete both children, but the ON DELETE CASCADE does not delete record in 2nd child table when 2nd child has ON DELETE SET NULL constraint to 1st child Referential integrity is broken after parent is deleted. LEFT JOINs do not show broken FK. MySQL 5.7.21 and MariaDB 10.1.30 do not exhibit this broken behaviour.
|
| Comments |
| Comment by Elena Stepanova [ 2018-02-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for the report and the test case. Reproducible as described on 10.2, including older 10.2 releases (lowest that I tried was 10.2.4). Not reproducible on 10.1 and MySQL 5.7. Debug builds abort with the assertion failure upon DELETE:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
@Elena Stepanova Thank you for testing and confirming that I am not going insane. Happy to compile / test any patches if that helps. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
On a related note, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I can repeat the debug assertion failure with MySQL 5.7.20 but not MySQL 5.7.21. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
@marko
Tested update for the test case above like this:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
@marko
Thanks. Great find of the upstream change from 5.7.20 => 5.7.21 That heap based FK cascade code has been there since MySQL 5.7.2. That might suggest that this broken FK behvaiour has existed in MySQL since 5.7.2: We noticed the broken Referential Integrity within 2 weeks of using an affected version. Surely we are not the only MariaDB or MySQL user who had used FKs like this since 2013? More likely some later commit, closer to 5.7.20, interacted with the heap based, iterative FK cascade code to cause this bug and when they reverted to stack based recursion, that removed this problem also. As you say "it was a side effect"? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
oschonrock, I believe that the MySQL 5.7.2 change may have introduced this corruption, but nobody noticed until now. For what it is worth, if you look at the commit, it is changing the file mysql-test/suite/innodb/r/innodb-index-online-fk.result. This result change looks acceptable to me, as there are multiple FOREIGN KEY constraints between the tables, and the order of evaluating SET NULL and CASCADE is affected based on the code change. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko, you're right. Notwithstanding my limited understanding of that code, that does indeed look like it might have been the trigger for the cascade delete break. Wow, so that means potentially many users since GA of MySQL 5.7 and MariaDB 10.2, who are using such FK structures have these "failed cascade delete" orphan records lying around. Perhaps what we are doing is more unusual than I thought, or perhaps it is down to "not being noticed". As stated in the original report, finding the "orphaned" record is not obvious. By FK doesn't work:
and a LEFT JOIN won't find it either as shown in original report. As I explained, the only way I found it, is by mysqldump followed by inserting that dump and then running the LEFT JOIN. If this gets fixed by merging the upstream fix/revert to stack based recursion for FK cascade, would it make sense to alert MariaDB users (and MySQL?) that they may have these orphaned records silently lying around in their DB? BTW: I also double confirmed that MySQL 5.7.20 exhibits the broken behaviour (while 5.7.21 does not). Thanks for your help. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko For what it's worth, once the broken CASCADE DELETE has occurred, a server restart does not restore sanity.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko I had a quick try to see what other methods might detect the problem, so users can see if they have been affected. And then what options exist to fix broken DBs without dump/reload. CHECK TABLE reports the problem:
and obviously, so does mysqlcheck
For InnoDB, the only suggested option for repair is:
What I did on our production DB (not wanting to take it down, dump & insert) is a I manually deleted the orphaned record by its PK (which is hard to know in a real dataset). Having done Some trial and error shows that this manual DELETE is easier if you do a "null alteration" first (which makes CHECK TABLE happy), then run the LEFT JOIN, which now works, and then you can easily clean up the orphans.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oliver Schonrock [ 2018-02-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko Thank for fixing this so quickly. We have now rolled this out on production via 10.2.13. Works great! By the time we rolled it out we had dozens of broken FKs on our production DB which we cleaned up by technique I described above. I still find it astonishing that no-one since 10.2.4 (GA since > 2 years!) had come across this. I suspect there must be quite a few users out there with broken FKs without realising it (since the broken LEFT JOIN hides them). |