[MDEV-11423] Crashes on startup with Assert failure (6) on load with missing encryption key, Created: 2016-11-29 Updated: 2017-12-13 Resolved: 2017-12-13 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Encryption, Storage Engine - InnoDB |
| Affects Version/s: | 10.1.19 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Richard Oakham | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Debian Jessie, all current patches, 24GB mem (approx 50% in use), 1TB db storage (90% free) |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
Trying to restart with a deliberately missing key - key 1 is available, key 19 isn't for a normal reboot. The failure mode blocks upgrades from working correctly again. On main server, correctly starts when both keys available, but fails when one key is missing. No assert on development server (running same config, acting as a replication slave). Attached is syslog showing successful shutdowns, failed starts, and working starts (the latter has both key 1 and key 19 in the file - note that the failed starts report key 1 not found, but that is in the file at all times) Error persists after full reboot - the MariaDB instance fails to start, until both keys are provided. |
| Comments |
| Comment by Jan Lindström (Inactive) [ 2016-12-13 ] |
|
Can you provide us the database where this happens or test case that would repeat your problem ? We already have number of tests where we try to start encrypted database with missing keys and I have not yet seen this assertion. |
| Comment by Richard Oakham [ 2016-12-13 ] |
|
I can't provide the full database (it includes personal and restricted information), I'll try and see if I can produce a "mini" version that has the same issue. Unfortunately this is occurring on the live server instance, so any testing is hard. Is there any indication what this was trying to assert, so I can focus attempts? The other option is if there is a debug version of that Jessie build so I can try and grab a more complete stacktrace as I did for the last one. |
| Comment by Jan Lindström (Inactive) [ 2016-12-13 ] |
|
I can see this Assertion failure in thread 140230388594432 in file btr0pcur.cc line 119 and I can guess why it would assert there, the fact is that I do not know why it has reached that place as we should have detected earlier that record in the persistent cursor is most likely NULL i.e. we have failed to read page from tablespace because page is encrypted and we have noted that we do not have key to decrypt the page. So if you can get me more understandable stack trace e.g. using https://dev.mysql.com/doc/refman/5.5/en/using-stack-trace.html |
| Comment by Richard Oakham [ 2016-12-13 ] |
|
Working on it now (better stack trace) |
| Comment by Richard Oakham [ 2016-12-13 ] |
|
Unfortunately a restart gets me a SigSev (11) again, like it did with 10.1.18. Starts successfully with both keys. See new attached Syslog (not yet converted for additional stacktrace). |
| Comment by Jan Lindström (Inactive) [ 2016-12-13 ] |
|
Can you convert both original and this new stacktrace based on instructions. I will try then investigate. |
| Comment by Richard Oakham [ 2016-12-13 ] |
|
I tried with those instructions, didn't get anything recognisable (even with demangling). There is a debug build of 10.1.19 compiling now, I have a one hour maintenance window tomorrow so I'll try with that one to get a better trace. I did wonder if that latest crash was an attempt to run a query whilst it was starting up again (got a connection, wasn't ready for it). During maintenance no-one will be connection so I hope to see the Assert again instead. |
| Comment by Jan Lindström (Inactive) [ 2017-12-13 ] |
|
In my understanding number of crashes when you start server with incorrect key have already be fixed e.g. |