Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11423

Crashes on startup with Assert failure (6) on load with missing encryption key,

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1.19
    • N/A
    • None
    • Debian Jessie, all current patches, 24GB mem (approx 50% in use), 1TB db storage (90% free)

    Description

      Trying to restart with a deliberately missing key - key 1 is available, key 19 isn't for a normal reboot. The failure mode blocks upgrades from working correctly again.

      On main server, correctly starts when both keys available, but fails when one key is missing. No assert on development server (running same config, acting as a replication slave).

      Attached is syslog showing successful shutdowns, failed starts, and working starts (the latter has both key 1 and key 19 in the file - note that the failed starts report key 1 not found, but that is in the file at all times)

      Error persists after full reboot - the MariaDB instance fails to start, until both keys are provided.

      Attachments

        Issue Links

          Activity

            Can you provide us the database where this happens or test case that would repeat your problem ? We already have number of tests where we try to start encrypted database with missing keys and I have not yet seen this assertion.

            jplindst Jan Lindström (Inactive) added a comment - Can you provide us the database where this happens or test case that would repeat your problem ? We already have number of tests where we try to start encrypted database with missing keys and I have not yet seen this assertion.

            I can't provide the full database (it includes personal and restricted information), I'll try and see if I can produce a "mini" version that has the same issue. Unfortunately this is occurring on the live server instance, so any testing is hard.

            Is there any indication what this was trying to assert, so I can focus attempts? The other option is if there is a debug version of that Jessie build so I can try and grab a more complete stacktrace as I did for the last one.

            Oakham Richard Oakham added a comment - I can't provide the full database (it includes personal and restricted information), I'll try and see if I can produce a "mini" version that has the same issue. Unfortunately this is occurring on the live server instance, so any testing is hard. Is there any indication what this was trying to assert, so I can focus attempts? The other option is if there is a debug version of that Jessie build so I can try and grab a more complete stacktrace as I did for the last one.

            I can see this Assertion failure in thread 140230388594432 in file btr0pcur.cc line 119 and I can guess why it would assert there, the fact is that I do not know why it has reached that place as we should have detected earlier that record in the persistent cursor is most likely NULL i.e. we have failed to read page from tablespace because page is encrypted and we have noted that we do not have key to decrypt the page. So if you can get me more understandable stack trace e.g. using https://dev.mysql.com/doc/refman/5.5/en/using-stack-trace.html

            jplindst Jan Lindström (Inactive) added a comment - I can see this Assertion failure in thread 140230388594432 in file btr0pcur.cc line 119 and I can guess why it would assert there, the fact is that I do not know why it has reached that place as we should have detected earlier that record in the persistent cursor is most likely NULL i.e. we have failed to read page from tablespace because page is encrypted and we have noted that we do not have key to decrypt the page. So if you can get me more understandable stack trace e.g. using https://dev.mysql.com/doc/refman/5.5/en/using-stack-trace.html

            Working on it now (better stack trace)

            Oakham Richard Oakham added a comment - Working on it now (better stack trace)

            Unfortunately a restart gets me a SigSev (11) again, like it did with 10.1.18. Starts successfully with both keys. See new attached Syslog (not yet converted for additional stacktrace).

            Oakham Richard Oakham added a comment - Unfortunately a restart gets me a SigSev (11) again, like it did with 10.1.18. Starts successfully with both keys. See new attached Syslog (not yet converted for additional stacktrace).

            Can you convert both original and this new stacktrace based on instructions. I will try then investigate.

            jplindst Jan Lindström (Inactive) added a comment - Can you convert both original and this new stacktrace based on instructions. I will try then investigate.

            I tried with those instructions, didn't get anything recognisable (even with demangling). There is a debug build of 10.1.19 compiling now, I have a one hour maintenance window tomorrow so I'll try with that one to get a better trace.

            I did wonder if that latest crash was an attempt to run a query whilst it was starting up again (got a connection, wasn't ready for it). During maintenance no-one will be connection so I hope to see the Assert again instead.

            Oakham Richard Oakham added a comment - I tried with those instructions, didn't get anything recognisable (even with demangling). There is a debug build of 10.1.19 compiling now, I have a one hour maintenance window tomorrow so I'll try with that one to get a better trace. I did wonder if that latest crash was an attempt to run a query whilst it was starting up again (got a connection, wasn't ready for it). During maintenance no-one will be connection so I hope to see the Assert again instead.

            In my understanding number of crashes when you start server with incorrect key have already be fixed e.g. MDEV-12253. Try with more recent release and if you can still repeat this issue please reopen this one with detailed info.

            jplindst Jan Lindström (Inactive) added a comment - In my understanding number of crashes when you start server with incorrect key have already be fixed e.g. MDEV-12253 . Try with more recent release and if you can still repeat this issue please reopen this one with detailed info.

            People

              jplindst Jan Lindström (Inactive)
              Oakham Richard Oakham
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.