Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21160

Rocksdb gets corrupted and stops the server form running

Details

    Description

      I get this error message very often, in several machines.
      The server stops running and waits until a human being erases that file manually.
      This is a disaster. Mission-critical machines needs to be converted back to INNODB, which is very inefficient in terms of storage.
      The right design is that Rocksdb fixes itself, erases any corrupt file and continues.

      [ERROR] RocksDB: The server will exit normally and stop restart attempts. Remove ./#rocksdb/ROCKSDB_CORRUPTED file from data directory and start mysqld manually.

      Attachments

        Activity

          > The right design is that Rocksdb fixes itself, erases any corrupt file and continues.

          I am looking at the code and this logic with ROCKSDB_CORRUPTED file was put there intentionally. Maybe there's some kind of error that requires user intervention?

          I see that above that line, you should have got this text:

                  "RocksDB: There was a corruption detected in RockDB files. "
                  "Check error log emitted earlier for more details.");
          

          Do you have it? Is there anything above that text that would give a clue about why RocksDB stopped?
          Another place to check is the $datadir/#rocksdb/LOG file

          psergei Sergei Petrunia added a comment - > The right design is that Rocksdb fixes itself, erases any corrupt file and continues. I am looking at the code and this logic with ROCKSDB_CORRUPTED file was put there intentionally. Maybe there's some kind of error that requires user intervention? I see that above that line, you should have got this text: "RocksDB: There was a corruption detected in RockDB files. " "Check error log emitted earlier for more details."); Do you have it? Is there anything above that text that would give a clue about why RocksDB stopped? Another place to check is the $datadir/#rocksdb/LOG file

          Looking at the code - I see that ROCKSDB_CORRUPTED file is created when an operation over RocksDB returns a data corruption error. This should not normally happen (e.g. server process crash machine power off are not expected to cause this).

          We need to figure out what is causing the data corruption error.

          • The first step is to check the error log and RocksDB's LOG file
          • then, one could use sst_dump utility to check the data directory for errors.
          psergei Sergei Petrunia added a comment - Looking at the code - I see that ROCKSDB_CORRUPTED file is created when an operation over RocksDB returns a data corruption error. This should not normally happen (e.g. server process crash machine power off are not expected to cause this). We need to figure out what is causing the data corruption error. The first step is to check the error log and RocksDB's LOG file then, one could use sst_dump utility to check the data directory for errors.

          I just saw this. The issue is that in no event the server may stop and wait
          for a user to manually delete the file. If the fix is merely deleting a
          file, please have the software or the whatchdog process do it. We use
          mariadb for business.

          On Tue, Dec 3, 2019 at 8:39 AM Sergei Petrunia (Jira) <jira@mariadb.org>

          philip_38 Philip orleans added a comment - I just saw this. The issue is that in no event the server may stop and wait for a user to manually delete the file. If the fix is merely deleting a file, please have the software or the whatchdog process do it. We use mariadb for business. On Tue, Dec 3, 2019 at 8:39 AM Sergei Petrunia (Jira) <jira@mariadb.org>

          People

            psergei Sergei Petrunia
            philip_38 Philip orleans
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.