Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18606

innodb crashes on large update and it gets corrupted

Details

    Description

      CREATE TABLE npadata (
        state varchar(2) NOT NULL DEFAULT '??',
        company varchar(30) DEFAULT NULL,
        ocn varchar(4) NOT NULL DEFAULT 'N/A',
        prefix_type varchar(10) DEFAULT NULL,
        ratecenter varchar(40) DEFAULT NULL,
        clli varchar(20) DEFAULT NULL,
        lata varchar(5) DEFAULT NULL,
        country varchar(2) DEFAULT NULL,
        wireless tinyint(1) UNSIGNED NOT NULL DEFAULT 0,
        did bigint(20) UNSIGNED NOT NULL,
        lrn bigint(20) UNSIGNED DEFAULT NULL,
        port_type tinyint(4) UNSIGNED DEFAULT NULL,
        dnc tinyint(4) UNSIGNED DEFAULT NULL,
        reachable tinyint(4) UNSIGNED DEFAULT NULL,
        npa varchar(3) DEFAULT NULL,
        nxx varchar(3) DEFAULT NULL,
        xxxx varchar(4) DEFAULT NULL,
        tested tinyint(1) DEFAULT 0,
        lastchange date NOT NULL DEFAULT 'curdate()',
        PRIMARY KEY (did)
      )
      ALTER TABLE npadata
      ADD INDEX IDX_npadata3 (reachable, tested);
      

      I was executing:

      update npadata set tested=0, reachable=0 where reachable=1 or tested=1;


      and MariaDB crashed. When it restared this was in the log. I have no idea how to get my data working again. The table has 1.97BN records.

      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_CFSTATS' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_DBSTATS' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_PERF_CONTEXT' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_PERF_CONTEXT_GLOBAL' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_CF_OPTIONS' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_COMPACTION_STATS' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_GLOBAL_INFO' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_DDL' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_SST_PROPS' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_INDEX_FILE_MAP' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_LOCKS' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_TRX' already installed
      2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_DEADLOCK' already installed
      2019-02-16 18:47:42 0 [Note] RocksDB: 2 column families found
      2019-02-16 18:47:42 0 [Note] RocksDB: Column Families at start:
      2019-02-16 18:47:42 0 [Note]   cf=default
      2019-02-16 18:47:42 0 [Note]     write_buffer_size=268435456
      2019-02-16 18:47:42 0 [Note]     target_file_size_base=33554432
      2019-02-16 18:47:42 0 [Note]   cf=__system__
      2019-02-16 18:47:42 0 [Note]     write_buffer_size=268435456
      2019-02-16 18:47:42 0 [Note]     target_file_size_base=33554432
      2019-02-16 18:47:42 0 [Note] RocksDB: Table_store: loaded DDL data for 1 tables
      2019-02-16 18:47:42 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
      2019-02-16 18:47:42 0 [Note] MyRocks storage engine plugin has been successfully initialized.
      2019-02-16 18:47:42 0 [Note] InnoDB: Using Linux native AIO
      2019-02-16 18:47:42 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2019-02-16 18:47:42 0 [Note] InnoDB: Uses event mutexes
      2019-02-16 18:47:42 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2019-02-16 18:47:42 0 [Note] InnoDB: Number of pools: 1
      2019-02-16 18:47:42 0 [Note] InnoDB: Using SSE2 crc32 instructions
      2019-02-16 18:47:42 0 [Note] InnoDB: Initializing buffer pool, total size = 15G, instances = 8, chunk size = 128M
      2019-02-16 18:48:13 0 [Note] InnoDB: Completed initialization of buffer pool
      2019-02-16 18:48:13 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2019-02-16 18:48:13 0 [ERROR] InnoDB: Invalid redo log header checksum.
      2019-02-16 18:48:13 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
      2019-02-16 18:48:13 0 [Note] InnoDB: Starting shutdown...
      2019-02-16 18:48:14 0 [Warning] InnoDB: Failed to set memory to DODUMP: Invalid argument ptr 0x7fc2c0000000 size 134217728
      2019-02-16 18:48:14 0 [Warning] InnoDB: Failed to set memory to DODUMP: Invalid argument ptr 0x7fc280000000 size 2097152
      2019-02-16 18:48:16 0 [ERROR] Plugin 'InnoDB' init function returned error.
      2019-02-16 18:48:16 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
      2019-02-16 18:48:16 0 [Note] Plugin 'FEEDBACK' is disabled.
      2019-02-16 18:48:16 0 [ERROR] Unknown/unsupported storage engine: innodb
      2019-02-16 18:48:16 0 [ERROR] Aborting
      

      Attachments

        Issue Links

          Activity

            This is the issue.
            https://bugs.mysql.com/bug.php?id=84191
            I run out of space and Innodb got terminally corrupted.
            This is a big bug. The engine should never get corrupted to the point that we need to export the data for ALL databases, which can take days, and reimport it a new installation, which can also take days. Innodb is absolutely useless. I will stick with RocksDB

            philip_38 Philip orleans added a comment - This is the issue. https://bugs.mysql.com/bug.php?id=84191 I run out of space and Innodb got terminally corrupted. This is a big bug. The engine should never get corrupted to the point that we need to export the data for ALL databases, which can take days, and reimport it a new installation, which can also take days. Innodb is absolutely useless. I will stick with RocksDB

            I'll leave it to marko to decide what can be done about it.

            elenst Elena Stepanova added a comment - I'll leave it to marko to decide what can be done about it.

            If InnoDB runs out of space or encounters an error when writing redo log, then I think that the only option would be to switch to read-only mode instantly.

            Alternatively, InnoDB should commit suicide immediately, and not ignore any errors when writing the redo log. Also in that way, crash recovery should succeed (at least after enough space has been made available in the file system).

            The upstream bug suggests that the problem occurred when extending the InnoDB system tablespace. In MariaDB, the file system extension was improved in MDEV-11520, and the logic could be slightly different. But, ultimately I am afraid that InnoDB would still crash if no space is available for data pages. This would be fixed in MDEV-13542.

            philip_38, can you please include the error log from before the failed restart. What exactly caused the crash?

            If you still have the dataset, I would like to diagnose what caused this message:

            2019-02-16 18:48:13 0 [ERROR] InnoDB: Invalid redo log header checksum.

            Would it be possible to start a debug version of MariaDB on the files, and add --debug=d,ib_log to the startup options? That would give some hints where the redo log read went bad. Alternatively, if you started mysqld under strace, we could try to figure it out from that output.

            marko Marko Mäkelä added a comment - If InnoDB runs out of space or encounters an error when writing redo log, then I think that the only option would be to switch to read-only mode instantly. Alternatively, InnoDB should commit suicide immediately, and not ignore any errors when writing the redo log. Also in that way, crash recovery should succeed (at least after enough space has been made available in the file system). The upstream bug suggests that the problem occurred when extending the InnoDB system tablespace. In MariaDB, the file system extension was improved in MDEV-11520 , and the logic could be slightly different. But, ultimately I am afraid that InnoDB would still crash if no space is available for data pages. This would be fixed in MDEV-13542 . philip_38 , can you please include the error log from before the failed restart. What exactly caused the crash? If you still have the dataset, I would like to diagnose what caused this message: 2019-02-16 18:48:13 0 [ERROR] InnoDB: Invalid redo log header checksum. Would it be possible to start a debug version of MariaDB on the files, and add --debug=d,ib_log to the startup options? That would give some hints where the redo log read went bad. Alternatively, if you started mysqld under strace , we could try to figure it out from that output.

            I have since erased the whole directory and are still inserting the data into RocksDB. It is only at 30% as of this point.
            But I will simulate the crash and let you know, It is pretty easy to do.

            philip_38 Philip orleans added a comment - I have since erased the whole directory and are still inserting the data into RocksDB. It is only at 30% as of this point. But I will simulate the crash and let you know, It is pretty easy to do.

            The issue with btrfs is this
            https://askubuntu.com/questions/170044/btrfs-and-missing-free-space

            you can run out of space but it's not your fault or a miscalculation. This evil file system will run of space on its own.
            Never use it, the best option is XFS.

            philip_38 Philip orleans added a comment - The issue with btrfs is this https://askubuntu.com/questions/170044/btrfs-and-missing-free-space you can run out of space but it's not your fault or a miscalculation. This evil file system will run of space on its own. Never use it, the best option is XFS.

            This ticket is linked to MDEV-13542, a fix of which is currently being tested, along with a fix of MDEV-13680.

            But, in the included log, I do not see anything that could be addressed by those fixes, or possibly in InnoDB at all.

            When it comes to allocating redo log or data pages (be it undo log, BLOB, or index pages), InnoDB should always extend the file upfront. If extending the file or internally allocating a page in an already extended file fails, InnoDB should crash or error out at that point. (This part would be improved in MDEV-13680.)

            Based on the included log, the InnoDB redo log was apparently corrupted, likely due to an error in btrfs. There is MDEV-27449 and other tickets that indicate problems with btrfs.

            philip_38, have you experienced this problem with any other file system than btrfs?

            marko Marko Mäkelä added a comment - This ticket is linked to MDEV-13542 , a fix of which is currently being tested, along with a fix of MDEV-13680 . But, in the included log, I do not see anything that could be addressed by those fixes, or possibly in InnoDB at all. When it comes to allocating redo log or data pages (be it undo log, BLOB, or index pages), InnoDB should always extend the file upfront. If extending the file or internally allocating a page in an already extended file fails, InnoDB should crash or error out at that point. (This part would be improved in MDEV-13680 .) Based on the included log, the InnoDB redo log was apparently corrupted, likely due to an error in btrfs. There is MDEV-27449 and other tickets that indicate problems with btrfs . philip_38 , have you experienced this problem with any other file system than btrfs ?

            philip_38, I might interpret your comments as "I moved away from btrfs and never looked back." Did you ever encounter this problem while not using btrfs?

            marko Marko Mäkelä added a comment - philip_38 , I might interpret your comments as "I moved away from btrfs and never looked back." Did you ever encounter this problem while not using btrfs ?

            I stopped using Btrfs long ago. Not suitable for production.

            philip_38 Philip orleans added a comment - I stopped using Btrfs long ago. Not suitable for production.

            People

              marko Marko Mäkelä
              philip_38 Philip orleans
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.