[MDEV-18606] innodb crashes on large update and it gets corrupted - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Incomplete
Affects Version/s: 10.3.12
Fix Version/s: N/A
Component/s: Storage Engine - InnoDB
Labels:
- innodb
- upstream-wontfix
Environment:
Ubuntu

Description

CREATE TABLE npadata (

  state varchar(2) NOT NULL DEFAULT '??',

  company varchar(30) DEFAULT NULL,

  ocn varchar(4) NOT NULL DEFAULT 'N/A',

  prefix_type varchar(10) DEFAULT NULL,

  ratecenter varchar(40) DEFAULT NULL,

  clli varchar(20) DEFAULT NULL,

  lata varchar(5) DEFAULT NULL,

  country varchar(2) DEFAULT NULL,

  wireless tinyint(1) UNSIGNED NOT NULL DEFAULT 0,

  did bigint(20) UNSIGNED NOT NULL,

  lrn bigint(20) UNSIGNED DEFAULT NULL,

  port_type tinyint(4) UNSIGNED DEFAULT NULL,

  dnc tinyint(4) UNSIGNED DEFAULT NULL,

  reachable tinyint(4) UNSIGNED DEFAULT NULL,

  npa varchar(3) DEFAULT NULL,

  nxx varchar(3) DEFAULT NULL,

  xxxx varchar(4) DEFAULT NULL,

  tested tinyint(1) DEFAULT 0,

  lastchange date NOT NULL DEFAULT 'curdate()',

  PRIMARY KEY (did)

ALTER TABLE npadata

ADD INDEX IDX_npadata3 (reachable, tested);

I was executing:

update npadata set tested=0, reachable=0 where reachable=1 or tested=1;

and MariaDB crashed. When it restared this was in the log. I have no idea how to get my data working again. The table has 1.97BN records.

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_CFSTATS' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_DBSTATS' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_PERF_CONTEXT' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_PERF_CONTEXT_GLOBAL' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_CF_OPTIONS' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_COMPACTION_STATS' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_GLOBAL_INFO' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_DDL' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_SST_PROPS' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_INDEX_FILE_MAP' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_LOCKS' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_TRX' already installed

2019-02-16 18:47:42 0 [ERROR] mysqld: Plugin 'ROCKSDB_DEADLOCK' already installed

2019-02-16 18:47:42 0 [Note] RocksDB: 2 column families found

2019-02-16 18:47:42 0 [Note] RocksDB: Column Families at start:

2019-02-16 18:47:42 0 [Note]   cf=default

2019-02-16 18:47:42 0 [Note]     write_buffer_size=268435456

2019-02-16 18:47:42 0 [Note]     target_file_size_base=33554432

2019-02-16 18:47:42 0 [Note]   cf=__system__

2019-02-16 18:47:42 0 [Note]     write_buffer_size=268435456

2019-02-16 18:47:42 0 [Note]     target_file_size_base=33554432

2019-02-16 18:47:42 0 [Note] RocksDB: Table_store: loaded DDL data for 1 tables

2019-02-16 18:47:42 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer

2019-02-16 18:47:42 0 [Note] MyRocks storage engine plugin has been successfully initialized.

2019-02-16 18:47:42 0 [Note] InnoDB: Using Linux native AIO

2019-02-16 18:47:42 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins

2019-02-16 18:47:42 0 [Note] InnoDB: Uses event mutexes

2019-02-16 18:47:42 0 [Note] InnoDB: Compressed tables use zlib 1.2.11

2019-02-16 18:47:42 0 [Note] InnoDB: Number of pools: 1

2019-02-16 18:47:42 0 [Note] InnoDB: Using SSE2 crc32 instructions

2019-02-16 18:47:42 0 [Note] InnoDB: Initializing buffer pool, total size = 15G, instances = 8, chunk size = 128M

2019-02-16 18:48:13 0 [Note] InnoDB: Completed initialization of buffer pool

2019-02-16 18:48:13 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().

2019-02-16 18:48:13 0 [ERROR] InnoDB: Invalid redo log header checksum.

2019-02-16 18:48:13 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption

2019-02-16 18:48:13 0 [Note] InnoDB: Starting shutdown...

2019-02-16 18:48:14 0 [Warning] InnoDB: Failed to set memory to DODUMP: Invalid argument ptr 0x7fc2c0000000 size 134217728

2019-02-16 18:48:14 0 [Warning] InnoDB: Failed to set memory to DODUMP: Invalid argument ptr 0x7fc280000000 size 2097152

2019-02-16 18:48:16 0 [ERROR] Plugin 'InnoDB' init function returned error.

2019-02-16 18:48:16 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

2019-02-16 18:48:16 0 [Note] Plugin 'FEEDBACK' is disabled.

2019-02-16 18:48:16 0 [ERROR] Unknown/unsupported storage engine: innodb

2019-02-16 18:48:16 0 [ERROR] Aborting

Attachments

Issue Links

relates to

MDEV-13542 Crashing on a corrupted page is unhelpful

Closed

MDEV-14425 Change the InnoDB redo log format to reduce write amplification

Closed

MDEV-27449 Database crashes on startup InnoDB: Assertion failure in file ./storage/innobase/btr/btr0cur.cc line 297

Closed

Activity

Ascending order - Click to sort in descending order

Philip orleans added a comment - 2019-02-17 15:45

This is the issue.
https://bugs.mysql.com/bug.php?id=84191
I run out of space and Innodb got terminally corrupted.
This is a big bug. The engine should never get corrupted to the point that we need to export the data for ALL databases, which can take days, and reimport it a new installation, which can also take days. Innodb is absolutely useless. I will stick with RocksDB

Philip orleans added a comment - 2019-02-17 15:45 This is the issue. https://bugs.mysql.com/bug.php?id=84191 I run out of space and Innodb got terminally corrupted. This is a big bug. The engine should never get corrupted to the point that we need to export the data for ALL databases, which can take days, and reimport it a new installation, which can also take days. Innodb is absolutely useless. I will stick with RocksDB

Elena Stepanova added a comment - 2019-02-19 01:07

I'll leave it to marko to decide what can be done about it.

Elena Stepanova added a comment - 2019-02-19 01:07 I'll leave it to marko to decide what can be done about it.

Marko Mäkelä added a comment - 2019-02-19 07:56

If InnoDB runs out of space or encounters an error when writing redo log, then I think that the only option would be to switch to read-only mode instantly.

Alternatively, InnoDB should commit suicide immediately, and not ignore any errors when writing the redo log. Also in that way, crash recovery should succeed (at least after enough space has been made available in the file system).

The upstream bug suggests that the problem occurred when extending the InnoDB system tablespace. In MariaDB, the file system extension was improved in ~~MDEV-11520~~, and the logic could be slightly different. But, ultimately I am afraid that InnoDB would still crash if no space is available for data pages. This would be fixed in ~~MDEV-13542~~.

philip_38, can you please include the error log from before the failed restart. What exactly caused the crash?

If you still have the dataset, I would like to diagnose what caused this message:

2019-02-16 18:48:13 0 [ERROR] InnoDB: Invalid redo log header checksum.

Would it be possible to start a debug version of MariaDB on the files, and add --debug=d,ib_log to the startup options? That would give some hints where the redo log read went bad. Alternatively, if you started mysqld under strace, we could try to figure it out from that output.

Marko Mäkelä added a comment - 2019-02-19 07:56 If InnoDB runs out of space or encounters an error when writing redo log, then I think that the only option would be to switch to read-only mode instantly. Alternatively, InnoDB should commit suicide immediately, and not ignore any errors when writing the redo log. Also in that way, crash recovery should succeed (at least after enough space has been made available in the file system). The upstream bug suggests that the problem occurred when extending the InnoDB system tablespace. In MariaDB, the file system extension was improved in MDEV-11520 , and the logic could be slightly different. But, ultimately I am afraid that InnoDB would still crash if no space is available for data pages. This would be fixed in MDEV-13542 . philip_38 , can you please include the error log from before the failed restart. What exactly caused the crash? If you still have the dataset, I would like to diagnose what caused this message: 2019-02-16 18:48:13 0 [ERROR] InnoDB: Invalid redo log header checksum. Would it be possible to start a debug version of MariaDB on the files, and add --debug=d,ib_log to the startup options? That would give some hints where the redo log read went bad. Alternatively, if you started mysqld under strace , we could try to figure it out from that output.

Philip orleans added a comment - 2019-02-19 15:21

I have since erased the whole directory and are still inserting the data into RocksDB. It is only at 30% as of this point.
But I will simulate the crash and let you know, It is pretty easy to do.

Philip orleans added a comment - 2019-02-19 15:21 I have since erased the whole directory and are still inserting the data into RocksDB. It is only at 30% as of this point. But I will simulate the crash and let you know, It is pretty easy to do.

Philip orleans added a comment - 2019-02-24 21:32

The issue with btrfs is this
https://askubuntu.com/questions/170044/btrfs-and-missing-free-space

you can run out of space but it's not your fault or a miscalculation. This evil file system will run of space on its own.
Never use it, the best option is XFS.

Philip orleans added a comment - 2019-02-24 21:32 The issue with btrfs is this https://askubuntu.com/questions/170044/btrfs-and-missing-free-space you can run out of space but it's not your fault or a miscalculation. This evil file system will run of space on its own. Never use it, the best option is XFS.

Marko Mäkelä added a comment - 2022-05-16 14:47

This ticket is linked to ~~MDEV-13542~~, a fix of which is currently being tested, along with a fix of ~~MDEV-13680~~.

But, in the included log, I do not see anything that could be addressed by those fixes, or possibly in InnoDB at all.

When it comes to allocating redo log or data pages (be it undo log, BLOB, or index pages), InnoDB should always extend the file upfront. If extending the file or internally allocating a page in an already extended file fails, InnoDB should crash or error out at that point. (This part would be improved in ~~MDEV-13680~~.)

Based on the included log, the InnoDB redo log was apparently corrupted, likely due to an error in btrfs. There is ~~MDEV-27449~~ and other tickets that indicate problems with btrfs.

philip_38, have you experienced this problem with any other file system than btrfs?

Marko Mäkelä added a comment - 2022-05-16 14:47 This ticket is linked to MDEV-13542 , a fix of which is currently being tested, along with a fix of MDEV-13680 . But, in the included log, I do not see anything that could be addressed by those fixes, or possibly in InnoDB at all. When it comes to allocating redo log or data pages (be it undo log, BLOB, or index pages), InnoDB should always extend the file upfront. If extending the file or internally allocating a page in an already extended file fails, InnoDB should crash or error out at that point. (This part would be improved in MDEV-13680 .) Based on the included log, the InnoDB redo log was apparently corrupted, likely due to an error in btrfs. There is MDEV-27449 and other tickets that indicate problems with btrfs . philip_38 , have you experienced this problem with any other file system than btrfs ?

Marko Mäkelä added a comment - 2022-09-01 06:11

philip_38, I might interpret your comments as "I moved away from btrfs and never looked back." Did you ever encounter this problem while not using btrfs?

Marko Mäkelä added a comment - 2022-09-01 06:11 philip_38 , I might interpret your comments as "I moved away from btrfs and never looked back." Did you ever encounter this problem while not using btrfs ?

Philip orleans added a comment - 2022-09-01 11:15

I stopped using Btrfs long ago. Not suitable for production.

Philip orleans added a comment - 2022-09-01 11:15 I stopped using Btrfs long ago. Not suitable for production.

People

Assignee:: Marko Mäkelä

Reporter:: Philip orleans

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2019-02-17 03:03

Updated:: 2022-10-02 19:02

Resolved:: 2022-10-02 19:02

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server