Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Description
Upon next startup after an intentional crash (during DML flow), Aria recovery does not return any errors:
10.2 5ec9b88e1111 |
2018-12-05 21:46:24 140049823393600 [Note] mysqld: Aria engine: starting recovery
|
recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 89% 99% 100% (17.7 seconds); transactions to roll back: 2 1 0 (0.0 seconds); tables to flush: 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 (1.8 seconds);
|
2018-12-05 21:46:44 140049823393600 [Note] mysqld: Aria engine: recovery done
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Uses event mutexes
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Compressed tables use zlib 1.2.11
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Using Linux native AIO
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Number of pools: 1
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Using SSE2 crc32 instructions
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Completed initialization of buffer pool
|
2018-12-05 21:46:44 140049095739136 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Highest supported file format is Barracuda.
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1620006
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: 128 out of 128 rollback segments are active.
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Creating shared tablespace for temporary tables
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: Waiting for purge to start
|
2018-12-05 21:46:44 140049823393600 [Note] InnoDB: 5.7.24 started; log sequence number 1620015
|
2018-12-05 21:46:44 140048867514112 [Note] InnoDB: Loading buffer pool(s) from /data/bld/10.2/data/ib_buffer_pool
|
2018-12-05 21:46:44 140049823393600 [Note] Plugin 'FEEDBACK' is disabled.
|
2018-12-05 21:46:44 140049823393600 [Note] Recovering after a crash using tc.log
|
2018-12-05 21:46:44 140049823393600 [Note] Starting crash recovery...
|
2018-12-05 21:46:44 140048867514112 [Note] InnoDB: Buffer pool(s) load completed at 181205 21:46:44
|
2018-12-05 21:46:44 140049823393600 [Note] Crash recovery finished.
|
2018-12-05 21:46:44 140049823393600 [Note] Server socket created on IP: '::'.
|
2018-12-05 21:46:44 140049823393600 [Note] Reading of all Master_info entries succeded
|
2018-12-05 21:46:44 140049823393600 [Note] Added new Master_info '' to hash table
|
2018-12-05 21:46:44 140049823393600 [Note] /data/bld/10.2/bin/mysqld: ready for connections.
|
Version: '10.2.20-MariaDB-debug' socket: '/data/bld/10.2/data/tmp/mysql.sock' port: 3306 Source distribution
|
However, further CHECK on a table shows a problem:
MariaDB [test]> check table oltp16; |
+-------------+-------+----------+-------------------------+ |
| Table | Op | Msg_type | Msg_text | |
+-------------+-------+----------+-------------------------+ |
| test.oltp16 | check | error | Found 2315 keys of 2319 | |
| test.oltp16 | check | error | Corrupt | |
+-------------+-------+----------+-------------------------+ |
2 rows in set (0.02 sec) |
Manual repair works:
MariaDB [test]> repair table oltp16; |
+-------------+--------+----------+-----------------------------------------------------------+ |
| Table | Op | Msg_type | Msg_text | |
+-------------+--------+----------+-----------------------------------------------------------+ |
| test.oltp16 | repair | warning | Duplicate key 1 for record at 3111 against record at 361 | |
| test.oltp16 | repair | warning | 1 records have been removed |
|
| test.oltp16 | repair | warning | Number of rows changed from 2319 to 2317 | |
| test.oltp16 | repair | status | OK |
|
+-------------+--------+----------+-----------------------------------------------------------+ |
4 rows in set (0.18 sec) |
|
MariaDB [test]> check table oltp16; |
+-------------+-------+----------+-----------------------------------------------------------+ |
| Table | Op | Msg_type | Msg_text | |
+-------------+-------+----------+-----------------------------------------------------------+ |
| test.oltp16 | check | warning | Found row where the auto_increment column has the value 0 | |
| test.oltp16 | check | status | OK | |
+-------------+-------+----------+-----------------------------------------------------------+ |
2 rows in set (0.00 sec) |
|
MariaDB [test]> show create table oltp16 \G |
*************************** 1. row ***************************
|
Table: oltp16 |
Create Table: CREATE TABLE `oltp16` ( |
`k` int(10) unsigned NOT NULL DEFAULT 0, |
`id` int(10) unsigned NOT NULL AUTO_INCREMENT, |
`pad` char(60) NOT NULL DEFAULT '', |
`c` char(120) NOT NULL DEFAULT '', |
PRIMARY KEY (`id`), |
KEY `k` (`k`) |
) ENGINE=Aria AUTO_INCREMENT=1891959922 DEFAULT CHARSET=latin1 CHECKSUM=1 PAGE_CHECKSUM=0 ROW_FORMAT=PAGE TRANSACTIONAL=1 |
1 row in set (0.00 sec) |
Setting --aria-recover-options to non-default NORMAL or FORCE does not change anything.
The datadir after the initial crash, before any attempt to recover, is here:
ftp://ftp.askmonty.org/public/mdev17913-data.tar.gz
Unpack and start the server with
--aria-encrypt-tables=1 --plugin-load-add=file_key_management --file-key-management-filename=`pwd`/mysql-test/std_data/keys.txt
|
Adjust the path to keys.txt if needed.
You can also add aria-recover-options of your choice.
After recovery, run
check table oltp16; |
The general log for the test flow prior to the initial crash is in mysql.log.
The test to run the complete flow (might require several trials):
https://github.com/MariaDB/randgen --branch elenst-jira-refs b1facb0360 |
perl ./runall-new.pl --basedir=/data/bld/10.2 --vardir=/dev/shm/vardir --duration=250 --threads=6 --scenario=CrashUpgrade --redefine=conf/mariadb/bulk_insert.yy --engine=Aria --grammar=conf/mariadb/generic-dml.yy --gendata=conf/mariadb/oltp-aria.zz --mysqld=--aria-encrypt-tables=1 --grammar=conf/mariadb/generic-dml.yy --gendata=conf/mariadb/oltp-aria.zz --mysqld=--plugin-load-add=file_key_management --mysqld=--file-key-management-filename=/data/bld/keys.txt
|
RQG branch and revision are important here, you might not have some of the required files in other branches.
Adjust the paths to basedir and keys file on the command line.
Couldn't reproduce on 10.1 with the given test, but possibly there is some difference in the test flow.
Couldn't reproduce so far without encryption.
Attachments
Issue Links
- causes
-
MDEV-18187 Aria engine: Redo phase failed with "error 192 when executing record redo_index_new_page" upon startup on a restored datadir
-
- Closed
-
- relates to
-
MDEV-17864 Run regular crash recovery tests for Aria tables (system and user tables)
-
- Closed
-
-
MDEV-17912 Aria with encryption fails upon crash recovery: Redo phase failed, "Got error 192"
-
- Stalled
-
-
MDEV-20313 Transactional Aria table stays corrupt after crash-recovery: error : Found 2275 keys of 2274
-
- Open
-
-
MDEV-25507 CHECK on encrypted Aria table complains about "Wrong LSN"
-
- Closed
-
-
MDEV-25866 Upgrade from pre-10.5.10 to 10.5.10 causes CHECK errors on encrypted Aria tables
-
- Closed
-
Other errors happening in similar circumstances (failed automatic recovery of encrypted transactional tables):
MariaDB [(none)]> check table test.oltp105;
+--------------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+--------------+-------+----------+-----------------------------------------------------------+
| test.oltp105 | check | warning | Found row where the auto_increment column has the value 0 |
| test.oltp105 | check | error | Checksum for key: 2 doesn't match checksum for records |
| test.oltp105 | check | error | Corrupt |
+--------------+-------+----------+-----------------------------------------------------------+
3 rows in set (0.048 sec)
MariaDB [(none)]> check table test.oltp105;
+--------------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+--------------+-------+----------+-----------------------------------------------------------+
| test.oltp105 | check | warning | Table is marked as crashed |
| test.oltp105 | check | warning | Found row where the auto_increment column has the value 0 |
| test.oltp105 | check | error | Checksum for key: 2 doesn't match checksum for records |
| test.oltp105 | check | error | Corrupt |
+--------------+-------+----------+-----------------------------------------------------------+
4 rows in set (0.005 sec)
MariaDB [(none)]> repair table test.oltp105;
+--------------+--------+----------+----------+age done
| Table | Op | Msg_type | Msg_text |
+--------------+--------+----------+----------+
| test.oltp105 | repair | status | OK |
+--------------+--------+----------+----------+
1 row in set (0.152 sec)
MariaDB [(none)]> check table test.oltp105;
+--------------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+--------------+-------+----------+-----------------------------------------------------------+
| test.oltp105 | check | warning | Found row where the auto_increment column has the value 0 |
| test.oltp105 | check | status | OK |
+--------------+-------+----------+-----------------------------------------------------------+
2 rows in set (0.002 sec)
or
MariaDB [(none)]> check table test.oltp90 extended;
+-------------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-----------------------------------------------------------+
| test.oltp90 | check | warning | Table is marked as crashed |
| test.oltp90 | check | warning | Found row where the auto_increment column has the value 0 |
| test.oltp90 | check | error | Record at: 1:171 Can't find key for index: 2 |
| test.oltp90 | check | error | Corrupt |
+-------------+-------+----------+-----------------------------------------------------------+
4 rows in set (0.042 sec)
MariaDB [(none)]> repair table test.oltp90;
+-------------+--------+----------+----------+tage done
| Table | Op | Msg_type | Msg_text |
+-------------+--------+----------+----------+
| test.oltp90 | repair | status | OK |
+-------------+--------+----------+----------+
1 row in set (0.158 sec)
MariaDB [(none)]> check table test.oltp90 extended;
+-------------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-----------------------------------------------------------+
| test.oltp90 | check | warning | Found row where the auto_increment column has the value 0 |
| test.oltp90 | check | status | OK |
+-------------+-------+----------+-----------------------------------------------------------+
2 rows in set (0.032 sec)
MariaDB [(none)]> check table test.oltp35 extended;
+-------------+-------+----------+-------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-------------------------------------------------+
| test.oltp35 | check | error | Record-count is not ok; found 18 Should be: 19 |
| test.oltp35 | check | error | Key pointers and record positions doesn't match |
| test.oltp35 | check | error | Corrupt |
+-------------+-------+----------+-------------------------------------------------+
3 rows in set (0.011 sec)
MariaDB [(none)]> check table test.oltp35 extended;
+-------------+-------+----------+-------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-------------------------------------------------+
| test.oltp35 | check | warning | Table is marked as crashed |
| test.oltp35 | check | error | Record-count is not ok; found 18 Should be: 19 |
| test.oltp35 | check | error | Key pointers and record positions doesn't match |
| test.oltp35 | check | error | Corrupt |
+-------------+-------+----------+-------------------------------------------------+
4 rows in set (0.010 sec)
MariaDB [(none)]> repair table test.oltp35;
+-------------+--------+----------+--------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+--------+----------+--------------------------------------+
| test.oltp35 | repair | warning | Number of rows changed from 19 to 18 |
| test.oltp35 | repair | status | OK |
+-------------+--------+----------+--------------------------------------+
2 rows in set (0.132 sec)
MariaDB [(none)]> check table test.oltp35 extended;
+-------------+-------+----------+----------+e done
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+----------+
| test.oltp35 | check | status | OK |
+-------------+-------+----------+----------+
1 row in set (0.001 sec)
MariaDB [(none)]> check table test.oltp97 extended;
+-------------+-------+----------+-------------------------------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-------------------------------------------------------------------------------------------+
| test.oltp97 | check | error | Page at 40960 is not delete marked |
| test.oltp97 | check | error | Invalid key block position: 4854317448351973378 key block size: 8192 file_length: 65536 |
| test.oltp97 | check | error | key delete-link-chain corrupted |
| test.oltp97 | check | error | Corrupt |
+-------------+-------+----------+-------------------------------------------------------------------------------------------+
4 rows in set (0.014 sec)
MariaDB [(none)]> check table test.oltp97 extended;
+-------------+-------+----------+-------------------------------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-------------------------------------------------------------------------------------------+
| test.oltp97 | check | warning | Table is marked as crashed |
| test.oltp97 | check | error | Page at 40960 is not delete marked |
| test.oltp97 | check | error | Invalid key block position: 4854317448351973378 key block size: 8192 file_length: 65536 |
| test.oltp97 | check | error | key delete-link-chain corrupted |
| test.oltp97 | check | error | Corrupt |
+-------------+-------+----------+-------------------------------------------------------------------------------------------+
5 rows in set (0.040 sec)
MariaDB [(none)]> repair table test.oltp97;
+-------------+--------+----------+----------+tage done
| Table | Op | Msg_type | Msg_text |
+-------------+--------+----------+----------+
| test.oltp97 | repair | status | OK |
+-------------+--------+----------+----------+
1 row in set (0.141 sec)
MariaDB [(none)]> check table test.oltp97 extended;
+-------------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-----------------------------------------------------------+
| test.oltp97 | check | warning | Found row where the auto_increment column has the value 0 |
| test.oltp97 | check | status | OK |
+-------------+-------+----------+-----------------------------------------------------------+
2 rows in set (0.046 sec)
... and many more variations, I won't list them all. After this bug is fixed, workarounds from my tests (forced explicit REPAIR) will need to be removed, and tests to be run again, to see if any variations remain.