Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19813 Aria crash recovery failures
  3. MDEV-18461

Aria crash recovery failures on the same data: 2 different assertions, error 176, refusal to start

    XMLWordPrintable

Details

    • Technical task
    • Status: Stalled (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.4(EOL), 10.5, 10.6
    • 10.5, 10.6
    • Storage Engine - Aria
    • None

    Description

      The attached datadir data.tar.gz was created by intentionally crashing 10.4 server (4b3656a) during concurrent DML.

      The server was run with

      --aria-encrypt-tables=1 --plugin-load-add=file_key_management --file-key-management-filename=/data/keys.txt
      

      (note that the path to the encryption key file might need to be modified).

      While trying to start the server again with the same options on this data directory, various failures have been observed.

      Most often this one:

      10.4 74eb4fc9

      mysqld: /data/src/10.4/storage/maria/ma_loghandler.c:3862: translog_init_with_table: Assertion `sure_page <= last_page' failed.
      190203 18:47:13 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007fa473202ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x000055679969ce16 in translog_init_with_table (directory=0x55679a983cc0 <mysql_real_data_home> "/data/bld/10.4/data/", log_file_max_size=1073741824, server_version=100403, server_id=0, pagecache=0x55679b2289c0 <maria_log_pagecache_var>, flags=0, readonly=0 '\000', init_table_func=0x556799693718 <translog_table_init>, no_errors=0 '\000') at /data/src/10.4/storage/maria/ma_loghandler.c:3862
      #9  0x000055679968c828 in ha_maria_init (p=0x55679b321700) at /data/src/10.4/storage/maria/ha_maria.cc:3686
      #10 0x000055679945e612 in ha_initialize_handlerton (plugin=0x55679b313fb0) at /data/src/10.4/sql/handler.cc:530
      #11 0x000055679914881b in plugin_initialize (tmp_root=0x7ffcb6b7a790, plugin=0x55679b313fb0, argc=0x55679a989eb8 <remaining_argc>, argv=0x55679b2c1968, options_only=false) at /data/src/10.4/sql/sql_plugin.cc:1437
      #12 0x00005567991494b0 in plugin_init (argc=0x55679a989eb8 <remaining_argc>, argv=0x55679b2c1968, flags=0) at /data/src/10.4/sql/sql_plugin.cc:1720
      #13 0x0000556799001b3f in init_server_components () at /data/src/10.4/sql/mysqld.cc:5153
      #14 0x0000556799002d25 in mysqld_main (argc=14, argv=0x55679b2c1968) at /data/src/10.4/sql/mysqld.cc:5680
      #15 0x0000556798ff74d0 in main (argc=14, argv=0x7ffcb6b7b1d8) at /data/src/10.4/sql/main.cc:25
      

      This failure was previously mentioned in MDEV-727 (description) and MDEV-583 (comments).

      Further attempts to start the server after it end the same way.


      Another one has been observed once so far:

      10.4 4b3656a

      recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (7.6 seconds); transactions to roll back: 2mysqld: /home/travis/src/storage/maria/ma_blockrec.c:2879: write_block_record: Assertion `undo_lsn == ((LSN)1) || head_length == row_pos->length' failed.
      190130  3:32:01 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007fc88c48ec82 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x000055aa69ad035d in write_block_record (info=0x55aa6c7de8b0, old_record=0x55aa6c7eaea0 "\341\017", record=0x55aa6c7eaec5 "\377\017", row=0x55aa6c7de9e0, bitmap_blocks=0x55aa6c7de920, head_block_is_read=1 '\001', row_pos=0x7ffd72ad8bc0, undo_lsn=4326780525, old_record_checksum=0) at /home/travis/src/storage/maria/ma_blockrec.c:2879
      #9  0x000055aa69ad3314 in _ma_update_at_original_place (info=0x55aa6c7de8b0, page=1, rownr=50, length_on_head_page=36, extent_count=0, extent_info=0x55aa6c7182a3 "", oldrec=0x55aa6c7eaea0 "\341\017", record=0x55aa6c7eaec5 "\377\017", undo_lsn=4326780525) at /home/travis/src/storage/maria/ma_blockrec.c:4002
      #10 0x000055aa69adc100 in _ma_apply_undo_row_update (info=0x55aa6c7de8b0, undo_lsn=4326780525, header=0x55aa6c7182a5 "\002", header_length=11) at /home/travis/src/storage/maria/ma_blockrec.c:7449
      #11 0x000055aa69aa199f in exec_UNDO_LOGREC_UNDO_ROW_UPDATE (rec=0x7ffd72ad8dc0, trn=0x55aa6c71b260) at /home/travis/src/storage/maria/ma_recovery.c:2387
      #12 0x000055aa69aa3257 in run_undo_phase (uncommitted=1) at /home/travis/src/storage/maria/ma_recovery.c:2964
      #13 0x000055aa69a9cf72 in maria_apply_log (from_lsn=4317903329, end_lsn=0, apply=MARIA_LOG_APPLY, trace_file=0x0, should_run_undo_phase=1 '\001', skip_DDLs_arg=1 '\001', take_checkpoints=1 '\001', warnings_count=0x7ffd72ad92d0) at /home/travis/src/storage/maria/ma_recovery.c:422
      #14 0x000055aa69a9ca46 in maria_recovery_from_log () at /home/travis/src/storage/maria/ma_recovery.c:240
      #15 0x000055aa69a6eadf in ha_maria_init (p=0x55aa6c6fbe20) at /home/travis/src/storage/maria/ha_maria.cc:3689
      #16 0x000055aa69466d0b in ha_initialize_handlerton (plugin=0x55aa6c6eca28) at /home/travis/src/sql/handler.cc:530
      #17 0x000055aa69150208 in plugin_initialize (tmp_root=0x7ffd72ad9700, plugin=0x55aa6c6eca28, argc=0x55aa6a974eb8 <remaining_argc>, argv=0x55aa6c69a638, options_only=false) at /home/travis/src/sql/sql_plugin.cc:1437
      #18 0x000055aa69150ebc in plugin_init (argc=0x55aa6a974eb8 <remaining_argc>, argv=0x55aa6c69a638, flags=0) at /home/travis/src/sql/sql_plugin.cc:1719
      #19 0x000055aa69009015 in init_server_components () at /home/travis/src/sql/mysqld.cc:5151
      #20 0x000055aa6900a1f9 in mysqld_main (argc=22, argv=0x55aa6c69a638) at /home/travis/src/sql/mysqld.cc:5680
      #21 0x000055aa68ffe950 in main (argc=22, argv=0x7ffd72ada148) at /home/travis/src/sql/main.cc:25
      

      This failure was mentioned in comments to MDEV-15256.

      The datadir after this crash is attached as data2.tar.gz

      Further attempts to start the server after it end with

      10.4

      2019-02-03 18:53:47 0 [Note] mysqld: Aria engine: starting recovery
      recovered pages: 0%
      Got error 176 when executing record redo_insert_row_head
      2019-02-03 18:53:47 0 [ERROR] mysqld: Aria engine: Redo phase failed
      tables to flush: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
      2019-02-03 18:53:49 0 [ERROR] mysqld: Aria recovery failed. Please run aria_chk -r on all Aria tables and delete all aria_log.######## files
      2019-02-03 18:53:49 0 [ERROR] Plugin 'Aria' init function returned error.
      2019-02-03 18:53:49 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
      


      Finally, non-debug server fails to start on data.tar.gz without any meaningful explanation:

      10.4 74eb4fc9

      2019-02-03 18:59:16 0 [ERROR] mysqld: Aria engine: log initialization failed
      2019-02-03 18:59:16 0 [ERROR] Plugin 'Aria' init function returned error.
      2019-02-03 18:59:16 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
      

      Given that the assertion failures have been seen long time ago, it's very likely that other versions are also affected. But since the attached data directories were created on 10.4, I can't check other versions with it, and thus filing it as a 10.4 bug. Please feel free to adjust affected/fix versions accordingly.

      Attachments

        1. data2.tar.gz
          9.45 MB
        2. data.tar.gz
          1.02 MB

        Issue Links

          Activity

            People

              monty Michael Widenius
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.