Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24695

Encryption is modifying a freed page

    XMLWordPrintable

    Details

      Description

      10.5 927a8823

      2021-01-26 19:18:12 0 [Note] InnoDB: Opened 127 undo tablespaces
      2021-01-26 19:18:12 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=103948,103948
      2021-01-26 19:18:12 0 [Note] InnoDB: Opened 127 undo tablespaces
      2021-01-26 19:18:12 0 [Note] InnoDB: Starting final batch to recover 563 pages from redo log.
      mysqld: /data/src/10.5/storage/innobase/log/log0recv.cc:274: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.
      210126 19:18:12 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007fbcc52c9f36 in __GI___assert_fail (assertion=0x55c97e4131a0 "mach_read_from_4(frame + 4U) == block.page.id().page_no()", file=0x55c97e412b40 "/data/src/10.5/storage/innobase/log/log0recv.cc", line=274, function=0x55c97e412f20 "log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const") at assert.c:101
      #8  0x000055c97cdb5855 in log_phys_t::apply (this=0x7fbca8eeb500, block=..., last_offset=@0x6080004b81e4: 0) at /data/src/10.5/storage/innobase/log/log0recv.cc:274
      #9  0x000055c97cda2d68 in recv_recover_page (block=0x7fbca8a0f090, mtr=..., p=Python Exception <class 'AttributeError'> 'NoneType' object has no attribute 'pointer': 
      ..., space=0x615000002398, init=0x60700000c9a0) at /data/src/10.5/storage/innobase/log/log0recv.cc:2312
      #10 0x000055c97cdbce31 in recv_sys_t::recover_low (this=0x55c97f6c99c0 <recv_sys>, page_id=..., p=Python Exception <class 'AttributeError'> 'NoneType' object has no attribute 'pointer': 
      ..., mtr=..., b=0x7fbca8a0f090) at /data/src/10.5/storage/innobase/log/log0recv.cc:2559
      #11 0x000055c97cda5a74 in recv_sys_t::apply (this=0x55c97f6c99c0 <recv_sys>, last_batch=true) at /data/src/10.5/storage/innobase/log/log0recv.cc:2662
      #12 0x000055c97d040150 in srv_start (create_new_db=false) at /data/src/10.5/storage/innobase/srv/srv0start.cc:1491
      #13 0x000055c97cb9a526 in innodb_init (p=0x616000021708) at /data/src/10.5/storage/innobase/handler/ha_innodb.cc:3909
      #14 0x000055c97c0754ab in ha_initialize_handlerton (plugin=0x62100002cc80) at /data/src/10.5/sql/handler.cc:645
      #15 0x000055c97b8ad406 in plugin_initialize (tmp_root=0x7ffe64870080, plugin=0x62100002cc80, argc=0x55c97fc31da0 <remaining_argc>, argv=0x6160000004f8, options_only=false) at /data/src/10.5/sql/sql_plugin.cc:1459
      #16 0x000055c97b8af394 in plugin_init (argc=0x55c97fc31da0 <remaining_argc>, argv=0x6160000004f8, flags=0) at /data/src/10.5/sql/sql_plugin.cc:1751
      #17 0x000055c97b5683ac in init_server_components () at /data/src/10.5/sql/mysqld.cc:4913
      #18 0x000055c97b56a39f in mysqld_main (argc=26, argv=0x6160000004f8) at /data/src/10.5/sql/mysqld.cc:5497
      #19 0x000055c97b553d9d in main (argc=25, argv=0x7ffe64871e18) at /data/src/10.5/sql/main.cc:25
      

      rr profile is available.
      To reproduce the problem, I was using the following test:

      git clone https://github.com/MariaDB/randgen --branch mdev24695 rqg-mdev24695
      cd rqg-mdev24695
      . ./mdev24695.cmd <basedir>
      

      It runs a short DML flow in one thread, kills the server and restarts it with --innodb-force-recovery=3, at which point the failure happens. If it doesn't, the test will restart the server again, without forced recovery, and run some more DML.

      Non-default server options used (also can be found in mdev24695.cmd):

      --mysqld=--innodb-use-native-aio=OFF \
      --mysqld=--innodb-encrypt-tables \
      --mysqld=--innodb-encrypt-log \
      --mysqld=--innodb-encryption-threads=4 \
      --mysqld=--file-key-management --mysqld=--file-key-management-filename=`pwd`/data/file_key_management_keys.txt \
      --mysqld=--plugin-load-add=file_key_management \
      --mysqld=--innodb-undo-tablespaces=127
      

      Reproducibility seems to depend heavily on the machine and build; for example, on my local machine it happens much more readily (almost every time) on an ASAN build, while not so much on a non-ASA debug build; on another machine it's vice versa.
      Couldn't reproduce without any of the options listed above, but it's not an absolute guarantee, maybe they just change the dynamics.
      Couldn't reproduce on 10.4, the same disclaimer applies.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              thiru Thirunarayanan Balathandayuthapani
              Reporter:
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: