[MDEV-32746] SIGSEGV on recovery when using innodb_encrypt_log and PMEM Created: 2023-11-09  Updated: 2023-11-14  Resolved: 2023-11-14

Status: Closed
Project: MariaDB Server
Component/s: Backup, Storage Engine - InnoDB
Affects Version/s: 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2, 11.3
Fix Version/s: 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: rr-profile-analyzed

Issue Links:
Problem/Incident
is caused by MDEV-14425 Change the InnoDB redo log format to ... Closed

 Description   

mleich provided an rr replay trace where encryption_crypt() hits SIGSEGV because it is being invoked with *dlen==0:

#0  0x000055d2158ee6b8 in log_decrypt_buf (iv=iv@entry=0x7ffcb3d01710 "", 
    buf=buf@entry=0x7ffcb3cfd645 "\037\002\347", ' ' <repeats 110 times>, "qul", ' ' <repeats 57 times>, "\220\b\\\b(\a\364\a\300\a\214\aX\a$\006\360\006\274\006\210\006T\006 ", <incomplete sequence \354>..., 
    data=data@entry=0x7f9c1bffc5db "\241\350G\200Y\202\005\341,~\263\321%\371\071\200Y\202\005\341$\201\214\365\262\277>\260\200@v\214P;\302\373\n!.>\270\207\345]\344\301(\333\327NE\032\360\227z\027\215\256}4\005\236F\362\217\220\036\311?\272", len=len@entry=0) at /data/Server/bb-11.2-MDEV-32452/storage/innobase/log/log0crypt.cc:473
#1  0x000055d2158d8ec3 in recv_ring::copy_if_needed (this=this@entry=0x7ffcb3d01908, iv=iv@entry=0x7ffcb3d01710 "", 
    tmp=tmp@entry=0x7ffcb3cfd640 "\024\200Y\202\005\037\002\347", ' ' <repeats 110 times>, "qul", ' ' <repeats 57 times>, "\220\b\\\b(\a\364\a\300\a\214\aX\a$\006\360\006\274\006\210\006"..., start=..., 
    start@entry=..., len=len@entry=0) at /data/Server/bb-11.2-MDEV-32452/storage/innobase/log/log0recv.cc:2334
#2  0x000055d2158ebcc0 in recv_sys_t::parse<recv_ring, true> (this=this@entry=0x55d2167df780 <recv_sys>, l=..., if_exists=if_exists@entry=false) at /usr/include/c++/9/bits/stl_tree.h:348
#3  0x000055d2158ed421 in recv_sys_t::parse_pmem<true> (if_exists=if_exists@entry=false) at /data/Server/bb-11.2-MDEV-32452/storage/innobase/log/log0recv.cc:2211
#4  0x000055d2158d5f36 in recv_scan_log (last_phase=last_phase@entry=false) at /data/Server/bb-11.2-MDEV-32452/storage/innobase/log/log0recv.cc:4060
#5  0x000055d2158d6c88 in recv_recovery_from_checkpoint_start () at /data/Server/bb-11.2-MDEV-32452/storage/innobase/log/log0recv.cc:4489
#6  0x000055d215a30590 in srv_start (create_new_db=<optimized out>) at /data/Server/bb-11.2-MDEV-32452/storage/innobase/srv/srv0start.cc:1523
#7  0x000055d215836ccf in innodb_init (p=<optimized out>) at /data/Server/bb-11.2-MDEV-32452/storage/innobase/handler/ha_innodb.cc:4166

I was not able to reproduce this crash with the 2 copies of a data directory that I found in the environment. However, I think that the following patch should fix this:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index f479428d987..ac4a68a0569 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -2409,7 +2409,7 @@ struct recv_ring : public recv_buf
   {
     const size_t s(*this - start);
     ut_ad(s + len <= srv_page_size);
-    if (!log_sys.is_encrypted())
+    if (!len || !log_sys.is_encrypted())
     {
       if (start.ptr + s == ptr && ptr + len <= end())
         return ptr;

A corresponding condition exists in recv_buf::copy_if_needed(). That is, if there is no actual payload in a MDEV-14425 redo log record, nothing needs to be encrypted. The page numbers and file names are never encrypted. For an INIT_PAGE or FREE_PAGE record, we only need to know the page identifier, nothing else.



 Comments   
Comment by Marko Mäkelä [ 2023-11-09 ]

mleich, can you please try to reproduce this bug (on the same branch that you were using so far) and test the fix?

Comment by Matthias Leich [ 2023-11-09 ]

Replay of the problem with optimized test battery

  • original tree: 4 times, 218 finished RQG tests
  • original tree + patch: never, 1134 finished tests
    Hence I assume the problem is fixed.
Generated at Thu Feb 08 10:33:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.