[MDEV-20475] Assertion `flushed_lsn == log_get_lsn()' failed in srv_prepare_to_delete_redo_log_files upon server startup Created: 2019-09-03  Updated: 2021-09-01  Resolved: 2021-09-01

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2, 10.3, 10.4
Fix Version/s: 10.5.2, 10.6.0

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Won't Fix Votes: 0
Labels: need_rr

Issue Links:
Relates
relates to MDEV-18370 InnoDB: Failing assertion: lsn % OS_F... Closed
relates to MDEV-20474 Assertion `!recv_no_log_write' failed... Closed
relates to MDEV-14425 Change the InnoDB redo log format to ... Closed

 Description   

10.2 9bf424bc

2019-08-30  1:12:28 140084597409600 [Note] InnoDB: Highest supported file format is Barracuda.
2019-08-30  1:12:29 140084597409600 [Note] InnoDB: Encrypting redo log: 1*524288000 bytes; LSN=1656361
mysqld: /data/src/10.2/storage/innobase/srv/srv0start.cc:1432: lsn_t srv_prepare_to_delete_redo_log_files(ulint): Assertion `flushed_lsn == log_get_lsn()' failed.
190830  1:12:29 [ERROR] mysqld got signal 6 ;
 
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#5  0x00007f67fab2742a in __GI_abort () at abort.c:89
#6  0x00007f67fab1ee67 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x55e58b010ff4 "flushed_lsn == log_get_lsn()", file=file@entry=0x55e58b010520 "/data/src/10.2/storage/innobase/srv/srv0start.cc", line=line@entry=1432, function=function@entry=0x55e58b012ca0 <srv_prepare_to_delete_redo_log_files(unsigned long)::__PRETTY_FUNCTION__> "lsn_t srv_prepare_to_delete_redo_log_files(ulint)") at assert.c:92
#7  0x00007f67fab1ef12 in __GI___assert_fail (assertion=0x55e58b010ff4 "flushed_lsn == log_get_lsn()", file=0x55e58b010520 "/data/src/10.2/storage/innobase/srv/srv0start.cc", line=1432, function=0x55e58b012ca0 <srv_prepare_to_delete_redo_log_files(unsigned long)::__PRETTY_FUNCTION__> "lsn_t srv_prepare_to_delete_redo_log_files(ulint)") at assert.c:101
#8  0x000055e58aa0a0e4 in srv_prepare_to_delete_redo_log_files (n_files=1) at /data/src/10.2/storage/innobase/srv/srv0start.cc:1432
#9  0x000055e58aa0c999 in innobase_start_or_create_for_mysql () at /data/src/10.2/storage/innobase/srv/srv0start.cc:2381
#10 0x000055e58a853675 in innobase_init (p=0x55e58cfe2940) at /data/src/10.2/storage/innobase/handler/ha_innodb.cc:4371
#11 0x000055e58a64cc1b in ha_initialize_handlerton (plugin=0x55e58cfb6678) at /data/src/10.2/sql/handler.cc:555
#12 0x000055e58a3e5d64 in plugin_initialize (tmp_root=0x7ffddebb4450, plugin=0x55e58cfb6678, argc=0x55e58b93e928 <remaining_argc>, argv=0x55e58cf51b90, options_only=false) at /data/src/10.2/sql/sql_plugin.cc:1417
#13 0x000055e58a3e6986 in plugin_init (argc=0x55e58b93e928 <remaining_argc>, argv=0x55e58cf51b90, flags=2) at /data/src/10.2/sql/sql_plugin.cc:1698
#14 0x000055e58a2ef7bb in init_server_components () at /data/src/10.2/sql/mysqld.cc:5335
#15 0x000055e58a2f08a5 in mysqld_main (argc=36, argv=0x55e58cf51b90) at /data/src/10.2/sql/mysqld.cc:5931
#16 0x000055e58a2e57b0 in main (argc=30, argv=0x7ffddebb51e8) at /data/src/10.2/sql/main.cc:25

Reproducing doesn't require any inflow on the server. It happens randomly, quite rarely, upon server startup right after bootstrap, it's enough to run bootstrap-startup in a loop with given options for a long enough time (hours). I use this crude script for that, but it can really be anything, as long as the server started with the right options. Scrubbing is important, but also on some reason innodb_log_files_in_group (or possibly some other InnoDB log options) – maybe resizing redo log is essential for the scenario.

Unfortunately, I don't know how to make it any faster / more probable.

if [ -n "$1" ] ; then
    port=$1
else
    port=8307
fi
 
while [ true ] ; do
    rm -rf data data.bootstrap tmp
    mkdir data
    mkdir tmp
 
    echo `date` "Running bootstrap..."
    scripts/mysql_install_db --no-defaults --basedir=`pwd` --lc-messages-dir=`pwd`/share --character-sets-dir=`pwd`/share/charsets --tmpdir=`pwd`/tmp --datadir=`pwd`/data --innodb_log_files_in_group=1 > `pwd`/boot.log 2>&1
 
    cp -r data data.bootstrap
 
    echo `date` "Starting server on port $port..."
    `pwd`/bin/mysqld --no-defaults --basedir=`pwd` --lc-messages-dir=`pwd`/share --character-sets-dir=`pwd`/share/charsets --tmpdir=`pwd`/tmp --core-file --datadir=`pwd`/data --port=$port --socket=`pwd`/mysql.sock --pid-file=`pwd`/mysql.pid --innodb-encryption-threads=4 --loose-innodb_scrub_log=ON --log-error=`pwd`/mysql.err &
 
    res=1;
    echo `date` "Waiting for the server to start..."
    for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ; do
        sleep 1
        `pwd`/bin/mysqladmin -uroot --socket=`pwd`/mysql.sock ping 2>/dev/null
#        `pwd`/bin/mysqladmin --socket=`pwd`/mysql.sock ping 2>/dev/null
        if [ $? == 0 ] ; then
            echo "Success"
            res=0
            break
        fi
    done
    if [ $res == 1 ] ; then
        echo `date` "Failed to start the server"
        break
    fi
    echo `date` "Trying to shut down the server..."
    `pwd`/bin/mysql -uroot --socket=`pwd`/mysql.sock -e shutdown 2>/dev/null
#    `pwd`/bin/mysql --socket=`pwd`/mysql.sock -e shutdown 2>/dev/null
    if [ $? != 0 ] ; then
        echo `date` "Shut down failed on some reason"
        break
    else
        echo `date` "Success"
    fi
    fg
done

The problem happens interchangeably with MDEV-18370 and MDEV-20474, they're likely to be related.



 Comments   
Comment by Marko Mäkelä [ 2019-09-03 ]

My main suspects are innodb_scrub_log=ON and much less likely innodb_stats_traditional=OFF. The other options should not have any direct impact on what is being written to the redo log.
Side note: I would recommend to stay away from innodb_lock_schedule_algorithm=VATS until MDEV-16664 has been fixed. I do not think that that a correctness issue with transactional locks would lead to an issue with the redo log, though. It might cause higher level corruption of the MDEV-9663 kind.

Comment by Elena Stepanova [ 2019-09-03 ]

innodb_scrub_log is most certainly a part of the deal, but not the only part. I am still trying to figure out which other options make a difference.

Comment by Marko Mäkelä [ 2020-04-16 ]

Can you reproduce this at all without innodb_scrub_log, which we disabled in MDEV-21870? If yes, I would like to analyze a rr replay.

Comment by Elena Stepanova [ 2020-04-16 ]

If the description says "scrubbing is important", it means I tried really hard to reproduce it without scrubbing and didn't succeed. If you want to close it as "won't fix" in 10.2-10.4 because it was removed in 10.5 anyway, please feel free to do so, I have no objections.

Comment by Marko Mäkelä [ 2021-09-01 ]

Starting with MDEV-21870, the parameter innodb_scrub_log is being ignored.

Generated at Thu Feb 08 08:59:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.