Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25284

Assertion `info->type == READ_CACHE || info->type == WRITE_CACHE' failed

Details

    Description

      START SLAVE SQL_THREAD;
      SET @@debug_dbug="d,simulate_find_log_pos_error";
      CHANGE MASTER TO IGNORE_DOMAIN_IDS=(1), MASTER_USE_GTID=SLAVE_POS;
      FLUSH LOGS;
      CHANGE MASTER TO master_use_gtid=current_pos;
      

      Leads to:

      10.6.0 8dd35a2507f8d63ca8df9335d2c6072d5c0e3b86 (Debug)

      mysqld: /test/10.6_dbg/mysys/mf_iocache.c:428: reinit_io_cache: Assertion `info->type == READ_CACHE || info->type == WRITE_CACHE' failed.
      

      10.6.0 8dd35a2507f8d63ca8df9335d2c6072d5c0e3b86 (Debug)

      Core was generated by `/test/MD160321-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
          at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      [Current thread is 1 (Thread 0x1512b5967700 (LWP 916521))]
      (gdb) bt
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      #1  0x0000561cb5a29d0b in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
      #2  0x0000561cb51c1313 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:331
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #5  0x00001512cc7cc859 in __GI_abort () at abort.c:79
      #6  0x00001512cc7cc729 in __assert_fail_base (fmt=0x1512cc962588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x561cb5e7c708 "info->type == READ_CACHE || info->type == WRITE_CACHE", file=0x561cb5e7c3d8 "/test/10.6_dbg/mysys/mf_iocache.c", line=428, function=<optimized out>) at assert.c:92
      #7  0x00001512cc7ddf36 in __GI___assert_fail (assertion=assertion@entry=0x561cb5e7c708 "info->type == READ_CACHE || info->type == WRITE_CACHE", file=file@entry=0x561cb5e7c3d8 "/test/10.6_dbg/mysys/mf_iocache.c", line=line@entry=428, function=function@entry=0x561cb5e7c880 <__PRETTY_FUNCTION__.18006> "reinit_io_cache") at assert.c:101
      #8  0x0000561cb5a0b84a in reinit_io_cache (info=info@entry=0x561cb87d31a0, type=type@entry=READ_CACHE, seek_offset=seek_offset@entry=0, use_async_io=use_async_io@entry=0 '\000', clear_cache=clear_cache@entry=0 '\000') at /test/10.6_dbg/mysys/mf_iocache.c:428
      #9  0x0000561cb531a8f1 in MYSQL_BIN_LOG::find_log_pos (this=this@entry=0x561cb87d2990, linfo=linfo@entry=0x1512b5965040, log_name=log_name@entry=0x0, need_lock=need_lock@entry=false) at /test/10.6_dbg/sql/log.cc:4131
      #10 0x0000561cb532d884 in MYSQL_BIN_LOG::reset_logs (this=this@entry=0x561cb87d2990, thd=thd@entry=0x151270000db8, create_new_log=<optimized out>, init_state=init_state@entry=0x0, init_state_len=init_state_len@entry=0, next_log_number=next_log_number@entry=0) at /test/10.6_dbg/sql/log.cc:4371
      #11 0x0000561cb504ae98 in purge_relay_logs (rli=rli@entry=0x561cb87d24b8, thd=thd@entry=0x151270000db8, just_reset=just_reset@entry=false, errmsg=errmsg@entry=0x1512b59655c0) at /test/10.6_dbg/sql/rpl_rli.cc:1219
      #12 0x0000561cb4f2d864 in change_master (thd=thd@entry=0x151270000db8, mi=mi@entry=0x561cb87d07f0, master_info_added=master_info_added@entry=0x1512b5966000) at /test/10.6_dbg/sql/sql_repl.cc:3852
      #13 0x0000561cb4f047bb in mysql_execute_command (thd=thd@entry=0x151270000db8) at /test/10.6_dbg/sql/sql_parse.cc:4110
      #14 0x0000561cb4ef0876 in mysql_parse (thd=thd@entry=0x151270000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1512b5966410) at /test/10.6_dbg/sql/sql_parse.cc:7998
      #15 0x0000561cb4eff1e7 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x151270000db8, packet=packet@entry=0x15127000b359 "CHANGE MASTER TO master_use_gtid=current_pos", packet_length=packet_length@entry=44, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_class.h:1318
      #16 0x0000561cb4f025c1 in do_command (thd=0x151270000db8, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_parse.cc:1397
      #17 0x0000561cb505a178 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x561cb875b298, put_in_cache=put_in_cache@entry=true) at /test/10.6_dbg/sql/sql_connect.cc:1410
      #18 0x0000561cb505a77d in handle_one_connection (arg=arg@entry=0x561cb875b298) at /test/10.6_dbg/sql/sql_connect.cc:1312
      #19 0x0000561cb5505a5b in pfs_spawn_thread (arg=0x561cb86a3a68) at /test/10.6_dbg/storage/perfschema/pfs.cc:2201
      #20 0x00001512cccda609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #21 0x00001512cc8c9293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      10.5.10 031b3dfc22c3d37769d49da902401b26a24f12b4 (Debug)

      mysqld: /test/10.5_dbg/mysys/mf_iocache.c:428: reinit_io_cache: Assertion `info->type == READ_CACHE || info->type == WRITE_CACHE' failed.
      

      10.5.10 031b3dfc22c3d37769d49da902401b26a24f12b4 (Debug)

      Core was generated by `/test/MD160321-mariadb-10.5.10-linux-x86_64-dbg/bin/mysqld --no-defaults --core'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
          at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      [Current thread is 1 (Thread 0x147e986b6700 (LWP 926046))]
      (gdb) bt
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      #1  0x000055fd9453ff45 in my_write_core (sig=sig@entry=6) at /test/10.5_dbg/mysys/stacktrace.c:424
      #2  0x000055fd93c82adf in handle_fatal_signal (sig=6) at /test/10.5_dbg/sql/signal_handler.cc:331
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #5  0x0000147e9c263859 in __GI_abort () at abort.c:79
      #6  0x0000147e9c263729 in __assert_fail_base (fmt=0x147e9c3f9588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55fd94997488 "info->type == READ_CACHE || info->type == WRITE_CACHE", file=0x55fd94997158 "/test/10.5_dbg/mysys/mf_iocache.c", line=428, function=<optimized out>) at assert.c:92
      #7  0x0000147e9c274f36 in __GI___assert_fail (assertion=assertion@entry=0x55fd94997488 "info->type == READ_CACHE || info->type == WRITE_CACHE", file=file@entry=0x55fd94997158 "/test/10.5_dbg/mysys/mf_iocache.c", line=line@entry=428, function=function@entry=0x55fd94997600 <__PRETTY_FUNCTION__.17993> "reinit_io_cache") at assert.c:101
      #8  0x000055fd945210c4 in reinit_io_cache (info=info@entry=0x55fd9772c9f0, type=type@entry=READ_CACHE, seek_offset=seek_offset@entry=0, use_async_io=use_async_io@entry=0 '\000', clear_cache=clear_cache@entry=0 '\000') at /test/10.5_dbg/mysys/mf_iocache.c:428
      #9  0x000055fd93ddd2b0 in MYSQL_BIN_LOG::find_log_pos (this=this@entry=0x55fd9772c1e0, linfo=linfo@entry=0x147e986b3fa0, log_name=log_name@entry=0x0, need_lock=need_lock@entry=false) at /test/10.5_dbg/sql/log.cc:4131
      #10 0x000055fd93df167e in MYSQL_BIN_LOG::reset_logs (this=this@entry=0x55fd9772c1e0, thd=thd@entry=0x147e50000db8, create_new_log=<optimized out>, init_state=init_state@entry=0x0, init_state_len=init_state_len@entry=0, next_log_number=next_log_number@entry=0) at /test/10.5_dbg/sql/log.cc:4371
      #11 0x000055fd93b08349 in purge_relay_logs (rli=rli@entry=0x55fd9772bd08, thd=thd@entry=0x147e50000db8, just_reset=just_reset@entry=false, errmsg=errmsg@entry=0x147e986b4518) at /test/10.5_dbg/sql/rpl_rli.cc:1219
      #12 0x000055fd939e9f5c in change_master (thd=thd@entry=0x147e50000db8, mi=mi@entry=0x55fd9772a040, master_info_added=master_info_added@entry=0x147e986b4f60) at /test/10.5_dbg/sql/sql_repl.cc:3852
      #13 0x000055fd939bfdcc in mysql_execute_command (thd=thd@entry=0x147e50000db8) at /test/10.5_dbg/sql/sql_parse.cc:4188
      #14 0x000055fd939ab611 in mysql_parse (thd=thd@entry=0x147e50000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x147e986b53d0, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.5_dbg/sql/sql_parse.cc:8089
      #15 0x000055fd939ba4d0 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x147e50000db8, packet=packet@entry=0x147e5000b499 "CHANGE MASTER TO master_use_gtid=current_pos", packet_length=packet_length@entry=44, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.5_dbg/sql/sql_class.h:1257
      #16 0x000055fd939bdc33 in do_command (thd=0x147e50000db8) at /test/10.5_dbg/sql/sql_parse.cc:1370
      #17 0x000055fd93b181dd in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55fd976d57a8, put_in_cache=put_in_cache@entry=true) at /test/10.5_dbg/sql/sql_connect.cc:1410
      #18 0x000055fd93b188e1 in handle_one_connection (arg=arg@entry=0x55fd976d57a8) at /test/10.5_dbg/sql/sql_connect.cc:1312
      #19 0x000055fd93fd01b9 in pfs_spawn_thread (arg=0x55fd975fef88) at /test/10.5_dbg/storage/perfschema/pfs.cc:2201
      #20 0x0000147e9c771609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #21 0x0000147e9c360293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      Bug confirmed present in:
      MariaDB: 10.2.38 (dbg), 10.3.29 (dbg), 10.4.19 (dbg), 10.5.10 (dbg), 10.6.0 (dbg)

      Bug (or feature/syntax) confirmed not present in:
      MariaDB: 10.2.38 (opt), 10.3.29 (opt), 10.4.19 (opt), 10.5.10 (opt), 10.6.0 (opt)

      Attachments

        Activity

          It may be that this is an expected result due to the debug setting. The following also crashes:

          START SLAVE SQL_THREAD;
          SET @@debug_dbug="d,simulate_find_log_pos_error";
          CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS;
          FLUSH LOGS;
          CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS;
          

          10.5.10 dd07cfcecd4aabb0aeae9c4f5087f82b4080c1bd (Debug)

          10.5.10-dbg>START SLAVE SQL_THREAD;
          ERROR 1200 (HY000): Misconfigured slave: MASTER_HOST was not set; Fix in config file or with CHANGE MASTER TO
          10.5.10-dbg>SET @@debug_dbug="d,simulate_find_log_pos_error";
          Query OK, 0 rows affected (0.000 sec)
          10.5.10-dbg>CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS;
          ERROR 1373 (HY000): Target log not found in binlog index
          10.5.10-dbg>FLUSH LOGS;
          ERROR 1026 (HY000): Can't open file: 'qa-roel-2-relay-bin' (errno: 27 "File too large")
          10.5.10-dbg>CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS;
          ERROR 2013 (HY000): Lost connection to MySQL server during query
          10.5.10-dbg>
          

          Idem for changing 'SLAVE_POS' to 'NO'. All with the same assert. At first glance does not look like a bug, though it is interesting that it only happens after the FLUSH. A quick look by Elkin is likely all that is needed to see if there is any real bug here.

          Roel Roel Van de Paar added a comment - It may be that this is an expected result due to the debug setting. The following also crashes: START SLAVE SQL_THREAD; SET @@debug_dbug="d,simulate_find_log_pos_error"; CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS; FLUSH LOGS; CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS; 10.5.10 dd07cfcecd4aabb0aeae9c4f5087f82b4080c1bd (Debug) 10.5.10-dbg>START SLAVE SQL_THREAD; ERROR 1200 (HY000): Misconfigured slave: MASTER_HOST was not set; Fix in config file or with CHANGE MASTER TO 10.5.10-dbg>SET @@debug_dbug="d,simulate_find_log_pos_error"; Query OK, 0 rows affected (0.000 sec) 10.5.10-dbg>CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS; ERROR 1373 (HY000): Target log not found in binlog index 10.5.10-dbg>FLUSH LOGS; ERROR 1026 (HY000): Can't open file: 'qa-roel-2-relay-bin' (errno: 27 "File too large") 10.5.10-dbg>CHANGE MASTER TO MASTER_USE_GTID=SLAVE_POS; ERROR 2013 (HY000): Lost connection to MySQL server during query 10.5.10-dbg> Idem for changing 'SLAVE_POS' to 'NO'. All with the same assert. At first glance does not look like a bug, though it is interesting that it only happens after the FLUSH. A quick look by Elkin is likely all that is needed to see if there is any real bug here.

          Hi Andrei, this is ready for review.

          Commit:
          7b56c4d

          Buildbot:
          bb-10.2-MDEV-25284

          Assessment:

          This patch addresses two issues.

          First, if a CHANGE MASTER command is issued and an error happens while locating the replica’s relay logs, the logs can be put into an invalid state where future updates fail and future CHANGE MASTER calls crash the server. More specifically, right before a replica purges the relay logs (part of the `CHANGE MASTER TO` logic), the relay log is temporarily closed with state LOG_TO_BE_OPENED. If the server errors in-between the temporary log closure and purge, i.e. during the function find_log_pos, the log should be closed. The assertion issue reported by this JIRA is due to the log state not properly closing on the simulated error in find_log_pos.

          Second, upon issuing a RESET SLAVE ALL command, a slave’s GTID filters are not cleared (DO_DOMAIN_IDS, IGNORE_DOMIAN_IDS, IGNORE_SERVER_IDS). MySQL had a similar bug report, Bug #18816897, which fixed this issue to clear IGNORE_SERVER_IDS after issuing RESET SLAVE ALL in version 5.7.

          To fix the first problem, the CHANGE MASTER error handling logic was extended to transition the relay log state to LOG_CLOSED from LOG_TO_BE_OPENED. To fix the second problem, the RESET SLAVE ALL logic is extended to clear the domain_id filter and ignore_server_ids.

          bnestere Brandon Nesterenko added a comment - Hi Andrei, this is ready for review. Commit : 7b56c4d Buildbot : bb-10.2-MDEV-25284 Assessment : This patch addresses two issues. First, if a CHANGE MASTER command is issued and an error happens while locating the replica’s relay logs, the logs can be put into an invalid state where future updates fail and future CHANGE MASTER calls crash the server. More specifically, right before a replica purges the relay logs (part of the `CHANGE MASTER TO` logic), the relay log is temporarily closed with state LOG_TO_BE_OPENED. If the server errors in-between the temporary log closure and purge, i.e. during the function find_log_pos, the log should be closed. The assertion issue reported by this JIRA is due to the log state not properly closing on the simulated error in find_log_pos. Second, upon issuing a RESET SLAVE ALL command, a slave’s GTID filters are not cleared (DO_DOMAIN_IDS, IGNORE_DOMIAN_IDS, IGNORE_SERVER_IDS). MySQL had a similar bug report, Bug #18816897, which fixed this issue to clear IGNORE_SERVER_IDS after issuing RESET SLAVE ALL in version 5.7. To fix the first problem, the CHANGE MASTER error handling logic was extended to transition the relay log state to LOG_CLOSED from LOG_TO_BE_OPENED. To fix the second problem, the RESET SLAVE ALL logic is extended to clear the domain_id filter and ignore_server_ids.
          Elkin Andrei Elkin added a comment -

          c850e71 is good enough. Thanks for the polishing part as well.

          Elkin Andrei Elkin added a comment - c850e71 is good enough. Thanks for the polishing part as well.

          Closing as fixed. Pushed to 10.2 as 2291f8ef73489fb8ed79768484df1ee4db3583a7.

          No merge-conflicts or build/test issues from 10.3-10.7 (tested via manual git cherry-pick into each version).

          bnestere Brandon Nesterenko added a comment - Closing as fixed. Pushed to 10.2 as 2291f8ef73489fb8ed79768484df1ee4db3583a7 . No merge-conflicts or build/test issues from 10.3-10.7 (tested via manual git cherry-pick into each version).

          People

            bnestere Brandon Nesterenko
            ramesh Ramesh Sivaraman
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.