[MDEV-24302] RESET MASTER hangs as Innodb does not report on binlog checkpoint - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.5, 10.6
Fix Version/s: 10.5.10
Component/s: Replication, Storage Engine - InnoDB
Labels:
- affects-tests

Description

The failed one in http://buildbot.askmonty.org/buildbot/builders/winx64-debug/builds/22164
commit 657fcdf430f39a3103dff51a6a2b2bd3 with the following stack trace

server!inline_mysql_cond_wait(struct st_mysql_cond * that = 0x00007ffb`ec653b68, struct st_mysql_mutex * mutex = 0x00007ffb`ec653b38, char * src_file = 0x00007ffb`ea98b140 "D:\winx64-debug\build\src\sql\log.cc", unsigned int src_line = 0x10f8) [D:\winx64-debug\build\src\include\mysql\psi\mysql_thread.h @ 1222]

server!MYSQL_BIN_LOG::reset_logs(class THD * thd = 0x00000165`d43d15c8, bool create_new_log = true, struct rpl_gtid * init_state = 0x00000000`00000000, unsigned int init_state_len = 0, unsigned long next_log_number = 0) [D:\winx64-debug\build\src\sql\log.cc @ 4345]

server!reset_master(class THD * thd = 0x00000165`d43d15c8, struct rpl_gtid * init_state = 0x00000000`00000000, unsigned int init_state_len = 0, unsigned long next_log_number = 0) [D:\winx64-debug\build\src\sql\sql_repl.cc @ 3966]

server!reload_acl_and_cache(class THD * thd = 0x00000165`d43d15c8, unsigned int64 options = 0x80, struct TABLE_LIST * tables = 0x00000000`00000000, int * write_to_binlog = 0x0000002a`c42fdc44) [D:\winx64-debug\build\src\sql\sql_reload.cc @ 362]

server!mysql_execute_command(class THD * thd = 0x00000165`d43d15c8) [D:\winx64-debug\build\src\sql\sql_parse.cc @ 5480]

server!mysql_parse(class THD * thd = 0x00000165`d43d15c8, char * rawbuf = 0x00000165`d44da280 "--- memory read error at address 0x00000165`d44da280 ---", unsigned int length = 0xc, class Parser_state * parser_state =

As it was never observed before the extra to 10.5 commits
need examination to clear out
possibility of missed out commit_checkpoint_notify_ha() invocation from innobase_mysql_log_notify() which is one of possibilities.
657fcdf430f and 7b1252c03d7 are rated as potentially relevant to the failure showdown.

A similar failure

main.mysqldump-max 'innodb'              w2 [ fail ]  timeout after 900 seconds

exists in http://buildbot.askmonty.org/buildbot/builders/win32-debug/builds/18570/steps/test/logs/stdio

Upon more analysis the most probable suspect

--- a/storage/innobase/handler/ha_innodb.cc

+++ b/storage/innobase/handler/ha_innodb.cc

@@ -4444,12 +4444,6 @@ innobase_mysql_log_notify(

 	struct pending_checkpoint *	entry;

 	struct pending_checkpoint *	last_ready;

-	/* It is safe to do a quick check for NULL first without lock.

-	Even if we should race, we will at most skip one checkpoint and

-	take the next one, which is harmless. */

-	if (!pending_checkpoint_list)

-		return;

is identified. ~~, to eliminate (the pasted block is from being tested fixes). Removal won't hurt performance in normal cases when binlog rotation is not frequent event (say not few in a second).~~

Attachments

Issue Links

blocks

MDEV-25611 RESET MASTER still causes the server to hang

Closed

causes

MDEV-25313 Assertion `pending == log_requests.start.load(std::memory_order_relaxed)' failed in log_flush_notify_and_unlock or crash in MYSQL_BIN_LOG::mark_xid_done

Closed

is caused by

MDEV-232 Remove one fsync() inside engine's commit() method

Closed

MDEV-532 Implement async commit checkpoint in InnoDB and XtraDB

Closed

relates to

MDEV-24526 binlog rotate via FLUSH LOGS may obsolate binlog file for recovery too early

Closed

MDEV-24886 main.sp_trans_log fails in buildbot with timeout

Open

(1 relates to)

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Andrei Elkin

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2020-11-27 20:22

Updated:: 2021-05-06 17:01

Resolved:: 2021-03-29 12:50

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server