[MDEV-3530] LP:912290 - Assertion mi->io_thd == 0 fails in handle_slave_io() (slave.cc: 2501) Created: 2012-01-05  Updated: 2015-02-02  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Vladislav Vaintroub Assignee: Kristian Nielsen
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug912290.xml    

 Description   

Can be seen here for example:
http://buildbot.askmonty.org/buildbot/builders/win2008r2-vs2010-amd64-debug/builds/988/steps/test_3/logs/stdio

This assertion crashes from time to time in the debug builds on Windows



 Comments   
Comment by Vladislav Vaintroub [ 2012-01-05 ]

Re: Assertion mi->io_thd == 0 fails in handle_slave_io() (slave.cc: 2501)
The callstack dump of all threads in the buildbot log shows 2 threads inside handle_slave_io()

  • the crashing one

mysqld!my_sigabrt_handler [c:\buildbot\win2008r2-vs2010-amd64-debug\build\sql\mysqld.cc @ 2206]
mysqld!raise [f:\dd\vctools\crt_bld\self_64_amd64\crt\src\winsig.c @ 593]
mysqld!abort [f:\dd\vctools\crt_bld\self_64_amd64\crt\src\abort.c @ 81]
mysqld!_wassert [f:\dd\vctools\crt_bld\self_64_amd64\crt\src\assert.c @ 336]
mysqld!handle_slave_io [c:\buildbot\win2008r2-vs2010-amd64-debug\build\sql\slave.cc @ 2501]

  • the one waiting for critical section

. 26 Id: 5710.54e4 Suspend: 0 Teb: 000007ff`fff8a000 Unfrozen
Priority: -1 Priority class: 32

ntdll!NtWaitForSingleObject
ntdll!RtlpWaitOnCriticalSection
ntdll!RtlEnterCriticalSection
mysqld!handle_slave_io(void * arg = 0x00000000`0b9d1760) [slave.cc @ 2781]
mysqld!pthread_start(void * param = 0x00000000`0b86c860) [my_winthread.c @ 90]

Comment by Kristian Nielsen [ 2012-02-13 ]

Re: Assertion mi->io_thd == 0 fails in handle_slave_io() (slave.cc: 2501)
Problem was a race in an error case in the slave IO thread startup.

If init_slave_thread() fails, the code would release the mi->run_lock,

but without setting mi->slave_running=1. This allows a subsequent START

SLAVE to proceed, even though the old slave thread is actually still

running (even though it will do nothing but cleanup and shutdown).

The assert happens in the rare case that the test case has time to run

START SLAVE after the old slave fails initialisation, but before it has

re-aquired the lock and done it's cleanup. This results in the assertion.

Fixed in 5.3.4 and 5.5.1.

Comment by Rasmus Johansson (Inactive) [ 2012-02-13 ]

Launchpad bug id: 912290

Generated at Thu Feb 08 06:49:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.