[MDEV-31101] spider/bugfix.mdev_29904 fails with "Server shutdown in progress" in CI Created: 2023-04-21  Updated: 2024-01-11  Resolved: 2024-01-11

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.4, 10.5, 10.6
Fix Version/s: 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3

Type: Bug Priority: Major
Reporter: Yuchen Pei Assignee: Yuchen Pei
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MDEV-22979 "mysqld --bootstrap" / mysql_install_... Closed
is blocked by MDEV-29870 Backport fixes to spider init bugs to... Closed
Problem/Incident
is caused by MDEV-30581 Add a testcase for MDEV-29904 Closed

 Description   

E.g. https://buildbot.mariadb.net/buildbot/builders/kvm-asan/builds/10208/steps/mtr_nm/logs/stdio

I am not sure why this happens and I cannot reproduce it at the same commit. Here is the complete test:

--echo #
--echo # MDEV-29904 SPIDER plugin initialization fails upon startup
--echo #
 
--let $restart_parameters=--plugin-load-add=ha_spider
--source include/restart_mysqld.inc

According to [1] the error message means some query is still running while the server is shutting down. Maybe mtr tries to shutdown the server after --source include/restart_mysqld.inc while the spider init queries are still running in another thread? In that case perhaps we need to introduce a wait at the end of the test

[1] https://www.percona.com/blog/error-mysqld-sort-aborted-server-shutdown-in-progress/

To add a wait condition perhaps we could use include/wait_condition.inc, but that would require a sql statement to determine the completion, but looking at at spd_init_query.h, I don't see which statement can achieve this, so maybe we'll need to add an init query that signals completion.

On the other hand, if spider init queries are no longer running in a background thread, then this bug may be fixed automatically, so let's block this ticket with the init bug ticket MDEV-22979 where we are considering such change.



 Comments   
Comment by Yuchen Pei [ 2023-06-07 ]

Happened again today locally on a 10.5 patched with commit to MDEV-30435 and running multiple tests with parallel=auto (mtr chose 8)

./mysql-test/mtr --parallel=auto --force --max-test-fail=0 spider/handler.direct_aggregate_part spider/handler.spider_fixes spider.direct_aggregate_part spider/bugfix.mdev_29904 spider.direct_aggregate spider.direct_aggregate_part spider.checksum_table_with_quick_mode_3 spider.basic_sql spider.basic_sql_part spider.auto_increment

Comment by Yuchen Pei [ 2023-06-28 ]

I'm going to disable mdev_29904.test until the present issue is no longer blocked by MDEV-22979

Comment by Yuchen Pei [ 2023-11-28 ]

I have a patch on top of my current patches for MDEV-29870 which is
under review. It works locally - neither spider/bugfix.mdev_29904 nor
spider/bugfix.mdev_27575 fails. Will need to check CI.

e9ea9353d53 upstream/bb-10.4-mdev-31101 MDEV-31101 Re-enable spider/bugfix.mdev_29904

Comment by Yuchen Pei [ 2023-12-19 ]

Hi holyfoot, ptal this trivial patch, thanks

bb-10.4-mdev-31101 5710179d323e58e8f549fb48635be0600574b211
MDEV-31101 Re-enable spider/bugfix.mdev_29904
 
The spider init bug fixes remove any race conditions during spider
init.
 
Also remove the add_suppressions in spider/bugfix.mdev_27575 which is
a similar issue.

Comment by Alexey Botchkov [ 2024-01-06 ]

ok to push.

Comment by Yuchen Pei [ 2024-01-11 ]

pushed d277a63c749bd49c9025da4efdd3008d22180dc0 to 10.4

Generated at Thu Feb 08 10:21:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.