MTR prints detailed stack trace unconditionally, etc (MDEV-28931)

[MDEV-29023] MTR hangs after multiple failures Created: 2022-07-04  Updated: 2023-03-13  Resolved: 2022-07-18

Status: Closed
Project: MariaDB Server
Component/s: Scripts & Clients
Affects Version/s: 10.3, 10.9
Fix Version/s: 10.3.36, 10.4.26

Type: Technical task Priority: Major
Reporter: Aleksey Midenkov Assignee: Aleksey Midenkov
Resolution: Fixed Votes: 0
Labels: None

Attachments: Zip Archive a.zip     Zip Archive b.zip    
Issue Links:
Relates
relates to MDEV-30836 MTR hangs after tests have completed Closed

 Description   

Reproduce

1. Patch for INSERT to crash always:

--- a/sql/sql_insert.cc
+++ b/sql/sql_insert.cc
@@ -715,6 +715,8 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
   List_item *values;
   Name_resolution_context *context;
   Name_resolution_context_state ctx_state;
+  //DBUG_ASSERT(!opt_bootstrap);
+  DBUG_ASSERT(opt_bootstrap);
 #ifndef EMBEDDED_LIBRARY
   char *query= thd->query();
   /*

2. Unpack a.zip into mysql-test/suites and run

mtr --fast --reorder --parallel=10 --suite=a

Usually reproducible with less tests from b.zip



 Comments   
Comment by Marko Mäkelä [ 2023-03-08 ]

Unfortunately, while a test of 10.3 1848804840f5595f982c4cd502ba2112f6dd7911 did not hang, anything else where this was supposedly fixed (including 10.5, 10.8, 10.9) would still hang.

Here is an example of 10.9 f53f64b7b9edaef8e413add322225dc33ebc8131 (the first 10.9 revision that includes the fix):

diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
index 283dd463fa8..872a1643e5c 100644
--- a/sql/sql_insert.cc
+++ b/sql/sql_insert.cc
@@ -713,6 +713,7 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
   SELECT_LEX *returning= thd->lex->has_returning() ? thd->lex->returning() : 0;
   unsigned char *readbuff= NULL;
 
+  DBUG_ASSERT(opt_bootstrap);
 #ifndef EMBEDDED_LIBRARY
   char *query= thd->query();
   /*

With this patch applied and an invocation of

nice ./mtr --parallel=60 --big-test --force --max-test-fail=0 --skip-core-file

the test would hang, with a few mariadbd processes still active but sleeping. After

killall -QUIT mariadbd

the server processes would terminate, but the mtr test driver would not react to them going away.

I got interested in this because there was a bogus failure on buildbot.mariadb.org that looked like a hang:

bb-10.6-MDEV-26055-MDEV-26827 632e745f906630fc5d2682d837a87275d5ce2868

sysschema.v_innodb_lock_waits 'innodb'   w3 [ pass ]     23
unit.conc_cursor                         w1 [ pass ]      2
innodb.skip_locked_nowait 'innodb'       w11 [ pass ]   3014
unit.conc_performance                    w8 [ pass ]      9
sys_vars.auto_increment_offset_basic     w7 [ pass ]      4
unit.conc_ps                             w12 [ pass ]    604
unit.conc_ps_bugs                        w9 [ pass ]      5
innodb.instant_alter_bugs 'innodb'       w4 [ pass ]  11250
unit.conc_misc                           w6 [ pass ]      7
innodb.log_file_name 'innodb'            w13 [ pass ]  11419
unit.conc_connection                     w5 [ pass ]      3
stress.ddl_myisam                        w10 [ pass ]  31795
innodb.log_file 'innodb'                 w14 [ pass ]  23531
innodb.log_corruption 'innodb'           w2 [ pass ]  39422
 
command timed out: 600 seconds without output running [b'sh', b'-c', b'cd mysql-test && exec perl mysql-test-run.pl --verbose-restart --force --retry=3 --max-save-core=1 --max-save-datadir=1 --max-test-fail=20 --mem --parallel=$(expr 7 \\* 2) '], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1214.928726

In the file mysql-test/var/3/log/mysqld.1.err we do have a shutdown message as well as a message about a test that is not listed in the mtr output log above:

CURRENT_TEST: sysschema.v_innodb_lock_waits
CURRENT_TEST: sysschema.v_schema_table_lock_waits
2023-03-07 16:32:47 0 [Note] /home/buildbot/amd64-rhel-7/build/sql/mariadbd (initiated by: unknown): Normal shutdown
2023-03-07 16:32:47 0 [Note] InnoDB: FTS optimize thread exiting.
2023-03-07 16:32:47 0 [Note] InnoDB: Starting shutdown...
2023-03-07 16:32:47 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/var_auto_zzmE/3/mysqld.1/data/ib_buffer_pool
2023-03-07 16:32:47 0 [Note] InnoDB: Restricted to 126 pages due to innodb_buf_pool_dump_pct=25
2023-03-07 16:32:47 0 [Note] InnoDB: Buffer pool(s) dump completed at 230307 16:32:47
2023-03-07 16:32:48 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-03-07 16:32:48 0 [Note] InnoDB: Shutdown completed; log sequence number 21530679; transaction id 116
2023-03-07 16:32:48 0 [Note] /home/buildbot/amd64-rhel-7/build/sql/mariadbd: Shutdown complete

I do not know if this anomaly is related to the hang, but I am willing to believe so.

Generated at Thu Feb 08 10:05:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.