Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31591

Assertion `!error' fails in ha_partition::delete_row upon query interruption



    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4, 10.5, 10.6, 10.9, 10.10, 10.11, 11.0, 11.1
    • 10.4, 10.5, 10.6, 10.11, 11.1
    • Partitioning
    • None



      The basic test case here is very fragile. The failure happens when the query encounters a max_statement_time timeout at a certain stage of execution (probably the same would happen if it was just killed with the right timing). However, it cannot be any time during execution. If the query is killed too early, it just fails with the normal ER_STATEMENT_TIMEOUT. If we wait too long, the query succeeds. So, any pre-defined value of max_statement_time in the test case would be wrong for some builds and machines.

      The reproducer below tries to overcome it by putting the query in a loop, trying different max_statement_time values, from 0.000001 to 10, by powers of 10. If the query fails with ER_STATEMENT_TIMEOUT, then max_statement_time is too low and needs to be increased. If the query succeeds, then we already exceeded the required value of max_statement_time, so there is no point to continue (the test execution will fail with succeeded - should have failed with error ..., it means that the reported failure was not reproduced).
      I expect it to be sufficient for most builds and environments, but it can happen that you need even more granularity. If the test case fails for you with succeeded - should have failed with error ..., check which last max_statement_time it used, and try intermediate values between that and the previous one. For example, if the query ends with ER_STATEMENT_TIMEOUT with max_statement_time=0.1 but succeeds with max_statement_time=1, try different values between 0.1 and 1.
      If the test passes (this should never happen as the query is quick, while the last iteration will expect it to last at least 10 seconds), then try values above 10 seconds.
      Do not put this test case into the regression test suite! Hopefully once the problem is known, it will be easy enough to synchronize properly.

      --source include/have_partition.inc
      --source include/have_sequence.inc
      CREATE TABLE t1 (a DATE, b INT)
       PARTITION BY HASH (to_seconds(a));
      INSERT INTO t1 SELECT '0000-00-00', seq from seq_1_to_5000;
      CREATE TABLE t2 (c INT, d DATE);
      INSERT INTO t2 VALUES (199,'0000-00-00'),(200,'0000-00-00');
      CREATE TABLE t3 (e INT);
      INSERT INTO t3 SELECT seq FROM seq_1_to_50;
      CREATE TABLE t4 (f int);
      INSERT INTO t4 SELECT seq FROM seq_1_to_1500;
      --let $iterations= 8
      while ($iterations)
        --eval SET max_statement_time= POW(10,2-$iterations)
        SELECT @@max_statement_time;
        --error ER_STATEMENT_TIMEOUT
        DELETE t1 FROM t1 LEFT JOIN t2 ON ( t1.a <> t2.d ) WHERE t2.c <= ALL ( SELECT t4.f FROM t3, t4 WHERE t4.f <= t1.b );
        --dec $iterations
      # Cleanup
      DROP TABLE t1, t2, t3, t4;

      10.4 f5dceafd

      mysqld: /data/src/10.4/sql/ha_partition.cc:4631: virtual int ha_partition::delete_row(const uchar*): Assertion `!error' failed.
      230630 14:51:27 [ERROR] mysqld got signal 6 ;
      #9  0x00007fcdd1253df2 in __GI___assert_fail (assertion=0x561a4f9bb940 "!error", file=0x561a4f9b7060 "/data/src/10.4/sql/ha_partition.cc", line=4631, function=0x561a4f9bbb40 "virtual int ha_partition::delete_row(const uchar*)") at ./assert/assert.c:101
      #10 0x0000561a4e3ec233 in ha_partition::delete_row (this=0x61d0002044a8, buf=0x6190000857d0 "\371") at /data/src/10.4/sql/ha_partition.cc:4631
      #11 0x0000561a4dc28f4c in handler::ha_delete_row (this=0x61d0002044a8, buf=0x6190000857d0 "\371") at /data/src/10.4/sql/handler.cc:6965
      #12 0x0000561a4e08035a in TABLE::delete_row (this=0x62000003c088) at /data/src/10.4/sql/sql_delete.cc:292
      #13 0x0000561a4e07cdf7 in multi_delete::send_data (this=0x62b000066ed8, values=...) at /data/src/10.4/sql/sql_delete.cc:1369
      #14 0x0000561a4d5a271e in end_send (join=0x62b000066f50, join_tab=0x629000269208, end_of_records=false) at /data/src/10.4/sql/sql_select.cc:22084
      #15 0x0000561a4d59b6c8 in evaluate_null_complemented_join_record (join=0x62b000066f50, join_tab=0x629000268e60) at /data/src/10.4/sql/sql_select.cc:21238
      #16 0x0000561a4d599bf5 in sub_select (join=0x62b000066f50, join_tab=0x629000268e60, end_of_records=false) at /data/src/10.4/sql/sql_select.cc:20933
      #17 0x0000561a4d59aba3 in evaluate_join_record (join=0x62b000066f50, join_tab=0x629000268ab8, error=0) at /data/src/10.4/sql/sql_select.cc:21116
      #18 0x0000561a4d599af8 in sub_select (join=0x62b000066f50, join_tab=0x629000268ab8, end_of_records=false) at /data/src/10.4/sql/sql_select.cc:20928
      #19 0x0000561a4d59732e in do_select (join=0x62b000066f50, procedure=0x0) at /data/src/10.4/sql/sql_select.cc:20412
      #20 0x0000561a4d5261bd in JOIN::exec_inner (this=0x62b000066f50) at /data/src/10.4/sql/sql_select.cc:4605
      #21 0x0000561a4d5237c4 in JOIN::exec (this=0x62b000066f50) at /data/src/10.4/sql/sql_select.cc:4387
      #22 0x0000561a4d527856 in mysql_select (thd=0x62b00005b208, tables=0x62b000062b40, wild_num=0, fields=..., conds=0x62b000066c60, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=3489926016, result=0x62b000066ed8, unit=0x62b00005f140, select_lex=0x62b00005f970) at /data/src/10.4/sql/sql_select.cc:4826
      #23 0x0000561a4d45bf46 in mysql_execute_command (thd=0x62b00005b208) at /data/src/10.4/sql/sql_parse.cc:4875
      #24 0x0000561a4d471463 in mysql_parse (thd=0x62b00005b208, rawbuf=0x62b000062228 "DELETE t1 FROM t1 LEFT JOIN t2 ON ( t1.a <> t2.d ) WHERE t2.c <= ALL ( SELECT t4.f FROM t3, t4 WHERE t4.f <= t1.b )", length=115, parser_state=0x7fcdc95b9860, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:8008
      #25 0x0000561a4d4477a6 in dispatch_command (command=COM_QUERY, thd=0x62b00005b208, packet=0x629000230209 "DELETE t1 FROM t1 LEFT JOIN t2 ON ( t1.a <> t2.d ) WHERE t2.c <= ALL ( SELECT t4.f FROM t3, t4 WHERE t4.f <= t1.b )", packet_length=115, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1857
      #26 0x0000561a4d444315 in do_command (thd=0x62b00005b208) at /data/src/10.4/sql/sql_parse.cc:1378
      #27 0x0000561a4d8430ba in do_handle_one_connection (connect=0x6080000009a8) at /data/src/10.4/sql/sql_connect.cc:1420
      #28 0x0000561a4d8429d1 in handle_one_connection (arg=0x6080000009a8) at /data/src/10.4/sql/sql_connect.cc:1324
      #29 0x0000561a4e4afaee in pfs_spawn_thread (arg=0x615000003508) at /data/src/10.4/storage/perfschema/pfs.cc:1869
      #30 0x00007fcdd12a7fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #31 0x00007fcdd13285bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

      Query (0x62b000062228): DELETE t1 FROM t1 LEFT JOIN t2 ON ( t1.a <> t2.d ) WHERE t2.c <= ALL ( SELECT t4.f FROM t3, t4 WHERE t4.f <= t1.b )
      Connection ID (thread ID): 4
      Status: KILL_TIMEOUT

      Reproducible with at least MyISAM, InnoDB, Aria, on all existing versions, including earlier minor releases.
      _The same assertion fails in MDEV-30067, however there the failure is apparently specific to Spider (or even to Spider not having a partition for a certain value in the underlying table). Maybe the relation is that in both cases the query encounters a problem in the middle of execution.




            holyfoot Alexey Botchkov
            elenst Elena Stepanova
            0 Vote for this issue
            2 Start watching this issue



              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.