[MDEV-30693] Assertion `dbl_records <= s->records' failed in apply_selectivity_for_table on SELECT Created: 2023-02-21  Updated: 2023-03-21  Resolved: 2023-03-17

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: N/A
Fix Version/s: 11.0.2

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: regression


 Description   

Likely caused by MDEV-26974

--source include/have_innodb.inc
SET optimizer_use_condition_selectivity=1;
CREATE TABLE t (c INT KEY) ENGINE=InnoDB;
SELECT * FROM (SELECT * FROM t) a JOIN (SELECT * FROM (SELECT * FROM t GROUP BY c) d WHERE c>1) b ON a.c=b.c;

Leads to:

11.0.1 f2dc4d4c10ac36a73b5c1eb765352d3aee808d66 (Debug)

mariadbd: /test/11.0_dbg/sql/sql_select.cc:7834: double apply_selectivity_for_table(JOIN_TAB*, uint): Assertion `dbl_records <= s->records' failed.

11.0.1 f2dc4d4c10ac36a73b5c1eb765352d3aee808d66 (Debug)

Core was generated by `/test/MD180223-mariadb-11.0.1-linux-x86_64-dbg/bin/mariadbd --no-defaults --cor'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=23214052087360)
    at ./nptl/pthread_kill.c:44
[Current thread is 1 (Thread 0x151cf1541640 (LWP 2410076))]
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=23214052087360) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=23214052087360) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=23214052087360, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x0000151d103d7476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x0000151d103bd7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000151d103bd71b in __assert_fail_base (fmt=0x151d10572150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x565411d7d84e "dbl_records <= s->records", file=0x565411d7ea68 "/test/11.0_dbg/sql/sql_select.cc", line=7834, function=<optimized out>) at ./assert/assert.c:92
#6  0x0000151d103cee96 in __GI___assert_fail (assertion=0x565411d7d84e "dbl_records <= s->records", file=0x565411d7ea68 "/test/11.0_dbg/sql/sql_select.cc", line=7834, function=0x565411d7eb38 "double apply_selectivity_for_table(JOIN_TAB*, uint)") at ./assert/assert.c:101
#7  0x00005654112a6afc in apply_selectivity_for_table (s=s@entry=0x151c30076a90, use_cond_selectivity=use_cond_selectivity@entry=1) at /test/11.0_dbg/sql/sql_select.cc:7834
#8  0x00005654112bbf5f in best_access_path (join=0x151c30027c98, s=s@entry=0x151c30076a90, remaining_tables=remaining_tables@entry=3, join_positions=0x151c300770b0, idx=idx@entry=0, disable_jbuf=true, disable_jbuf@entry=false, record_count=record_count@entry=1, pos=0x151c30077678, loose_scan_pos=0x151c300777a0) at /test/11.0_dbg/sql/sql_select.cc:9083
#9  0x00005654112bde5f in get_costs_for_tables (join=join@entry=0x151c30027c98, remaining_tables=remaining_tables@entry=3, idx=idx@entry=0, record_count=record_count@entry=1, trace_one_table=trace_one_table@entry=0x151cf153f060, pos=0x151c30077080, pos@entry=0x151c30077078, store_position=store_position@entry=0x151cf153f048, allowed_tables=<optimized out>, stop_on_eq_ref=<optimized out>) at /test/11.0_dbg/sql/sql_select.cc:10789
#10 0x00005654112c57ac in best_extension_by_limited_search (join=join@entry=0x151c30027c98, remaining_tables=remaining_tables@entry=3, idx=idx@entry=0, record_count=record_count@entry=1, read_time=read_time@entry=0, search_depth=search_depth@entry=62, use_cond_selectivity=use_cond_selectivity@entry=1, processed_eq_ref_tables=processed_eq_ref_tables@entry=0x151cf153f1a0) at /test/11.0_dbg/sql/sql_select.cc:11046
#11 0x00005654112c8cfa in greedy_search (use_cond_selectivity=1, search_depth=62, remaining_tables=3, join=0x151c30027c98) at /test/11.0_dbg/sql/sql_select.cc:10017
#12 choose_plan (join=join@entry=0x151c30027c98, join_tables=<optimized out>, emb_sjm_nest=emb_sjm_nest@entry=0x0) at /test/11.0_dbg/sql/sql_select.cc:9538
#13 0x00005654112ee007 in make_join_statistics (join=join@entry=0x151c30027c98, tables_list=@0x151c30013490: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x151c3006f3d0, last = 0x151c3006f3e0, elements = 2}, <No data fields>}, keyuse_array=keyuse_array@entry=0x151c30028000) at /test/11.0_dbg/sql/sql_select.cc:6073
#14 0x00005654112f54b9 in JOIN::optimize_inner (this=this@entry=0x151c30027c98) at /test/11.0_dbg/sql/sql_select.cc:2569
#15 0x00005654112f59bc in JOIN::optimize (this=this@entry=0x151c30027c98) at /test/11.0_dbg/sql/sql_select.cc:1897
#16 0x00005654112f5ac5 in mysql_select (thd=thd@entry=0x151c30000d58, tables=0x151c30014e20, fields=@0x151c30013530: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x151c30013848, last = 0x151c3006f670, elements = 2}, <No data fields>}, conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2164525824, result=0x151c30027c70, unit=0x151c30004fa0, select_lex=0x151c30013278) at /test/11.0_dbg/sql/sql_select.cc:5132
#17 0x00005654112f628b in handle_select (thd=thd@entry=0x151c30000d58, lex=lex@entry=0x151c30004ec8, result=result@entry=0x151c30027c70, setup_tables_done_option=setup_tables_done_option@entry=0) at /test/11.0_dbg/sql/sql_select.cc:608
#18 0x000056541125be8d in execute_sqlcom_select (thd=thd@entry=0x151c30000d58, all_tables=0x151c30014e20) at /test/11.0_dbg/sql/sql_parse.cc:6267
#19 0x00005654112674af in mysql_execute_command (thd=thd@entry=0x151c30000d58, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/11.0_dbg/sql/sql_parse.cc:3949
#20 0x000056541126e7cf in mysql_parse (thd=thd@entry=0x151c30000d58, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x151cf15402c0) at /test/11.0_dbg/sql/sql_parse.cc:8002
#21 0x0000565411270963 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x151c30000d58, packet=packet@entry=0x151c3000ae19 "SELECT * FROM (SELECT * FROM t) a JOIN (SELECT * FROM (SELECT * FROM t GROUP BY c) d WHERE c>1) b ON a.c=b.c", packet_length=packet_length@entry=108, blocking=blocking@entry=true) at /test/11.0_dbg/sql/sql_class.h:242
#22 0x00005654112727bc in do_command (thd=0x151c30000d58, blocking=blocking@entry=true) at /test/11.0_dbg/sql/sql_parse.cc:1407
#23 0x00005654113c36e2 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x565413ae41b8, put_in_cache=put_in_cache@entry=true) at /test/11.0_dbg/sql/sql_connect.cc:1416
#24 0x00005654113c3941 in handle_one_connection (arg=0x565413ae41b8) at /test/11.0_dbg/sql/sql_connect.cc:1318
#25 0x0000151d10429b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#26 0x0000151d104bba00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Bug confirmed present in:
MariaDB: 11.0.1 (dbg)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.3.38 (dbg), 10.3.38 (opt), 10.4.29 (dbg), 10.4.29 (opt), 10.5.20 (dbg), 10.5.20 (opt), 10.6.13 (dbg), 10.6.13 (opt), 10.7.8 (dbg), 10.7.8 (opt), 10.8.8 (dbg), 10.8.8 (opt), 10.9.6 (dbg), 10.9.6 (opt), 10.10.4 (dbg), 10.10.4 (opt), 10.11.2 (dbg), 10.11.2 (opt), 11.0.1 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.40 (dbg), 5.7.40 (opt), 8.0.31 (dbg), 8.0.31 (opt)



 Comments   
Comment by Sergei Petrunia [ 2023-02-21 ]

We hit this assert in apply_selectivity_for_table():

=>  DBUG_ASSERT(dbl_records <= s->records);

we have

(gdb) print dbl_records
  $138 = 2
(gdb) p s->records
  $140 = 1

also

(gdb) p s->table->opt_range_condition_rows
  $142 = 2
(gdb) p s->found_records
  $144 = 2

The table is :

(gdb) p s->table->alias.Ptr
  $136 = 0x7fff5c01b4b0 "d" 

It is a derived table:

... (SELECT * FROM t GROUP BY c) d ...

Where did we get s->records=1 from ?

Comment by Sergei Petrunia [ 2023-02-21 ]

Ok, the failure scenario

Source of the estimate of records=2

Initially, the number of records is obtained in TABLE_LIST::fetch_number_of_rows():

    if (is_materialized_derived() && !fill_me)
    { 
=>    table->file->stats.records= get_unit()->result->est_records; 
      set_if_bigger(table->file->stats.records, 2);

Here

  • est_records=1 (this is correct),
  • stats.records is set to 2.

Then, JOIN_TAB::estimate_scan_time() sets JOIN_TAB::records based on that:

      records= table->stat_records();

This sets records=2.

Source of the estimate of records=1:

It comes from here:
First, splitting code sets spl_opt_info->unsplit_card

  #0  JOIN::add_keyuses_for_splitting (this=0x7fff6895b3d0) at /home/psergey/dev-git2/11.0/sql/opt_split.cc:744
  #1  0x0000555556228cbd in st_join_table::add_keyuses_for_splitting (this=0x7fff68967390) at /home/psergey/dev-git2/11.0/sql/opt_split.cc:810
  #2  0x0000555555fb85f3 in make_join_statistics (join=0x7fff680467f8, tables_list=@0x7fff68017f50: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x7fff6895c5c0,
  #3  0x0000555555fac5f8 in JOIN::optimize_inner (this=0x7fff680467f8) at /home/psergey/dev-git2/11.0/sql/sql_select.cc:2569

It sets:

=>  spl_opt_info->unsplit_card= join_record_count;

join_record_count=1.

Eventually, the st_join_table::choose_best_splitting sets JOIN_TAB::records to unsplit_card (when it figures we won't use splitting):

      /* Restore original values */
      startup_cost=      spl_opt_info->unsplit_cost;
      records= (ha_rows) spl_opt_info->unsplit_card;
=>    spl_plan= 0; 
    }
    return spl_plan;

  #0  st_join_table::choose_best_splitting (this=0x7fff68967390, record_count=1, remaining_tables=3) at /home/psergey/dev-git2/11.0/sql/opt_split.cc:1082
  #1  0x0000555555fbf0cc in best_access_path (join=0x7fff680467f8, s=0x7fff68967390, remaining_tables=3, join_positions=0x7fff689679b0, idx=0, disable_jbuf=true, record_count
  #2  0x0000555555fc68e6 in get_costs_for_tables (join=0x7fff680467f8, remaining_tables=3, idx=0, record_count=1, trace_one_table=0x7ffff005fef0, pos=0x7fff68967980, store_po
  #3  0x0000555555fc71e3 in best_extension_by_limited_search (join=0x7fff680467f8, remaining_tables=3, idx=0, record_count=1, read_time=0, search_depth=62, use_cond_selectivi
  #4  0x0000555555fc48cd in greedy_search (join=0x7fff680467f8, remaining_tables=3, search_depth=62, use_cond_selectivity=1) at /home/psergey/dev-git2/11.0/sql/sql_select.cc:
  #5  0x0000555555fc38c0 in choose_plan (join=0x7fff680467f8, join_tables=3, emb_sjm_nest=0x0) at /home/psergey/dev-git2/11.0/sql/sql_select.cc:9539
  #6  0x0000555555fb93e0 in make_join_statistics (join=0x7fff680467f8, tables_list=@0x7fff68017f50: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x7fff6895c5c0, 
  #7  0x0000555555fac5f8 in JOIN::optimize_inner (this=0x7fff680467f8) at /home/psergey/dev-git2/11.0/sql/sql_select.cc:2569

Conflict of estimates

Then, apply_selectivity_for_table computes

dbl_records= rows2double(s->table->opt_range_condition_rows);

There was no quick select, so s->table->opt_range_condition_rows was computed as equal to #records in the table, 2.

On the other hand, splitting code has just set s->records (JOIN_TAB::records) to 1.
So, we hit the assert

=>  DBUG_ASSERT(dbl_records <= s->records);  

.

The fix.

It doesn't seem like a good thing that splitting code sets JOIN_TAB::records depending on the current query plan. But changing that might require a lot of changes.

For now, lets just use the same logic when computing #records in the derived table.

Comment by Sergei Petrunia [ 2023-02-22 ]

bb-11.0-mdev30693. Oleg, please review.

Comment by Oleg Smirnov [ 2023-03-16 ]

Left a couple of comments at https://github.com/MariaDB/server/commit/301fd687f790a4ebfd1dc6b275926a6071895e6d.

Regarding the actual fix:

-  spl_opt_info->unsplit_card= join_record_count;
+  /*
+   Use the number of rows that was computed by
+    TABLE_LIST::fetch_number_of_rows():
+ */
+  spl_opt_info->unsplit_card=
+    select_lex->master_unit()->derived->table->stat_records();

I'm not sure I'm the right person to approve it, maybe the author of that code (Igor Babaev) can do it better?

The usage of the magic number "2" looks suspicious to me:

set_if_bigger(table->file->stats.records, 2);

I don't understand that logic when stats.records=1 becomes 2. Maybe we should target estimated records count to be 1 instead of 2?

One more observation: the issue isn't reproduced on bb-11.0-MDEV-8320 which was branched off 11.0 about a month or two ago. So I guess this can be a regression introduced with some of recent commits to 11.0. I can help investigate this version if it's needed.

Comment by Oleg Smirnov [ 2023-03-16 ]

One more comment regarding Windows build: https://buildbot.mariadb.net/buildbot/builders/winx64-debug/builds/36086/steps/build/logs/stdio.

    83>D:\winx64-debug\build\src\sql\opt_split.cc(749,60): error C2220: the following warning is treated as an error [D:\winx64-debug\build\sql\sql.vcxproj]
D:\winx64-debug\build\src\sql\opt_split.cc(749,60): error C2220:     select_lex->master_unit()->derived->table->stat_records(); [D:\winx64-debug\build\sql\sql.vcxproj]
D:\winx64-debug\build\src\sql\opt_split.cc(749,60): error C2220:                                                            ^ [D:\winx64-debug\build\sql\sql.vcxproj]
    83>D:\winx64-debug\build\src\sql\opt_split.cc(749,60): warning C4244: '=': conversion from 'ha_rows' to 'double', possible loss of data [D:\winx64-debug\build\sql\sql.vcxproj]
         proxy_protocol.cc
         backup.cc
         xa.cc
         threadpool_win.cc
         threadpool_winsockets.cc
         threadpool_generic.cc
         threadpool_common.cc
         handle_connections_win.cc
         winmain.cc

Comment by Sergei Petrunia [ 2023-03-20 ]

The

set_if_bigger(table->file->stats.records, 2);

was introduced in

commit 709a0a131021135e9fb7a2095fcfcbc223dfb126
Author: Igor Babaev <igor@askmonty.org>
Date:   Wed May 26 13:18:18 2010 -0700
 
    MWL#106: Backport optimizations for derived tables and views.
    The main consolidated patch.

I think the intent of this is to avoid the temporary table to be treated as a constant table. (HEAP or MyISAM table with one row
may be treated as a constant)

I don't think this should be fixed as part of this MDEV.

Comment by Sergei Petrunia [ 2023-03-20 ]

(Pushed a commit addressing the input in github)

Generated at Thu Feb 08 10:18:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.