[MDEV-33314] Crash inside calculate_cond_selectivity_for_table() with many columns Created: 2024-01-25  Updated: 2024-02-07  Resolved: 2024-02-03

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.6
Fix Version/s: 10.5.25, 10.6.18, 10.11.8, 11.0.6, 11.1.5, 11.2.4, 11.3.3, 11.4.2

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 2
Labels: None

Issue Links:
Blocks

 Description   

I will post a reduced testcase once I have it.

The symptom of the bug is

sql/sql_bitmap.h:77: ulonglong Bitmap<width>::bit_mask(uint) const [with
        unsigned int width = 64; ulonglong = long long unsigned int; uint = unsigned int]: Assertion `n < width' failed.

The stack trace:

  #3  0x00007ffff5221472 in __GI___assert_fail (assertion=0x555556c6cb4a "n < width", file=0x555556c6cb08 "/home/psergey/dev-git2/         MariaDBEnterprise-10.6.14/sql/sql_bitmap.h", line=77, function=0x555556c6d380 <Bitmap<64u>::bit_mask(unsigned int) const::                 __PRETTY_FUNCTION__> "ulonglong Bitmap<width>::bit_mask(uint) const [with unsigned int width = 64; ulonglong = long long unsigned int;     uint = unsigned int]") at assert.c:101
  #4  0x0000555555d94d37 in Bitmap<64u>::bit_mask (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/     sql_bitmap.h:77
  #5  0x0000555555e41d69 in Bitmap<64u>::set_bit (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/      sql_bitmap.h:109
  #6  0x00005555564541d2 in Item_bool_func::get_mm_parts (this=0x7fff3419e690, param=0x7ffff40f3c50, field=0x7fff3407a2d0, type=Item_func::EQ_FUNC, value=0x7fff3418edd8) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8659
  #7  0x0000555556453d11 in Item_equal::get_mm_tree (this=0x7fff3419e690, param=0x7ffff40f3c50, cond_ptr=0x7fff3419e7c0) at /home/psergey/ dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8568
  #8  0x00005555564530e5 in Item_cond_and::get_mm_tree (this=0x7fff3402a9c0, param=0x7ffff40f3c50, cond_ptr=0x7fff34185438) at /home/      psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8346
  #9  0x0000555556446c01 in calculate_cond_selectivity_for_table (thd=0x7fff34000d78, table=0x7fff340578a8, cond=0x7fff34185438) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3517
  #10 0x00007fff341c0d70 in ?? ()
  #11 0x00007fff341c0d90 in ?? ()
  #12 0x00007fff341c0db0 in ?? ()
  #13 0x00007fff341c0dd0 in ?? ()

In the testcase I'm trying, calculate_cond_selectivity_for_table() is trying to get estimates for 168 columns:

  Thread 14 "mysqld" hit Breakpoint 4, create_key_parts_for_pseudo_indexes (param=0x7ffff40f3c50, used_fields=0x7fff28057a78) at /home/    psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3183
(gdb) p keys
  $54 = 168

the range optimizer can handle at most 64 indexes.
(This is all I see so far. There may be other issues)



 Comments   
Comment by Sergei Petrunia [ 2024-01-26 ]

bb-10.6-MDEV-33314, please review.

Comment by Sergei Petrunia [ 2024-01-27 ]

ralf.gebhardt, the problem and the fix will be still valid for the ES.

There is only one thing: I might need to adjust the patch's testcase to use > 128 indexes (currently it uses 100). The rest of the patch is valid.

Comment by Michael Widenius [ 2024-01-28 ]

@spetruna, how do you plan to choose the fields that we should create a pseudo index for?
I assume we should first use fields that are used with an operator we can use with the range optimizer ?
It looks like table->cond_set that is used is already setting the bit's for 'field op constant'
If there are more than 128 of them, which ones to choose?
The first 128 that we find in create_key_parts_for_pseudo_index() ?

Another option would be to change SEL_TREE::keys_map to have an explicit key map allocated in get_mm_parts()
This would allows to have any number of pseudo keys.

Comment by Michael Widenius [ 2024-01-28 ]

Review comments:

In the test case, you should create MAX_INDEXES + 1 fields and conditions.

To do that, please add system variable, MAX_INDEXES. This can be genrally
usefull (simlar to version_malloc_library).

  • double rows;
    + double rows= table_records;

You can remove the rows variable and replace it with
a local variable in the only part where it is needed:

rows= records_in_column_ranges(&param, idx, key);
->
double rows= records_in_column_ranges(&param, idx, key);

I have one open question:

  • Each SEL_TREE has a key bitmap.
    When we do a tree_and() we merge the bitmaps (and thus assume each bit
    maps to the same index).

For pseudo trees, do we ever try to merge or combine the bitmaps?

I spent some time looking at the code and it is probably safe as we never
combine two SEL_ARG trees generated from different get_mm_trees.

Comment by Sergei Petrunia [ 2024-01-29 ]

igor, look:

create_key_parts_for_pseudo_indexes() uses this piece of code to check whether to create pseudo indexes for columns:

  for (field_ptr= table->field; *field_ptr; field_ptr++)
  {
    Field *field= *field_ptr;
    if (bitmap_is_set(used_fields, field->field_index) &&
        is_eits_usable(field))
      parts++;
  }

So are you suggesting that outside of create_key_parts_for_pseudo_indexes there should be identical code that walks through the columns, checks "if_eits_usable" and collects a bitmap of first 64 columns, then calls create_key_parts_for_pseudo_indexes().

Since you request not to touch create_key_parts_for_pseudo_indexes(), it will check "if_eits_usable" for each passed column again.

Is that what you would prefer to see implemented? Please clarify.

Comment by Sergei Petrunia [ 2024-01-30 ]

Slo we do need to change create_key_parts_for_pseudo_indexes after all? Ok let me try making a patch.

Comment by Sergei Petrunia [ 2024-02-01 ]

igor could you please review the latest patch: https://github.com/MariaDB/server/commit/2949128be1ffd7c340676618b83145f4f3f3929b
I believe it addressed all of your input. it's pushed into bb-10.6-MDEV-33314

Comment by Igor Babaev [ 2024-02-02 ]

I don't see any problems with the patch. Yet as it's a crashing bug the patch should be applied to 10.5 as well.

Generated at Thu Feb 08 10:37:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.