Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33314

Crash inside calculate_cond_selectivity_for_table() with many columns

Details

    Description

      I will post a reduced testcase once I have it.

      The symptom of the bug is

      sql/sql_bitmap.h:77: ulonglong Bitmap<width>::bit_mask(uint) const [with
              unsigned int width = 64; ulonglong = long long unsigned int; uint = unsigned int]: Assertion `n < width' failed.
      

      The stack trace:

        #3  0x00007ffff5221472 in __GI___assert_fail (assertion=0x555556c6cb4a "n < width", file=0x555556c6cb08 "/home/psergey/dev-git2/         MariaDBEnterprise-10.6.14/sql/sql_bitmap.h", line=77, function=0x555556c6d380 <Bitmap<64u>::bit_mask(unsigned int) const::                 __PRETTY_FUNCTION__> "ulonglong Bitmap<width>::bit_mask(uint) const [with unsigned int width = 64; ulonglong = long long unsigned int;     uint = unsigned int]") at assert.c:101
        #4  0x0000555555d94d37 in Bitmap<64u>::bit_mask (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/     sql_bitmap.h:77
        #5  0x0000555555e41d69 in Bitmap<64u>::set_bit (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/      sql_bitmap.h:109
        #6  0x00005555564541d2 in Item_bool_func::get_mm_parts (this=0x7fff3419e690, param=0x7ffff40f3c50, field=0x7fff3407a2d0, type=Item_func::EQ_FUNC, value=0x7fff3418edd8) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8659
        #7  0x0000555556453d11 in Item_equal::get_mm_tree (this=0x7fff3419e690, param=0x7ffff40f3c50, cond_ptr=0x7fff3419e7c0) at /home/psergey/ dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8568
        #8  0x00005555564530e5 in Item_cond_and::get_mm_tree (this=0x7fff3402a9c0, param=0x7ffff40f3c50, cond_ptr=0x7fff34185438) at /home/      psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8346
        #9  0x0000555556446c01 in calculate_cond_selectivity_for_table (thd=0x7fff34000d78, table=0x7fff340578a8, cond=0x7fff34185438) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3517
        #10 0x00007fff341c0d70 in ?? ()
        #11 0x00007fff341c0d90 in ?? ()
        #12 0x00007fff341c0db0 in ?? ()
        #13 0x00007fff341c0dd0 in ?? ()
      

      In the testcase I'm trying, calculate_cond_selectivity_for_table() is trying to get estimates for 168 columns:

        Thread 14 "mysqld" hit Breakpoint 4, create_key_parts_for_pseudo_indexes (param=0x7ffff40f3c50, used_fields=0x7fff28057a78) at /home/    psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3183
      (gdb) p keys
        $54 = 168
      

      the range optimizer can handle at most 64 indexes.
      (This is all I see so far. There may be other issues)

      Attachments

        Activity

          psergei Sergei Petrunia created issue -
          psergei Sergei Petrunia made changes -
          Field Original Value New Value
          Component/s Optimizer [ 10200 ]
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ]
          psergei Sergei Petrunia made changes -
          Affects Version/s 10.6 [ 24028 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 10.6 [ 24028 ]
          psergei Sergei Petrunia made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          psergei Sergei Petrunia made changes -
          Description I will post a reduced testcase once I have it.

          The symptom of the bug is
          {code}
          sql/sql_bitmap.h:77: ulonglong Bitmap<width>::bit_mask(uint) const [with unsigned int width = 64; ulonglong = long long unsigned int; uint = unsigned int]: Assertion `n < width' failed.

          The stack trace:
          {code}
            #3 0x00007ffff5221472 in __GI___assert_fail (assertion=0x555556c6cb4a "n < width", file=0x555556c6cb08 "/home/psergey/dev-git2/ MariaDBEnterprise-10.6.14/sql/sql_bitmap.h", line=77, function=0x555556c6d380 <Bitmap<64u>::bit_mask(unsigned int) const:: __PRETTY_FUNCTION__> "ulonglong Bitmap<width>::bit_mask(uint) const [with unsigned int width = 64; ulonglong = long long unsigned int; uint = unsigned int]") at assert.c:101
            #4 0x0000555555d94d37 in Bitmap<64u>::bit_mask (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/ sql_bitmap.h:77
            #5 0x0000555555e41d69 in Bitmap<64u>::set_bit (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/ sql_bitmap.h:109
            #6 0x00005555564541d2 in Item_bool_func::get_mm_parts (this=0x7fff3419e690, param=0x7ffff40f3c50, field=0x7fff3407a2d0, type=Item_func::EQ_FUNC, value=0x7fff3418edd8) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8659
            #7 0x0000555556453d11 in Item_equal::get_mm_tree (this=0x7fff3419e690, param=0x7ffff40f3c50, cond_ptr=0x7fff3419e7c0) at /home/psergey/ dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8568
            #8 0x00005555564530e5 in Item_cond_and::get_mm_tree (this=0x7fff3402a9c0, param=0x7ffff40f3c50, cond_ptr=0x7fff34185438) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8346
            #9 0x0000555556446c01 in calculate_cond_selectivity_for_table (thd=0x7fff34000d78, table=0x7fff340578a8, cond=0x7fff34185438) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3517
            #10 0x00007fff341c0d70 in ?? ()
            #11 0x00007fff341c0d90 in ?? ()
            #12 0x00007fff341c0db0 in ?? ()
            #13 0x00007fff341c0dd0 in ?? ()
          {code}

          In the testcase I'm trying, calculate_cond_selectivity_for_table() is trying to get estimates for 168 columns:

          {code}
            Thread 14 "mysqld" hit Breakpoint 4, create_key_parts_for_pseudo_indexes (param=0x7ffff40f3c50, used_fields=0x7fff28057a78) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3183
          (gdb) p keys
            $54 = 168
          {code}

          the range optimizer can handle at most 64 indexes.
          (This is all I see so far. There may be other issues)
          I will post a reduced testcase once I have it.

          The symptom of the bug is
          {code}
          sql/sql_bitmap.h:77: ulonglong Bitmap<width>::bit_mask(uint) const [with
                  unsigned int width = 64; ulonglong = long long unsigned int; uint = unsigned int]: Assertion `n < width' failed.
          {code}

          The stack trace:
          {code}
            #3 0x00007ffff5221472 in __GI___assert_fail (assertion=0x555556c6cb4a "n < width", file=0x555556c6cb08 "/home/psergey/dev-git2/ MariaDBEnterprise-10.6.14/sql/sql_bitmap.h", line=77, function=0x555556c6d380 <Bitmap<64u>::bit_mask(unsigned int) const:: __PRETTY_FUNCTION__> "ulonglong Bitmap<width>::bit_mask(uint) const [with unsigned int width = 64; ulonglong = long long unsigned int; uint = unsigned int]") at assert.c:101
            #4 0x0000555555d94d37 in Bitmap<64u>::bit_mask (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/ sql_bitmap.h:77
            #5 0x0000555555e41d69 in Bitmap<64u>::set_bit (this=0x7fff341d8918, n=64) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/ sql_bitmap.h:109
            #6 0x00005555564541d2 in Item_bool_func::get_mm_parts (this=0x7fff3419e690, param=0x7ffff40f3c50, field=0x7fff3407a2d0, type=Item_func::EQ_FUNC, value=0x7fff3418edd8) at /home/psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8659
            #7 0x0000555556453d11 in Item_equal::get_mm_tree (this=0x7fff3419e690, param=0x7ffff40f3c50, cond_ptr=0x7fff3419e7c0) at /home/psergey/ dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8568
            #8 0x00005555564530e5 in Item_cond_and::get_mm_tree (this=0x7fff3402a9c0, param=0x7ffff40f3c50, cond_ptr=0x7fff34185438) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:8346
            #9 0x0000555556446c01 in calculate_cond_selectivity_for_table (thd=0x7fff34000d78, table=0x7fff340578a8, cond=0x7fff34185438) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3517
            #10 0x00007fff341c0d70 in ?? ()
            #11 0x00007fff341c0d90 in ?? ()
            #12 0x00007fff341c0db0 in ?? ()
            #13 0x00007fff341c0dd0 in ?? ()
          {code}

          In the testcase I'm trying, calculate_cond_selectivity_for_table() is trying to get estimates for 168 columns:

          {code}
            Thread 14 "mysqld" hit Breakpoint 4, create_key_parts_for_pseudo_indexes (param=0x7ffff40f3c50, used_fields=0x7fff28057a78) at /home/ psergey/dev-git2/MariaDBEnterprise-10.6.14/sql/opt_range.cc:3183
          (gdb) p keys
            $54 = 168
          {code}

          the range optimizer can handle at most 64 indexes.
          (This is all I see so far. There may be other issues)

          bb-10.6-MDEV-33314, please review.

          psergei Sergei Petrunia added a comment - bb-10.6- MDEV-33314 , please review.
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Michael Widenius [ monty ]
          Status In Progress [ 3 ] In Review [ 10002 ]

          ralf.gebhardt, the problem and the fix will be still valid for the ES.

          There is only one thing: I might need to adjust the patch's testcase to use > 128 indexes (currently it uses 100). The rest of the patch is valid.

          psergei Sergei Petrunia added a comment - ralf.gebhardt , the problem and the fix will be still valid for the ES. There is only one thing: I might need to adjust the patch's testcase to use > 128 indexes (currently it uses 100). The rest of the patch is valid.

          @spetruna, how do you plan to choose the fields that we should create a pseudo index for?
          I assume we should first use fields that are used with an operator we can use with the range optimizer ?
          It looks like table->cond_set that is used is already setting the bit's for 'field op constant'
          If there are more than 128 of them, which ones to choose?
          The first 128 that we find in create_key_parts_for_pseudo_index() ?

          Another option would be to change SEL_TREE::keys_map to have an explicit key map allocated in get_mm_parts()
          This would allows to have any number of pseudo keys.

          monty Michael Widenius added a comment - @spetruna, how do you plan to choose the fields that we should create a pseudo index for? I assume we should first use fields that are used with an operator we can use with the range optimizer ? It looks like table->cond_set that is used is already setting the bit's for 'field op constant' If there are more than 128 of them, which ones to choose? The first 128 that we find in create_key_parts_for_pseudo_index() ? Another option would be to change SEL_TREE::keys_map to have an explicit key map allocated in get_mm_parts() This would allows to have any number of pseudo keys.

          Review comments:

          In the test case, you should create MAX_INDEXES + 1 fields and conditions.

          To do that, please add system variable, MAX_INDEXES. This can be genrally
          usefull (simlar to version_malloc_library).

          • double rows;
            + double rows= table_records;

          You can remove the rows variable and replace it with
          a local variable in the only part where it is needed:

          rows= records_in_column_ranges(&param, idx, key);
          ->
          double rows= records_in_column_ranges(&param, idx, key);

          I have one open question:

          • Each SEL_TREE has a key bitmap.
            When we do a tree_and() we merge the bitmaps (and thus assume each bit
            maps to the same index).

          For pseudo trees, do we ever try to merge or combine the bitmaps?

          I spent some time looking at the code and it is probably safe as we never
          combine two SEL_ARG trees generated from different get_mm_trees.

          monty Michael Widenius added a comment - Review comments: In the test case, you should create MAX_INDEXES + 1 fields and conditions. To do that, please add system variable, MAX_INDEXES. This can be genrally usefull (simlar to version_malloc_library). double rows; + double rows= table_records; You can remove the rows variable and replace it with a local variable in the only part where it is needed: rows= records_in_column_ranges(&param, idx, key); -> double rows= records_in_column_ranges(&param, idx, key); I have one open question: Each SEL_TREE has a key bitmap. When we do a tree_and() we merge the bitmaps (and thus assume each bit maps to the same index). For pseudo trees, do we ever try to merge or combine the bitmaps? I spent some time looking at the code and it is probably safe as we never combine two SEL_ARG trees generated from different get_mm_trees.

          igor, look:

          create_key_parts_for_pseudo_indexes() uses this piece of code to check whether to create pseudo indexes for columns:

            for (field_ptr= table->field; *field_ptr; field_ptr++)
            {
              Field *field= *field_ptr;
              if (bitmap_is_set(used_fields, field->field_index) &&
                  is_eits_usable(field))
                parts++;
            }
          

          So are you suggesting that outside of create_key_parts_for_pseudo_indexes there should be identical code that walks through the columns, checks "if_eits_usable" and collects a bitmap of first 64 columns, then calls create_key_parts_for_pseudo_indexes().

          Since you request not to touch create_key_parts_for_pseudo_indexes(), it will check "if_eits_usable" for each passed column again.

          Is that what you would prefer to see implemented? Please clarify.

          psergei Sergei Petrunia added a comment - igor , look: create_key_parts_for_pseudo_indexes() uses this piece of code to check whether to create pseudo indexes for columns: for (field_ptr= table->field; *field_ptr; field_ptr++) { Field *field= *field_ptr; if (bitmap_is_set(used_fields, field->field_index) && is_eits_usable(field)) parts++; } So are you suggesting that outside of create_key_parts_for_pseudo_indexes there should be identical code that walks through the columns, checks "if_eits_usable" and collects a bitmap of first 64 columns, then calls create_key_parts_for_pseudo_indexes(). Since you request not to touch create_key_parts_for_pseudo_indexes(), it will check "if_eits_usable" for each passed column again. Is that what you would prefer to see implemented? Please clarify.

          Slo we do need to change create_key_parts_for_pseudo_indexes after all? Ok let me try making a patch.

          psergei Sergei Petrunia added a comment - Slo we do need to change create_key_parts_for_pseudo_indexes after all? Ok let me try making a patch.
          psergei Sergei Petrunia made changes -
          Assignee Michael Widenius [ monty ] Igor Babaev [ igor ]

          igor could you please review the latest patch: https://github.com/MariaDB/server/commit/2949128be1ffd7c340676618b83145f4f3f3929b
          I believe it addressed all of your input. it's pushed into bb-10.6-MDEV-33314

          psergei Sergei Petrunia added a comment - igor could you please review the latest patch: https://github.com/MariaDB/server/commit/2949128be1ffd7c340676618b83145f4f3f3929b I believe it addressed all of your input. it's pushed into bb-10.6- MDEV-33314

          I don't see any problems with the patch. Yet as it's a crashing bug the patch should be applied to 10.5 as well.

          igor Igor Babaev (Inactive) added a comment - I don't see any problems with the patch. Yet as it's a crashing bug the patch should be applied to 10.5 as well.
          igor Igor Babaev (Inactive) made changes -
          Status In Review [ 10002 ] Stalled [ 10000 ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Igor Babaev [ igor ] Sergei Petrunia [ psergey ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 10.5.24 [ 29517 ]
          Fix Version/s 10.6.17 [ 29518 ]
          Fix Version/s 10.11.7 [ 29519 ]
          Fix Version/s 11.0.5 [ 29520 ]
          Fix Version/s 11.1.4 [ 29024 ]
          Fix Version/s 11.2.3 [ 29521 ]
          Fix Version/s 11.3.2 [ 29522 ]
          Fix Version/s 11.4.1 [ 29523 ]
          Fix Version/s 10.6 [ 24028 ]
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 10.5.25 [ 29626 ]
          Fix Version/s 10.6.18 [ 29627 ]
          Fix Version/s 10.11.8 [ 29630 ]
          Fix Version/s 11.0.6 [ 29628 ]
          Fix Version/s 11.1.5 [ 29629 ]
          Fix Version/s 11.2.4 [ 29631 ]
          Fix Version/s 11.3.3 [ 29632 ]
          Fix Version/s 11.4.2 [ 29633 ]
          Fix Version/s 11.1.4 [ 29024 ]
          Fix Version/s 10.5.24 [ 29517 ]
          Fix Version/s 10.6.17 [ 29518 ]
          Fix Version/s 10.11.7 [ 29519 ]
          Fix Version/s 11.0.5 [ 29520 ]
          Fix Version/s 11.2.3 [ 29521 ]
          Fix Version/s 11.3.2 [ 29522 ]
          Fix Version/s 11.4.1 [ 29523 ]
          ralf.gebhardt Ralf Gebhardt made changes -
          julien.fritsch Julien Fritsch made changes -
          Fix Version/s 11.3.3 [ 29632 ]

          Notes for the changelog:

          The server could crash if a query's WHERE clause has conditions over more than 128 (Enterprise Server) or 64 (Community Server) columns such that column statistics is available for them.

          psergei Sergei Petrunia added a comment - Notes for the changelog: The server could crash if a query's WHERE clause has conditions over more than 128 (Enterprise Server) or 64 (Community Server) columns such that column statistics is available for them.
          mariadb-jira-automation Jira Automation (IT) made changes -
          Zendesk Related Tickets 131807

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.