Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33665

main.pool_of_threads fails due to (spurious) uninitialized Item_func::not_null_tables_cache

Details

    Description

      This was found as part of MDEV-33478. See MDEV-33478 description for details.

      This is fairly old code, why did new MSAN find it while valgrind didn't? It seems the cause was that the value was copied between Item objects but was not used.

      If I add a printout:

      diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
      index f8fd28aebb5..7ec7a752875 100644
      --- a/sql/item_cmpfunc.cc
      +++ b/sql/item_cmpfunc.cc
      @@ -5076,6 +5076,7 @@ Item_cond::fix_fields(THD *thd, Item **ref)
           with_flags|= item->with_flags;
         }
         (void) eval_not_null_tables((void*) 0);
      +  fprintf(stderr, "AAQ: not_null_tables_cache=%llx\n", not_null_tables_cache);
       
         /*
           We have to set fixed as some other items will check it and fail if we
      

      the attached testcase (extract from include/common-tests.inc) starts to fail under valgrind:

       ==23601== Thread 6:
      ==23601== Use of uninitialised value of size 8
      ==23601==    at 0x747F7E1: _itoa_word (_itoa.c:180)
      ==23601==    by 0x7482EDD: vfprintf (vfprintf.c:1642)
      ==23601==    by 0x748563F: buffered_vfprintf (vfprintf.c:2329)
      ==23601==    by 0x74826F5: vfprintf (vfprintf.c:1301)
      ==23601==    by 0x748BE13: fprintf (fprintf.c:32)
      ==23601==    by 0xE9C878: Item_cond::fix_fields(THD*, Item**) (item_cmpfunc.cc:5079)
      ==23601==    by 0xB18FC8: make_cond_for_table_from_pred(THD*, Item*, Item*, unsigned long long, unsigned long long, int, bool, bool, bool) (sql_select.cc:23938)
      ==23601==    by 0xB18CBC: make_cond_for_table(THD*, Item*, unsigned long long, unsigned long long, int, bool, bool) (sql_select.cc:23869)
      ==23601==    by 0xAFA17A: make_join_select(JOIN*, SQL_SELECT*, Item*) (sql_select.cc:12543)
      ==23601==    by 0xADC64D: JOIN::optimize_stage2() (sql_select.cc:2855)
      ==23601==    by 0xADB459: JOIN::optimize_inner() (sql_select.cc:2590)
      ==23601==    by 0xAD8A6D: JOIN::optimize() (sql_select.cc:1888)
      ==23601==    by 0xAE4A58: mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:5127)
      ==23601==    by 0xAD3452: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:559)
      ==23601==    by 0xA91440: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6372)
      

      Attachments

        Issue Links

          Activity

            psergei Sergei Petrunia created issue -
            psergei Sergei Petrunia made changes -
            Field Original Value New Value
            psergei Sergei Petrunia made changes -
            Description This was found as part of MDEV-33478.
            This was found as part of MDEV-33478. See MDEV-33478 description for details.

            This is fairly old code, why did new MSAN find it while valgrind didn't? It seems the cause was that the value was copied between Item objects but was not used.

            If I add a printout:
            {code:diff}
            diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
            index f8fd28aebb5..7ec7a752875 100644
            --- a/sql/item_cmpfunc.cc
            +++ b/sql/item_cmpfunc.cc
            @@ -5076,6 +5076,7 @@ Item_cond::fix_fields(THD *thd, Item **ref)
                 with_flags|= item->with_flags;
               }
               (void) eval_not_null_tables((void*) 0);
            + fprintf(stderr, "AAQ: not_null_tables_cache=%llx\n", not_null_tables_cache);
            {code}

            the attached testcase starts to fail under valgrind:

            {code}
             ==23601== Thread 6:
            ==23601== Use of uninitialised value of size 8
            ==23601== at 0x747F7E1: _itoa_word (_itoa.c:180)
            ==23601== by 0x7482EDD: vfprintf (vfprintf.c:1642)
            ==23601== by 0x748563F: buffered_vfprintf (vfprintf.c:2329)
            ==23601== by 0x74826F5: vfprintf (vfprintf.c:1301)
            ==23601== by 0x748BE13: fprintf (fprintf.c:32)
            ==23601== by 0xE9C878: Item_cond::fix_fields(THD*, Item**) (item_cmpfunc.cc:5079)
            ==23601== by 0xB18FC8: make_cond_for_table_from_pred(THD*, Item*, Item*, unsigned long long, unsigned long long, int, bool, bool, bool) (sql_select.cc:23938)
            ==23601== by 0xB18CBC: make_cond_for_table(THD*, Item*, unsigned long long, unsigned long long, int, bool, bool) (sql_select.cc:23869)
            ==23601== by 0xAFA17A: make_join_select(JOIN*, SQL_SELECT*, Item*) (sql_select.cc:12543)
            ==23601== by 0xADC64D: JOIN::optimize_stage2() (sql_select.cc:2855)
            ==23601== by 0xADB459: JOIN::optimize_inner() (sql_select.cc:2590)
            ==23601== by 0xAD8A6D: JOIN::optimize() (sql_select.cc:1888)
            ==23601== by 0xAE4A58: mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:5127)
            ==23601== by 0xAD3452: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:559)
            ==23601== by 0xA91440: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6372)
            {code}
            psergei Sergei Petrunia made changes -
            Attachment _a1.test [ 73263 ]
            psergei Sergei Petrunia made changes -
            Description This was found as part of MDEV-33478. See MDEV-33478 description for details.

            This is fairly old code, why did new MSAN find it while valgrind didn't? It seems the cause was that the value was copied between Item objects but was not used.

            If I add a printout:
            {code:diff}
            diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
            index f8fd28aebb5..7ec7a752875 100644
            --- a/sql/item_cmpfunc.cc
            +++ b/sql/item_cmpfunc.cc
            @@ -5076,6 +5076,7 @@ Item_cond::fix_fields(THD *thd, Item **ref)
                 with_flags|= item->with_flags;
               }
               (void) eval_not_null_tables((void*) 0);
            + fprintf(stderr, "AAQ: not_null_tables_cache=%llx\n", not_null_tables_cache);
            {code}

            the attached testcase starts to fail under valgrind:

            {code}
             ==23601== Thread 6:
            ==23601== Use of uninitialised value of size 8
            ==23601== at 0x747F7E1: _itoa_word (_itoa.c:180)
            ==23601== by 0x7482EDD: vfprintf (vfprintf.c:1642)
            ==23601== by 0x748563F: buffered_vfprintf (vfprintf.c:2329)
            ==23601== by 0x74826F5: vfprintf (vfprintf.c:1301)
            ==23601== by 0x748BE13: fprintf (fprintf.c:32)
            ==23601== by 0xE9C878: Item_cond::fix_fields(THD*, Item**) (item_cmpfunc.cc:5079)
            ==23601== by 0xB18FC8: make_cond_for_table_from_pred(THD*, Item*, Item*, unsigned long long, unsigned long long, int, bool, bool, bool) (sql_select.cc:23938)
            ==23601== by 0xB18CBC: make_cond_for_table(THD*, Item*, unsigned long long, unsigned long long, int, bool, bool) (sql_select.cc:23869)
            ==23601== by 0xAFA17A: make_join_select(JOIN*, SQL_SELECT*, Item*) (sql_select.cc:12543)
            ==23601== by 0xADC64D: JOIN::optimize_stage2() (sql_select.cc:2855)
            ==23601== by 0xADB459: JOIN::optimize_inner() (sql_select.cc:2590)
            ==23601== by 0xAD8A6D: JOIN::optimize() (sql_select.cc:1888)
            ==23601== by 0xAE4A58: mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:5127)
            ==23601== by 0xAD3452: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:559)
            ==23601== by 0xA91440: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6372)
            {code}
            This was found as part of MDEV-33478. See MDEV-33478 description for details.

            This is fairly old code, why did new MSAN find it while valgrind didn't? It seems the cause was that the value was copied between Item objects but was not used.

            If I add a printout:
            {code:diff}
            diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
            index f8fd28aebb5..7ec7a752875 100644
            --- a/sql/item_cmpfunc.cc
            +++ b/sql/item_cmpfunc.cc
            @@ -5076,6 +5076,7 @@ Item_cond::fix_fields(THD *thd, Item **ref)
                 with_flags|= item->with_flags;
               }
               (void) eval_not_null_tables((void*) 0);
            + fprintf(stderr, "AAQ: not_null_tables_cache=%llx\n", not_null_tables_cache);
            {code}

            the attached testcase (extract from include/common-tests.inc) starts to fail under valgrind:

            {code}
             ==23601== Thread 6:
            ==23601== Use of uninitialised value of size 8
            ==23601== at 0x747F7E1: _itoa_word (_itoa.c:180)
            ==23601== by 0x7482EDD: vfprintf (vfprintf.c:1642)
            ==23601== by 0x748563F: buffered_vfprintf (vfprintf.c:2329)
            ==23601== by 0x74826F5: vfprintf (vfprintf.c:1301)
            ==23601== by 0x748BE13: fprintf (fprintf.c:32)
            ==23601== by 0xE9C878: Item_cond::fix_fields(THD*, Item**) (item_cmpfunc.cc:5079)
            ==23601== by 0xB18FC8: make_cond_for_table_from_pred(THD*, Item*, Item*, unsigned long long, unsigned long long, int, bool, bool, bool) (sql_select.cc:23938)
            ==23601== by 0xB18CBC: make_cond_for_table(THD*, Item*, unsigned long long, unsigned long long, int, bool, bool) (sql_select.cc:23869)
            ==23601== by 0xAFA17A: make_join_select(JOIN*, SQL_SELECT*, Item*) (sql_select.cc:12543)
            ==23601== by 0xADC64D: JOIN::optimize_stage2() (sql_select.cc:2855)
            ==23601== by 0xADB459: JOIN::optimize_inner() (sql_select.cc:2590)
            ==23601== by 0xAD8A6D: JOIN::optimize() (sql_select.cc:1888)
            ==23601== by 0xAE4A58: mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:5127)
            ==23601== by 0xAD3452: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:559)
            ==23601== by 0xA91440: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6372)
            {code}
            psergei Sergei Petrunia made changes -
            Description This was found as part of MDEV-33478. See MDEV-33478 description for details.

            This is fairly old code, why did new MSAN find it while valgrind didn't? It seems the cause was that the value was copied between Item objects but was not used.

            If I add a printout:
            {code:diff}
            diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
            index f8fd28aebb5..7ec7a752875 100644
            --- a/sql/item_cmpfunc.cc
            +++ b/sql/item_cmpfunc.cc
            @@ -5076,6 +5076,7 @@ Item_cond::fix_fields(THD *thd, Item **ref)
                 with_flags|= item->with_flags;
               }
               (void) eval_not_null_tables((void*) 0);
            + fprintf(stderr, "AAQ: not_null_tables_cache=%llx\n", not_null_tables_cache);
            {code}

            the attached testcase (extract from include/common-tests.inc) starts to fail under valgrind:

            {code}
             ==23601== Thread 6:
            ==23601== Use of uninitialised value of size 8
            ==23601== at 0x747F7E1: _itoa_word (_itoa.c:180)
            ==23601== by 0x7482EDD: vfprintf (vfprintf.c:1642)
            ==23601== by 0x748563F: buffered_vfprintf (vfprintf.c:2329)
            ==23601== by 0x74826F5: vfprintf (vfprintf.c:1301)
            ==23601== by 0x748BE13: fprintf (fprintf.c:32)
            ==23601== by 0xE9C878: Item_cond::fix_fields(THD*, Item**) (item_cmpfunc.cc:5079)
            ==23601== by 0xB18FC8: make_cond_for_table_from_pred(THD*, Item*, Item*, unsigned long long, unsigned long long, int, bool, bool, bool) (sql_select.cc:23938)
            ==23601== by 0xB18CBC: make_cond_for_table(THD*, Item*, unsigned long long, unsigned long long, int, bool, bool) (sql_select.cc:23869)
            ==23601== by 0xAFA17A: make_join_select(JOIN*, SQL_SELECT*, Item*) (sql_select.cc:12543)
            ==23601== by 0xADC64D: JOIN::optimize_stage2() (sql_select.cc:2855)
            ==23601== by 0xADB459: JOIN::optimize_inner() (sql_select.cc:2590)
            ==23601== by 0xAD8A6D: JOIN::optimize() (sql_select.cc:1888)
            ==23601== by 0xAE4A58: mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:5127)
            ==23601== by 0xAD3452: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:559)
            ==23601== by 0xA91440: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6372)
            {code}
            This was found as part of MDEV-33478. See MDEV-33478 description for details.

            This is fairly old code, why did new MSAN find it while valgrind didn't? It seems the cause was that the value was copied between Item objects but was not used.

            If I add a printout:
            {code:diff}
            diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
            index f8fd28aebb5..7ec7a752875 100644
            --- a/sql/item_cmpfunc.cc
            +++ b/sql/item_cmpfunc.cc
            @@ -5076,6 +5076,7 @@ Item_cond::fix_fields(THD *thd, Item **ref)
                 with_flags|= item->with_flags;
               }
               (void) eval_not_null_tables((void*) 0);
            + fprintf(stderr, "AAQ: not_null_tables_cache=%llx\n", not_null_tables_cache);
             
               /*
                 We have to set fixed as some other items will check it and fail if we
            {code}

            the attached testcase (extract from include/common-tests.inc) starts to fail under valgrind:

            {code}
             ==23601== Thread 6:
            ==23601== Use of uninitialised value of size 8
            ==23601== at 0x747F7E1: _itoa_word (_itoa.c:180)
            ==23601== by 0x7482EDD: vfprintf (vfprintf.c:1642)
            ==23601== by 0x748563F: buffered_vfprintf (vfprintf.c:2329)
            ==23601== by 0x74826F5: vfprintf (vfprintf.c:1301)
            ==23601== by 0x748BE13: fprintf (fprintf.c:32)
            ==23601== by 0xE9C878: Item_cond::fix_fields(THD*, Item**) (item_cmpfunc.cc:5079)
            ==23601== by 0xB18FC8: make_cond_for_table_from_pred(THD*, Item*, Item*, unsigned long long, unsigned long long, int, bool, bool, bool) (sql_select.cc:23938)
            ==23601== by 0xB18CBC: make_cond_for_table(THD*, Item*, unsigned long long, unsigned long long, int, bool, bool) (sql_select.cc:23869)
            ==23601== by 0xAFA17A: make_join_select(JOIN*, SQL_SELECT*, Item*) (sql_select.cc:12543)
            ==23601== by 0xADC64D: JOIN::optimize_stage2() (sql_select.cc:2855)
            ==23601== by 0xADB459: JOIN::optimize_inner() (sql_select.cc:2590)
            ==23601== by 0xAD8A6D: JOIN::optimize() (sql_select.cc:1888)
            ==23601== by 0xAE4A58: mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:5127)
            ==23601== by 0xAD3452: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:559)
            ==23601== by 0xA91440: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6372)
            {code}
            psergei Sergei Petrunia made changes -
            Assignee Sergei Petrunia [ psergey ]
            psergei Sergei Petrunia made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            psergei Sergei Petrunia added a comment - - edited

            Previously suggested fix

            diff --git a/sql/item_func.cc b/sql/item_func.cc
            index 0973d0c2c82..54f7eb222bf 100644
            --- a/sql/item_func.cc
            +++ b/sql/item_func.cc
            @@ -397,6 +397,7 @@ Item_func::quick_fix_field()
                 }
               }
               base_flags|= item_base_t::FIXED;
            +  eval_not_null_tables(NULL);
             }
             
            

            removes the error, but is it correct? quick_fix_field() just marks the item as fixed, without updating any fields (like e.g. used_tables). Why should we single out not_null_tables_cache and just update it?

            Some callers update some fields after quick_fix_field() call:

                    new_cond->quick_fix_field();
                    new_cond->used_tables_cache=
                      ((Item_cond_and*) cond)->used_tables_cache &
                      tables;
            

            psergei Sergei Petrunia added a comment - - edited Previously suggested fix diff --git a/sql/item_func.cc b/sql/item_func.cc index 0973d0c2c82..54f7eb222bf 100644 --- a/sql/item_func.cc +++ b/sql/item_func.cc @@ -397,6 +397,7 @@ Item_func::quick_fix_field() } } base_flags|= item_base_t::FIXED; + eval_not_null_tables(NULL); } removes the error, but is it correct? quick_fix_field() just marks the item as fixed, without updating any fields (like e.g. used_tables). Why should we single out not_null_tables_cache and just update it? Some callers update some fields after quick_fix_field() call: new_cond->quick_fix_field(); new_cond->used_tables_cache= ((Item_cond_and*) cond)->used_tables_cache & tables;
            psergei Sergei Petrunia made changes -
            Assignee Sergei Petrunia [ psergey ] Oleg Smirnov [ JIRAUSER50405 ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            oleg.smirnov Oleg Smirnov added a comment -

            Ok to push

            oleg.smirnov Oleg Smirnov added a comment - Ok to push
            oleg.smirnov Oleg Smirnov made changes -
            Assignee Oleg Smirnov [ JIRAUSER50405 ] Sergei Petrunia [ psergey ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            psergei Sergei Petrunia made changes -
            Summary main.pool_of_threads fails due to uninitialized Item_func::not_null_tables_cache main.pool_of_threads fails due to (spurious) uninitialized Item_func::not_null_tables_cache
            psergei Sergei Petrunia made changes -
            Fix Version/s 10.5.25 [ 29626 ]
            Fix Version/s 10.6.18 [ 29627 ]
            Fix Version/s 10.11.8 [ 29630 ]
            Fix Version/s 11.0.6 [ 29628 ]
            Fix Version/s 11.1.5 [ 29629 ]
            Fix Version/s 11.2.4 [ 29631 ]
            Fix Version/s 11.4.2 [ 29633 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.