Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23510

Crash in my_strnncoll_binary while running point-select

Details

    Description

      • Running a CPU bound workload (point-select) with increase scalability from 1-1024 (in multiple of 2).
      • Hit the said issue at 32 threads but it is random could hit is even at different scalability.
      • Issues seems to suggest something wrong with memcmp (less likely memory violation more likely alignment issue).
      • Observed with MDB-10.5-trunk (10.5.6) on ARM only. Not observed with x86 yet. If present may need different test-case.
        Existing bug in this area MDEV-20619

      ----------------

      (gdb) bt
      #0  0x0000ffffbe077f50 in memcmp () from /lib64/libc.so.6
      #1  0x0000aaaaab82747c in my_strnncoll_binary (cs=<optimized out>, s=<optimized out>, slen=14, t=<optimized out>, tlen=14, t_is_prefix=0 '\000')
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/strings/ctype-bin.c:87
      #2  0x0000aaaaab817898 in l_find (head=0xffe998006cf8, head@entry=0xaaab058aec20, cs=0xaaaaac02d3f0 <my_charset_bin>, hashnr=<optimized out>, 
          key=key@entry=0xffe924b3c668 "\002arm", keylen=keylen@entry=14, cursor=0xffeafc4c55d8, cursor@entry=0xffeafc4c5618, pins=pins@entry=0xaaab058a3c00, 
          callback=callback@entry=0x0) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/mysys/lf_hash.c:132
      #3  0x0000aaaaab818240 in l_search (pins=0xaaab058a3c00, keylen=14, key=0xffe924b3c668 "\002arm", hashnr=<optimized out>, cs=<optimized out>, 
          head=0xaaab058aec20) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/mysys/lf_hash.c:286
      #4  lf_hash_search_using_hash_value (hash=hash@entry=0xaaaaac0b9010 <mdl_locks>, pins=pins@entry=0xaaab058a3c00, hashnr=<optimized out>, 
          key=key@entry=0xffe924b3c668, keylen=14) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/mysys/lf_hash.c:488
      #5  0x0000aaaaab8183b4 in lf_hash_search (hash=hash@entry=0xaaaaac0b9010 <mdl_locks>, pins=pins@entry=0xaaab058a3c00, key=key@entry=0xffe924b3c668, 
          keylen=<optimized out>) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/mysys/lf_hash.c:528
      #6  0x0000aaaaab24acb4 in MDL_map::find_or_insert (this=0xaaaaac0b9010 <mdl_locks>, pins=0xaaab058a3c00, mdl_key=0xffe924b3c660)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/mdl.cc:825
      #7  0x0000aaaaab24c1e0 in MDL_context::try_acquire_lock_impl (this=this@entry=0xffe9240009f8, mdl_request=mdl_request@entry=0xffe924b3c640, 
          out_ticket=0xffeafc4c5768, out_ticket@entry=0xffeafc4c57e8) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/mdl.cc:2076
      #8  0x0000aaaaab24c860 in MDL_context::acquire_lock (this=this@entry=0xffe9240009f8, mdl_request=mdl_request@entry=0xffe924b3c640, 
          lock_wait_timeout=86400) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/mdl.cc:2251
      #9  0x0000aaaaab122f10 in open_table_get_mdl_lock (thd=thd@entry=0xffe9240008d8, ot_ctx=ot_ctx@entry=0xffeafc4c6118, 
          mdl_request=mdl_request@entry=0xffe924b3c640, flags=flags@entry=0, mdl_ticket=0xffeafc4c5ae0, mdl_ticket@entry=0xffeafc4c5b60)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_base.cc:1542
      #10 0x0000aaaaab126640 in open_table (thd=thd@entry=0xffe9240008d8, table_list=table_list@entry=0xffe924b3c1f8, ot_ctx=ot_ctx@entry=0xffeafc4c6118)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_base.cc:1803
      #11 0x0000aaaaab1291bc in open_and_process_table (ot_ctx=0xffeafc4c6118, has_prelocking_list=24, prelocking_strategy=0xffeafc4c6218, flags=65514, 
          counter=0xffeafc4c619c, tables=0xffe924b3c1f8, thd=0xffe9240008d8)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_base.cc:3784
      #12 open_tables (thd=thd@entry=0xffe9240008d8, options=..., start=0xffeafc4c6188, start@entry=0xffeafc4c61a8, counter=0xffeafc4c619c, 
          counter@entry=0xffeafc4c61bc, flags=65514, flags@entry=0, prelocking_strategy=prelocking_strategy@entry=0xffeafc4c6218)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_base.cc:4256
      #13 0x0000aaaaab12995c in open_and_lock_tables (thd=thd@entry=0xffe9240008d8, options=..., tables=<optimized out>, tables@entry=0xffe924b3c1f8, 
          derived=derived@entry=true, flags=flags@entry=0, prelocking_strategy=prelocking_strategy@entry=0xffeafc4c6218)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_base.cc:5160
      #14 0x0000aaaaab17d1c8 in open_and_lock_tables (flags=0, derived=true, tables=0xffe924b3c1f8, thd=0xffe9240008d8)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_base.h:509
      #15 execute_sqlcom_select (thd=thd@entry=0xffe9240008d8, all_tables=0xffe924b3c1f8)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_parse.cc:6131
      #16 0x0000aaaaab17a3a8 in mysql_execute_command (thd=0xffe9240008d8) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_parse.cc:3932
      #17 0x0000aaaaab1910c0 in Prepared_statement::execute (this=this@entry=0xffe924c04908, expanded_query=expanded_query@entry=0xffeafc4c6db0, 
          open_cursor=open_cursor@entry=false) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_prepare.cc:4736
      ---Type <return> to continue, or q <return> to quit---
      #18 0x0000aaaaab1911bc in Prepared_statement::execute_loop (this=0xffe924c04908, expanded_query=0xffeafc4c6db0, open_cursor=false, 
          packet=<optimized out>, packet_end=<optimized out>) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_prepare.cc:4225
      #19 0x0000aaaaab191a28 in mysql_stmt_execute_common (thd=thd@entry=0xffe9240008d8, stmt_id=42, packet=packet@entry=0xffe924c144e2 "", 
          packet_end=0xffe924c144ec "", packet_end@entry=0xffe924c144d9 "*", cursor_flags=<optimized out>, bulk_op=bulk_op@entry=false, 
          read_types=read_types@entry=false) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_prepare.cc:3226
      #20 0x0000aaaaab191ab4 in mysqld_stmt_execute (thd=thd@entry=0xffe9240008d8, packet_arg=packet_arg@entry=0xffe924c144d9 "*", 
          packet_length=packet_length@entry=0) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_prepare.cc:3121
      #21 0x0000aaaaab177558 in dispatch_command (command=command@entry=COM_STMT_EXECUTE, thd=thd@entry=0xffe9240008d8, 
          packet=packet@entry=0xffe924c144d9 "*", packet_length=0, packet_length@entry=19, is_com_multi=is_com_multi@entry=false, 
          is_next_command=is_next_command@entry=false) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_parse.cc:1791
      #22 0x0000aaaaab176914 in do_command (thd=0xffe9240008d8) at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_parse.cc:1348
      #23 0x0000aaaaab244870 in do_handle_one_connection (connect=<optimized out>, connect@entry=0xaaab059451b8, put_in_cache=put_in_cache@entry=true)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_connect.cc:1410
      #24 0x0000aaaaab244c70 in handle_one_connection (arg=arg@entry=0xaaab059451b8)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/sql/sql_connect.cc:1312
      #25 0x0000aaaaab5337e0 in pfs_spawn_thread (arg=0xaaab05876d48)
          at /opt/projects/mysql/105/non-forked-mdb/mdb/codebase/server/storage/perfschema/pfs.cc:2201
      #26 0x0000ffffbe767d38 in start_thread () from /lib64/libpthread.so.0
      #27 0x0000ffffbe0cf5f0 in thread_start () from /lib64/libc.so.6
      

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            I had a theory that a SEGV would be caused by a read of a memory address that was only partially written. Normally we assume writes are atomic but they don't need to be. The l_find uses the LF_SLIST structure for searching. there are a number of 64 pointer types with a uint32 in the middle. This breaks up the alignment. The order isn't necessary so uint32 goes at the end.

            https://github.com/MariaDB/server/commit/d30c1331a18d875e553f3fcf544997e4f33fb943 - branch bb-10.5-danielblack-MDEV-23510-arm-lfhash

            danblack Daniel Black added a comment - I had a theory that a SEGV would be caused by a read of a memory address that was only partially written. Normally we assume writes are atomic but they don't need to be. The l_find uses the LF_SLIST structure for searching. there are a number of 64 pointer types with a uint32 in the middle. This breaks up the alignment. The order isn't necessary so uint32 goes at the end. https://github.com/MariaDB/server/commit/d30c1331a18d875e553f3fcf544997e4f33fb943 - branch bb-10.5-danielblack- MDEV-23510 -arm-lfhash

            I have tested the fix suggested by Daniel and it is working fine when run with the said unit-test-case.
            TC use to crash intermittently but has sufficiently stabilized (w/o any crash observed) postfix.
            We can assume said issue fixed with the proposed patch.

            krunalbauskar Krunal Bauskar added a comment - I have tested the fix suggested by Daniel and it is working fine when run with the said unit-test-case. TC use to crash intermittently but has sufficiently stabilized (w/o any crash observed) postfix. We can assume said issue fixed with the proposed patch.
            danblack Daniel Black added a comment -

            Ok, ready. I do remember hitting a lf_hash thing in ppc64le but never got to the bottom of this.

            bb-10.2-danielblack-MDEV-23510-arm-lfhash

            • lf_hash -> same patch as above, with a few extra volatile directive removed because they are wrong.
            • HAVE_ALIGNED_MALLOC / HAVE_POSIX_MEMALIGN configure tests where missing from the beginning of the 10.0 branch (where only used in performance schema).
            • To ensure that lf_hash structure is aligned my_[mc]alloc_aligned was created (using HAVE_* above) and used by lf_dynarray_lvalue. The uses of this are in lf_hash, maria/lockman.c, and lf_pinbox_get_pins/lf_alloc_get_pins (that only get used in unit tests). Windows use of aligned requires a special free function so these need to be mapped right. In all cases, lf_dynarray_destroy is the destructor of usage in hence my_free_aligned in recursive_free.
            • As minor cleanup, pfs, innodb allocations changed to use new mysys my_[mc]alloc_aligned.

            For 10.5 - bb-10.5-danielblack-MDEV-23510-arm-lfhash

            • lf_hash contains movement to C++ to use my_assume_aligned assertions but is otherwise the same.
            • pfs - uses klass->count_alloc for counting (presumably to fix 10.2 bug that pfs_allocated_memory was never decreased).
            • tpool (added 10.5) to use my_ {malloc,free}

              _aligned rather than own implementation

            • innodb/mariabackup 10.5 extended allocated usage significantly so includes extra my_[mc]alloc_aligned rather than own implementation. In a slight case of going to far, added error handling for memory allocation where an obvious within function solution existed.
            danblack Daniel Black added a comment - Ok, ready. I do remember hitting a lf_hash thing in ppc64le but never got to the bottom of this. bb-10.2-danielblack- MDEV-23510 -arm-lfhash lf_hash -> same patch as above, with a few extra volatile directive removed because they are wrong. HAVE_ALIGNED_MALLOC / HAVE_POSIX_MEMALIGN configure tests where missing from the beginning of the 10.0 branch (where only used in performance schema). To ensure that lf_hash structure is aligned my_ [mc] alloc_aligned was created (using HAVE_* above) and used by lf_dynarray_lvalue. The uses of this are in lf_hash, maria/lockman.c, and lf_pinbox_get_pins/lf_alloc_get_pins (that only get used in unit tests). Windows use of aligned requires a special free function so these need to be mapped right. In all cases, lf_dynarray_destroy is the destructor of usage in hence my_free_aligned in recursive_free. As minor cleanup, pfs, innodb allocations changed to use new mysys my_ [mc] alloc_aligned. For 10.5 - bb-10.5-danielblack- MDEV-23510 -arm-lfhash lf_hash contains movement to C++ to use my_assume_aligned assertions but is otherwise the same. pfs - uses klass->count_alloc for counting (presumably to fix 10.2 bug that pfs_allocated_memory was never decreased). tpool (added 10.5) to use my_ {malloc,free} _aligned rather than own implementation innodb/mariabackup 10.5 extended allocated usage significantly so includes extra my_ [mc] alloc_aligned rather than own implementation. In a slight case of going to far, added error handling for memory allocation where an obvious within function solution existed.

            Thank you. I think that I am really only entitled to review the InnoDB change (using the common wrapper my_malloc_aligned()). That change looked OK to me.

            As far as I understand, the necessary change is the first one, which to my understanding passed tests on the affected platform.

            I did not completely understand the issue, though. If some data must or must not be in the same cache line with something else, then we probably would want to allocate aligned memory, passing the cache line size as the parameter. I did not see that happening. You passed an alignment constraint of only sizeof(void*), which I think should be trivially satisfied by any malloc() implementation.

            The implementation of my_malloc_aligned() looks wrong to me in one aspect: It allows a fall-back to unaligned malloc(). We do not want to reintroduce MDEV-21337 and break innodb_flush_method=O_DIRECT or encrypted or page_compressed tables, on any platform.

            marko Marko Mäkelä added a comment - Thank you. I think that I am really only entitled to review the InnoDB change (using the common wrapper my_malloc_aligned() ). That change looked OK to me. As far as I understand, the necessary change is the first one , which to my understanding passed tests on the affected platform. I did not completely understand the issue, though. If some data must or must not be in the same cache line with something else, then we probably would want to allocate aligned memory, passing the cache line size as the parameter. I did not see that happening. You passed an alignment constraint of only sizeof(void*) , which I think should be trivially satisfied by any malloc() implementation. The implementation of my_malloc_aligned() looks wrong to me in one aspect: It allows a fall-back to unaligned malloc() . We do not want to reintroduce MDEV-21337 and break innodb_flush_method=O_DIRECT or encrypted or page_compressed tables, on any platform.
            danblack Daniel Black added a comment -

            Pushed uncontroversial first commit.

            The entire `uchar *key` must be within a cache line to be able to atomicly read it by a CPU. So with 32bits before and after the cache line boundary, one CPU could be writing an upper 1/2 of the address while another CPU is reading the lower 1/2. Eventually the writer CPU will finish, however the reader combining it with the other 1/2 is an invalid address.

            leaving other commits for a separate MDEV later:

            danblack Daniel Black added a comment - Pushed uncontroversial first commit. The entire `uchar *key` must be within a cache line to be able to atomicly read it by a CPU. So with 32bits before and after the cache line boundary, one CPU could be writing an upper 1/2 of the address while another CPU is reading the lower 1/2. Eventually the writer CPU will finish, however the reader combining it with the other 1/2 is an invalid address. leaving other commits for a separate MDEV later: https://github.com/MariaDB/server/commit/270fb4219a0e317cd68fa0d79ab013494a4135fc https://github.com/MariaDB/server/commit/2a72b642f7176cba611d6a08dbab269ac0726ffc

            People

              danblack Daniel Black
              krunalbauskar Krunal Bauskar
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.