Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34565

MariaDB crashes with SIGILL because the OS does not support AVX512

Details

    Description

      The mariadb service start up is failing on AlmaLinux 9.4.

      Env:
      AlmaLinux 9.4
      libgcc-11.4.1-3.el9.alma.1.x86_64

      Mariadb packages installed:
      MariaDB-shared-10.11.8-1.el9.x86_64
      MariaDB-common-10.11.8-1.el9.x86_64
      MariaDB-client-10.11.8-1.el9.x86_64
      galera-4-26.4.18-1.el9.x86_64
      MariaDB-server-10.11.8-1.el9.x86_64
      MariaDB-backup-10.11.8-1.el9.x86_64

      my.cnf : Default (almost empty)

      #
      # This group is read both by the client and the server
      # use it for options that affect everything
      #
      [client-server]
       
      #
      # include *.cnf from the config directory
      #
      !includedir /etc/my.cnf.d
      

      Hardware Info:
      -VM

      [root@node1]# lscpu
      Architecture:            x86_64
        CPU op-mode(s):        32-bit, 64-bit
        Address sizes:         45 bits physical, 57 bits virtual
        Byte Order:            Little Endian
      CPU(s):                  2
        On-line CPU(s) list:   0,1
      Vendor ID:               GenuineIntel
        BIOS Vendor ID:        GenuineIntel
        Model name:            Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
          BIOS Model name:     Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
          CPU family:          6
          Model:               106
          Thread(s) per core:  1
          Core(s) per socket:  1
          Socket(s):           2
          Stepping:            6
          BogoMIPS:            3990.62
          Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology
                               tsc_reliable nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes f16c rdrand hypervisor lahf_lm abm 3dn
                               owprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni wbnoinvd arat umip rdpid md_cle
                               ar flush_l1d arch_capabilities
      Virtualization features:
        Hypervisor vendor:     VMware
        Virtualization type:   full
      Caches (sum of all):
        L1d:                   96 KiB (2 instances)
        L1i:                   64 KiB (2 instances)
        L2:                    2.5 MiB (2 instances)
        L3:                    84 MiB (2 instances)
      NUMA:
        NUMA node(s):          1
        NUMA node0 CPU(s):     0,1
      Vulnerabilities:
        Gather data sampling:  Unknown: Dependent on hypervisor status
        Itlb multihit:         KVM: Mitigation: VMX unsupported
        L1tf:                  Not affected
        Mds:                   Not affected
        Meltdown:              Not affected
        Mmio stale data:       Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
        Retbleed:              Not affected
        Spec rstack overflow:  Not affected
        Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
        Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
        Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
        Srbds:                 Not affected
        Tsx async abort:       Not affected
      [root@node1]#
      

      Error message from /var/log/message:

      Jul 11 01:36:45 node1 mariadbd[79583]: 2024-07-11  1:36:45 0 [Note] Starting MariaDB 10.11.8-MariaDB source revision 3a069644682e336e445039e48baae9693f9a08ee as process 79583
      Jul 11 01:36:45 node1 mariadbd[79583]: 240711  1:36:45 [ERROR] mysqld got signal 4 ;
      Jul 11 01:36:45 node1 mariadbd[79583]: Sorry, we probably made a mistake, and this is a bug.
      Jul 11 01:36:45 node1 mariadbd[79583]: Your assistance in bug reporting will enable us to fix this for the next release.
      Jul 11 01:36:45 node1 mariadbd[79583]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      Jul 11 01:36:45 node1 mariadbd[79583]: We will try our best to scrape up some info that will hopefully help
      Jul 11 01:36:45 node1 mariadbd[79583]: diagnose the problem, but since we have already crashed,
      Jul 11 01:36:45 node1 mariadbd[79583]: something is definitely wrong and this may fail.
      Jul 11 01:36:45 node1 mariadbd[79583]: Server version: 10.11.8-MariaDB source revision: 3a069644682e336e445039e48baae9693f9a08ee
      Jul 11 01:36:45 node1 mariadbd[79583]: key_buffer_size=134217728
      Jul 11 01:36:45 node1 mariadbd[79583]: read_buffer_size=131072
      Jul 11 01:36:45 node1 mariadbd[79583]: max_used_connections=0
      Jul 11 01:36:45 node1 mariadbd[79583]: max_threads=153
      Jul 11 01:36:45 node1 mariadbd[79583]: thread_count=0
      Jul 11 01:36:45 node1 mariadbd[79583]: It is possible that mysqld could use up to
      Jul 11 01:36:45 node1 mariadbd[79583]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 468061 K  bytes of memory
      Jul 11 01:36:45 node1 mariadbd[79583]: Hope that's ok; if not, decrease some variables in the equation.
      Jul 11 01:36:45 node1 mariadbd[79583]: Thread pointer: 0x0
      Jul 11 01:36:45 node1 mariadbd[79583]: Attempting backtrace. You can use the following information to find out
      Jul 11 01:36:45 node1 mariadbd[79583]: where mysqld died. If you see no messages after this, something went
      Jul 11 01:36:45 node1 mariadbd[79583]: terribly wrong...
      Jul 11 01:36:45 node1 mariadbd[79583]: stack_bottom = 0x0 thread_stack 0x49000
      Jul 11 01:36:45 node1 mariadbd[79583]: Printing to addr2line failed
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x55fb1b527cee]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(handle_fatal_signal+0x468)[0x55fb1afffa28]
      Jul 11 01:36:45 node1 mariadbd[79583]: /lib64/libc.so.6(+0x3e6f0)[0x7fbf8803e6f0]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(+0xfcbff5)[0x55fb1b53eff5]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(+0xc286f3)[0x55fb1b19b6f3]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(+0xc2071c)[0x55fb1b19371c]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(_Z24ha_initialize_handlertonP13st_plugin_int+0x7e)[0x55fb1b002b1e]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(+0x84b523)[0x55fb1adbe523]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(+0x8519fc)[0x55fb1adc49fc]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(_Z11plugin_initPiPPci+0x8f1)[0x55fb1adc5a21]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(+0x7244ff)[0x55fb1ac974ff]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(_Z11mysqld_mainiPPc+0x415)[0x55fb1ac9ca55]
      Jul 11 01:36:45 node1 mariadbd[79583]: /lib64/libc.so.6(+0x29590)[0x7fbf88029590]
      Jul 11 01:36:45 node1 mariadbd[79583]: /lib64/libc.so.6(__libc_start_main+0x80)[0x7fbf88029640]
      Jul 11 01:36:45 node1 mariadbd[79583]: /usr/sbin/mariadbd(_start+0x25)[0x55fb1ac91595]
      Jul 11 01:36:45 node1 mariadbd[79583]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains
      Jul 11 01:36:45 node1 mariadbd[79583]: information that should help you find out what is causing the crash.
      Jul 11 01:36:45 node1 mariadbd[79583]: Writing a core file...
      Jul 11 01:36:45 node1 mariadbd[79583]: Working directory at /var/lib/mysql
      Jul 11 01:36:45 node1 mariadbd[79583]: Resource Limits:
      Jul 11 01:36:45 node1 mariadbd[79583]: Limit                     Soft Limit           Hard Limit           Units
      Jul 11 01:36:45 node1 mariadbd[79583]: Max cpu time              unlimited            unlimited            seconds
      Jul 11 01:36:45 node1 mariadbd[79583]: Max file size             unlimited            unlimited            bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max data size             unlimited            unlimited            bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max stack size            8388608              unlimited            bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max core file size        0                    unlimited            bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max resident set          unlimited            unlimited            bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max processes             30630                30630                processes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max open files            32768                32768                files
      Jul 11 01:36:45 node1 mariadbd[79583]: Max locked memory         8388608              8388608              bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max address space         unlimited            unlimited            bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max file locks            unlimited            unlimited            locks
      Jul 11 01:36:45 node1 mariadbd[79583]: Max pending signals       30630                30630                signals
      Jul 11 01:36:45 node1 mariadbd[79583]: Max msgqueue size         819200               819200               bytes
      Jul 11 01:36:45 node1 mariadbd[79583]: Max nice priority         0                    0
      Jul 11 01:36:45 node1 mariadbd[79583]: Max realtime priority     0                    0
      Jul 11 01:36:45 node1 mariadbd[79583]: Max realtime timeout      unlimited            unlimited            us
      Jul 11 01:36:45 node1 mariadbd[79583]: Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
      Jul 11 01:36:45 node1 mariadbd[79583]: Kernel version: Linux version 5.14.0-427.18.1.el9_4.x86_64 (mockbuild@x64-builder02.almalinux.org) (gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3), GNU ld version 2.35.2-43.el9) #1 SMP PREEMPT_DYNAMIC Tue May 28 06:27:02 EDT 2024
      Jul 11 01:36:45 node1 systemd-coredump[79592]: Process 79583 (mariadbd) of user 1001 dumped core.
      Jul 11 01:36:45 node1 systemd[1]: systemd-coredump@1493-79590-0.service: Deactivated successfully.
      Jul 11 01:36:45 node1 systemd[1]: mariadb.service: Main process exited, code=dumped, status=4/ILL
      Jul 11 01:36:45 node1 systemd[1]: mariadb.service: Failed with result 'core-dump'.
      Jul 11 01:36:45 node1 systemd[1]: Failed to start MariaDB 10.11.8 database server.
      

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            So 4 == illegal instruction.

            Thanks for the cpu info.

            Can you provide from "coredumpctl debug"

            and then "bt full" on the gdb prompt.

            danblack Daniel Black added a comment - So 4 == illegal instruction. Thanks for the cpu info. Can you provide from "coredumpctl debug" and then "bt full" on the gdb prompt.
            bijjupatel Bijju Patel added a comment - - edited

            Thanks for the quick response. Appreciate it.

            [root@node1 ~]# coredumpctl debug
                       PID: 8000 (mariadbd)
                       UID: 1001 (twcadmin)
                       GID: 1001 (twcadmin)
                    Signal: 4 (ILL)
                 Timestamp: Thu 2024-07-11 02:58:31 GMT (2h 39min ago)
              Command Line: /usr/sbin/mariadbd
                Executable: /usr/sbin/mariadbd
             Control Group: /system.slice/mariadb.service
                      Unit: mariadb.service
                     Slice: system.slice
                   Boot ID: d0ce20541044481e93948e9bb22a0be0
                Machine ID: 58a3dcd525a74d3a9aad5e8b2ced5799
                  Hostname: node1
                   Storage: /var/lib/systemd/coredump/core.mariadbd.1001.d0ce20541044481e93948e9bb22a0be0.8000.1720666711000000.zst (present)
              Size on Disk: 422.6K
                   Message: Process 8000 (mariadbd) of user 1001 dumped core.
             
                            Stack trace of thread 8000:
                            #0  0x00007f1ef7e8b94c __pthread_kill_implementation (libc.so.6 + 0x8b94c)
                            #1  0x00005600fcbd6a48 handle_fatal_signal (mariadbd + 0xa8ca48)
                            #2  0x00007f1ef7e3e6f0 __restore_rt (libc.so.6 + 0x3e6f0)
                            #3  0x00005600fd115ff5 _Z22_mm512_broadcast_i32x4Dv2_x (mariadbd + 0xfcbff5)
                            #4  0x00005600fcd726f3 ma_control_file_open (mariadbd + 0xc286f3)
                            #5  0x00005600fcd6a71c ha_maria_init (mariadbd + 0xc2071c)
                            #6  0x00005600fcbd9b1e _Z24ha_initialize_handlertonP13st_plugin_int (mariadbd + 0xa8fb1e)
                            #7  0x00005600fc995523 plugin_do_initialize (mariadbd + 0x84b523)
                            #8  0x00005600fc99b9fc plugin_initialize (mariadbd + 0x8519fc)
                            #9  0x00005600fc99ca21 _Z11plugin_initPiPPci (mariadbd + 0x852a21)
                            #10 0x00005600fc86e4ff init_server_components (mariadbd + 0x7244ff)
                            #11 0x00005600fc873a55 _Z11mysqld_mainiPPc (mariadbd + 0x729a55)
                            #12 0x00007f1ef7e29590 __libc_start_call_main (libc.so.6 + 0x29590)
                            #13 0x00007f1ef7e29640 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x29640)
                            #14 0x00005600fc868595 _start (mariadbd + 0x71e595)
             
                            Stack trace of thread 8001:
                            #0  0x00007f1ef7e8679a __futex_abstimed_wait_common (libc.so.6 + 0x8679a)
                            #1  0x00007f1ef7e892a4 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x892a4)
                            #2  0x00005600fd1025ff inline_mysql_cond_timedwait (mariadbd + 0xfb85ff)
                            #3  0x00007f1ef7e89c02 start_thread (libc.so.6 + 0x89c02)
                            #4  0x00007f1ef7f0ec40 __clone3 (libc.so.6 + 0x10ec40)
                            ELF object binary architecture: AMD x86-64
             
            GNU gdb (GDB) Red Hat Enterprise Linux 10.2-13.el9
            Copyright (C) 2021 Free Software Foundation, Inc.
            License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
            This is free software: you are free to change and redistribute it.
            There is NO WARRANTY, to the extent permitted by law.
            Type "show copying" and "show warranty" for details.
            This GDB was configured as "x86_64-redhat-linux-gnu".
            Type "show configuration" for configuration details.
            For bug reporting instructions, please see:
            <https://www.gnu.org/software/gdb/bugs/>.
            Find the GDB manual and other documentation resources online at:
                <http://www.gnu.org/software/gdb/documentation/>.
             
            For help, type "help".
            Type "apropos word" to search for commands related to "word"...
            Reading symbols from /usr/sbin/mariadbd...
            Reading symbols from /usr/lib/debug/usr/sbin/mariadbd-10.11.8-1.el9.x86_64.debug...
            [New LWP 8000]
            [New LWP 8001]
            [Thread debugging using libthread_db enabled]
            Using host libthread_db library "/usr/lib64/libthread_db.so.1".
            Core was generated by `/usr/sbin/mariadbd'.
            Program terminated with signal SIGILL, Illegal instruction.
            #0  __pthread_kill_implementation (threadid=<optimized out>, signo=4, no_tid=<optimized out>) at pthread_kill.c:44
            44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
            [Current thread is 1 (Thread 0x7f1ef8a391c0 (LWP 8000))]
            Missing separate debuginfos, use: dnf debuginfo-install libaio-0.3.111-13.el9.x86_64 libcap-2.48-9.el9_2.x86_64 libgcc-11.4.1-3.el9.alma.1.x86_64 libgcrypt-1.10.0-10.el9_2.x86_64 libgpg-error-1.42-5.el9.x86_64 libstdc++-11.4.1-3.el9.alma.1.x86_64 libxcrypt-4.4.18-3.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 lz4-libs-1.9.3-5.el9.x86_64 openssl-libs-3.0.7-27.el9.x86_64 pcre2-10.40-5.el9.x86_64 systemd-libs-252-32.el9_4.alma.1.x86_64 xz-libs-5.2.5-8.el9_0.x86_64 zlib-1.2.11-40.el9.x86_64
            (gdb)
             
            (gdb) bt full
            #0  __pthread_kill_implementation (threadid=<optimized out>, signo=4, no_tid=<optimized out>) at pthread_kill.c:44
                    tid = <optimized out>
                    ret = 0
                    pd = <optimized out>
                    old_mask = {__val = {15, 0, 18446744073709551600, 94562245650864, 0 <repeats 11 times>, 94561288323072}}
                    ret = <optimized out>
            #1  0x00005600fcbd6a48 in handle_fatal_signal (sig=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/signal_handler.cc:357
                    curr_time = 1720666711
                    tm = {tm_sec = 31, tm_min = 58, tm_hour = 2, tm_mday = 11, tm_mon = 6, tm_year = 124, tm_wday = 4, tm_yday = 192, tm_isdst = 0, tm_gmtoff = 0, tm_zone = 0x5600fed9b200 "GMT"}
                    thd = 0x0
                    print_invalid_query_pointer = false
            #2  <signal handler called>
            No locals.
            #3  0x00005600fd115ff5 in _mm512_broadcast_i32x4(long long __vector(2)) (__A=...) at /usr/lib/gcc/x86_64-redhat-linux/11/include/avx512fintrin.h:4203
            No locals.
            #4  crc32_avx512 (crc=0, buf=0x7ffd26edcf40 "\376\376\f\001S\361]\033\346\260\021\356\233&", size=26, tab=...) at /usr/src/debug/MariaDB-/src_0/mysys/crc32/crc32c_x86.cc:216
                    crc_in = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>}
                    b512 = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>}
                    crc_out = {<optimized out>, <optimized out>}
                    lo = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>}
                    b = {<optimized out>, <optimized out>}
            #5  0x00005600fcd726f3 in ma_control_file_open (create_if_missing=<optimized out>, print_error=<optimized out>, wait_for_lock=<optimized out>, open_flags=<optimized out>)
                at /usr/src/debug/MariaDB-/src_0/storage/maria/ma_control_file.c:421
                    buffer = "\376\376\f\001S\361]\033\346\260\021\356\233&\000PV\242\006\354\036\000\026\000\000 \037\360\246M\024\225\304\371\001\000\000F\336M\005\001\000\000\000@\300\000\000\000\000\000\375\177\000\000\200\247\344\376\000V\000\000\240\317\355&\375\177\000\000en\017\375\000V\000\000d/4\375\000V\000\000\340\271\344\376\000V\000\000\240\322\355&\375\177\000\000\223v\017\375\000V", '\000' <repeats 82 times>, "\200\247\344\376\000V\000\000"...
                    name = "/var/lib/mysql/aria_log_control\000\020\020\000\000\000\000\000\000\334\321\355&\375\177\000\000\340\312\344\376\000V\000\000\210\321\355&\375\177\000\000\002\000\000\000\000\000\000\000#\000\000\000\000\000\000\000\000\322\355&\375\177\000\000 )\260\375\000V\000\000\210\027\344\376\000V\000\000\340\312\344\376\000V\000\000\000\000\000\000\000\000\000\000\371\341\337\374\000V\000\000\340\321\355&\375\177\000\000\070\322\355&\375\177\000\000\364\327Ä”\374\023\273\212\251\217\351\367\036\177\000\000\000\000\000\000\000\000\000\000 \000\000\000\000\000\000\000 \002\000\000\000\000\000\000@\242\243\375\000V\000\000\030\002\000\000\000\000\000\000\213"...
                    errmsg_buff = "\300\323\355&\375\177\000\000\v\000\000\000\000\000\000\000\313\006\374\367\036\177\000\000\240v\377\367\036\177\000\000\000\325\355&\375\177\000\000\360\315\346\367\036\177\000\000\300\006\374\367\036\177\000\000\003\000\000\000\000\000\000\000h\r\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000 \377\377\377\240\017\000\000\350\003\000\000\n\000\000\000\000\000\000\000\200Q\001\000\000\000\000\000X\002\000\000\000\000\000\000\000\000\000\000\300\006\374\367\036\177\000\000\000\000\000\000`\352\000\000\004\000\000\000\026\000\000\000\377\377\377\377\377\377\377\377\361", '\000' <repeats 23 times>, "\220\317\355&\375\177\000\000\313\006\374\367\036\177\000\000"...
                    errmsg = <optimized out>
                    lock_failed_errmsg = <optimized out>
                    new_cf_create_time_size = 30
                    new_cf_changeable_size = 22
                    new_block_size = <optimized out>
                    file_size = 52
                    error = <optimized out>
                    ok = <optimized out>
            #6  0x00005600fcd6a71c in ha_maria_init (p=0x5600fee4db98) at /usr/src/debug/MariaDB-/src_0/storage/maria/ha_maria.cc:3876
                    res = 0
                    tmp = <optimized out>
            --Type <RET> for more, q to quit, c to continue without paging--
                    log_dir = 0x5600fdb012c0 <mysql_real_data_home> "/var/lib/mysql/"
            #7  0x00005600fcbd9b1e in ha_initialize_handlerton (plugin=0x5600fee3fd20) at /usr/src/debug/MariaDB-/src_0/sql/handler.cc:648
                    hton = 0x5600fee4db98
                    ret = 0
                    tmp = <optimized out>
                    fslot = <optimized out>
            #8  0x00005600fc995523 in plugin_do_initialize (plugin=plugin@entry=0x5600fee3fd20, state=@0x7ffd26edd4fc: 4) at /usr/src/debug/MariaDB-/src_0/sql/sql_plugin.cc:1453
                    ret = <optimized out>
                    init = <optimized out>
            #9  0x00005600fc99b9fc in plugin_initialize (tmp_root=0x7ffd26eddac0, plugin=0x5600fee3fd20, argc=0x5600fdb024a0 <remaining_argc>, argv=0x5600fed9b058, options_only=<optimized out>)
                at /usr/src/debug/MariaDB-/src_0/sql/sql_plugin.cc:1506
                    ret = 1
                    state = 4
            #10 0x00005600fc99ca21 in plugin_init (argc=argc@entry=0x5600fdb024a0 <remaining_argc>, argv=<optimized out>, flags=0) at /usr/src/debug/MariaDB-/src_0/sql/sql_plugin.cc:1764
                    plugin_table_engine = <optimized out>
                    opts_only = <optimized out>
                    idx = 4
                    hash = 0x5600fdb05900 <plugin_hash+128>
                    error = <optimized out>
                    i = <optimized out>
                    builtins = <optimized out>
                    plugin = <optimized out>
                    tmp = {name = {str = 0x5600fd712712 "partition", length = 9}, plugin = 0x5600fda3aaa0 <builtin_maria_partition_plugin>, plugin_dl = 0x0, ptr_backup = 0x0, nbackups = 0, state = 4,
                      ref_count = 0, locks_total = 0, data = 0x0, mem_root = {free = 0x0, used = 0x0, pre_alloc = 0x0, min_malloc = 0, block_size = 0, block_num = 0, first_block_usage = 0, flags = 0,
                        error_handler = 0x0, psi_key = 0}, system_vars = 0x0, load_option = PLUGIN_ON}
                    plugin_ptr = 0x5600fee3fd20
                    reap = 0x7ffd26edd788
                    retry_end = 0x7ffd26edd540
                    retry_start = 0x7ffd26edd540
                    tmp_root = {free = 0x5600fee49178, used = 0x5600fee4a768, pre_alloc = 0x5600fee41a78, min_malloc = 32, block_size = 4088, block_num = 8, first_block_usage = 0, flags = 0, error_handler = 0x0,
                      psi_key = 0}
                    reaped_mandatory_plugin = false
                    mandatory = <optimized out>
                    opt_plugin_load_list_iter = {<base_ilist_iterator> = {list = 0x5600fdb06000 <opt_plugin_load_list>, el = <optimized out>, current = <optimized out>}, <No data fields>}
                    plugin_table_engine_name_buf = "Aria\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\244\311\343\376\000V\000\000\000\000\000\000\000\000\000\000"
                    plugin_table_engine_name = {str = 0x7ffd26eddc30 "Aria", length = 4}
                    MyISAM = {str = <optimized out>, length = <optimized out>}
            #11 0x00005600fc86e4ff in init_server_components () at /usr/src/debug/MariaDB-/src_0/sql/mysqld.cc:5247
            No locals.
            #12 0x00005600fc873a55 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/mysqld.cc:5873
                    please_close_stdin = true
                    ho_error = <optimized out>
                    new_thread_stack_size = <optimized out>
                    user = <optimized out>
            --Type <RET> for more, q to quit, c to continue without paging--
            #13 0x00007f1ef7e29590 in __libc_start_call_main (main=main@entry=0x5600fc825460 <main(int, char**)>, argc=argc@entry=1, argv=argv@entry=0x7ffd26ee0bd8) at ../sysdeps/nptl/libc_start_call_main.h:58
                    self = <optimized out>
                    result = <optimized out>
                    unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -8465895077201800942, 140725256588248, 94562236388448, 94562253736824, 139771001217024, 8464889969319252242, 8412432240417607954},
                          mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x5600fc825460 <main(int, char**)>, 0x5600fd8b0b78}, data = {prev = 0x0, cleanup = 0x0, canceltype = -58567584}}}
                    not_first_call = <optimized out>
            #14 0x00007f1ef7e29640 in __libc_start_main_impl (main=0x5600fc825460 <main(int, char**)>, argc=1, argv=0x7ffd26ee0bd8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
                stack_end=0x7ffd26ee0bc8) at ../csu/libc-start.c:389
            No locals.
            #15 0x00005600fc868595 in _start () at /usr/src/debug/MariaDB-/src_0/sql/my_decimal.h:147
            No symbol table info available.
            (gdb)
            

            bijjupatel Bijju Patel added a comment - - edited Thanks for the quick response. Appreciate it. [root@node1 ~]# coredumpctl debug PID: 8000 (mariadbd) UID: 1001 (twcadmin) GID: 1001 (twcadmin) Signal: 4 (ILL) Timestamp: Thu 2024-07-11 02:58:31 GMT (2h 39min ago) Command Line: /usr/sbin/mariadbd Executable: /usr/sbin/mariadbd Control Group: /system.slice/mariadb.service Unit: mariadb.service Slice: system.slice Boot ID: d0ce20541044481e93948e9bb22a0be0 Machine ID: 58a3dcd525a74d3a9aad5e8b2ced5799 Hostname: node1 Storage: /var/lib/systemd/coredump/core.mariadbd.1001.d0ce20541044481e93948e9bb22a0be0.8000.1720666711000000.zst (present) Size on Disk: 422.6K Message: Process 8000 (mariadbd) of user 1001 dumped core.   Stack trace of thread 8000: #0 0x00007f1ef7e8b94c __pthread_kill_implementation (libc.so.6 + 0x8b94c) #1 0x00005600fcbd6a48 handle_fatal_signal (mariadbd + 0xa8ca48) #2 0x00007f1ef7e3e6f0 __restore_rt (libc.so.6 + 0x3e6f0) #3 0x00005600fd115ff5 _Z22_mm512_broadcast_i32x4Dv2_x (mariadbd + 0xfcbff5) #4 0x00005600fcd726f3 ma_control_file_open (mariadbd + 0xc286f3) #5 0x00005600fcd6a71c ha_maria_init (mariadbd + 0xc2071c) #6 0x00005600fcbd9b1e _Z24ha_initialize_handlertonP13st_plugin_int (mariadbd + 0xa8fb1e) #7 0x00005600fc995523 plugin_do_initialize (mariadbd + 0x84b523) #8 0x00005600fc99b9fc plugin_initialize (mariadbd + 0x8519fc) #9 0x00005600fc99ca21 _Z11plugin_initPiPPci (mariadbd + 0x852a21) #10 0x00005600fc86e4ff init_server_components (mariadbd + 0x7244ff) #11 0x00005600fc873a55 _Z11mysqld_mainiPPc (mariadbd + 0x729a55) #12 0x00007f1ef7e29590 __libc_start_call_main (libc.so.6 + 0x29590) #13 0x00007f1ef7e29640 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x29640) #14 0x00005600fc868595 _start (mariadbd + 0x71e595)   Stack trace of thread 8001: #0 0x00007f1ef7e8679a __futex_abstimed_wait_common (libc.so.6 + 0x8679a) #1 0x00007f1ef7e892a4 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x892a4) #2 0x00005600fd1025ff inline_mysql_cond_timedwait (mariadbd + 0xfb85ff) #3 0x00007f1ef7e89c02 start_thread (libc.so.6 + 0x89c02) #4 0x00007f1ef7f0ec40 __clone3 (libc.so.6 + 0x10ec40) ELF object binary architecture: AMD x86-64   GNU gdb (GDB) Red Hat Enterprise Linux 10.2-13.el9 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>.   For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/mariadbd... Reading symbols from /usr/lib/debug/usr/sbin/mariadbd-10.11.8-1.el9.x86_64.debug... [New LWP 8000] [New LWP 8001] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/mariadbd'. Program terminated with signal SIGILL, Illegal instruction. #0 __pthread_kill_implementation (threadid=<optimized out>, signo=4, no_tid=<optimized out>) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; [Current thread is 1 (Thread 0x7f1ef8a391c0 (LWP 8000))] Missing separate debuginfos, use: dnf debuginfo-install libaio-0.3.111-13.el9.x86_64 libcap-2.48-9.el9_2.x86_64 libgcc-11.4.1-3.el9.alma.1.x86_64 libgcrypt-1.10.0-10.el9_2.x86_64 libgpg-error-1.42-5.el9.x86_64 libstdc++-11.4.1-3.el9.alma.1.x86_64 libxcrypt-4.4.18-3.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 lz4-libs-1.9.3-5.el9.x86_64 openssl-libs-3.0.7-27.el9.x86_64 pcre2-10.40-5.el9.x86_64 systemd-libs-252-32.el9_4.alma.1.x86_64 xz-libs-5.2.5-8.el9_0.x86_64 zlib-1.2.11-40.el9.x86_64 (gdb)   (gdb) bt full #0 __pthread_kill_implementation (threadid=<optimized out>, signo=4, no_tid=<optimized out>) at pthread_kill.c:44 tid = <optimized out> ret = 0 pd = <optimized out> old_mask = {__val = {15, 0, 18446744073709551600, 94562245650864, 0 <repeats 11 times>, 94561288323072}} ret = <optimized out> #1 0x00005600fcbd6a48 in handle_fatal_signal (sig=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/signal_handler.cc:357 curr_time = 1720666711 tm = {tm_sec = 31, tm_min = 58, tm_hour = 2, tm_mday = 11, tm_mon = 6, tm_year = 124, tm_wday = 4, tm_yday = 192, tm_isdst = 0, tm_gmtoff = 0, tm_zone = 0x5600fed9b200 "GMT"} thd = 0x0 print_invalid_query_pointer = false #2 <signal handler called> No locals. #3 0x00005600fd115ff5 in _mm512_broadcast_i32x4(long long __vector(2)) (__A=...) at /usr/lib/gcc/x86_64-redhat-linux/11/include/avx512fintrin.h:4203 No locals. #4 crc32_avx512 (crc=0, buf=0x7ffd26edcf40 "\376\376\f\001S\361]\033\346\260\021\356\233&", size=26, tab=...) at /usr/src/debug/MariaDB-/src_0/mysys/crc32/crc32c_x86.cc:216 crc_in = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>} b512 = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>} crc_out = {<optimized out>, <optimized out>} lo = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>} b = {<optimized out>, <optimized out>} #5 0x00005600fcd726f3 in ma_control_file_open (create_if_missing=<optimized out>, print_error=<optimized out>, wait_for_lock=<optimized out>, open_flags=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/maria/ma_control_file.c:421 buffer = "\376\376\f\001S\361]\033\346\260\021\356\233&\000PV\242\006\354\036\000\026\000\000 \037\360\246M\024\225\304\371\001\000\000F\336M\005\001\000\000\000@\300\000\000\000\000\000\375\177\000\000\200\247\344\376\000V\000\000\240\317\355&\375\177\000\000en\017\375\000V\000\000d/4\375\000V\000\000\340\271\344\376\000V\000\000\240\322\355&\375\177\000\000\223v\017\375\000V", '\000' <repeats 82 times>, "\200\247\344\376\000V\000\000"... name = "/var/lib/mysql/aria_log_control\000\020\020\000\000\000\000\000\000\334\321\355&\375\177\000\000\340\312\344\376\000V\000\000\210\321\355&\375\177\000\000\002\000\000\000\000\000\000\000#\000\000\000\000\000\000\000\000\322\355&\375\177\000\000 )\260\375\000V\000\000\210\027\344\376\000V\000\000\340\312\344\376\000V\000\000\000\000\000\000\000\000\000\000\371\341\337\374\000V\000\000\340\321\355&\375\177\000\000\070\322\355&\375\177\000\000\364\327Ä”\374\023\273\212\251\217\351\367\036\177\000\000\000\000\000\000\000\000\000\000 \000\000\000\000\000\000\000 \002\000\000\000\000\000\000@\242\243\375\000V\000\000\030\002\000\000\000\000\000\000\213"... errmsg_buff = "\300\323\355&\375\177\000\000\v\000\000\000\000\000\000\000\313\006\374\367\036\177\000\000\240v\377\367\036\177\000\000\000\325\355&\375\177\000\000\360\315\346\367\036\177\000\000\300\006\374\367\036\177\000\000\003\000\000\000\000\000\000\000h\r\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000 \377\377\377\240\017\000\000\350\003\000\000\n\000\000\000\000\000\000\000\200Q\001\000\000\000\000\000X\002\000\000\000\000\000\000\000\000\000\000\300\006\374\367\036\177\000\000\000\000\000\000`\352\000\000\004\000\000\000\026\000\000\000\377\377\377\377\377\377\377\377\361", '\000' <repeats 23 times>, "\220\317\355&\375\177\000\000\313\006\374\367\036\177\000\000"... errmsg = <optimized out> lock_failed_errmsg = <optimized out> new_cf_create_time_size = 30 new_cf_changeable_size = 22 new_block_size = <optimized out> file_size = 52 error = <optimized out> ok = <optimized out> #6 0x00005600fcd6a71c in ha_maria_init (p=0x5600fee4db98) at /usr/src/debug/MariaDB-/src_0/storage/maria/ha_maria.cc:3876 res = 0 tmp = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- log_dir = 0x5600fdb012c0 <mysql_real_data_home> "/var/lib/mysql/" #7 0x00005600fcbd9b1e in ha_initialize_handlerton (plugin=0x5600fee3fd20) at /usr/src/debug/MariaDB-/src_0/sql/handler.cc:648 hton = 0x5600fee4db98 ret = 0 tmp = <optimized out> fslot = <optimized out> #8 0x00005600fc995523 in plugin_do_initialize (plugin=plugin@entry=0x5600fee3fd20, state=@0x7ffd26edd4fc: 4) at /usr/src/debug/MariaDB-/src_0/sql/sql_plugin.cc:1453 ret = <optimized out> init = <optimized out> #9 0x00005600fc99b9fc in plugin_initialize (tmp_root=0x7ffd26eddac0, plugin=0x5600fee3fd20, argc=0x5600fdb024a0 <remaining_argc>, argv=0x5600fed9b058, options_only=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/sql_plugin.cc:1506 ret = 1 state = 4 #10 0x00005600fc99ca21 in plugin_init (argc=argc@entry=0x5600fdb024a0 <remaining_argc>, argv=<optimized out>, flags=0) at /usr/src/debug/MariaDB-/src_0/sql/sql_plugin.cc:1764 plugin_table_engine = <optimized out> opts_only = <optimized out> idx = 4 hash = 0x5600fdb05900 <plugin_hash+128> error = <optimized out> i = <optimized out> builtins = <optimized out> plugin = <optimized out> tmp = {name = {str = 0x5600fd712712 "partition", length = 9}, plugin = 0x5600fda3aaa0 <builtin_maria_partition_plugin>, plugin_dl = 0x0, ptr_backup = 0x0, nbackups = 0, state = 4, ref_count = 0, locks_total = 0, data = 0x0, mem_root = {free = 0x0, used = 0x0, pre_alloc = 0x0, min_malloc = 0, block_size = 0, block_num = 0, first_block_usage = 0, flags = 0, error_handler = 0x0, psi_key = 0}, system_vars = 0x0, load_option = PLUGIN_ON} plugin_ptr = 0x5600fee3fd20 reap = 0x7ffd26edd788 retry_end = 0x7ffd26edd540 retry_start = 0x7ffd26edd540 tmp_root = {free = 0x5600fee49178, used = 0x5600fee4a768, pre_alloc = 0x5600fee41a78, min_malloc = 32, block_size = 4088, block_num = 8, first_block_usage = 0, flags = 0, error_handler = 0x0, psi_key = 0} reaped_mandatory_plugin = false mandatory = <optimized out> opt_plugin_load_list_iter = {<base_ilist_iterator> = {list = 0x5600fdb06000 <opt_plugin_load_list>, el = <optimized out>, current = <optimized out>}, <No data fields>} plugin_table_engine_name_buf = "Aria\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\242\311\343\376\000V\000\000\244\311\343\376\000V\000\000\000\000\000\000\000\000\000\000" plugin_table_engine_name = {str = 0x7ffd26eddc30 "Aria", length = 4} MyISAM = {str = <optimized out>, length = <optimized out>} #11 0x00005600fc86e4ff in init_server_components () at /usr/src/debug/MariaDB-/src_0/sql/mysqld.cc:5247 No locals. #12 0x00005600fc873a55 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/mysqld.cc:5873 please_close_stdin = true ho_error = <optimized out> new_thread_stack_size = <optimized out> user = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- #13 0x00007f1ef7e29590 in __libc_start_call_main (main=main@entry=0x5600fc825460 <main(int, char**)>, argc=argc@entry=1, argv=argv@entry=0x7ffd26ee0bd8) at ../sysdeps/nptl/libc_start_call_main.h:58 self = <optimized out> result = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -8465895077201800942, 140725256588248, 94562236388448, 94562253736824, 139771001217024, 8464889969319252242, 8412432240417607954}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x5600fc825460 <main(int, char**)>, 0x5600fd8b0b78}, data = {prev = 0x0, cleanup = 0x0, canceltype = -58567584}}} not_first_call = <optimized out> #14 0x00007f1ef7e29640 in __libc_start_main_impl (main=0x5600fc825460 <main(int, char**)>, argc=1, argv=0x7ffd26ee0bd8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd26ee0bc8) at ../csu/libc-start.c:389 No locals. #15 0x00005600fc868595 in _start () at /usr/src/debug/MariaDB-/src_0/sql/my_decimal.h:147 No symbol table info available. (gdb)
            danblack Daniel Black added a comment -

            decompile on mariadbd at offset fcbff5

            0000000000fcbfa0 <my_crc32c@@Base>:
              fcbfa0:       ff 25 a2 00 2b 01       jmpq   *0x12b00a2(%rip)        # 227c048 <my_may_have_atomic_write@@Base+0xce4>
              fcbfa6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
              fcbfad:       00 00 00
              fcbfb0:       55                      push   %rbp
              fcbfb1:       b8 07 00 00 00          mov    $0x7,%eax
              fcbfb6:       31 c9                   xor    %ecx,%ecx
              fcbfb8:       48 89 e5                mov    %rsp,%rbp
              fcbfbb:       53                      push   %rbx
              fcbfbc:       0f a2                   cpuid
              fcbfbe:       80 e5 04                and    $0x4,%ch
              fcbfc1:       74 1d                   je     fcbfe0 <my_crc32c@@Base+0x40>
              fcbfc3:       f7 d3                   not    %ebx
              fcbfc5:       31 c0                   xor    %eax,%eax
              fcbfc7:       81 e3 00 00 03 c0       and    $0xc0030000,%ebx
              fcbfcd:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
              fcbfd1:       c9                      leaveq
              fcbfd2:       0f 94 c0                sete   %al
              fcbfd5:       c3                      retq
              fcbfd6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
              fcbfdd:       00 00 00
              fcbfe0:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
              fcbfe4:       31 c0                   xor    %eax,%eax
              fcbfe6:       c9                      leaveq
              fcbfe7:       c3                      retq
              fcbfe8:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
              fcbfef:       00
              fcbff0:       48 89 d0                mov    %rdx,%rax
              fcbff3:       89 fa                   mov    %edi,%edx
              fcbff5:       62 72 7d 48 5a 41 07    vbroadcasti32x4 0x70(%rcx),%zmm8
            

            So vbroadcasti32x4 us a AVX2 instruction.

            danblack Daniel Black added a comment - decompile on mariadbd at offset fcbff5 0000000000fcbfa0 <my_crc32c@@Base>: fcbfa0: ff 25 a2 00 2b 01 jmpq *0x12b00a2(%rip) # 227c048 <my_may_have_atomic_write@@Base+0xce4> fcbfa6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) fcbfad: 00 00 00 fcbfb0: 55 push %rbp fcbfb1: b8 07 00 00 00 mov $0x7,%eax fcbfb6: 31 c9 xor %ecx,%ecx fcbfb8: 48 89 e5 mov %rsp,%rbp fcbfbb: 53 push %rbx fcbfbc: 0f a2 cpuid fcbfbe: 80 e5 04 and $0x4,%ch fcbfc1: 74 1d je fcbfe0 <my_crc32c@@Base+0x40> fcbfc3: f7 d3 not %ebx fcbfc5: 31 c0 xor %eax,%eax fcbfc7: 81 e3 00 00 03 c0 and $0xc0030000,%ebx fcbfcd: 48 8b 5d f8 mov -0x8(%rbp),%rbx fcbfd1: c9 leaveq fcbfd2: 0f 94 c0 sete %al fcbfd5: c3 retq fcbfd6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) fcbfdd: 00 00 00 fcbfe0: 48 8b 5d f8 mov -0x8(%rbp),%rbx fcbfe4: 31 c0 xor %eax,%eax fcbfe6: c9 leaveq fcbfe7: c3 retq fcbfe8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) fcbfef: 00 fcbff0: 48 89 d0 mov %rdx,%rax fcbff3: 89 fa mov %edi,%edx fcbff5: 62 72 7d 48 5a 41 07 vbroadcasti32x4 0x70(%rcx),%zmm8 So vbroadcasti32x4 us a AVX2 instruction.
            danblack Daniel Black added a comment -

            So it did a runtime check - detected vpclmulqdq as a CPU feature.

            crc32_avx512 was called, and despite its name, is compiled with target "pclmul,avx512f,avx512dq,avx512bw,avx512vl,vpclmulqdq" so should be portable on all those platforms.

            The _mm512_broadcast_i32x4 intrinsic at the top of this function is the line it occurs at.

            /usr/lib/gcc/x86_64-redhat-linux/11/include/avx512fintrin.h where its implemented

            This header starts with:

            #ifndef __AVX512F__
            #pragma GCC push_options
            #pragma GCC target("avx512f")
            #define __DISABLE_AVX512F__
            #endif /* __AVX512F__ */
            

            (does the logic on ifndef look inverted to anyone but me?)

            So it looks like crc32_avx512 because of the include headers is requiring avx512f instructions.

            danblack Daniel Black added a comment - So it did a runtime check - detected vpclmulqdq as a CPU feature. crc32_avx512 was called, and despite its name, is compiled with target "pclmul,avx512f,avx512dq,avx512bw,avx512vl,vpclmulqdq" so should be portable on all those platforms. The _mm512_broadcast_i32x4 intrinsic at the top of this function is the line it occurs at. /usr/lib/gcc/x86_64-redhat-linux/11/include/avx512fintrin.h where its implemented This header starts with: #ifndef __AVX512F__ #pragma GCC push_options #pragma GCC target("avx512f") #define __DISABLE_AVX512F__ #endif /* __AVX512F__ */ (does the logic on ifndef look inverted to anyone but me?) So it looks like crc32_avx512 because of the include headers is requiring avx512f instructions.

            danblack, have_vpclmulqdq detects more than just vpclmulqdq . It is supposed to detect avx512f, too (I think it would be better reflected in its name, but it is not yet
            marko, any thoughts?

            wlad Vladislav Vaintroub added a comment - danblack , have_vpclmulqdq detects more than just vpclmulqdq . It is supposed to detect avx512f, too (I think it would be better reflected in its name, but it is not yet marko , any thoughts?

            Actually , checking for "AVX512 supported" is a more complicated adventure.

            https://github.com/ispc/ispc/blob/v1.24.0/check_isa.cpp#L120 could be somewhat correct, because it is from intel.
            apart from cpuid bits specific to AVX512, they check for OSXSAVE, they check for AVX2, they check if it is supported by OS using xgetbv in https://github.com/ispc/ispc/blob/v1.24.0/check_isa.cpp#L48 .
            Not so easy.

            wlad Vladislav Vaintroub added a comment - Actually , checking for "AVX512 supported" is a more complicated adventure. https://github.com/ispc/ispc/blob/v1.24.0/check_isa.cpp#L120 could be somewhat correct, because it is from intel. apart from cpuid bits specific to AVX512, they check for OSXSAVE, they check for AVX2, they check if it is supported by OS using xgetbv in https://github.com/ispc/ispc/blob/v1.24.0/check_isa.cpp#L48 . Not so easy.

            Curiously, in the lscpu output from the virtual machine I do not see vpclmulqdq or any avx512 flags that the have_vpclmulqdq check is looking for. But, isn’t the microarchitecture of the Intel Xeon Gold 6330 called Ice Lake, that is, the first one to support these features? Is AVX512 support somehow missing from the VM hypervisor?

            On Linux, compared to the Intel library, our detection code is missing the xgetbv check. Maybe adding it could fix this.

            marko Marko Mäkelä added a comment - Curiously, in the lscpu output from the virtual machine I do not see vpclmulqdq or any avx512 flags that the have_vpclmulqdq check is looking for. But, isn’t the microarchitecture of the Intel Xeon Gold 6330 called Ice Lake, that is, the first one to support these features? Is AVX512 support somehow missing from the VM hypervisor? On Linux, compared to the Intel library, our detection code is missing the xgetbv check. Maybe adding it could fix this.

            I think that the disassembly that danblack posted shows several different functions. The function crc32_avx512, which is common for both CRC-32 and CRC-32C, appears to start at 0xfcbff0. The illegal instruction corresponds to the second line in the function body:

            static unsigned crc32_avx512(unsigned crc, const char *buf, size_t size,
                                         const crc32_tab &tab)
            {
              const __m512i crc_in = _mm512_castsi128_si512(_mm_cvtsi32_si128(~crc)),
                b512 = _mm512_broadcast_i32x4(_mm_load_epi32(tab.b512));
            

            marko Marko Mäkelä added a comment - I think that the disassembly that danblack posted shows several different functions. The function crc32_avx512 , which is common for both CRC-32 and CRC-32C, appears to start at 0xfcbff0 . The illegal instruction corresponds to the second line in the function body: static unsigned crc32_avx512(unsigned crc, const char *buf, size_t size, const crc32_tab &tab) { const __m512i crc_in = _mm512_castsi128_si512(_mm_cvtsi32_si128(~crc)), b512 = _mm512_broadcast_i32x4(_mm_load_epi32(tab.b512));

            marko, we already had seen VMs disabling native instructions, MDEV-34372, so VM hypervisor is likely the reason. compared to intel's code, ours is also missing checks for xgetbv, XSAVE and AVX2.

            wlad Vladislav Vaintroub added a comment - marko , we already had seen VMs disabling native instructions, MDEV-34372 , so VM hypervisor is likely the reason. compared to intel's code, ours is also missing checks for xgetbv, XSAVE and AVX2.
            wlad Vladislav Vaintroub added a comment - - edited

            marko, I guess it is yours. I can't really test it. But you can, on your AVX512-capable hardware.

            Allegedly, there is linux boot parameter that will make avx go away - https://stackoverflow.com/a/48764832/547065 , and you'd need to test with and without it. And for safety, I think all other conditions from intel code need to be applied.

            BTW, Intel states macOS-Intel does support AVX512, but in a funny way - it is first enabled after illegal instruction exception is raised and caught.

            wlad Vladislav Vaintroub added a comment - - edited marko , I guess it is yours. I can't really test it. But you can, on your AVX512-capable hardware. Allegedly, there is linux boot parameter that will make avx go away - https://stackoverflow.com/a/48764832/547065 , and you'd need to test with and without it. And for safety, I think all other conditions from intel code need to be applied. BTW, Intel states macOS-Intel does support AVX512, but in a funny way - it is first enabled after illegal instruction exception is raised and caught.

            I am back from vacation. I am afraid that I can’t really test this either, other than by ensuring that the AVX512 instructions will be used even after adding the xgetbv check.

            This would have to be tested by bijjupatel in an environment where the error was reproduced in the first place. The virtualization hypervisor and the guest operating system configuration is relevant. I do not know if the hypervisor can trap an xsetbv instruction that would be executed by the guest operating system kernel, and what the effect might be. Hopefully it will be such that xgetbv will report the registers as not usable.

            Some versions of the clang based compiler that Apple has forked for macOS disable the intrinsic functions for AVX512. My commit message for fixing up that does not mention the compilation error, but it points to https://discussions.apple.com/thread/8256853 which suggests that they do not save the 512-bit zmm registers on a context switch.

            marko Marko Mäkelä added a comment - I am back from vacation. I am afraid that I can’t really test this either, other than by ensuring that the AVX512 instructions will be used even after adding the xgetbv check. This would have to be tested by bijjupatel in an environment where the error was reproduced in the first place. The virtualization hypervisor and the guest operating system configuration is relevant. I do not know if the hypervisor can trap an xsetbv instruction that would be executed by the guest operating system kernel, and what the effect might be. Hopefully it will be such that xgetbv will report the registers as not usable. Some versions of the clang based compiler that Apple has forked for macOS disable the intrinsic functions for AVX512. My commit message for fixing up that does not mention the compilation error, but it points to https://discussions.apple.com/thread/8256853 which suggests that they do not save the 512-bit zmm registers on a context switch.

            @marko, is there a noxsave Linux boot parameter, as claimed in https://stackoverflow.com/a/48764832/547065 ? Does adding it makes AVX disappear?

            wlad Vladislav Vaintroub added a comment - @marko, is there a noxsave Linux boot parameter, as claimed in https://stackoverflow.com/a/48764832/547065 ? Does adding it makes AVX disappear?

            I do not have the ability to change the boot parameters of an environment that supports AVX512.

            marko Marko Mäkelä added a comment - I do not have the ability to change the boot parameters of an environment that supports AVX512.

            bijjupatel, MariaDB Server 10.11.9 (anything between 10.5.26 and 11.4.3) have finally been released. Can you confirm that this bug no longer occurs in your environment and a different CRC-32 implementation (such as Using crc32 + pclmulqdq instructions) will be used?

            marko Marko Mäkelä added a comment - bijjupatel , MariaDB Server 10.11.9 (anything between 10.5.26 and 11.4.3) have finally been released. Can you confirm that this bug no longer occurs in your environment and a different CRC-32 implementation (such as Using crc32 + pclmulqdq instructions ) will be used?

            I was finally able to test this myself using the xnosave Linux boot option. It turns out that the xgetbv instruction (for an attempt to check if the registers are being saved) would trigger SIGILL. I did not yet figure out a way to check from user space if the OSXSAVE flag in CR4 has been set. Well, we could read and parse /proc/cpuinfo (where the AVX512 features will not be advertised when the kernel was started with xnosave, but that would be Linux specific.

            marko Marko Mäkelä added a comment - I was finally able to test this myself using the xnosave Linux boot option. It turns out that the xgetbv instruction (for an attempt to check if the registers are being saved) would trigger SIGILL . I did not yet figure out a way to check from user space if the OSXSAVE flag in CR4 has been set. Well, we could read and parse /proc/cpuinfo (where the AVX512 features will not be advertised when the kernel was started with xnosave , but that would be Linux specific.
            wlad Vladislav Vaintroub added a comment - - edited

            would not copying everything literally from https://github.com/ispc/ispc/blob/v1.24.0/check_isa.cpp work? There are more tests that in your code, I think.

            wlad Vladislav Vaintroub added a comment - - edited would not copying everything literally from https://github.com/ispc/ispc/blob/v1.24.0/check_isa.cpp work? There are more tests that in your code, I think.

            Intel's code has

              if (osxsave && avx2 && avx512_f && __os_has_avx512_support())
            

            as the prerequisite to check other axv512 related stuff. OSXSAVE is checked , too.

            wlad Vladislav Vaintroub added a comment - Intel's code has if (osxsave && avx2 && avx512_f && __os_has_avx512_support()) as the prerequisite to check other axv512 related stuff. OSXSAVE is checked , too.

            wlad, thank you. I got confused with the forest of cpuid leaves and sub-leaves. I will try this one for the osxsave bit.

            marko Marko Mäkelä added a comment - wlad , thank you. I got confused with the forest of cpuid leaves and sub-leaves. I will try this one for the osxsave bit.

            I can confirm that MariaDB Server still crashes with SIGILL, this time on the xgetbv instruction (with the input ecx being 0).

            marko Marko Mäkelä added a comment - I can confirm that MariaDB Server still crashes with SIGILL, this time on the xgetbv instruction (with the input ecx being 0).

            I had gotten it almost right. The XSAVE bit was off by one, and while we are at it, we should also check for the AVX bit:

            diff --git a/mysys/crc32/crc32c_x86.cc b/mysys/crc32/crc32c_x86.cc
            index 3ddddf1303c..fb5dc19f7a5 100644
            --- a/mysys/crc32/crc32c_x86.cc
            +++ b/mysys/crc32/crc32c_x86.cc
            @@ -39,7 +39,7 @@ extern "C" unsigned crc32c_sse42(unsigned crc, const void* buf, size_t size);
             
             constexpr uint32_t cpuid_ecx_SSE42= 1U << 20;
             constexpr uint32_t cpuid_ecx_SSE42_AND_PCLMUL= cpuid_ecx_SSE42 | 1U << 1;
            -constexpr uint32_t cpuid_ecx_XSAVE= 1U << 26;
            +constexpr uint32_t cpuid_ecx_AVX_AND_XSAVE= 1U << 28 | 1U << 27;
             
             static uint32_t cpuid_ecx()
             {
            @@ -395,7 +395,7 @@ static bool os_have_avx512()
             
             static ATTRIBUTE_NOINLINE bool have_vpclmulqdq(uint32_t cpuid_ecx)
             {
            -  if (!(cpuid_ecx & cpuid_ecx_XSAVE) || !os_have_avx512())
            +  if ((~cpuid_ecx & cpuid_ecx_AVX_AND_XSAVE) || !os_have_avx512())
                 return false;
             # ifdef _MSC_VER
               int regs[4];
            

            With this, the test program unittest/mysys/crc32-t will report one of

            10.5 ae02999cdbea0ebb57cafc8c6b09878b8ea8a3be with patch

            1..36
            Using AVX512 instructions
            ok 1 - crc32(0,'')
            ok 2 - crc32(1,'')
            …
            

            or

            10.5 ae02999cdbea0ebb57cafc8c6b09878b8ea8a3be with patch

            1..36
            Using crc32 + pclmulqdq instructions
            ok 1 - crc32(0,'')
            ok 2 - crc32(1,'')
            …
            

            depending on whether Linux has been started with the noxsave option. Previously, I had no ability to test this by rebooting an affected system.

            marko Marko Mäkelä added a comment - I had gotten it almost right. The XSAVE bit was off by one, and while we are at it, we should also check for the AVX bit: diff --git a/mysys/crc32/crc32c_x86.cc b/mysys/crc32/crc32c_x86.cc index 3ddddf1303c..fb5dc19f7a5 100644 --- a/mysys/crc32/crc32c_x86.cc +++ b/mysys/crc32/crc32c_x86.cc @@ -39,7 +39,7 @@ extern "C" unsigned crc32c_sse42(unsigned crc, const void* buf, size_t size); constexpr uint32_t cpuid_ecx_SSE42= 1U << 20; constexpr uint32_t cpuid_ecx_SSE42_AND_PCLMUL= cpuid_ecx_SSE42 | 1U << 1; -constexpr uint32_t cpuid_ecx_XSAVE= 1U << 26; +constexpr uint32_t cpuid_ecx_AVX_AND_XSAVE= 1U << 28 | 1U << 27; static uint32_t cpuid_ecx() { @@ -395,7 +395,7 @@ static bool os_have_avx512() static ATTRIBUTE_NOINLINE bool have_vpclmulqdq(uint32_t cpuid_ecx) { - if (!(cpuid_ecx & cpuid_ecx_XSAVE) || !os_have_avx512()) + if ((~cpuid_ecx & cpuid_ecx_AVX_AND_XSAVE) || !os_have_avx512()) return false; # ifdef _MSC_VER int regs[4]; With this, the test program unittest/mysys/crc32-t will report one of 10.5 ae02999cdbea0ebb57cafc8c6b09878b8ea8a3be with patch 1..36 Using AVX512 instructions ok 1 - crc32(0,'') ok 2 - crc32(1,'') … or 10.5 ae02999cdbea0ebb57cafc8c6b09878b8ea8a3be with patch 1..36 Using crc32 + pclmulqdq instructions ok 1 - crc32(0,'') ok 2 - crc32(1,'') … depending on whether Linux has been started with the noxsave option. Previously, I had no ability to test this by rebooting an affected system.
            danblack Daniel Black added a comment -

            transactional_lock_enabled() has a SIGILL based pattern for testing CPU capabilties.

            danblack Daniel Black added a comment - transactional_lock_enabled() has a SIGILL based pattern for testing CPU capabilties.

            Right, on POWER we test a particular capability with the help of a SIGILL handler. On x86 it has been customary to rely on cpuid; on other ISA the getauxval() of the operating system covers many ISA extensions.

            marko Marko Mäkelä added a comment - Right, on POWER we test a particular capability with the help of a SIGILL handler. On x86 it has been customary to rely on cpuid ; on other ISA the getauxval() of the operating system covers many ISA extensions.

            People

              marko Marko Mäkelä
              bijjupatel Bijju Patel
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.