Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24745

Fallback CRC-32C computation wrongly uses SSE4.1 instructions

Details

    • Bug
    • Status: Closed (View Workflow)
    • Blocker
    • Resolution: Fixed
    • 10.5.7, 10.5.8, 10.5.9
    • 10.5.10
    • Slackware Linux 5.10.11-smp #1 SMP i686 Intel(R) Xeon(TM) CPU 2.80GHz GenuineIntel GNU/Linux

    Description

      Upgrade from glibc-2.30 to glibc-2.32 and package 10.5.8 compiled against 2.30 to 10.5.8 compiled against 2.32. Now get this error when trying to start mariadb server:

      2021-01-30 20:34:40 0 [Note] InnoDB: Using Linux native AIO
      2021-01-30 20:34:40 0 [Note] InnoDB: Uses event mutexes
      2021-01-30 20:34:40 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2021-01-30 20:34:40 0 [Note] InnoDB: Number of pools: 1
      2021-01-30 20:34:40 0 [Note] InnoDB: Using generic crc32 instructions
      2021-01-30 20:34:40 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
      2021-01-30 20:34:40 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
      2021-01-30 20:34:40 0 [Note] InnoDB: Completed initialization of buffer pool
      2021-01-30 20:34:40 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      210130 20:34:40 [ERROR] mysqld got signal 4 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.5.8-MariaDB-log
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=0
      max_threads=153
      thread_count=0
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 466473 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      ??:0(my_print_stacktrace)[0x12d69ae]
      ??:0(handle_fatal_signal)[0xc3b992]
      addr2line: 'linux-gate.so.1': No such file
      linux-gate.so.1(__kernel_sigreturn+0x0)[0xb7f87554]
      ??:0(my_dlerror)[0x12f1250]
      ??:0(std::unique_lock<std::mutex>::unlock())[0x1185e9c]
      ??:0(std::unique_lock<std::mutex>::unlock())[0x117b47f]
      ??:0(std::unique_lock<std::mutex>::unlock())[0x11fd753]
      ??:0(std::unique_lock<std::mutex>::unlock())[0x1201618]
      ??:0(std::unique_lock<std::mutex>::unlock())[0x1201c17]
      ??:0(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x85df98]
      ??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0xfeabdc]
      ??:0(ha_initialize_handlerton(st_plugin_int*))[0xc3eb2d]
      ??:0(sys_var_pluginvar::sys_var_pluginvar(sys_var_chain*, char const*, st_plugin_int*, st_mysql_sys_var*))[0x9e20cc]
      ??:0(plugin_init(int*, char**, int))[0x9e37da]
      ??:0(unireg_abort)[0x8df56c]
      ??:0(mysqld_main(int, char**))[0x8e5b35]
      ??:0(main)[0x8a2467]
      ??:0(__libc_start_main)[0xb760405a]
      ??:0(_start)[0x8d8811]
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Writing a core file...
      Working directory at /var/lib/mysql
      Resource Limits:
      Limit                     Soft Limit           Hard Limit           Units
      Max cpu time              unlimited            unlimited            seconds
      Max file size             unlimited            unlimited            bytes
      Max data size             unlimited            unlimited            bytes
      Max stack size            8388608              unlimited            bytes
      Max core file size        0                    unlimited            bytes
      Max resident set          unlimited            unlimited            bytes
      Max processes             15973                15973                processes
      Max open files            32186                32186                files
      Max locked memory         65536                65536                bytes
      Max address space         unlimited            unlimited            bytes
      Max file locks            unlimited            unlimited            locks
      Max pending signals       15973                15973                signals
      Max msgqueue size         819200               819200               bytes
      Max nice priority         0                    0
      Max realtime priority     0                    0
      Max realtime timeout      unlimited            unlimited            us
      Core pattern: core
      

      Attachments

        1. compat_report.html
          354 kB
        2. cs215.mds-nh.org.err
          5 kB
        3. proc_cpuinfo
          2 kB
        4. screen-exchange
          18 kB

        Issue Links

          Activity

            This should have been broken in MariaDB Server 10.5.7 by MDEV-22749.

            marko Marko Mäkelä added a comment - This should have been broken in MariaDB Server 10.5.7 by MDEV-22749 .
            wlad Vladislav Vaintroub added a comment - - edited

            marko, could you please minimize the diff. It is foreign code, and there is no reason to fix the whitespace in it. Also, renaming files is not necessary, it does not even add to clarity. The code duplication instead of function template needs a good and large comment though.

            Alternatively the minimal fix could be just some GCC deoptimize pragma in ExtendImpl, such as

            #if defined (__GNUC__) && defined (__i386__)
            #define NEED_GCC_NO_SSE_WORKAROUND
            #endif
             
            #ifdef NEED_GCC_NO_SSE_WORKAROUND
            #pragma GCC push_options
            #pragma GCC target ("no-sse")
            #endif
             
            template<void (*CRC32)(uint64_t*, uint8_t const**)>
            uint32_t ExtendImpl(uint32_t crc, const char* buf, size_t size) {
            ....
            }
            #ifdef NEED_GCC_NO_SSE_WORKAROUND
            #pragma GCC pop_options
            #undef NEED_GCC_NO_SSE_WORKAROUND
            #endif
            

            This is not 100% proof against malice by GCC, but will work so far. You can also extend that to x86_64 if you think it could be necessary. In terms of lines of code, and clarity, that change would be hard to beat

            wlad Vladislav Vaintroub added a comment - - edited marko , could you please minimize the diff. It is foreign code, and there is no reason to fix the whitespace in it. Also, renaming files is not necessary, it does not even add to clarity. The code duplication instead of function template needs a good and large comment though. Alternatively the minimal fix could be just some GCC deoptimize pragma in ExtendImpl, such as #if defined (__GNUC__) && defined (__i386__) #define NEED_GCC_NO_SSE_WORKAROUND #endif   #ifdef NEED_GCC_NO_SSE_WORKAROUND #pragma GCC push_options #pragma GCC target ("no-sse") #endif   template<void (*CRC32)(uint64_t*, uint8_t const**)> uint32_t ExtendImpl(uint32_t crc, const char* buf, size_t size) { .... } #ifdef NEED_GCC_NO_SSE_WORKAROUND #pragma GCC pop_options #undef NEED_GCC_NO_SSE_WORKAROUND #endif This is not 100% proof against malice by GCC, but will work so far. You can also extend that to x86_64 if you think it could be necessary. In terms of lines of code, and clarity, that change would be hard to beat

            wlad, compiling generic code with target-specific flags is comparable to running with open scissors. I did try your suggested work-around, and it did not work reliably. Either the mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Slow_CRC32> would still contain SSE4.1 (or SSE4.2) instructions (without any warning being emitted by the compiler!), or mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Fast_CRC32> would no longer contain those instructions. We have past examples of that: undefined behavior gave the permission to the compiler to optimize away some code on some platforms, in MDEV-21977 (leading to wrong results) and MDEV-15587 (leading to SIGSEGV due to dereferencing a null pointer).

            There is another issue with isSSE42() and crc32_pclmul_enabled(). They were also being compiled with -msse4.2 -mpclmul, and theoretically some future compiler version could choose to make advantage of those instruction set extensions.

            I think that the cleanest solution is to avoid setting any target-specific flags for compiling crc32c.cc and to move the implementation of crc32_pclmul_enabled() check to that file.

            For the SSE4.2 but not PCLMUL accelerated CRC-32C, it is easiest to specify target attributes on the few functions that use SSE4.2 instructions. Because we support compiling MariaDB 10.5 with GCC 4.8.5, and because header files such as <nmmintrin.h> only work without file-level flags such as -msse4.2 starting with GCC 5, we need a work-around for the old GCC. To minimize the size of the work-around, it is easiest to move all -mpclmul code to a separate compilation unit.

            marko Marko Mäkelä added a comment - wlad , compiling generic code with target-specific flags is comparable to running with open scissors. I did try your suggested work-around, and it did not work reliably. Either the mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Slow_CRC32> would still contain SSE4.1 (or SSE4.2) instructions (without any warning being emitted by the compiler!), or mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Fast_CRC32> would no longer contain those instructions. We have past examples of that: undefined behavior gave the permission to the compiler to optimize away some code on some platforms, in MDEV-21977 (leading to wrong results) and MDEV-15587 (leading to SIGSEGV due to dereferencing a null pointer). There is another issue with isSSE42() and crc32_pclmul_enabled() . They were also being compiled with -msse4.2 -mpclmul , and theoretically some future compiler version could choose to make advantage of those instruction set extensions. I think that the cleanest solution is to avoid setting any target-specific flags for compiling crc32c.cc and to move the implementation of crc32_pclmul_enabled() check to that file. For the SSE4.2 but not PCLMUL accelerated CRC-32C, it is easiest to specify target attributes on the few functions that use SSE4.2 instructions. Because we support compiling MariaDB 10.5 with GCC 4.8.5, and because header files such as <nmmintrin.h> only work without file-level flags such as -msse4.2 starting with GCC 5, we need a work-around for the old GCC. To minimize the size of the work-around, it is easiest to move all -mpclmul code to a separate compilation unit.

            I got a mystery failure on Windows (only) when I converted mysys/crc32/crc32_x86.c from C to C++. mariabackup (but not the server) would crash somewhere in the pclmul accelerated my_checksum(). I tested that moving the -mpclmul dependent crc32c_3way() code to a new compilation unit mysys/crc32/crc32c_amd64.cc addressed that problem.

            While working on this, I noticed that mysys/crc32ieee.cc is not really needed on 64-bit POWER, because there is only one implementation of my_checksum() on that platform.

            marko Marko Mäkelä added a comment - I got a mystery failure on Windows (only) when I converted mysys/crc32/crc32_x86.c from C to C++. mariabackup (but not the server) would crash somewhere in the pclmul accelerated my_checksum() . I tested that moving the -mpclmul dependent crc32c_3way() code to a new compilation unit mysys/crc32/crc32c_amd64.cc addressed that problem. While working on this, I noticed that mysys/crc32ieee.cc is not really needed on 64-bit POWER, because there is only one implementation of my_checksum() on that platform.
            danblack Daniel Black added a comment -

            nobby6 confirms fix in MDEV-24922.

            danblack Daniel Black added a comment - nobby6 confirms fix in MDEV-24922 .

            People

              marko Marko Mäkelä
              cgw Charlie Wilder
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.