[MDEV-24745] Fallback CRC-32C computation wrongly uses SSE4.1 instructions Created: 2021-01-31  Updated: 2021-04-19  Resolved: 2021-04-13

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5.7, 10.5.8, 10.5.9
Fix Version/s: 10.5.10

Type: Bug Priority: Blocker
Reporter: Charlie Wilder Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: crash, regression
Environment:

Slackware Linux 5.10.11-smp #1 SMP i686 Intel(R) Xeon(TM) CPU 2.80GHz GenuineIntel GNU/Linux


Attachments: HTML File compat_report.html     File cs215.mds-nh.org.err     HTML File proc_cpuinfo     HTML File screen-exchange    
Issue Links:
Duplicate
is duplicated by MDEV-24922 10.5.8 fails to run with GLIBC 2.32 a... Closed
Problem/Incident
is caused by MDEV-22749 Implement portable PCLMUL accelerated... Closed

 Description   

Upgrade from glibc-2.30 to glibc-2.32 and package 10.5.8 compiled against 2.30 to 10.5.8 compiled against 2.32. Now get this error when trying to start mariadb server:

2021-01-30 20:34:40 0 [Note] InnoDB: Using Linux native AIO
2021-01-30 20:34:40 0 [Note] InnoDB: Uses event mutexes
2021-01-30 20:34:40 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-01-30 20:34:40 0 [Note] InnoDB: Number of pools: 1
2021-01-30 20:34:40 0 [Note] InnoDB: Using generic crc32 instructions
2021-01-30 20:34:40 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
2021-01-30 20:34:40 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
2021-01-30 20:34:40 0 [Note] InnoDB: Completed initialization of buffer pool
2021-01-30 20:34:40 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
210130 20:34:40 [ERROR] mysqld got signal 4 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.5.8-MariaDB-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 466473 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
??:0(my_print_stacktrace)[0x12d69ae]
??:0(handle_fatal_signal)[0xc3b992]
addr2line: 'linux-gate.so.1': No such file
linux-gate.so.1(__kernel_sigreturn+0x0)[0xb7f87554]
??:0(my_dlerror)[0x12f1250]
??:0(std::unique_lock<std::mutex>::unlock())[0x1185e9c]
??:0(std::unique_lock<std::mutex>::unlock())[0x117b47f]
??:0(std::unique_lock<std::mutex>::unlock())[0x11fd753]
??:0(std::unique_lock<std::mutex>::unlock())[0x1201618]
??:0(std::unique_lock<std::mutex>::unlock())[0x1201c17]
??:0(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x85df98]
??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0xfeabdc]
??:0(ha_initialize_handlerton(st_plugin_int*))[0xc3eb2d]
??:0(sys_var_pluginvar::sys_var_pluginvar(sys_var_chain*, char const*, st_plugin_int*, st_mysql_sys_var*))[0x9e20cc]
??:0(plugin_init(int*, char**, int))[0x9e37da]
??:0(unireg_abort)[0x8df56c]
??:0(mysqld_main(int, char**))[0x8e5b35]
??:0(main)[0x8a2467]
??:0(__libc_start_main)[0xb760405a]
??:0(_start)[0x8d8811]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             15973                15973                processes
Max open files            32186                32186                files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       15973                15973                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: core



 Comments   
Comment by Sergei Golubchik [ 2021-01-31 ]

this stack doesn't make sense, these functions/methods don't call each other in that order. But it kind of looks ok, it's wrong, but not completely absurd. So it's not like the stack was overwritten by some garbage.

The likely use case for such an effect is a broken build. If you have built MariaDB 10.5.8 for glibc-2.30, and then later rebuilt it for 2.32 using the same build dir. And if make didn't handle the change properly and didn't rebuild some of the files — this is exactly the kind of stack you would get on such a broken binary.

Try to do a clean build of 10.5.8 on a new empty build dir.

The second possible (but much less likely) reason would be that the stack is correct, but the binary was so optimized, that the stack is no longer recognizable. If you could try to build 10.5.8 without optimization (-g3 -O0 or -DCMAKE_BUILD_TYPE=Debug it'll show a readable stack that could help to point where the problem is.

Comment by Charlie Wilder [ 2021-01-31 ]

The problem started when I installed updated packages from the distro site. This morning, I compiled on my system with the package buildfile and got what appeared to be the same error (my untrained I). While it was compiling, I got your message about -g3 -O0 and have since recompiled and reinstalled. It looks to be the same error. All other software on this server appears to be running properly, so I do not believe there is memory corruption. Unfortunately it take about 2 hours to compile...

Here is the latest .err info:

{{2021-01-31 14:15:37 0 [Note] InnoDB: Using Linux native AIO
2021-01-31 14:15:37 0 [Note] InnoDB: Uses event mutexes
2021-01-31 14:15:37 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-01-31 14:15:37 0 [Note] InnoDB: Number of pools: 1
2021-01-31 14:15:37 0 [Note] InnoDB: Using generic crc32 instructions
2021-01-31 14:15:37 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
2021-01-31 14:15:37 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
2021-01-31 14:15:37 0 [Note] InnoDB: Completed initialization of buffer pool
2021-01-31 14:15:37 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
210131 14:15:37 [ERROR] mysqld got signal 4 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.5.8-MariaDB-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 466473 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
??:0(my_print_stacktrace)[0x12db9ae]
??:0(handle_fatal_signal)[0xc40992]
addr2line: 'linux-gate.so.1': No such file
linux-gate.so.1(__kernel_sigreturn+0x0)[0xb7f82554]
??:0(my_dlerror)[0x12f6250]
??:0(std::unique_lock<std::mutex>::unlock())[0x118ae9c]
??:0(std::unique_lock<std::mutex>::unlock())[0x118047f]
??:0(std::unique_lock<std::mutex>::unlock())[0x1202753]
??:0(std::unique_lock<std::mutex>::unlock())[0x1206618]
??:0(std::unique_lock<std::mutex>::unlock())[0x1206c17]
??:0(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x862f98]
??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0xfefbdc]
??:0(ha_initialize_handlerton(st_plugin_int*))[0xc43b2d]
??:0(sys_var_pluginvar::sys_var_pluginvar(sys_var_chain*, char const*, st_plugin_int*, st_mysql_sys_var*))[0x9e70cc]
??:0(plugin_init(int*, char**, int))[0x9e87da]
??:0(unireg_abort)[0x8e456c]
??:0(mysqld_main(int, char**))[0x8eab35]
??:0(main)[0x8a7467]
??:0(__libc_start_main)[0xb75ff05a]
??:0(_start)[0x8dd811]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 15973 15973 processes
Max open files 32186 32186 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 15973 15973 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
}}

Comment by Charlie Wilder [ 2021-01-31 ]

I have found the previous distro package (mariadb-10.5.8-i586-1.txz), installed it, and all seems well. My databases check out and seem to be up to date with no errors. Because the system has been down for so long, I won't do my next test until a weekend or when a new distro package is released. Weird that it works with glibc-2.30 and not with glibc-2.32, "all else" being equal...

Comment by Marko Mäkelä [ 2021-02-19 ]

cgw, the built-in stack trace reporter is often producing inaccurate stack traces. Would it be possible to start the broken server under a debugger, to get more meaningful stack traces?

man 7 signal suggests to me that signal 4 is SIGILL, that is, illegal instruction. This would make it even more important to run the server in a debugger, so that we can find the problem. Some code might fail to detect the processor correctly before using an instruction that is not part of the original AMD64 ISA. How does cat /proc/cpuinfo identify the processor?

If it is not feasible to start the mariadbd executable under GDB, given that you are able to compile yourself, you could add a delay right before or after the output of the message Completed initialization of buffer pool, such as os_thread_sleep(10000000); (10 seconds) that would allow you to execute gdb -p $(pgrep -x mariadbd) while the server is starting up.

In GDB, I would be interested in the backtrace of the crashing thread (I guess there are not many other interesting threads at this point), as well as the output of the disassemble command for the stack frame that triggered the signal.

Comment by Charlie Wilder [ 2021-02-19 ]

GDB is out of my depth and I have had to do recent testing on a different system so that the production system will be available. The test system is older, lesser hardware but it does result the same error.

proc_cpuinfo screen-exchange cs215.mds-nh.org.err

Comment by Marko Mäkelä [ 2021-02-24 ]

cgw, thank you. In screen-exchange you disassembled the call that is in mariadbd, but not the function that contains the stack frame:

   0x00b629ab <+107>:   mov    %esi,0x24(%eax)
   0x00b629ae <+110>:   mov    0x8(%eax),%eax
   0x00b629b1 <+113>:   mov    0x18(%eax),%eax
   0x00b629b6 <+118>:   je     0xb629c5 <_Z24ha_initialize_handlertonP13st_plugin_int+133>
   0x00b629b8 <+120>:   mov    %esi,(%esp)
   0x00b629bb <+123>:   call   *%eax
   0x00b629bd <+125>:   test   %eax,%eax

It seems to correspond to the following source code line in int ha_initialize_handlerton(st_plugin_int *plugin):

  if (plugin->plugin->init && plugin->plugin->init(hton))

This is probably calling innodb_init(). The called code is likely calling something in GNU libc and hitting the illegal instruction there.

I think that if you installed the debug symbols for GNU libc, then it should be possible to disassemble the interesting stack frame. On my Debian system, the package to install would be libc6-dbg. On a quick search, I did not find what it could be on Slackware. Possibly it is glibc-devel, but that one might also just correspond to Debian’s libc-dev (header files only).

There might also be a trick to force GDB to disassemble code even if the start and the end of the function are not known, but I do not know it. We really need the disassembly of the deepest stack frame (and must be able to identify that function).

Comment by Daniel Black [ 2021-02-24 ]

Note the address that failed:

Thread 1 "mariadbd" received signal SIGILL, Illegal instruction.
0x01216f30 in ?? ()

If you disassemble this location you'll probably see a non Pentium 4 instruction.

Look at "cat /proc/$(pidof mariadbd)/map" even while the debugger is active on this file will show you address ranges where you can point to the shared object that has the illegal instruction.

Like marko indicated the libc debuginfo packages (usually separate from devel packages) will help if you have these installed before gdb is run. Taking a not verfiied web post https://www.linuxquestions.org/questions/linux-newbie-8/how-to-install-debuginfo-for-gdb-651865/ debuginfo-install glibc is needed.

Comment by Charlie Wilder [ 2021-02-25 ]

Here's a new stab at it. Thank you for the pointers!

The system has evolved some. It is now MariaDB 10.5.9 and glibc-2.33

# rm -v /var/lib/mysql/* ; gdb -ex=r --args /usr/libexec/mariadbd --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --skip-networking --log-error=/var/lib/mysql/cs215.mds-nh.org.err --pid-file=/var/run/mysql/mysql.pidls
removed '/var/lib/mysql/aria_log.00000001'
removed '/var/lib/mysql/aria_log_control'
removed '/var/lib/mysql/cs215.mds-nh.org.err'
removed '/var/lib/mysql/ib_logfile101'
removed '/var/lib/mysql/ibdata1'
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i586-slackware-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
 
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/libexec/mariadbd...
(No debugging symbols found in /usr/libexec/mariadbd)
Starting program: /usr/libexec/mariadbd --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --skip-networking --log-error=/var/lib/mysql/cs215.mds-nh.org.err --pid-file=/var/run/mysql/mysql.pid
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
2021-02-25  7:22:35 0 [Note] /usr/libexec/mariadbd (mysqld 10.5.9-MariaDB) starting as process 13594 ...
[New Thread 0xb7fc2b00 (LWP 13598)]
[New Thread 0xa7110b00 (LWP 13599)]
[New Thread 0xa63feb00 (LWP 13600)]
[New Thread 0x9b9fdb00 (LWP 13601)]
 
Thread 1 "mariadbd" received signal SIGILL, Illegal instruction.
0x01216f30 in ?? ()
(gdb) backtrace 
 
#0  0x01216f30 in ?? ()
#1  0x00f6c21a in ?? ()
#2  0x0077e1bf in ?? ()
#3  0x0077f151 in ?? ()
#4  0x00f1158c in ?? ()
#5  0x00b629bd in ha_initialize_handlerton(st_plugin_int*) ()
#6  0x0090446c in ?? ()
#7  0x00905b8a in plugin_init(int*, char**, int) ()
#8  0x0080101c in ?? ()
#9  0x00807665 in mysqld_main(int, char**) ()
#10 0x007c41e7 in main ()
(gdb) disassemble 0x01216f30,0x01216f30, +60
 
Dump of assembler code from 0x1216f30 to 0x1216f6c:
=> 0x01216f30:pinsrd $0x1,-0x1c(%ebp),%xmm0
   0x01216f37:jb     0x1216f89
   0x01216f39:lea    -0x4cbfe4(%edi),%ebx
   0x01216f3f:cmp    %esi,%eax
   0x01216f41:mov    %ebx,%edi
   0x01216f43:je     0x1216f89
   0x01216f45:lea    0x0(%esi,%eiz,1),%esi
   0x01216f4c:lea    0x0(%esi,%eiz,1),%esi
   0x01216f50:movzbl -0x20(%ebp),%edx
   0x01216f54:inc    %eax
   0x01216f55:xor    %ebx,%ebx
   0x01216f57:movzbl -0x1(%eax),%ecx
   0x01216f5b:psrlq  $0x8,%xmm0
   0x01216f60:xor    %cl,%dl
   0x01216f62:cmp    %esi,%eax
   0x01216f64:movzbl %dl,%edx
   0x01216f67:mov    (%edi,%edx,4),%ecx
   0x01216f6a:movd   %ecx,%xmm1
End of assembler dump.
(gdb) shell
 
# ps -C mariadbd
  PID TTY          TIME CMD
13594 pts/3    00:00:00 mariadbd
 
# cat /proc/13594/maps 
00400000-00734000 r--p 00000000 08:01 3621348    /usr/libexec/mariadbd
00734000-012ac000 r-xp 00334000 08:01 3621348    /usr/libexec/mariadbd
012ac000-018c1000 r--p 00eac000 08:01 3621348    /usr/libexec/mariadbd
018c1000-01962000 r--p 014c0000 08:01 3621348    /usr/libexec/mariadbd
01962000-019e5000 rw-p 01561000 08:01 3621348    /usr/libexec/mariadbd
019e5000-0266d000 rw-p 00000000 00:00 0          [heap]
9b000000-9b021000 rw-p 00000000 00:00 0 
9b021000-9b100000 ---p 00000000 00:00 0 
9b1fd000-9b1fe000 ---p 00000000 00:00 0 
9b1fe000-9b9fe000 rw-p 00000000 00:00 0 
9bbfe000-a5bfe000 rw-p 00000000 00:00 0 
a5bfe000-a5bff000 ---p 00000000 00:00 0 
a5bff000-a6800000 rw-p 00000000 00:00 0 
a6800000-a6821000 rw-p 00000000 00:00 0 
a6821000-a6900000 ---p 00000000 00:00 0 
a6910000-a6911000 ---p 00000000 00:00 0 
a6911000-b7300000 rw-p 00000000 00:00 0 
b7300000-b7321000 rw-p 00000000 00:00 0 
b7321000-b7400000 ---p 00000000 00:00 0 
b7469000-b748a000 rw-s 00000000 00:12 1565295    /[aio] (deleted)
b748a000-b75c9000 rw-p 00000000 00:00 0 
b75c9000-b75ca000 r--p 00000000 08:01 4341433    /lib/libnss_compat-2.33.so
b75ca000-b75d1000 r-xp 00001000 08:01 4341433    /lib/libnss_compat-2.33.so
b75d1000-b75d2000 r--p 00008000 08:01 4341433    /lib/libnss_compat-2.33.so
b75d2000-b75d3000 r--p 00008000 08:01 4341433    /lib/libnss_compat-2.33.so
b75d3000-b75d4000 rw-p 00009000 08:01 4341433    /lib/libnss_compat-2.33.so
b75d4000-b7617000 rw-p 00000000 00:00 0 
b7617000-b761a000 r--p 00000000 08:01 3559409    /usr/lib/libgcc_s.so.1
b761a000-b7632000 r-xp 00003000 08:01 3559409    /usr/lib/libgcc_s.so.1
b7632000-b7636000 r--p 0001b000 08:01 3559409    /usr/lib/libgcc_s.so.1
b7636000-b7637000 r--p 0001e000 08:01 3559409    /usr/lib/libgcc_s.so.1
b7637000-b7638000 rw-p 0001f000 08:01 3559409    /usr/lib/libgcc_s.so.1
b7638000-b7652000 r--p 00000000 08:01 4341427    /lib/libc-2.33.so
b7652000-b77a9000 r-xp 0001a000 08:01 4341427    /lib/libc-2.33.so
b77a9000-b77fa000 r--p 00171000 08:01 4341427    /lib/libc-2.33.so
b77fa000-b77fc000 r--p 001c1000 08:01 4341427    /lib/libc-2.33.so
b77fc000-b77fd000 rw-p 001c3000 08:01 4341427    /lib/libc-2.33.so
b77fd000-b7805000 rw-p 00000000 00:00 0 
b7805000-b780f000 r--p 00000000 08:01 4341430    /lib/libm-2.33.so
b780f000-b7918000 r-xp 0000a000 08:01 4341430    /lib/libm-2.33.so
b7918000-b7947000 r--p 00113000 08:01 4341430    /lib/libm-2.33.so
b7947000-b7948000 r--p 00141000 08:01 4341430    /lib/libm-2.33.so
b7948000-b7949000 rw-p 00142000 08:01 4341430    /lib/libm-2.33.so
b7949000-b79c6000 r--p 00000000 08:01 3552908    /usr/lib/libstdc++.so.6.0.28
b79c6000-b7acd000 r-xp 0007d000 08:01 3552908    /usr/lib/libstdc++.so.6.0.28
b7acd000-b7b15000 r--p 00184000 08:01 3552908    /usr/lib/libstdc++.so.6.0.28
b7b15000-b7b1b000 r--p 001cb000 08:01 3552908    /usr/lib/libstdc++.so.6.0.28
b7b1b000-b7b1d000 rw-p 001d1000 08:01 3552908    /usr/lib/libstdc++.so.6.0.28
b7b1d000-b7b21000 rw-p 00000000 00:00 0 
b7b21000-b7b22000 r--p 00000000 08:01 4341429    /lib/libdl-2.33.so
b7b22000-b7b24000 r-xp 00001000 08:01 4341429    /lib/libdl-2.33.so
b7b24000-b7b25000 r--p 00003000 08:01 4341429    /lib/libdl-2.33.so
b7b25000-b7b26000 r--p 00003000 08:01 4341429    /lib/libdl-2.33.so
b7b26000-b7b27000 rw-p 00004000 08:01 4341429    /lib/libdl-2.33.so
b7b27000-b7b76000 r--p 00000000 08:01 4348920    /lib/libcrypto.so.1.1
b7b76000-b7cfe000 r-xp 0004f000 08:01 4348920    /lib/libcrypto.so.1.1
b7cfe000-b7ddb000 r--p 001d7000 08:01 4348920    /lib/libcrypto.so.1.1
b7ddb000-b7df3000 r--p 002b3000 08:01 4348920    /lib/libcrypto.so.1.1
b7df3000-b7df5000 rw-p 002cb000 08:01 4348920    /lib/libcrypto.so.1.1
b7df5000-b7df8000 rw-p 00000000 00:00 0 
b7df8000-b7e09000 r--p 00000000 08:01 4348921    /lib/libssl.so.1.1
b7e09000-b7e5b000 r-xp 00011000 08:01 4348921    /lib/libssl.so.1.1
b7e5b000-b7e8c000 r--p 00063000 08:01 4348921    /lib/libssl.so.1.1
b7e8c000-b7e8d000 ---p 00094000 08:01 4348921    /lib/libssl.so.1.1
b7e8d000-b7e92000 r--p 00094000 08:01 4348921    /lib/libssl.so.1.1
b7e92000-b7e96000 rw-p 00099000 08:01 4348921    /lib/libssl.so.1.1
b7e96000-b7e98000 r--p 00000000 08:01 4349451    /lib/libz.so.1.2.11
b7e98000-b7ea8000 r-xp 00002000 08:01 4349451    /lib/libz.so.1.2.11
b7ea8000-b7eaf000 r--p 00012000 08:01 4349451    /lib/libz.so.1.2.11
b7eaf000-b7eb0000 r--p 00018000 08:01 4349451    /lib/libz.so.1.2.11
b7eb0000-b7eb1000 rw-p 00019000 08:01 4349451    /lib/libz.so.1.2.11
b7eb1000-b7eb2000 r--p 00000000 08:01 4325472    /lib/libaio.so.1.0.1
b7eb2000-b7eb3000 r-xp 00001000 08:01 4325472    /lib/libaio.so.1.0.1
b7eb3000-b7eb4000 r--p 00002000 08:01 4325472    /lib/libaio.so.1.0.1
b7eb4000-b7eb5000 r--p 00002000 08:01 4325472    /lib/libaio.so.1.0.1
b7eb5000-b7eb6000 rw-p 00003000 08:01 4325472    /lib/libaio.so.1.0.1
b7eb6000-b7eb7000 r--p 00000000 08:01 4328622    /lib/libbz2.so.1.0.8
b7eb7000-b7ec5000 r-xp 00001000 08:01 4328622    /lib/libbz2.so.1.0.8
b7ec5000-b7ec7000 r--p 0000f000 08:01 4328622    /lib/libbz2.so.1.0.8
b7ec7000-b7ec8000 ---p 00011000 08:01 4328622    /lib/libbz2.so.1.0.8
b7ec8000-b7ec9000 r--p 00011000 08:01 4328622    /lib/libbz2.so.1.0.8
b7ec9000-b7eca000 rw-p 00012000 08:01 4328622    /lib/libbz2.so.1.0.8
b7eca000-b7ecd000 r--p 00000000 08:01 4349447    /lib/liblzma.so.5.2.5
b7ecd000-b7ee9000 r-xp 00003000 08:01 4349447    /lib/liblzma.so.5.2.5
b7ee9000-b7ef4000 r--p 0001f000 08:01 4349447    /lib/liblzma.so.5.2.5
b7ef4000-b7ef5000 r--p 00029000 08:01 4349447    /lib/liblzma.so.5.2.5
b7ef5000-b7ef6000 rw-p 0002a000 08:01 4349447    /lib/liblzma.so.5.2.5
b7ef6000-b7ef8000 r--p 00000000 08:01 3553795    /usr/lib/liblzo2.so.2.0.0
b7ef8000-b7f16000 r-xp 00002000 08:01 3553795    /usr/lib/liblzo2.so.2.0.0
b7f16000-b7f19000 r--p 00020000 08:01 3553795    /usr/lib/liblzo2.so.2.0.0
b7f19000-b7f1a000 r--p 00022000 08:01 3553795    /usr/lib/liblzo2.so.2.0.0
b7f1a000-b7f1b000 rw-p 00023000 08:01 3553795    /usr/lib/liblzo2.so.2.0.0
b7f1b000-b7f1d000 r--p 00000000 08:01 3608360    /usr/lib/liblz4.so.1.9.3
b7f1d000-b7f3c000 r-xp 00002000 08:01 3608360    /usr/lib/liblz4.so.1.9.3
b7f3c000-b7f40000 r--p 00021000 08:01 3608360    /usr/lib/liblz4.so.1.9.3
b7f40000-b7f41000 r--p 00024000 08:01 3608360    /usr/lib/liblz4.so.1.9.3
b7f41000-b7f42000 rw-p 00025000 08:01 3608360    /usr/lib/liblz4.so.1.9.3
b7f42000-b7f43000 r--p 00000000 08:01 4341428    /lib/libcrypt-2.33.so
b7f43000-b7f4b000 r-xp 00001000 08:01 4341428    /lib/libcrypt-2.33.so
b7f4b000-b7f4d000 r--p 00009000 08:01 4341428    /lib/libcrypt-2.33.so
b7f4d000-b7f4e000 r--p 0000a000 08:01 4341428    /lib/libcrypt-2.33.so
b7f4e000-b7f4f000 rw-p 0000b000 08:01 4341428    /lib/libcrypt-2.33.so
b7f4f000-b7f76000 rw-p 00000000 00:00 0 
b7f76000-b7f7b000 r--p 00000000 08:01 4341438    /lib/libpthread-2.33.so
b7f7b000-b7f8d000 r-xp 00005000 08:01 4341438    /lib/libpthread-2.33.so
b7f8d000-b7f93000 r--p 00017000 08:01 4341438    /lib/libpthread-2.33.so
b7f93000-b7f94000 r--p 0001c000 08:01 4341438    /lib/libpthread-2.33.so
b7f94000-b7f95000 rw-p 0001d000 08:01 4341438    /lib/libpthread-2.33.so
b7f95000-b7f97000 rw-p 00000000 00:00 0 
b7fb2000-b7fb3000 ---p 00000000 00:00 0 
b7fb3000-b7fc5000 rw-p 00000000 00:00 0 
b7fc5000-b7fc9000 r--p 00000000 00:00 0          [vvar]
b7fc9000-b7fcb000 r-xp 00000000 00:00 0          [vdso]
b7fcb000-b7fcc000 r--p 00000000 08:01 4328896    /lib/ld-2.33.so
b7fcc000-b7ff3000 r-xp 00001000 08:01 4328896    /lib/ld-2.33.so
b7ff3000-b7ffd000 r--p 00028000 08:01 4328896    /lib/ld-2.33.so
b7ffd000-b7fff000 r--p 00031000 08:01 4328896    /lib/ld-2.33.so
b7fff000-b8000000 rw-p 00033000 08:01 4328896    /lib/ld-2.33.so
bffdf000-c0000000 rw-p 00000000 00:00 0          [stack]
 
# ls -l  /proc/13594/map_files/
total 0
lr-------- 1 root root 64 Feb 25 07:23 12ac000-18c1000 -> /usr/libexec/mariadbd*
lr-------- 1 root root 64 Feb 25 07:23 18c1000-1962000 -> /usr/libexec/mariadbd*
lr-------- 1 root root 64 Feb 25 07:23 1962000-19e5000 -> /usr/libexec/mariadbd*
lr-------- 1 root root 64 Feb 25 07:23 400000-734000 -> /usr/libexec/mariadbd*
lr-------- 1 root root 64 Feb 25 07:23 734000-12ac000 -> /usr/libexec/mariadbd*
lrw------- 1 root root 64 Feb 25 07:23 b7469000-b748a000 -> /[aio]\ (deleted)
lr-------- 1 root root 64 Feb 25 07:23 b75c9000-b75ca000 -> /lib/libnss_compat-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b75ca000-b75d1000 -> /lib/libnss_compat-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b75d1000-b75d2000 -> /lib/libnss_compat-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b75d2000-b75d3000 -> /lib/libnss_compat-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b75d3000-b75d4000 -> /lib/libnss_compat-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7617000-b761a000 -> /usr/lib/libgcc_s.so.1*
lr-------- 1 root root 64 Feb 25 07:23 b761a000-b7632000 -> /usr/lib/libgcc_s.so.1*
lr-------- 1 root root 64 Feb 25 07:23 b7632000-b7636000 -> /usr/lib/libgcc_s.so.1*
lr-------- 1 root root 64 Feb 25 07:23 b7636000-b7637000 -> /usr/lib/libgcc_s.so.1*
lr-------- 1 root root 64 Feb 25 07:23 b7637000-b7638000 -> /usr/lib/libgcc_s.so.1*
lr-------- 1 root root 64 Feb 25 07:23 b7638000-b7652000 -> /lib/libc-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7652000-b77a9000 -> /lib/libc-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b77a9000-b77fa000 -> /lib/libc-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b77fa000-b77fc000 -> /lib/libc-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b77fc000-b77fd000 -> /lib/libc-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7805000-b780f000 -> /lib/libm-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b780f000-b7918000 -> /lib/libm-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7918000-b7947000 -> /lib/libm-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7947000-b7948000 -> /lib/libm-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7948000-b7949000 -> /lib/libm-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7949000-b79c6000 -> /usr/lib/libstdc++.so.6.0.28*
lr-------- 1 root root 64 Feb 25 07:23 b79c6000-b7acd000 -> /usr/lib/libstdc++.so.6.0.28*
lr-------- 1 root root 64 Feb 25 07:23 b7acd000-b7b15000 -> /usr/lib/libstdc++.so.6.0.28*
lr-------- 1 root root 64 Feb 25 07:23 b7b15000-b7b1b000 -> /usr/lib/libstdc++.so.6.0.28*
lr-------- 1 root root 64 Feb 25 07:23 b7b1b000-b7b1d000 -> /usr/lib/libstdc++.so.6.0.28*
lr-------- 1 root root 64 Feb 25 07:23 b7b21000-b7b22000 -> /lib/libdl-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7b22000-b7b24000 -> /lib/libdl-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7b24000-b7b25000 -> /lib/libdl-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7b25000-b7b26000 -> /lib/libdl-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7b26000-b7b27000 -> /lib/libdl-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7b27000-b7b76000 -> /lib/libcrypto.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7b76000-b7cfe000 -> /lib/libcrypto.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7cfe000-b7ddb000 -> /lib/libcrypto.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7ddb000-b7df3000 -> /lib/libcrypto.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7df3000-b7df5000 -> /lib/libcrypto.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7df8000-b7e09000 -> /lib/libssl.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7e09000-b7e5b000 -> /lib/libssl.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7e5b000-b7e8c000 -> /lib/libssl.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7e8c000-b7e8d000 -> /lib/libssl.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7e8d000-b7e92000 -> /lib/libssl.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7e92000-b7e96000 -> /lib/libssl.so.1.1*
lr-------- 1 root root 64 Feb 25 07:23 b7e96000-b7e98000 -> /lib/libz.so.1.2.11*
lr-------- 1 root root 64 Feb 25 07:23 b7e98000-b7ea8000 -> /lib/libz.so.1.2.11*
lr-------- 1 root root 64 Feb 25 07:23 b7ea8000-b7eaf000 -> /lib/libz.so.1.2.11*
lr-------- 1 root root 64 Feb 25 07:23 b7eaf000-b7eb0000 -> /lib/libz.so.1.2.11*
lr-------- 1 root root 64 Feb 25 07:23 b7eb0000-b7eb1000 -> /lib/libz.so.1.2.11*
lr-------- 1 root root 64 Feb 25 07:23 b7eb1000-b7eb2000 -> /lib/libaio.so.1.0.1*
lr-------- 1 root root 64 Feb 25 07:23 b7eb2000-b7eb3000 -> /lib/libaio.so.1.0.1*
lr-------- 1 root root 64 Feb 25 07:23 b7eb3000-b7eb4000 -> /lib/libaio.so.1.0.1*
lr-------- 1 root root 64 Feb 25 07:23 b7eb4000-b7eb5000 -> /lib/libaio.so.1.0.1*
lr-------- 1 root root 64 Feb 25 07:23 b7eb5000-b7eb6000 -> /lib/libaio.so.1.0.1*
lr-------- 1 root root 64 Feb 25 07:23 b7eb6000-b7eb7000 -> /lib/libbz2.so.1.0.8*
lr-------- 1 root root 64 Feb 25 07:23 b7eb7000-b7ec5000 -> /lib/libbz2.so.1.0.8*
lr-------- 1 root root 64 Feb 25 07:23 b7ec5000-b7ec7000 -> /lib/libbz2.so.1.0.8*
lr-------- 1 root root 64 Feb 25 07:23 b7ec7000-b7ec8000 -> /lib/libbz2.so.1.0.8*
lr-------- 1 root root 64 Feb 25 07:23 b7ec8000-b7ec9000 -> /lib/libbz2.so.1.0.8*
lr-------- 1 root root 64 Feb 25 07:23 b7ec9000-b7eca000 -> /lib/libbz2.so.1.0.8*
lr-------- 1 root root 64 Feb 25 07:23 b7eca000-b7ecd000 -> /lib/liblzma.so.5.2.5*
lr-------- 1 root root 64 Feb 25 07:23 b7ecd000-b7ee9000 -> /lib/liblzma.so.5.2.5*
lr-------- 1 root root 64 Feb 25 07:23 b7ee9000-b7ef4000 -> /lib/liblzma.so.5.2.5*
lr-------- 1 root root 64 Feb 25 07:23 b7ef4000-b7ef5000 -> /lib/liblzma.so.5.2.5*
lr-------- 1 root root 64 Feb 25 07:23 b7ef5000-b7ef6000 -> /lib/liblzma.so.5.2.5*
lr-------- 1 root root 64 Feb 25 07:23 b7ef6000-b7ef8000 -> /usr/lib/liblzo2.so.2.0.0*
lr-------- 1 root root 64 Feb 25 07:23 b7ef8000-b7f16000 -> /usr/lib/liblzo2.so.2.0.0*
lr-------- 1 root root 64 Feb 25 07:23 b7f16000-b7f19000 -> /usr/lib/liblzo2.so.2.0.0*
lr-------- 1 root root 64 Feb 25 07:23 b7f19000-b7f1a000 -> /usr/lib/liblzo2.so.2.0.0*
lr-------- 1 root root 64 Feb 25 07:23 b7f1a000-b7f1b000 -> /usr/lib/liblzo2.so.2.0.0*
lr-------- 1 root root 64 Feb 25 07:23 b7f1b000-b7f1d000 -> /usr/lib/liblz4.so.1.9.3*
lr-------- 1 root root 64 Feb 25 07:23 b7f1d000-b7f3c000 -> /usr/lib/liblz4.so.1.9.3*
lr-------- 1 root root 64 Feb 25 07:23 b7f3c000-b7f40000 -> /usr/lib/liblz4.so.1.9.3*
lr-------- 1 root root 64 Feb 25 07:23 b7f40000-b7f41000 -> /usr/lib/liblz4.so.1.9.3*
lr-------- 1 root root 64 Feb 25 07:23 b7f41000-b7f42000 -> /usr/lib/liblz4.so.1.9.3*
lr-------- 1 root root 64 Feb 25 07:23 b7f42000-b7f43000 -> /lib/libcrypt-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f43000-b7f4b000 -> /lib/libcrypt-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f4b000-b7f4d000 -> /lib/libcrypt-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f4d000-b7f4e000 -> /lib/libcrypt-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f4e000-b7f4f000 -> /lib/libcrypt-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f76000-b7f7b000 -> /lib/libpthread-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f7b000-b7f8d000 -> /lib/libpthread-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f8d000-b7f93000 -> /lib/libpthread-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f93000-b7f94000 -> /lib/libpthread-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7f94000-b7f95000 -> /lib/libpthread-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7fcb000-b7fcc000 -> /lib/ld-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7fcc000-b7ff3000 -> /lib/ld-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7ff3000-b7ffd000 -> /lib/ld-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7ffd000-b7fff000 -> /lib/ld-2.33.so*
lr-------- 1 root root 64 Feb 25 07:23 b7fff000-b8000000 -> /lib/ld-2.33.so*
 
# exit
 
(gdb) quit
A debugging session is active.
 
Inferior 1 [process 13594] will be killed.
 
Quit anyway? (y or n) y
# 

Comment by Noel [ 2021-03-06 ]

Joining this bug, rather than re-opening mine (24922)

Daniel I have struck this on a third machine now, where most thing including mariadb built from source, all have glibc2.33 in common, even with a fresh initial install w/no databases the same errors occur)

my build is for 32bit is

cmake -DCMAKE_C_FLAGS="-O2 -fPIC" -DCMAKE_CXX_FLAGS="-O2 -fPIC" -DFEATURE_SET="community" -DCMAKE_INSTALL_PREFIX=/usr -DINSTALL_LIBDIR="lib64" -DINSTALL_SBINDIR=libexec -DINSTALL_INCLUDEDIR=include/mysql -DINSTALL_MYSQLSHAREDIR=share/mysql -DINSTALL_SQLBENCHDIR="" -DINSTALL_MYSQLTESTDIR=mysql-test -DINSTALL_MANDIR=man -DINSTALL_PLUGINDIR="lib/mysql/plugin" -DINSTALL_SCRIPTDIR=bin -DINSTALL_SUPPORTFILESDIR=share/mysql -DINSTALL_MYSQLDATADIR="/var/lib/mysql" -DMYSQL_DATADIR="/var/lib/mysql" -DMYSQL_UNIX_ADDR="/var/run/mysql/mysql.sock" -DWITH_EXTRA_CHARSETS=complex -DWITH_INNOBASE_STORAGE_ENGINE=1 -DENABLED_LOCAL_INFILE=ON -DWITH_LIBARCHIVE=ON -DWITH_READLINE=ON -DWITH_JEMALLOC=system -DWITH_ZLIB=system -DWITH_EXTERNAL_ZLIB=ON -DWITH_SSL=system -DCONC_WITH_SSL=ON -DUSE_ARIA_FOR_TMP_TABLES=ON -DAWS_SDK_EXTERNAL_PROJECT=OFF

Can you, or another team member pick a modern OS with glibc2.33 installed maybe in a VM, and try this yourselves?

Comment by Noel [ 2021-03-06 ]

abi-compliance-checker -test

results attached if it helps those who know how to read it compat_report.html

Comment by Marko Mäkelä [ 2021-03-08 ]

cgw, if I got it correctly, the SIMD instruction

=> 0x01216f30:pinsrd $0x1,-0x1c(%ebp),%xmm0

ought to be in the mariadbd executable, if I am reading the output right:

00734000-012ac000 r-xp 00334000 08:01 3621348    /usr/libexec/mariadbd

Another SIMD instruction psrlq appears to be for bit-shifing. Unfortunately, this does not provide enough clues to me to guess which function this belongs to. Can you try to find the code in the server executable or one of the libraries? Or, can you build the server with -g so that debug symbols will be included for it?

A brute-force approach without debugging symbols could be

objdump -d mariadbd > mariadbd.s

and then searching mariadbd.s for the code snippet.

nobby6, the problem is that we do not have suitably old CPUs to try this on. The instructions would be valid on newer CPUs. Maybe there is some emulator that would flag an illegal instruction, but I have no experience in those. The compat_report.html appears to be for some sample program libsample_cpp that probably has little to do with MariaDB.

Comment by Noel [ 2021-03-10 ]

vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping : 1
microcode : 0x12
cpu MHz : 3192.189
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 1
initial apicid : 1
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pebs bts cpuid pni dtes64 monitor ds_cpl cid cx16 xtpr pti
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips : 6384.37
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual

Yes, I know its an old box, but is solid - usually, the thing is its used as a Dev box, so if stuff fails there, you can see why I'm hesitant to use it on production boxes, because it would be irresponsible of me to assume " it'll be ok" on that newer (and 64bit) platform, mariadb is the only thing that fails, all other daemons web/mail/bind - heck, even a copy of DNEWS which hasnt been updated in 15 years runs fine.

So we have to assume now that mariadb is only suitable for use on later systems, or systems that dont go above glibc2.31

Comment by Marko Mäkelä [ 2021-03-10 ]

nobby6, nobody has produced enough details to show where exactly the problem resides. Someone should debug a mariadbd that has the debugging symbols present. It looks like the problematic instruction is in the mariadbd executable and not in any dynamically linked library.

Comment by Noel [ 2021-03-12 ]

Don't know enough about it, I am not developer, I can do basic scripting in bash and perl, and if I'm lucky I wont break basic php but real code like C and what not, have not got the faintest idea

I just report my finding bugs Daniel asked me to run abi-compliance-checker -test, and I gave that earlier, no idea if he's even seen it or if it gave clues, I gather it didnt

which not sure if I mentioned here or my original bug report which is closed weeks ago, that its gotta be mariadbd/glibc related, if I accidentally allow a mariadb update with the glibc2.33 build it dies, all I do is grab just the mariadbd binary from a backup from glibc2.31 build, copy it inplace and voila! mariadbd works again.

The fact that everything else works fine, points to mariadbd, the bit about the CPU may be very relevant, as another slackware user reports mariadbd works but on 64bit, but they have not confirmed when i asked are they actively running database, or did they just start it and assume it works since start script falsely reports it started

I once tried the insructions for running mariadb from source build dir, I followed the bits about how to fake the database dir but that didnt work and attacked the databases (luckily on my dev box), so im not game to try it on my 64bit server which is a production server, the 32bit on my dev box dying is an annoyance, but my production dying will be a nightmare

Comment by Marko Mäkelä [ 2021-03-19 ]

I could easily debug it if I got remote access to some environment where the problem is repeatable. That worked for MDEV-25121.

Comment by Matteo Bernardini [ 2021-04-10 ]

I was able to replicate the crash during "mysql_install_db --user=mysql" on a 32bit installation of slackware-current on a samsung nc10 netbook with an Atom N270 processor.

Building mariadb with -DCMAKE_BUILD_TYPE=Debug and without stripping libraries/binaries (the standard package builds with "-O2 -march=i586 -mtune=i686" and does the stripping) avoids the crash.

If useful, Marko, I could give you access to the environment, just contact me privately: I also installed glibc-debug libraries in /usr/lib/debug if necessary.

http://ponce.cc/slackware/testing/mariadb/mysql_install_db.txt
http://ponce.cc/slackware/testing/mariadb/cpuinfo.txt
http://ponce.cc/slackware/testing/mariadb/core.gz

I also tried to make a full backtrace but I used gdb occasionally so I'm not sure it's useful

http://ponce.cc/slackware/testing/mariadb/gdb.txt

Comment by Marko Mäkelä [ 2021-04-12 ]

ponce, thank you. Your stack trace for Thread 1 clearly identifies the problem:

Thread 1 (Thread 0xb7613f40 (LWP 24147) "mariadbd"):
#0  0x01216f30 in mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Slow_CRC32> (crc=0, buf=0xbfff9a00 "PHYS", size=508) at /tmp/mariadb-10.5.9/mysys/crc32/crc32c.cc:387
        p = 0xbfff9a00 "PHYS"
        e = 0xbfff9bfc ""
        l = <optimized out>
        pval = <optimized out>
        x = <optimized out>
#1  0x00f6c21a in ut_crc32 (size=508, s=0xbfff9a00 "PHYS") at /tmp/mariadb-10.5.9/storage/innobase/include/ut0crc32.h:34

This code was last refactored by wlad.

ponce, can you please post the output of the following GDB command:

disassemble mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Slow_CRC32>

I suspect that the following compile time flags in mysys/CMakeLists.txt are biting us:

  IF(have_C__msse4.2 AND have_C__mpclmul AND HAVE_CPUID_H AND HAVE_X86INTRIN_H)
    SET(MYSYS_SOURCES ${MYSYS_SOURCES} crc32/crc32_x86.c)
    SET_SOURCE_FILES_PROPERTIES(crc32/crc32_x86.c crc32/crc32c.cc PROPERTIES COMPILE_FLAGS "-msse4.2 -mpclmul")
    ADD_DEFINITIONS(-DHAVE_SSE42 -DHAVE_PCLMUL)
  ENDIF()

This is compiling the entire compilation unit with those flags. It could enable the use of instructions such as movbe, which definitely do the original Intel Pentium ISA.

ponce, could you try to build an earlier revision of the code? I think that in MDEV-20386 this should have worked correctly, although I had no way to test it:

git checkout 31e6c96b0449761dc15f548c28ded671d1b7219b

I think that the correct fix should be to remove those unsafe SET_SOURCE_FILES_PROPERTIES and instead add __attribute__((target("sse4.2"))) to those functions that actually need it.

Comment by Matteo Bernardini [ 2021-04-12 ]

this is gdb log with the disassemble output

http://ponce.cc/slackware/testing/mariadb/disassemble.txt

I'll try to build the earlier revision and I'll post back.

Comment by Marko Mäkelä [ 2021-04-12 ]

I compiled it locally and verified that the code requires at least SSE 4.1, because it is using the pinsrd instruction:

objdump --disassemble=_ZN15mysys_namespace6crc32c10ExtendImplIXadL_ZNS0_L10Slow_CRC32EPyPPKhEEEEjjPKcj mysys/libmysys.a|grep pins

 1c0:	66 0f 3a 22 45 e4 01 	pinsrd $0x1,-0x1c(%ebp),%xmm0
 1fe:	66 0f 3a 22 cb 01    	pinsrd $0x1,%ebx,%xmm1
 32c:	66 0f 3a 22 c3 01    	pinsrd $0x1,%ebx,%xmm0
 3f4:	66 0f 3a 22 c3 01    	pinsrd $0x1,%ebx,%xmm0
 43e:	66 0f 3a 22 cb 01    	pinsrd $0x1,%ebx,%xmm1

The function could use even newer instructions; I only wanted some proof that the code is invalid.

Comment by Marko Mäkelä [ 2021-04-12 ]

ponce, thank you, your disassembly confirms that the invalid instruction is pinsrd. I think that we have all the necessary information now.

Comment by Marko Mäkelä [ 2021-04-12 ]

wlad, please review my fix on bb-10.5-release and also test it with clang-cl.exe.

Comment by Marko Mäkelä [ 2021-04-12 ]

This should have been broken in MariaDB Server 10.5.7 by MDEV-22749.

Comment by Vladislav Vaintroub [ 2021-04-12 ]

marko, could you please minimize the diff. It is foreign code, and there is no reason to fix the whitespace in it. Also, renaming files is not necessary, it does not even add to clarity. The code duplication instead of function template needs a good and large comment though.

Alternatively the minimal fix could be just some GCC deoptimize pragma in ExtendImpl, such as

#if defined (__GNUC__) && defined (__i386__)
#define NEED_GCC_NO_SSE_WORKAROUND
#endif
 
#ifdef NEED_GCC_NO_SSE_WORKAROUND
#pragma GCC push_options
#pragma GCC target ("no-sse")
#endif
 
template<void (*CRC32)(uint64_t*, uint8_t const**)>
uint32_t ExtendImpl(uint32_t crc, const char* buf, size_t size) {
....
}
#ifdef NEED_GCC_NO_SSE_WORKAROUND
#pragma GCC pop_options
#undef NEED_GCC_NO_SSE_WORKAROUND
#endif

This is not 100% proof against malice by GCC, but will work so far. You can also extend that to x86_64 if you think it could be necessary. In terms of lines of code, and clarity, that change would be hard to beat

Comment by Marko Mäkelä [ 2021-04-13 ]

wlad, compiling generic code with target-specific flags is comparable to running with open scissors. I did try your suggested work-around, and it did not work reliably. Either the mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Slow_CRC32> would still contain SSE4.1 (or SSE4.2) instructions (without any warning being emitted by the compiler!), or mysys_namespace::crc32c::ExtendImpl<mysys_namespace::crc32c::Fast_CRC32> would no longer contain those instructions. We have past examples of that: undefined behavior gave the permission to the compiler to optimize away some code on some platforms, in MDEV-21977 (leading to wrong results) and MDEV-15587 (leading to SIGSEGV due to dereferencing a null pointer).

There is another issue with isSSE42() and crc32_pclmul_enabled(). They were also being compiled with -msse4.2 -mpclmul, and theoretically some future compiler version could choose to make advantage of those instruction set extensions.

I think that the cleanest solution is to avoid setting any target-specific flags for compiling crc32c.cc and to move the implementation of crc32_pclmul_enabled() check to that file.

For the SSE4.2 but not PCLMUL accelerated CRC-32C, it is easiest to specify target attributes on the few functions that use SSE4.2 instructions. Because we support compiling MariaDB 10.5 with GCC 4.8.5, and because header files such as <nmmintrin.h> only work without file-level flags such as -msse4.2 starting with GCC 5, we need a work-around for the old GCC. To minimize the size of the work-around, it is easiest to move all -mpclmul code to a separate compilation unit.

Comment by Marko Mäkelä [ 2021-04-13 ]

I got a mystery failure on Windows (only) when I converted mysys/crc32/crc32_x86.c from C to C++. mariabackup (but not the server) would crash somewhere in the pclmul accelerated my_checksum(). I tested that moving the -mpclmul dependent crc32c_3way() code to a new compilation unit mysys/crc32/crc32c_amd64.cc addressed that problem.

While working on this, I noticed that mysys/crc32ieee.cc is not really needed on 64-bit POWER, because there is only one implementation of my_checksum() on that platform.

Comment by Daniel Black [ 2021-04-19 ]

nobby6 confirms fix in MDEV-24922.

Generated at Thu Feb 08 09:32:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.