[MDEV-25633] MariaDB crashes when compiled with link time optimizations Created: 2021-05-10 Updated: 2023-11-15 |
|
| Status: | In Review |
| Project: | MariaDB Server |
| Component/s: | Compiling, Replication |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6 |
| Fix Version/s: | 10.6, 10.11 |
| Type: | Bug | Priority: | Major |
| Reporter: | Vicențiu Ciorbaru | Assignee: | Vicențiu Ciorbaru |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Following the release of MariaDB 10.5.10 on Hirsute, we have discovered that appending -flto and -ffat-lto-objects as compile flags will cause MariaDB to crash with SIGABRT in pthread_exit when closing the replication slave thread. This is happens regardless if MariaDB is compiled with PERFSCHEMA or not. Steps to reproduce: Set up an Ubuntu 21.04 docker container or VM. run
or
from the base server directory. Notice the extra -flto and -ffat-lto-objects flags being passed during compilation (and linking). Run any replication test such as rpl_sp
|
| Comments |
| Comment by Sergei Golubchik [ 2021-05-10 ] | |||||||||||||||||||||||
|
just FYI (want to write it down somewhere before I forget):
| |||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-04-26 ] | |||||||||||||||||||||||
|
In case link-time optimization generates invalid code due to undefined behavior, then | |||||||||||||||||||||||
| Comment by Daniel Black [ 2022-04-29 ] | |||||||||||||||||||||||
|
LTO removing the exception catching code: https://bugs.launchpad.net/ubuntu/+source/mariadb-10.6/+bug/1970634 resulting in assertion ( | |||||||||||||||||||||||
| Comment by Daniel Black [ 2022-07-22 ] | |||||||||||||||||||||||
|
I ran mtr tests on the quay.io/mariadb-foundation/mariadb-devel:10.8 based on our ubuntu jammy builds (like MDBF-453) and got the consistent crashes on the stopping of the slave thread like shown here. If we can't find a solution maybe we could pull in https://git.launchpad.net/ubuntu/+source/mariadb-10.6/commit/?id=ae532f091e888f9302d2a5f3aad4c0b74521d158 before the release (10.6+ as ubuntu packages 10.6 on 22.04). | |||||||||||||||||||||||
| Comment by Daniel Black [ 2022-07-29 ] | |||||||||||||||||||||||
|
Unable to reproduce with clang-14.0.0 / gcc-12.1.1 (fc36) with the CMAKE_C{,XX}_FLAGS and CMAKE_LINKER_FLAGS. containing -flto. only gcc supported -ffat-lto-objects | |||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-10-22 ] | |||||||||||||||||||||||
|
There are many issues related to lto. The fix for the replication crash could be
Unfortunately there're lots of places that throw exceptions (in oqgraph and columnstore) and crash with lto, and they cannot be fixed like above. | |||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2023-10-05 ] | |||||||||||||||||||||||
|
This issue still exists. Filed https://bugs.launchpad.net/ubuntu/+source/mariadb/+bug/2038500 to track this and to remember to remove the workaround eventually. | |||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2023-10-28 ] | |||||||||||||||||||||||
|
From the stacktrace in the description, this looks similar to
We see that __pthread_exit() goes through dynamic libgcc_s.so, but the And in cmake/build_configurations/mysql_release.cmake I see that it uses
So the code crashes exactly at the place where the dynamic libgcc code calls Is there still a way to reproduce this? If so, try removing the 4 occurences It doesn't seem correct to use -lstatic-libgcc, static linking has been | |||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2023-11-06 ] | |||||||||||||||||||||||
|
Thanks, knielsen. Without -static-libgcc it doesn't crash for me. There were few compilation failures though, easy to fix. Otherwise it appears to work now. | |||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2023-11-06 ] | |||||||||||||||||||||||
|
cvicentiu, please, see commits 259233e2e94 don't disable lto in DEB builds (including commits inside 475c39cdbfc) |