[MDEV-27936] hardware lock elision on ppc64{,le} failing to compile Created: 2022-02-24  Updated: 2022-04-01  Resolved: 2022-03-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.8, 10.7.4, 10.8.3

Type: Bug Priority: Major
Reporter: Daniel Black Assignee: Daniel Black
Resolution: Fixed Votes: 0
Labels: None
Environment:

ppc64 / ppc64le


Issue Links:
Problem/Incident
is caused by MDEV-26769 InnoDB internal latches do not suppor... Closed

 Description   

From https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.6&arch=ppc64&ver=1%3A10.6.7-1&stamp=1645322566&raw=0

In file included from /<<PKGBUILDDIR>>/storage/innobase/include/transactional_lock_guard.h:73,
                 from /<<PKGBUILDDIR>>/storage/innobase/sync/srw_lock.cc:22:
/usr/lib/gcc/powerpc64-linux-gnu/11/include/htmxlintrin.h: In function ‘void __TM_abort()’:
/usr/lib/gcc/powerpc64-linux-gnu/11/include/htmxlintrin.h:94:3: error: ‘__builtin_tabort’ was not declared in this scope; did you mean ‘__builtin_abort’?
   94 |   __builtin_tabort (0);
      |   ^~~~~~~~~~~~~~~~

Per https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html, the builtins are only available when -mhtm or -mcpu=power8 or later. Neither of these are used in the compile.

bb-10.6-danielblack-MDEV-26769-htm-flag-for-ppc64 pushed, see how buildbots handle it. Proper fix might be to make the hardware lock ellision somehow dependent on compile flag detection. I'm also unsure if AIX supports it.

HTM was introduced POWER8, so I'm unsure if older POWER will survive this change.



 Comments   
Comment by Marko Mäkelä [ 2022-02-24 ]

Related to this, if you can figure out how to make that code compile on s390x, it would be great if you can do that. The programming interface should be very similar on POWER and s390x. I tried it a few times, but eventually I gave up on s390x because at that time, I only had indirect access to s390x via the CI system, with delays of several hours.

For experimenting, you might use https://github.com/dr-m/atomic_sync which is a stand-alone version of the code. Much faster to compile and test than the full MariaDB source code.

Comment by Daniel Black [ 2022-02-25 ]

s390x - apparently identical - https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html

Blind testing on bb-10.6-danielblack-MDEV-27936-htm-flag-for-ppc64.

Looking closer, can't see why https://buildbot.mariadb.org/#/grid?branch=10.6, there is ppc64le-debian-

{10,11,sid}

-autobake all running fine and it fails when it gets to debian.

Comment by Daniel Black [ 2022-02-28 ]

comparing an actual supported achitecture, ppc64le

htmxlintrin.h is provided by libgcc-11-dev

deb

Toolchain package versions: binutils_2.38-1 dpkg-dev_1.21.1 g++-11_11.2.0-16 gcc-11_11.2.0-16 libc6-dev_2.33-6 libstdc++-11-dev_11.2.0-16 libstdc++6_11.2.0-16 linux-libc-dev_5.16.7-2
 
	cd builddir && cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON "-GUnix Makefiles" -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INSTALL_LIBDIR=lib/powerpc64le-linux-gnu -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_PMEM=yes "-DCOMPILATION_COMMENT=Debian buildd-unstable " -DMYSQL_SERVER_SUFFIX=-1 -DSYSTEM_TYPE=debian-linux-gnu -DCMAKE_SYSTEM_PROCESSOR=ppc64el -DBUILD_CONFIG=mysql_release -DCONC_DEFAULT_CHARSET=utf8mb4 -DPLUGIN_AWS_KEY_MANAGEMENT=NO -DPLUGIN_COLUMNSTORE=NO -DWITH_NUMA=auto -DIGNORE_AIO_CHECK=YES -DWITH_URING=YES -DWITH_INNODB_SNAPPY=ON -DDEB=Debian ..
 
 
-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
 

bb.org - ppc64le-debian-sid-deb-autobake builder, worker=p9-rhel8-bbw1-docker-debian-sid

$ podman run --rm --arch ppc64le -ti quay.io/mariadb-foundation/bb-worker:debiansid bash
gcc-11                                   11.2.0-17
gcc-11-base:ppc64el                      11.2.0-17 
libgcc-11-dev:ppc64el                    11.2.0-17
 
include header - /usr/lib/gcc/powerpc64le-linux-gnu/11/include/htmxlintrin.h appears same #error on line 25
 
	cd builddir && cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON "-GUnix Makefiles" -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INSTALL_LIBDIR=lib/powerpc64le-linux-gnu -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_PMEM=yes "-DCOMPILATION_COMMENT=mariadb.org binary distribution" -DMYSQL_SERVER_SUFFIX=-1:10.6.8\+maria\~sid -DSYSTEM_TYPE=debian-linux-gnu -DCMAKE_SYSTEM_PROCESSOR=ppc64el -DBUILD_CONFIG=mysql_release -DCONC_DEFAULT_CHARSET=utf8mb4 -DPLUGIN_AWS_KEY_MANAGEMENT=NO -DIGNORE_AIO_CHECK=YES -DWITH_URING=yes -DDEB=Debian ..
-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
[ 72%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o
cd /buildbot/ppc64le-debian-sid-deb-autobake/build/builddir/storage/innobase && /usr/lib/ccache/c++ -DBTR_CUR_ADAPT -DBTR_CUR_HASH_ADAPT -DCOMPILER_HINTS -DDBUG_TRACE -DHAVE_CONFIG_H -DHAVE_FALLOC_PUNCH_HOLE_AND_KEEP_SIZE=1 -DHAVE_LZ4=1 -DHAVE_LZ4_COMPRESS_DEFAULT=1 -DHAVE_OPENSSL -DHAVE_SCHED_GETCPU=1 -DHAVE_URING -DWITH_INNODB_DISALLOW_WRITES -D_FILE_OFFSET_BITS=64 -I/buildbot/ppc64le-debian-sid-deb-autobake/build/wsrep-lib/include -I/buildbot/ppc64le-debian-sid-deb-autobake/build/wsrep-lib/wsrep-API/v26 -I/buildbot/ppc64le-debian-sid-deb-autobake/build/builddir/include -I/buildbot/ppc64le-debian-sid-deb-autobake/build/storage/innobase/include -I/buildbot/ppc64le-debian-sid-deb-autobake/build/storage/innobase/handler -I/buildbot/ppc64le-debian-sid-deb-autobake/build/libbinlogevents/include -I/buildbot/ppc64le-debian-sid-deb-autobake/build/tpool -I/buildbot/ppc64le-debian-sid-deb-autobake/build/include -I/buildbot/ppc64le-debian-sid-deb-autobake/build/sql -g -O2 -ffile-prefix-map=/buildbot/ppc64le-debian-sid-deb-autobake/build=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -pie -fPIC -fstack-protector --param=ssp-buffer-size=4 -Wconversion -Wno-sign-conversion -O3 -g -static-libgcc -fno-omit-frame-pointer -fno-strict-aliasing -Wno-uninitialized -fno-omit-frame-pointer -D_FORTIFY_SOURCE=2 -DDBUG_OFF -Wall -Wextra -Wformat-security -Wno-format-truncation -Wno-init-self -Wno-nonnull-compare -Wno-unused-parameter -Woverloaded-virtual -Wnon-virtual-dtor -Wvla -Wwrite-strings   -Wdate-time -D_FORTIFY_SOURCE=2 -DUNIV_LINUX -D_GNU_SOURCE=1  -fvisibility=hidden -std=gnu++11 -MD -MT storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o -MF CMakeFiles/innobase.dir/btr/btr0btr.cc.o.d -o CMakeFiles/innobase.dir/btr/btr0btr.cc.o -c /buildbot/ppc64le-debian-sid-deb-autobake/build/storage/innobase/btr/btr0btr.cc

Only real difference, minor gcc package version that was released on 23 February.
There is a rather long list of PPC fixes on https://metadata.ftp-master.debian.org/changelogs//main/g/gcc-11/gcc-11_11.2.0-18_changelog for 11.2.0-17.

otto please rebuild to see if this reoccurs.

Comment by Daniel Black [ 2022-03-02 ]

Apparently our debian sid ppc64le started failing the same way

https://buildbot.mariadb.org/#/grid?branch=10.6

Rebuild unsuccessful

https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.6&arch=ppc64&ver=1%3A10.6.7-1&stamp=1646197548&raw=0

Probably need a fix like https://github.com/MariaDB/server/commit/ce895ffe7ccb2d46285c0ae2c14acd65009ba8ce for ppc64, moving the xbegin/xend into non-line functions.

Comment by Daniel Black [ 2022-03-02 ]

bb-10.6-danielblack-MDEV-27936-ppc64-htm-build-fail pushed as test.

Comment by Otto Kekäläinen [ 2022-03-02 ]

otto please rebuild to see if this reoccurs.

Yes it reproducible as you noticed on buildbot now as well.

Downstream bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006527

Comment by Daniel Black [ 2022-03-03 ]

/home/buildbot/ppc64le-debian-sid/build/storage/innobase/sync/srw_lock.cc: In function ‘bool xbegin()’:
/home/buildbot/ppc64le-debian-sid/build/storage/innobase/sync/srw_lock.cc:63:23: error: ‘__builtin_tbegin’ was not declared in this scope; did you mean ‘__builtin_tan’?
   63 |     __builtin_expect (__builtin_tbegin(0), 1);
      |                       ^~~~~~~~~~~~~~~~
      |                       __builtin_tan
/home/buildbot/ppc64le-debian-sid/build/storage/innobase/sync/srw_lock.cc: In function ‘void xabort()’:
/home/buildbot/ppc64le-debian-sid/build/storage/innobase/sync/srw_lock.cc:67:17: error: ‘__builtin_tabort’ was not declared in this scope; did you mean ‘__builtin_abort’?
   67 | void xabort() { __builtin_tabort(0); }
      |                 ^~~~~~~~~~~~~~~~
      |                 __builtin_abort
/home/buildbot/ppc64le-debian-sid/build/storage/innobase/sync/srw_lock.cc: In function ‘void xend()’:
/home/buildbot/ppc64le-debian-sid/build/storage/innobase/sync/srw_lock.cc:70:15: error: ‘__builtin_tend’ was not declared in this scope; did you mean ‘__builtin_tanl’?
   70 | void xend() { __builtin_tend(0); }
      |               ^~~~~~~~~~~~~~
      |               __builtin_tanl
make[2]: *** [storage/innobase/unittest/CMakeFiles/innodb_sync-t.dir/build.make:90: storage/innobase/unittest/CMakeFiles/innodb_sync-t.dir/__/sync/srw_lock.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:4903: storage/innobase/unittest/CMakeFiles/innodb_sync-t.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Avoiding this just requires the -mhtm flag to be passed per:

podman run --arch ppc64le -ti debian:sid

root@0b49bb233a4d:/# gcc -o main main.c
main.c: In function 'f':
main.c:5:16: warning: implicit declaration of function '__builtin_tbegin'; did you mean '__builtin_asin'? [-Wimplicit-function-declaration]
    5 |         return __builtin_tbegin(0);
      |                ^~~~~~~~~~~~~~~~
      |                __builtin_asin
/usr/bin/ld: /tmp/ccaqLefJ.o: in function `f':
main.c:(.text+0x20): undefined reference to `__builtin_tbegin'
collect2: error: ld returned 1 exit status
root@0b49bb233a4d:/# gcc -mhtm -o main main.c
root@0b49bb233a4d:/# cat main.c
 
__attribute__((target("htm")))
int f()
{
	return __builtin_tbegin(0);
}
 
int main()
{
 
 
	return f();
}
root@0b49bb233a4d:/# uname -a
Linux 0b49bb233a4d 5.17.0-0.rc6.109.fc37.x86_64 #1 SMP PREEMPT Mon Feb 28 15:48:52 UTC 2022 ppc64le GNU/Linux

But cmake isn't doing it for me at the moment.

Comment by Daniel Black [ 2022-03-04 ]

otto Please try https://github.com/MariaDB/server/commit/5c8e9cacea2e913c0da74686bfbfeff0a33c2cee

Its failing on our bb on side for unknown reasons, however it looks like its ignoring the addition of the -mhtm flag on the srw_lock.cc file. I built the entire innodb with quay.io/mariadb-foundation/bb-worker:debiansid under ppc64le qemu without a problem.

Comment by Daniel Black [ 2022-03-08 ]

the unit test compile of srw_lock.cc needed the cflag too.

marko can you please review bb-10.6-danielblack-MDEV-27936-ppc64-htm-build-fail

Comment by Otto Kekäläinen [ 2022-03-08 ]

Thanks! The commit above was applied on Debian packaging repo in https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/3478b9d9b2f0520be4f36b84aa444e4cdb90e9da and uploaded, but unfortunately build still fails: https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.6&arch=ppc64el&ver=1%3A10.6.7-3%7Eexp1&stamp=1646726091&raw=0

Full status of latest version in Debian experimental: https://buildd.debian.org/status/package.php?p=mariadb-10.6&suite=experimental

Comment by Daniel Black [ 2022-03-09 ]

otto wrong patch - its bd5f7f0f8930fa343ee676fdffa1cf26b5e12e70 that includes the storage/innobase/unittest/CMakeLists.txt change adding the htm cflag to ../sync/srw_lock.cc

Succeeding on https://buildbot.mariadb.org/#/grid?branch=bb-10.6-danielblack-MDEV-27936-ppc64-htm-build-fail for the ppc64le sid builds. (test failures are something infrastructure related - MDBF-351)

Comment by Otto Kekäläinen [ 2022-03-09 ]

danblack Updated Debian patch in https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/be847f26c4b5d2ecd6defa9db267a85dc84745ae - does this look good, shall I upload?

Comment by Daniel Black [ 2022-03-09 ]

otto, marko,

I moved https://github.com/MariaDB/server/commit/11e68988d9698c3b1f79b8a3a41f81502b3e095c back to using the high level directives, checked the compile (locally under qemu), for less change and spelt elision right.

Are the hot attributes or anything else unacceptable?

Comment by Marko Mäkelä [ 2022-03-09 ]

Thank you, this looks OK to me. According to https://godbolt.org the hot function attribute is supported already by GCC 4.8.5 and the oldest version of clang that I tested.

Comment by Otto Kekäläinen [ 2022-03-10 ]

I changed the version in Debian now to latest version Daniel posted here: https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/e6f568bf63cbae53e25b7f10170d71c68a3c6e62

End result https://salsa.debian.org/mariadb-team/mariadb-server/-/blob/e6f568bf63cbae53e25b7f10170d71c68a3c6e62/debian/patches/1006527-fix-ppc64-ftbfs.patch equals https://github.com/MariaDB/server/commit/11e68988d9698c3b1f79b8a3a41f81502b3e095c.

OK?

Generated at Thu Feb 08 09:56:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.