[MDEV-23249] __builtin_readcyclecounter in include/my_rdtsc.h causes SIGILL on ARM Created: 2020-07-22 Updated: 2020-09-05 Resolved: 2020-07-23 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Server |
| Affects Version/s: | 10.4.7, 10.4.8, 10.4.9, 10.4.10, 10.4.11, 10.4.12, 10.5.0, 10.5.1, 10.4.13, 10.5.4, 10.5.2, 10.5.3 |
| Fix Version/s: | 10.4.14, 10.5.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Naram Qashat | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | crash | ||
| Environment: |
FreeBSD, aarch64 |
||
| Issue Links: |
|
||||||||||||||||||||||||
| Description |
|
The commit that handles MDEV-19845 makes a mention in the commit message about how it may lead to runtime failures on ARM systems. In trying to use both the 10.4 line and 10.5 line of MariaDB on a Raspberry Pi 3 running FreeBSD, which uses an ARM Cortex-A53 processor, I am running into just that runtime failure. As it says in the subject line, I get a SIGILL when MariaDB tries to call __builtin_readcyclecounter. The easiest fix for this would be to add a preprocessor condition to the check for the builtin's availability for it to not be used when __arm__ is defined. And example of this can be found here. |
| Comments |
| Comment by Naram Qashat [ 2020-07-22 ] | ||||||||||||||||||||
|
I forgot to mention which commit did this: https://github.com/MariaDB/server/commit/5e929ee8a0487f8929386031e84b1884f503eb48 | ||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-07-22 ] | ||||||||||||||||||||
|
If I understand correctly, FreeBSD is using the clang compiler. The primitive __has_builtin should only be defined on clang or possibly GCC 10. How exactly are the executables being compiled? Can you provide a link to the compilation log? Is the -march switch compatible with the Raspberry Pi 3? In my experience so far, compiler builtin functions in gcc or clang should never generate code that would be incompatible with the specified ISA. Also, please show the generated code after your suggested patch. Would my_timer_cycles() return gethrtime(), or would it return the constant 0? There is a related pull request #1620, which provides an ARMv8 implementation for my_timer_cycles() on GCC. | ||||||||||||||||||||
| Comment by Naram Qashat [ 2020-07-22 ] | ||||||||||||||||||||
|
FreeBSD does include use clang by default nowadays. I use FreeBSD's poudriere to build within a jail. The latest compilation log I have is this one: https://poudriere.cyberbotx.com:8766/data/local_aarch64-default/2020-07-21_23h45m55s/logs/mariadb105-server-10.5.4_1.log (I was building with debug symbols to narrow down where the crash occurred.) The -march flag is not compatible with the Raspberry Pi 3, but the -mcpu flag is, although I was getting the crash even when it wasn't in use. (I am using it set to cortex-a53 as that is the processor that the Raspberry Pi 3 uses.) I believe the reason it gets included is that it is a valid language extension for the Cortex-A53, but from talking to others that work with FreeBSD, it does indeed require special permissions to be used from userspace. I believe by default it can only be used from the kernel, at least under FreeBSD and possibly under Linux. I have seen that the Android application Termux, which is a terminal emulator and Linux environment app, has a package for MariaDB and includes a similar patch to the one in the initial report, albeit using __ANDROID__ instead of __arm__. I don't have an easy way to get to the generated code within the build jail, but asking around, it seems that FreeBSD does not have the gethrtime() function, so it would return 0. If it also helps, this is a gdb run before the patch that I was asked to generate by another user: https://pastebin.com/u3GAKw8r This is what led us to seeing that the __builtin_readcyclecounter call was causing the SIGILL. | ||||||||||||||||||||
| Comment by Krunal Bauskar [ 2020-07-22 ] | ||||||||||||||||||||
|
I have Ubuntu 18.04 running on arm64 (armv8.2 kunpeng 920) bit and I was able to run the said function from user-space. So looks like it is limitation or issue with either free-bsd or free-bsd on cortex-53. bash$ cat timer.cc int main() { cout << __builtin_readcyclecounter() << endl; }bash$ clang++-8 timer.cc Value is always returned as 0 so not sure if it fetching the valid register. alternative is to use that will work on gnu too and being used by mysql also proposed in PR#1620. unsigned long long result; | ||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-07-22 ] | ||||||||||||||||||||
|
I see that my_timer_cycles() is frequently called by the function get_rnd_value():
This function is used in sync_array_get() and in ib_counter_t::add() (which could be related to Other callers of my_timer_cycles() outside server initialization are in:
It seems that returning 0 from my_timer_cycles() could cause reduced InnoDB scalability and useless results being returned by Exec_time_tracker. krunalbauskar, can you check if the InnoDB code actually ‘works’ on the ARMv8 hardware that you have access to, possibly after applying pull request #1620? | ||||||||||||||||||||
| Comment by Krunal Bauskar [ 2020-07-22 ] | ||||||||||||||||||||
|
The existing approach of __builtin_readcyclecounter always continues to return 0 on arm64 vs the proposed asm approach to use cntvct_el0 return non-zero values (counter). this is a good reason to replace __builtin_readcyclecounter with new proposed asm assembly. bash$ cat timer.cc int main() { unsigned long long result; __asm __volatile("mrs %0, CNTVCT_EL0" : "=&r" (result)); cout << result << endl; cout << __builtin_readcyclecounter() << endl; }bash$ clang++-8 timer.cc ----------- bash$ g++ timer.cc | ||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-07-22 ] | ||||||||||||||||||||
|
CyberBotX, can you please test the changes from pull request #1620 while also disabling the __builtin_readcyclecounter()? clang supports GCC-style inline assembler. | ||||||||||||||||||||
| Comment by Naram Qashat [ 2020-07-22 ] | ||||||||||||||||||||
|
I can confirm that the pull request's code, combined with adding to the preprocessor condition to ignore __builtin_readcyclecounter being available on aarch64, does allow MariaDB 10.5 to run successfully. A patch combining both can be found in the FreeBSD bug report I submitted to them as PR #248160. |