Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23249

__builtin_readcyclecounter in include/my_rdtsc.h causes SIGILL on ARM

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.4.7, 10.4.8, 10.4.9, 10.4.10, 10.4.11, 10.4.12, 10.5.0, 10.5.1, 10.4.13, 10.5.4, 10.5.2, 10.5.3
    • 10.4.14, 10.5.5
    • Server
    • FreeBSD, aarch64

    Description

      The commit that handles MDEV-19845 makes a mention in the commit message about how it may lead to runtime failures on ARM systems. In trying to use both the 10.4 line and 10.5 line of MariaDB on a Raspberry Pi 3 running FreeBSD, which uses an ARM Cortex-A53 processor, I am running into just that runtime failure. As it says in the subject line, I get a SIGILL when MariaDB tries to call __builtin_readcyclecounter. The easiest fix for this would be to add a preprocessor condition to the check for the builtin's availability for it to not be used when __arm__ is defined. And example of this can be found here.

      Attachments

        Issue Links

          Activity

            CyberBotX Naram Qashat created issue -
            CyberBotX Naram Qashat made changes -
            Field Original Value New Value
            CyberBotX Naram Qashat added a comment - I forgot to mention which commit did this: https://github.com/MariaDB/server/commit/5e929ee8a0487f8929386031e84b1884f503eb48

            If I understand correctly, FreeBSD is using the clang compiler. The primitive __has_builtin should only be defined on clang or possibly GCC 10.

            How exactly are the executables being compiled? Can you provide a link to the compilation log? Is the -march switch compatible with the Raspberry Pi 3? In my experience so far, compiler builtin functions in gcc or clang should never generate code that would be incompatible with the specified ISA.

            Also, please show the generated code after your suggested patch. Would my_timer_cycles() return gethrtime(), or would it return the constant 0?

            There is a related pull request #1620, which provides an ARMv8 implementation for my_timer_cycles() on GCC.

            marko Marko Mäkelä added a comment - If I understand correctly, FreeBSD is using the clang compiler. The primitive __has_builtin should only be defined on clang or possibly GCC 10. How exactly are the executables being compiled? Can you provide a link to the compilation log? Is the -march switch compatible with the Raspberry Pi 3? In my experience so far, compiler builtin functions in gcc or clang should never generate code that would be incompatible with the specified ISA. Also, please show the generated code after your suggested patch. Would my_timer_cycles() return gethrtime() , or would it return the constant 0 ? There is a related pull request #1620 , which provides an ARMv8 implementation for my_timer_cycles() on GCC.
            CyberBotX Naram Qashat added a comment - - edited

            FreeBSD does include use clang by default nowadays.

            I use FreeBSD's poudriere to build within a jail. The latest compilation log I have is this one: https://poudriere.cyberbotx.com:8766/data/local_aarch64-default/2020-07-21_23h45m55s/logs/mariadb105-server-10.5.4_1.log (I was building with debug symbols to narrow down where the crash occurred.) The -march flag is not compatible with the Raspberry Pi 3, but the -mcpu flag is, although I was getting the crash even when it wasn't in use. (I am using it set to cortex-a53 as that is the processor that the Raspberry Pi 3 uses.)

            I believe the reason it gets included is that it is a valid language extension for the Cortex-A53, but from talking to others that work with FreeBSD, it does indeed require special permissions to be used from userspace. I believe by default it can only be used from the kernel, at least under FreeBSD and possibly under Linux. I have seen that the Android application Termux, which is a terminal emulator and Linux environment app, has a package for MariaDB and includes a similar patch to the one in the initial report, albeit using __ANDROID__ instead of __arm__.

            I don't have an easy way to get to the generated code within the build jail, but asking around, it seems that FreeBSD does not have the gethrtime() function, so it would return 0.

            If it also helps, this is a gdb run before the patch that I was asked to generate by another user: https://pastebin.com/u3GAKw8r This is what led us to seeing that the __builtin_readcyclecounter call was causing the SIGILL.

            CyberBotX Naram Qashat added a comment - - edited FreeBSD does include use clang by default nowadays. I use FreeBSD's poudriere to build within a jail. The latest compilation log I have is this one: https://poudriere.cyberbotx.com:8766/data/local_aarch64-default/2020-07-21_23h45m55s/logs/mariadb105-server-10.5.4_1.log (I was building with debug symbols to narrow down where the crash occurred.) The -march flag is not compatible with the Raspberry Pi 3, but the -mcpu flag is, although I was getting the crash even when it wasn't in use. (I am using it set to cortex-a53 as that is the processor that the Raspberry Pi 3 uses.) I believe the reason it gets included is that it is a valid language extension for the Cortex-A53, but from talking to others that work with FreeBSD, it does indeed require special permissions to be used from userspace. I believe by default it can only be used from the kernel, at least under FreeBSD and possibly under Linux. I have seen that the Android application Termux, which is a terminal emulator and Linux environment app, has a package for MariaDB and includes a similar patch to the one in the initial report, albeit using __ANDROID__ instead of __arm__ . I don't have an easy way to get to the generated code within the build jail, but asking around, it seems that FreeBSD does not have the gethrtime() function, so it would return 0. If it also helps, this is a gdb run before the patch that I was asked to generate by another user: https://pastebin.com/u3GAKw8r This is what led us to seeing that the __builtin_readcyclecounter call was causing the SIGILL.
            CyberBotX Naram Qashat made changes -
            Description The commit that handles [MDEV-19845|https://jira.mariadb.org/browse/MDEV-19845] makes a mention in the commit message about how it may lead to runtime failures on ARM systems. In trying to use both the 10.4 line and 10.5 line of MariaDB on a Raspberry Pi 3 running FreeBSD, which uses an ARM Cortex-A53 processor, I am running into just that runtime failure. As it says in the subject line, I get a SIGILL when MariaDB tries to call `__builtin_readcyclecounter`. The easiest fix for this would be to add a preprocessor condition to the check for the builtin's availability for it to not be used when `__arm__` is defined. And example of this can be found [here|https://raw.githubusercontent.com/its-pointless/its-pointless.github.io/master/my_rdtsc.h.patch]. The commit that handles [MDEV-19845|https://jira.mariadb.org/browse/MDEV-19845] makes a mention in the commit message about how it may lead to runtime failures on ARM systems. In trying to use both the 10.4 line and 10.5 line of MariaDB on a Raspberry Pi 3 running FreeBSD, which uses an ARM Cortex-A53 processor, I am running into just that runtime failure. As it says in the subject line, I get a SIGILL when MariaDB tries to call {{__builtin_readcyclecounter}}. The easiest fix for this would be to add a preprocessor condition to the check for the builtin's availability for it to not be used when {{__arm__}} is defined. And example of this can be found [here|https://raw.githubusercontent.com/its-pointless/its-pointless.github.io/master/my_rdtsc.h.patch].
            CyberBotX Naram Qashat made changes -
            Description The commit that handles [MDEV-19845|https://jira.mariadb.org/browse/MDEV-19845] makes a mention in the commit message about how it may lead to runtime failures on ARM systems. In trying to use both the 10.4 line and 10.5 line of MariaDB on a Raspberry Pi 3 running FreeBSD, which uses an ARM Cortex-A53 processor, I am running into just that runtime failure. As it says in the subject line, I get a SIGILL when MariaDB tries to call {{__builtin_readcyclecounter}}. The easiest fix for this would be to add a preprocessor condition to the check for the builtin's availability for it to not be used when {{__arm__}} is defined. And example of this can be found [here|https://raw.githubusercontent.com/its-pointless/its-pointless.github.io/master/my_rdtsc.h.patch]. The commit that handles [MDEV-19845|https://jira.mariadb.org/browse/MDEV-19845] makes a mention in the commit message about how it may lead to runtime failures on ARM systems. In trying to use both the 10.4 line and 10.5 line of MariaDB on a Raspberry Pi 3 running FreeBSD, which uses an ARM Cortex-A53 processor, I am running into just that runtime failure. As it says in the subject line, I get a SIGILL when MariaDB tries to call {{\_\_builtin\_readcyclecounter}}. The easiest fix for this would be to add a preprocessor condition to the check for the builtin's availability for it to not be used when {{\_\_arm\_\_}} is defined. And example of this can be found [here|https://raw.githubusercontent.com/its-pointless/its-pointless.github.io/master/my_rdtsc.h.patch].
            krunalbauskar Krunal Bauskar added a comment - - edited

            I have Ubuntu 18.04 running on arm64 (armv8.2 kunpeng 920) bit and I was able to run the said function from user-space. So looks like it is limitation or issue with either free-bsd or free-bsd on cortex-53.

            bash$ cat timer.cc
            #include <iostream>
            using namespace std;

            int main()

            { cout << __builtin_readcyclecounter() << endl; }

            bash$ clang++-8 timer.cc
            bash$ ./a.out
            0
            bash$ vi atomic^C
            bash$ vi timer.s
            bash$ grep "mrs" timer.s
            mrs x1, PMCCNTR_EL0
            bash$


            Value is always returned as 0 so not sure if it fetching the valid register.

            alternative is to use that will work on gnu too and being used by mysql also proposed in PR#1620.

            unsigned long long result;
            __asm __volatile("mrs %0, CNTVCT_EL0" : "=&r" (result));

            krunalbauskar Krunal Bauskar added a comment - - edited I have Ubuntu 18.04 running on arm64 (armv8.2 kunpeng 920) bit and I was able to run the said function from user-space. So looks like it is limitation or issue with either free-bsd or free-bsd on cortex-53. bash$ cat timer.cc #include <iostream> using namespace std; int main() { cout << __builtin_readcyclecounter() << endl; } bash$ clang++-8 timer.cc bash$ ./a.out 0 bash$ vi atomic^C bash$ vi timer.s bash$ grep "mrs" timer.s mrs x1, PMCCNTR_EL0 bash$ Value is always returned as 0 so not sure if it fetching the valid register. alternative is to use that will work on gnu too and being used by mysql also proposed in PR#1620. unsigned long long result; __asm __volatile("mrs %0, CNTVCT_EL0" : "=&r" (result));

            I see that my_timer_cycles() is frequently called by the function get_rnd_value():

            static inline size_t
            get_rnd_value()
            {
            	size_t c = static_cast<size_t>(my_timer_cycles());
             
            	if (c != 0) {
            		return c;
            	}
             
            	/* We may go here if my_timer_cycles() returns 0,
            	so we have to have the plan B for the counter. */
            #if !defined(_WIN32)
            	return (size_t)os_thread_get_curr_id();
            #else
            	LARGE_INTEGER cnt;
            	QueryPerformanceCounter(&cnt);
             
            	return static_cast<size_t>(cnt.QuadPart);
            #endif /* !_WIN32 */
            }
            

            This function is used in sync_array_get() and in ib_counter_t::add() (which could be related to MDEV-21212).

            Other callers of my_timer_cycles() outside server initialization are in:

            • Exec_time_tracker::start_tracking() and Exec_time_tracker::cycles_stop_tracking()
            • Some code in storage/perfschema/pfs_timer.cc that might choose a ‘working’ timer at startup and then keep using it.

            It seems that returning 0 from my_timer_cycles() could cause reduced InnoDB scalability and useless results being returned by Exec_time_tracker.

            krunalbauskar, can you check if the InnoDB code actually ‘works’ on the ARMv8 hardware that you have access to, possibly after applying pull request #1620?

            marko Marko Mäkelä added a comment - I see that my_timer_cycles() is frequently called by the function get_rnd_value() : static inline size_t get_rnd_value() { size_t c = static_cast < size_t >(my_timer_cycles());   if (c != 0) { return c; }   /* We may go here if my_timer_cycles() returns 0, so we have to have the plan B for the counter. */ #if !defined(_WIN32) return ( size_t )os_thread_get_curr_id(); #else LARGE_INTEGER cnt; QueryPerformanceCounter(&cnt);   return static_cast < size_t >(cnt.QuadPart); #endif /* !_WIN32 */ } This function is used in sync_array_get() and in ib_counter_t::add() (which could be related to MDEV-21212 ). Other callers of my_timer_cycles() outside server initialization are in: Exec_time_tracker::start_tracking() and Exec_time_tracker::cycles_stop_tracking() Some code in storage/perfschema/pfs_timer.cc that might choose a ‘working’ timer at startup and then keep using it. It seems that returning 0 from my_timer_cycles() could cause reduced InnoDB scalability and useless results being returned by Exec_time_tracker . krunalbauskar , can you check if the InnoDB code actually ‘works’ on the ARMv8 hardware that you have access to, possibly after applying pull request #1620 ?
            marko Marko Mäkelä made changes -

            The existing approach of __builtin_readcyclecounter always continues to return 0 on arm64 vs the proposed asm approach to use cntvct_el0 return non-zero values (counter).

            this is a good reason to replace __builtin_readcyclecounter with new proposed asm assembly.

            bash$ cat timer.cc
            #include <iostream>
            using namespace std;

            int main()

            { unsigned long long result; __asm __volatile("mrs %0, CNTVCT_EL0" : "=&r" (result)); cout << result << endl; cout << __builtin_readcyclecounter() << endl; }

            bash$ clang++-8 timer.cc
            bash$ ./a.out
            829795946130603
            0
            bash$ ./a.out
            829796087735762
            0
            bash$ ./a.out
            829796206079031
            0
            bash$ ./a.out
            829796414853453
            0

            -----------
            with gcc (gcc doesn't support __builtin_readcyclecounter it is clang specific)

            bash$ g++ timer.cc
            bash$ ./a.out
            829809355536080
            bash$ ./a.out
            829809446827453
            bash$ ./a.out
            829809557542989

            krunalbauskar Krunal Bauskar added a comment - The existing approach of __builtin_readcyclecounter always continues to return 0 on arm64 vs the proposed asm approach to use cntvct_el0 return non-zero values (counter). this is a good reason to replace __builtin_readcyclecounter with new proposed asm assembly. bash$ cat timer.cc #include <iostream> using namespace std; int main() { unsigned long long result; __asm __volatile("mrs %0, CNTVCT_EL0" : "=&r" (result)); cout << result << endl; cout << __builtin_readcyclecounter() << endl; } bash$ clang++-8 timer.cc bash$ ./a.out 829795946130603 0 bash$ ./a.out 829796087735762 0 bash$ ./a.out 829796206079031 0 bash$ ./a.out 829796414853453 0 ----------- with gcc (gcc doesn't support __builtin_readcyclecounter it is clang specific) bash$ g++ timer.cc bash$ ./a.out 829809355536080 bash$ ./a.out 829809446827453 bash$ ./a.out 829809557542989

            CyberBotX, can you please test the changes from pull request #1620 while also disabling the __builtin_readcyclecounter()? clang supports GCC-style inline assembler.

            marko Marko Mäkelä added a comment - CyberBotX , can you please test the changes from pull request #1620 while also disabling the __builtin_readcyclecounter() ? clang supports GCC-style inline assembler.
            CyberBotX Naram Qashat added a comment -

            I can confirm that the pull request's code, combined with adding to the preprocessor condition to ignore __builtin_readcyclecounter being available on aarch64, does allow MariaDB 10.5 to run successfully. A patch combining both can be found in the FreeBSD bug report I submitted to them as PR #248160.

            CyberBotX Naram Qashat added a comment - I can confirm that the pull request's code, combined with adding to the preprocessor condition to ignore __builtin_readcyclecounter being available on aarch64, does allow MariaDB 10.5 to run successfully. A patch combining both can be found in the FreeBSD bug report I submitted to them as PR #248160 .
            marko Marko Mäkelä made changes -
            Fix Version/s 10.4.14 [ 24305 ]
            Fix Version/s 10.5.5 [ 24423 ]
            Assignee Marko Mäkelä [ marko ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            Sam Sahil Mamgain made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 111543 ] MariaDB v4 [ 158129 ]
            danblack Daniel Black added a comment -

            update on the clang compilation of __buildin_readcyclecounter

            Clang Version Mapping
            16.0 .0- 20.0.0 (current trunk) CNTVCT_EL0
            15.0.0 xzr (0 value register - glad missed that one)
            < 14.0.0 PMCCNTR_EL0
            danblack Daniel Black added a comment - update on the clang compilation of __buildin_readcyclecounter Clang Version Mapping 16.0 .0- 20.0.0 (current trunk) CNTVCT_EL0 15.0.0 xzr (0 value register - glad missed that one) < 14.0.0 PMCCNTR_EL0

            People

              marko Marko Mäkelä
              CyberBotX Naram Qashat
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.