Details

    • 10.1.8-1, 10.1.8-3

    Description

      See http://bugs.mysql.com/bug.php?id=74832

      Patch has languished since Nov 2014.

      So i'm just pasting in my original description here:

      The code here has seemingly changed in 5.7.5 compared to previous versions. I'm talking about 5.7.5-m15 here.

      The purpose of ut_delay() is to spin for a while before InnoDB attempts to again acquire a mutex. Optimizations include (on x86) calling the pause instruction inside the spin loop and (on POWER) setting the thread priority to low for the duration of ut_delay.

      Here is the current (MySQL 5.7.5) implementation of ut_delay:
      ulint
      ut_delay(
      /=====/
      ulint delay) /*!< in: delay in microseconds on 100 MHz Pentium */
      {
      ulint i, j;

      UT_LOW_PRIORITY_CPU();

      j = 0;

      for (i = 0; i < delay * 50; i++)

      { j += i; UT_RELAX_CPU(); }

      if (ut_always_false)

      { ut_always_false = (ibool) j; }

      UT_RESUME_PRIORITY_CPU();

      return(j);
      }

      There are a couple of problems with this code:
      1) ut_always_false could quite legitimately be compiled away by the compiler
      2) j is actually unneeded and if UT_RELAX_CPU() was not implemented, then the compiler could legitimately completely optimize away the loop

      But there's another problem that's a bit hidden....

      In ut0ut.h we have the following:
      #ifndef UNIV_HOTBACKUP

      1. if defined(HAVE_PAUSE_INSTRUCTION)
        /* According to the gcc info page, asm volatile means that the
        instruction has important side-effects and must not be removed.
        Also asm volatile may trigger a memory barrier (spilling all registers
        to memory). */
      2. ifdef __SUNPRO_CC
      3. define UT_RELAX_CPU() asm ("pause" )
      4. else
      5. define UT_RELAX_CPU() _asm_ _volatile_ ("pause")
      6. endif /* __SUNPRO_CC */
      1. elif defined(HAVE_FAKE_PAUSE_INSTRUCTION)
      2. define UT_RELAX_CPU() _asm_ _volatile_ ("rep; nop")
      3. elif defined(HAVE_ATOMIC_BUILTINS)
      4. define UT_RELAX_CPU() do { \ volatile lint volatile_var; \ os_compare_and_swap_lint(&volatile_var, 0, 1); \ } while (0)
        # elif defined(HAVE_WINDOWS_ATOMICS)
        /* In the Win32 API, the x86 PAUSE instruction is executed by calling
        the YieldProcessor macro defined in WinNT.h. It is a CPU architecture-
        independent way by using YieldProcessor. */
        # define UT_RELAX_CPU() YieldProcessor()
        # else
        # define UT_RELAX_CPU() ((void)0) /* avoid warning for an empty statement */
        # endif

        Which if HAVE_PAUSE_INSTRUCTION or HAVE_FAKE_PAUSE_INSTRUCTION are defined (i.e. recent x86), you'll get the desired effect, there will be a pause instruction.

        However, if you HAVE_ATOMIC_BUILTINS, then you get this:
        do { volatile lint volatile_var; os_compare_and_swap_lint(&volatile_var, 0, 1); }

        while (0)

      Which is anything but relaxing. So, on POWER, where we have atomics but not pause instruction, we get that instead of an empty statement.

      This likely affects other platforms too (e.g. SPARC, MIPS, ARM, ia64, mips, m68k... basically everything that isn't x86).

      What we really want here is instead of that, just a compiler barrier, so that it knows that it cannot optimize away the loop.

      Back to ut_delay, if we look at the original PowerPC assembler for this, it's rather larger than it needs to be:

      0000000000000380 <._Z8ut_delaym>:
      380: fb e1 ff f8 std r31,-8(r1)
      384: f8 21 ff b1 stdu r1,-80(r1)
      388: 7c 3f 0b 78 mr r31,r1
      38c: 7c 21 0b 78 mr r1,r1
      390: 1d 03 00 32 mulli r8,r3,50
      394: 38 60 00 00 li r3,0
      398: 2f a8 00 00 cmpdi cr7,r8,0
      39c: 41 9e 00 44 beq cr7,3e0 <._Z8ut_delaym+0x60>
      3a0: 39 20 00 00 li r9,0
      3a4: 38 e0 00 01 li r7,1
      3a8: 60 00 00 00 nop
      3ac: 60 00 00 00 nop
      3b0: 7c 00 04 ac sync
      3b4: 7c 63 4a 14 add r3,r3,r9
      3b8: 38 df 00 30 addi r6,r31,48
      3bc: 7d 40 30 a8 ldarx r10,0,r6
      3c0: 2c 2a 00 00 cmpdi r10,0
      3c4: 40 82 00 0c bne 3d0 <._Z8ut_delaym+0x50>
      3c8: 7c e0 31 ad stdcx. r7,0,r6
      3cc: 40 a2 ff ec bne 3b8 <._Z8ut_delaym+0x38>
      3d0: 4c 00 01 2c isync
      3d4: 39 29 00 01 addi r9,r9,1
      3d8: 7f a9 40 40 cmpld cr7,r9,r8
      3dc: 40 9e ff d4 bne cr7,3b0 <._Z8ut_delaym+0x30>
      3e0: 3c c2 00 00 addis r6,r2,0
      3e4: e9 26 00 00 ld r9,0(r6)
      3e8: 2f a9 00 00 cmpdi cr7,r9,0
      3ec: 41 9e 00 08 beq cr7,3f4 <._Z8ut_delaym+0x74>
      3f0: f8 66 00 00 std r3,0(r6)
      3f4: 7c 42 13 78 mr r2,r2
      3f8: 38 3f 00 50 addi r1,r31,80
      3fc: eb e1 ff f8 ld r31,-8(r1)
      400: 4e 80 00 20 blr

      The bits that stare at me are the sync and isync instructions. We're executing memory barriers in there! In a loop! When we're meant to be relaxing!

      So, once I remove the buggy UT_RELAX_CPU() implementation and simplify ut_delay (patch attached), I end up with:

      0000000000000380 <._Z8ut_delaym>:
      380: fb e1 ff f8 std r31,-8(r1)
      384: f8 21 ff c1 stdu r1,-64(r1)
      388: 7c 3f 0b 78 mr r31,r1
      38c: 7c 21 0b 78 mr r1,r1
      390: 1c 63 00 32 mulli r3,r3,50
      394: 7c 69 03 a6 mtctr r3
      398: 2f a3 00 00 cmpdi cr7,r3,0
      39c: 41 9e 00 08 beq cr7,3a4 <._Z8ut_delaym+0x24>
      3a0: 42 00 00 00 bdnz 3a0 <._Z8ut_delaym+0x20>
      3a4: 7c 42 13 78 mr r2,r2
      3a8: 38 3f 00 40 addi r1,r31,64
      3ac: eb e1 ff f8 ld r31,-8(r1)
      3b0: 4e 80 00 20 blr
      3b4: 00 00 00 00 .long 0x0
      3b8: 00 09 00 00 .long 0x90000
      3bc: 80 01 00 00 lwz r0,0(r1)

      Which is exactly what we should be doing - we go into low priority (mr r1,r1), spin for a while, then resume normal priority (mr r2, r2) and return. We also avoid doing unnecessary work (which is good).

      This also may have a positive performance impact on x86 as the extra math and work around there would have to be done, and IIRC modern KVM on x86 will trap the pause instruction and attempt to schedule a vcpu that may hold the lock that we're spinning for.

      How to repeat:
      look at profiles, or disassemble code and examine it (like I've done above)

      Suggested fix:
      merge my patch (attached) that fixes this.

      Attachments

        1. psdoit
          0.6 kB
        2. sql
          314 kB

        Issue Links

          Activity

            jplindst, could you also share your thoughts on this?

            svoj Sergey Vojtovich added a comment - jplindst , could you also share your thoughts on this?

            Very interesting to see that different implementations have very different amount of time used on that single function. However, results are not clear, why the middle one that has the least amount of time used has significantly lower performance? Does it hit the real mutex wait sooner? When you compare the first one and the last one, there is no significant difference and I do not see any real reason to change from original to last one. Do you see similar results when number of threads are varied?

            jplindst Jan Lindström (Inactive) added a comment - Very interesting to see that different implementations have very different amount of time used on that single function. However, results are not clear, why the middle one that has the least amount of time used has significantly lower performance? Does it hit the real mutex wait sooner? When you compare the first one and the last one, there is no significant difference and I do not see any real reason to change from original to last one. Do you see similar results when number of threads are varied?
            stewart-ibm Stewart Smith added a comment -

            (Sorry I've been a bit behind on this - a million other things going on).

            I'm totally in favour of instead just using pthread mutexes and fixing the problem there! I'm doubly in support of doing this upstream too.

            I think what's going on with the first patch and decreased performance is that we poke the cacheline with the mutex in it a lot more and we end up sleeping a lot sooner.

            I don't think anyone has looked at what's the best way to spin like this for a while, I'll poke some people and see if we can come up with something better, as we may want to poke into various in-depth simulators/modelling tools and even just ask the chip designers.

            stewart-ibm Stewart Smith added a comment - (Sorry I've been a bit behind on this - a million other things going on). I'm totally in favour of instead just using pthread mutexes and fixing the problem there! I'm doubly in support of doing this upstream too. I think what's going on with the first patch and decreased performance is that we poke the cacheline with the mutex in it a lot more and we end up sleeping a lot sooner. I don't think anyone has looked at what's the best way to spin like this for a while, I'll poke some people and see if we can come up with something better, as we may want to poke into various in-depth simulators/modelling tools and even just ask the chip designers.
            danblack Daniel Black added a comment - - edited

            Ok. Have just added https://github.com/MariaDB/server/pull/168 containing two patches from MySQL-5.7, Svoj's patch and an implementation of a UT_RELAX_CPU that isn't harmful outside of the CPU.

            Running with a sysbench (sysbench --test=sysbench/tests/db/select.lua --oltp_tables_count=64 --oltp-table-size=500000 --mysql-socket=/tmp/mysql.sock --mysql-user=root --max-time=600 --max-requests=2000000000 --report-interval=20
            --db-dirver=mysql --mysql-table-engine=innodb --num-threads=300) shows the following.

            Overall the patches change the perf split between ut_delay, mutex_spin_wait and _raw_spin_lock

            Is running on ppc64 on SMT=8

            Patch point ut_delay mutex_split_wait _raw_spin_lock Total reads/second
            10.1 HEAD - commit 9f5b285662ed8c13d6e87d8baf2f0ad4484d4a85 45.87% 14.91% 13.67% 74.45 101276.09
            After two Oracle upstream patches 6.20 25.24 30.36 62.80 93680.53
            After Svoj's patch 6.23 26.44 30.24 62.91 94697.7
            after mfspr patch 16.28 23.98 23.14 64.4 112055.13

            I haven't done a x86 comparison yet as the really only impact is the removal of the maths which the micobenchmarks show as insignificant.

            So we've got a 10.645 increase in TPS while using 10% less cpu time.

            breakdown of perf on the HEAD:

            -   45.87%  mysqld   mysqld               [.] ut_delay                                                                                                                                                                                â–’
               - ut_delay                                                                                                                                                                                                                         â–’
                  + 37.93% trx_start_low                                                                                                                                                                                                          â–’
                  + 32.58% trx_commit_low                                                                                                                                                                                                         â–’
                  + 29.25% read_view_open_now                                                                                                                                                                                                     â–’
            -   14.91%  mysqld   mysqld               [.] mutex_spin_wait                                                                                                                                                                         â–’
               - mutex_spin_wait                                                                                                                                                                                                                  â–’
                  + 37.30% trx_start_low                                                                                                                                                                                                          â–’
                  + 32.69% trx_commit_low                                                                                                                                                                                                         â–’
                  + 29.87% read_view_open_now                                                                                                                                                                                                     â–’
            -   13.67%  mysqld   [kernel.kallsyms]    [k] _raw_spin_lock                                                                                                                                                                          â–’
               - _raw_spin_lock                                                                                                                                                                                                                   â–’
                  - 55.96% futex_wait_setup                                                                                                                                                                                                       â–’
                     - 38.83% _raw_spin_lock                                                                                                                                                                                                      â–’
                        - 97.77% pthread_mutex_lock                                                                                                                                                                                               â–’
                           - 74.38% os_mutex_enter                                                                                                                                                                                                â–’
                              - 62.70% sync_array_wait_event                                                                                                                                                                                      â–’
                                 - mutex_spin_wait                                                                                                                                                                                                â–’
                                    + 40.84% trx_start_low                                                                                                                                                                                        â–’
                                    + 30.21% trx_commit_low                                                                                                                                                                                       â–’
                                    + 28.95% read_view_open_now                                                                                                                                                                                   â–’
                              + 34.79% sync_array_reserve_cell                                                                                                                                                                                    â–’
                              + 2.51% sync_array_free_cell                                                                                                                                                                                        â–’
                           + 11.26% os_event_wait_low                                                                                                                                                                                             â–’
                           + 11.06% os_event_reset                                                                                                                                                                                                â–’
                           + 3.31% os_event_set                                                                                                                                                                                                   â–’
                        + 2.19% 0xe77c                                                                                                                                                                                                            â–’
                     + 30.96% 0x15f04                                                                                                                                                                                                             â–’
                     + 30.07% 0x15ea8                                                                                                                                                                                                             â–’
                  - 40.07% futex_wake                                                                                                                                                                                                             â–’
                     - 59.74% pthread_mutex_unlock                                                                                                                                                                                                â–’
                        - 64.16% os_mutex_exit                                                                                                                                                                                                    â–’
                           - os_mutex_exit                                                                                                                                                                                                        â–’
                              + 69.87% sync_array_reserve_cell                                                                                                                                                                                    â–’
                              + 28.20% sync_array_wait_event                                                                                                                                                                                      â–’
                              + 1.93% sync_array_free_cell                                                                                                                                                                                        â–’
                        - 16.41% os_event_wait_low                                                                                                                                                                                                â–’
                             os_event_wait_low                                                                                                                                                                                                    â–’
                             sync_array_wait_event                                                                                                                                                                                                â–’
                           + mutex_spin_wait                                                                                                                                                                                                      â–’
                        + 14.91% os_event_reset                                                                                                                                                                                                   â–’
                        + 4.52% os_event_set                                                                                                                                                                                                      â–’
                     + 39.84% _raw_spin_lock                                                                                                                                                                                                      â–’
                  + 1.78% default_wake_function                   
            

            After the patch series applied:

            -   23.98%  mysqld   mysqld               [.] mutex_spin_wait                                                                                                                                                                         â–’
               - mutex_spin_wait                                                                                                                                                                                                                  â–’
                  + 41.04% trx_start_low                                                                                                                                                                                                          â–’
                  + 32.71% trx_commit_low                                                                                                                                                                                                         â–’
                  + 26.11% read_view_open_now                                                                                                                                                                                                     â–’
            -   23.14%  mysqld   [kernel.kallsyms]    [k] _raw_spin_lock                                                                                                                                                                          â–’
               - _raw_spin_lock                                                                                                                                                                                                                   â–’
                  - 56.24% futex_wait_setup                                                                                                                                                                                                       â–’
                     - 35.00% _raw_spin_lock                                                                                                                                                                                                      â–’
                        - 96.43% pthread_mutex_lock                                                                                                                                                                                               â–’
                           - 63.70% os_mutex_enter                                                                                                                                                                                                â–’
                              + 64.97% sync_array_wait_event                                                                                                                                                                                      â–’
                              + 32.22% sync_array_reserve_cell                                                                                                                                                                                    â–’
                              + 2.82% sync_array_free_cell                                                                                                                                                                                        â–’
                           + 15.79% os_event_wait_low                                                                                                                                                                                             â–’
                           + 14.72% os_event_reset                                                                                                                                                                                                â–’
                           + 5.80% os_event_set                                                                                                                                                                                                   â–’
                        + 3.47% 0xe77c                                                                                                                                                                                                            â–’
                     - 33.65% 0x15ea8                                                                                                                                                                                                             â–’
                        - 95.97% pthread_mutex_lock                                                                                                                                                                                               â–’
                           - 42.58% os_mutex_enter                                                                                                                                                                                                â–’
                              + 58.61% sync_array_reserve_cell                                                                                                                                                                                    â–’
                              + 37.62% sync_array_wait_event                                                                                                                                                                                      â–’
                              + 3.77% sync_array_free_cell                                                                                                                                                                                        â–’
                           + 35.37% os_event_reset                                                                                                                                                                                                â–’
                           + 15.94% os_event_wait_low                                                                                                                                                                                             â–’
                           + 6.08% os_event_set                                                                                                                                                                                                   â–’
                        + 4.03% 0xe77c                                                                                                                                                                                                            â–’
                     - 31.28% 0x15f04                                                                                                                                                                                                             â–’
                        - 93.65% pthread_mutex_lock                                                                                                                                                                                               â–’
                           + 47.48% os_event_reset                                                                                                                                                                                                â–’
                           + 33.85% os_mutex_enter                                                                                                                                                                                                â–’
                           + 10.71% os_event_wait_low                                                                                                                                                                                             â–’
                           + 7.97% os_event_set                                                                                                                                                                                                   â–’
                        + 6.30% 0xe77c                                                                                                                                                                                                            â–’
                  - 39.93% futex_wake                                                                                                                                                                                                             â–’
                     + 61.60% pthread_mutex_unlock                                                                                                                                                                                                â–’
                     + 37.78% _raw_spin_lock                                                                                                                                                                                                      â–’
                     + 0.57% 0xd990                                                                                                                                                                                                               â–’
                  + 1.34% default_wake_function                                                                                                                                                                                                   â–’
                  + 0.52% futex_requeue                                                                                                                                                                                                           â–’
            -   16.28%  mysqld   mysqld               [.] ut_delay                                                                                                                                                                                â–’
               - ut_delay                                                                                                                                                                                                                         â–’
                  + 41.17% trx_start_low                                                                                                                                                                                                          â–’
                  + 32.44% trx_commit_low                                                                                                                                                                                                         â–’
                  + 26.17% read_view_open_now     
            

            Using the microbenchmarks modifying Stewarts (https://github.com/grooverdan/microbenchmarks/tree/master/ut_delay)

            The time difference of the MATHS component was minimal on both x86 and Power.

            Running on x86_64 Intel(R) Xeon(R) CPU L5640 @ 2.27GHz

            time ./mysql-5.7
            tb change (avg over 1000000): 1393
            0.61user 0.00system 0:00.61elapsed 100%CPU (0avgtext+0avgdata 1288maxresident)k
            0inputs+0outputs (0major+56minor)pagefaults 0swaps
            time ./nomath
            tb change (avg over 1000000): 1381
            0.60user 0.00system 0:00.61elapsed 100%CPU (0avgtext+0avgdata 1356maxresident)k
            0inputs+0outputs (0major+56minor)pagefaults 0swaps
            

            On Power:

            [root@zoom2par ut_delay]#         time ./mysql-5.7
            tb change (avg over 1000000): 581
             
            real    0m1.136s
            user    0m1.136s
            sys     0m0.000s
            [root@zoom2par ut_delay]#         time ./nomath
            tb change (avg over 1000000): 581
             
            real    0m1.137s
            user    0m1.136s
            sys     0m0.000s
            

            So the ut_delay is taking about twice the length of time as x86.

            Removing the power UT_DELAY implemenation

            [root@zoom2par ut_delay]#         time ./mysql-5.7
            tb change (avg over 1000000): 68
             
            real    0m0.135s
            user    0m0.134s
            sys     0m0.000s
            [root@zoom2par ut_delay]#         time ./nomath
            tb change (avg over 1000000): 68
             
            real    0m0.134s
            user    0m0.134s
            sys     0m0.000s
            

            Suggestions welcome.

            danblack Daniel Black added a comment - - edited Ok. Have just added https://github.com/MariaDB/server/pull/168 containing two patches from MySQL-5.7, Svoj's patch and an implementation of a UT_RELAX_CPU that isn't harmful outside of the CPU. Running with a sysbench (sysbench --test=sysbench/tests/db/select.lua --oltp_tables_count=64 --oltp-table-size=500000 --mysql-socket=/tmp/mysql.sock --mysql-user=root --max-time=600 --max-requests=2000000000 --report-interval=20 --db-dirver=mysql --mysql-table-engine=innodb --num-threads=300) shows the following. Overall the patches change the perf split between ut_delay, mutex_spin_wait and _raw_spin_lock Is running on ppc64 on SMT=8 Patch point ut_delay mutex_split_wait _raw_spin_lock Total reads/second 10.1 HEAD - commit 9f5b285662ed8c13d6e87d8baf2f0ad4484d4a85 45.87% 14.91% 13.67% 74.45 101276.09 After two Oracle upstream patches 6.20 25.24 30.36 62.80 93680.53 After Svoj's patch 6.23 26.44 30.24 62.91 94697.7 after mfspr patch 16.28 23.98 23.14 64.4 112055.13 I haven't done a x86 comparison yet as the really only impact is the removal of the maths which the micobenchmarks show as insignificant. So we've got a 10.645 increase in TPS while using 10% less cpu time. breakdown of perf on the HEAD: - 45.87% mysqld mysqld [.] ut_delay â–’ - ut_delay â–’ + 37.93% trx_start_low â–’ + 32.58% trx_commit_low â–’ + 29.25% read_view_open_now â–’ - 14.91% mysqld mysqld [.] mutex_spin_wait â–’ - mutex_spin_wait â–’ + 37.30% trx_start_low â–’ + 32.69% trx_commit_low â–’ + 29.87% read_view_open_now â–’ - 13.67% mysqld [kernel.kallsyms] [k] _raw_spin_lock â–’ - _raw_spin_lock â–’ - 55.96% futex_wait_setup â–’ - 38.83% _raw_spin_lock â–’ - 97.77% pthread_mutex_lock â–’ - 74.38% os_mutex_enter â–’ - 62.70% sync_array_wait_event â–’ - mutex_spin_wait â–’ + 40.84% trx_start_low â–’ + 30.21% trx_commit_low â–’ + 28.95% read_view_open_now â–’ + 34.79% sync_array_reserve_cell â–’ + 2.51% sync_array_free_cell â–’ + 11.26% os_event_wait_low â–’ + 11.06% os_event_reset â–’ + 3.31% os_event_set â–’ + 2.19% 0xe77c â–’ + 30.96% 0x15f04 â–’ + 30.07% 0x15ea8 â–’ - 40.07% futex_wake â–’ - 59.74% pthread_mutex_unlock â–’ - 64.16% os_mutex_exit â–’ - os_mutex_exit â–’ + 69.87% sync_array_reserve_cell â–’ + 28.20% sync_array_wait_event â–’ + 1.93% sync_array_free_cell â–’ - 16.41% os_event_wait_low â–’ os_event_wait_low â–’ sync_array_wait_event â–’ + mutex_spin_wait â–’ + 14.91% os_event_reset â–’ + 4.52% os_event_set â–’ + 39.84% _raw_spin_lock â–’ + 1.78% default_wake_function After the patch series applied: - 23.98% mysqld mysqld [.] mutex_spin_wait â–’ - mutex_spin_wait â–’ + 41.04% trx_start_low â–’ + 32.71% trx_commit_low â–’ + 26.11% read_view_open_now â–’ - 23.14% mysqld [kernel.kallsyms] [k] _raw_spin_lock â–’ - _raw_spin_lock â–’ - 56.24% futex_wait_setup â–’ - 35.00% _raw_spin_lock â–’ - 96.43% pthread_mutex_lock â–’ - 63.70% os_mutex_enter â–’ + 64.97% sync_array_wait_event â–’ + 32.22% sync_array_reserve_cell â–’ + 2.82% sync_array_free_cell â–’ + 15.79% os_event_wait_low â–’ + 14.72% os_event_reset â–’ + 5.80% os_event_set â–’ + 3.47% 0xe77c â–’ - 33.65% 0x15ea8 â–’ - 95.97% pthread_mutex_lock â–’ - 42.58% os_mutex_enter â–’ + 58.61% sync_array_reserve_cell â–’ + 37.62% sync_array_wait_event â–’ + 3.77% sync_array_free_cell â–’ + 35.37% os_event_reset â–’ + 15.94% os_event_wait_low â–’ + 6.08% os_event_set â–’ + 4.03% 0xe77c â–’ - 31.28% 0x15f04 â–’ - 93.65% pthread_mutex_lock â–’ + 47.48% os_event_reset â–’ + 33.85% os_mutex_enter â–’ + 10.71% os_event_wait_low â–’ + 7.97% os_event_set â–’ + 6.30% 0xe77c â–’ - 39.93% futex_wake â–’ + 61.60% pthread_mutex_unlock â–’ + 37.78% _raw_spin_lock â–’ + 0.57% 0xd990 â–’ + 1.34% default_wake_function â–’ + 0.52% futex_requeue â–’ - 16.28% mysqld mysqld [.] ut_delay â–’ - ut_delay â–’ + 41.17% trx_start_low â–’ + 32.44% trx_commit_low â–’ + 26.17% read_view_open_now Using the microbenchmarks modifying Stewarts ( https://github.com/grooverdan/microbenchmarks/tree/master/ut_delay ) The time difference of the MATHS component was minimal on both x86 and Power. Running on x86_64 Intel(R) Xeon(R) CPU L5640 @ 2.27GHz time ./mysql-5.7 tb change (avg over 1000000): 1393 0.61user 0.00system 0:00.61elapsed 100%CPU (0avgtext+0avgdata 1288maxresident)k 0inputs+0outputs (0major+56minor)pagefaults 0swaps time ./nomath tb change (avg over 1000000): 1381 0.60user 0.00system 0:00.61elapsed 100%CPU (0avgtext+0avgdata 1356maxresident)k 0inputs+0outputs (0major+56minor)pagefaults 0swaps On Power: [root@zoom2par ut_delay]# time ./mysql-5.7 tb change (avg over 1000000): 581   real 0m1.136s user 0m1.136s sys 0m0.000s [root@zoom2par ut_delay]# time ./nomath tb change (avg over 1000000): 581   real 0m1.137s user 0m1.136s sys 0m0.000s So the ut_delay is taking about twice the length of time as x86. Removing the power UT_DELAY implemenation [root@zoom2par ut_delay]# time ./mysql-5.7 tb change (avg over 1000000): 68   real 0m0.135s user 0m0.134s sys 0m0.000s [root@zoom2par ut_delay]# time ./nomath tb change (avg over 1000000): 68   real 0m0.134s user 0m0.134s sys 0m0.000s Suggestions welcome.
            danblack Daniel Black added a comment - closed as per https://github.com/MariaDB/server/commit/9794cf2311c8fe86f05e046f0b96b46862219e03

            People

              jplindst Jan Lindström (Inactive)
              stewart-ibm Stewart Smith
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.