Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10813

Clean-up InnoDB atomics, memory barriers and mutexes

Details

    • 10.2.3-1, 10.2.3-2

    Description

      InnoDB atomics implementation is getting worse: 5.7 implementation is not maintainable. We should remove it and use atomics provided by server instead.

      InnoDB is now full of meaningless memory barriers, these should be removed along with memory barriers implementation.

      We should restore fixes for InnoDB rwlocks.

      Mutexes should be cleaned up so that their code is readable. Also we should ensure that mutexes issue proper memory barriers.

      NOTE: there're 4 mutex implementations in InnoDB: futex base, spin lock (never used), pthread/critical_section based, event based (default). We should consider removing 3 implementations and leaving only pthread/critical_section based solution.

      Attachments

        Issue Links

          Activity

            This failed on OSX, too http://buildbot.askmonty.org/buildbot/builders/labrador/builds/8036/steps/compile/logs/stdio

            /private/var/lib/buildslave/maria-slave/labrador/build/storage/innobase/include/sync0rw.ic:341: error: invalid conversion from '_opaque_pthread_t*' to 'int64'
            /private/var/lib/buildslave/maria-slave/labrador/build/storage/innobase/include/sync0rw.ic:341: error: initializing argument 2 of 'void my_atomic_store64(volatile int64*, int64)

            wlad Vladislav Vaintroub added a comment - This failed on OSX, too http://buildbot.askmonty.org/buildbot/builders/labrador/builds/8036/steps/compile/logs/stdio /private/var/lib/buildslave/maria-slave/labrador/build/storage/innobase/include/sync0rw.ic:341: error: invalid conversion from '_opaque_pthread_t*' to 'int64' /private/var/lib/buildslave/maria-slave/labrador/build/storage/innobase/include/sync0rw.ic:341: error: initializing argument 2 of 'void my_atomic_store64(volatile int64*, int64)
            serg Sergei Golubchik added a comment - - edited

            I suppose bb-10.2-mdev10813 branch (at commit d2a4536be9) is good for review now.

            Looking at the diff

            serg Sergei Golubchik added a comment - - edited I suppose bb-10.2-mdev10813 branch (at commit d2a4536be9 ) is good for review now. Looking at the diff
            axel Axel Schwenke added a comment -

            I did a round of sysbench OLTP. On our 16 core / 32 thread benchmark machine the latest changes in bb-10.2-mdev10813 make nearly no difference. Here are numbers (queries per second - bigger is better)

            # data set 01 -> mdev-10813 head
            # data set 02 -> before (commit 5058ced)
             
            # read only
            #thd    01      02
            1       9676.6  9977.2
            2       21338   21649
            4       39786   40760
            8       71194   75239
            16      137518  136853
            32      195938  196105
            64      191771  191567
            128     188573  188105
            256     191704  191259
             
            # read/write (10% writes)
            #thd    01      02
            1       6530.6  6663.4
            2       13575   13533
            4       24298   24414
            8       45236   45446
            16      86590   87654
            32      148783  154582
            64      194451  196712
            128     206457  206319
            256     203748  203821
             
            # read/write (22% writes)
            #thd    01      02
            1       4238.5  4221.1
            2       8492.5  8608.8
            4       14292   14354
            8       26833   27170
            16      51832   52092
            32      104019  104240
            64      151609  149283
            128     168852  169022
            256     170582  170851
             
            # read/write (80% writes)
            #thd    01      02
            1       8890.4  9261.9
            2       17155   17560
            4       32284   32438
            8       60511   60207
            16      112681  115927
            32      183503  186966
            64      219418  221771
            128     227306  229264
            256     225811  215256
            

            On the 64 core / 128 thread machine it looks a bit different. Benchmark is still running, I'll add numbers when they arrive.

            axel Axel Schwenke added a comment - I did a round of sysbench OLTP. On our 16 core / 32 thread benchmark machine the latest changes in bb-10.2-mdev10813 make nearly no difference. Here are numbers (queries per second - bigger is better) # data set 01 -> mdev-10813 head # data set 02 -> before (commit 5058ced)   # read only #thd 01 02 1 9676.6 9977.2 2 21338 21649 4 39786 40760 8 71194 75239 16 137518 136853 32 195938 196105 64 191771 191567 128 188573 188105 256 191704 191259   # read/write (10% writes) #thd 01 02 1 6530.6 6663.4 2 13575 13533 4 24298 24414 8 45236 45446 16 86590 87654 32 148783 154582 64 194451 196712 128 206457 206319 256 203748 203821   # read/write (22% writes) #thd 01 02 1 4238.5 4221.1 2 8492.5 8608.8 4 14292 14354 8 26833 27170 16 51832 52092 32 104019 104240 64 151609 149283 128 168852 169022 256 170582 170851   # read/write (80% writes) #thd 01 02 1 8890.4 9261.9 2 17155 17560 4 32284 32438 8 60511 60207 16 112681 115927 32 183503 186966 64 219418 221771 128 227306 229264 256 225811 215256 On the 64 core / 128 thread machine it looks a bit different. Benchmark is still running, I'll add numbers when they arrive.
            axel Axel Schwenke added a comment -

            Here the numbers (again queries per second) from the 64 core / 128 thread machine. Read-only is a bit faster for the head of the tree. All in all performance on this hardware is rather poor, probably caused by contention in the server layers above InnoDB.

            # data set 01 -> mdev-10813 head
            # data set 02 -> before (commit 5058ced)
             
            # read only
            #thd    01      02
            1       8543.9  8550.4
            2       16886   16951
            4       32710   31471
            8       60611   59759
            16      84002   82922
            32      61391   60169
            64      38369   36371
            128     39930   39849
            256     20656   20974
            512     23212   23343
            1024    78925   73784
             
            # read/write (10% writes)
            #thd    01      02
            1       5987.6  5849.4
            2       12810   12402
            4       24669   24228
            8       45750   45463
            16      67585   67701
            32      67475   66547
            64      52460   50705
            128     28924   30246
            256     17840   17841
            512     17094   17108
            1024    22048   20821
             
            # read/write (22% writes)
            #thd    01      02
            1       4166.1  4110.3
            2       9475.2  9261.5
            4       16945   16598
            8       31238   30851
            16      53908   54055
            32      64890   63974
            64      49891   50204
            128     23642   23546
            256     16068   15895
            512     15360   15484
            1024    15319   15389
             
            # read/write (80% writes)
            #thd    01      02
            1       8007.6  7786.9
            2       15966   15724
            4       25405   25198
            8       43806   43682
            16      69070   67875
            32      81656   81641
            64      77996   78498
            128     80387   77480
            256     64941   63605
            512     57655   58633
            1024    56908   56282
            

            axel Axel Schwenke added a comment - Here the numbers (again queries per second) from the 64 core / 128 thread machine. Read-only is a bit faster for the head of the tree. All in all performance on this hardware is rather poor, probably caused by contention in the server layers above InnoDB. # data set 01 -> mdev-10813 head # data set 02 -> before (commit 5058ced)   # read only #thd 01 02 1 8543.9 8550.4 2 16886 16951 4 32710 31471 8 60611 59759 16 84002 82922 32 61391 60169 64 38369 36371 128 39930 39849 256 20656 20974 512 23212 23343 1024 78925 73784   # read/write (10% writes) #thd 01 02 1 5987.6 5849.4 2 12810 12402 4 24669 24228 8 45750 45463 16 67585 67701 32 67475 66547 64 52460 50705 128 28924 30246 256 17840 17841 512 17094 17108 1024 22048 20821   # read/write (22% writes) #thd 01 02 1 4166.1 4110.3 2 9475.2 9261.5 4 16945 16598 8 31238 30851 16 53908 54055 32 64890 63974 64 49891 50204 128 23642 23546 256 16068 15895 512 15360 15484 1024 15319 15389   # read/write (80% writes) #thd 01 02 1 8007.6 7786.9 2 15966 15724 4 25405 25198 8 43806 43682 16 69070 67875 32 81656 81641 64 77996 78498 128 80387 77480 256 64941 63605 512 57655 58633 1024 56908 56282

            According to the above there seem to be no performance/scalability regression introduced by relevant patches.

            svoj Sergey Vojtovich added a comment - According to the above there seem to be no performance/scalability regression introduced by relevant patches.

            People

              svoj Sergey Vojtovich
              svoj Sergey Vojtovich
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.