Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23475

InnoDB performance regression for write-heavy workloads

Details

    Description

      Write-heavy benchmarks show a regression in latency and throughput for latest releases in the 10.2, 10.3 and 10.4 series.

      Example:

      ----------------------------------------------------------------------------
      Test 't_oltp-innodb-multi' - sysbench OLTP read/write
      32 InnoDB tables with 10 mio rows total
      numbers are queries per second
       
      #thread count           1       8       16      32      64      128     256
      mariadb-10.2.32         5262.8  28886   53535   95712   158049  189909  192668
      mariadb-10.2.33         4866.7  22651   42352   78750   133580  173461  191125
       
      mariadb-10.3.23         5126.4  28716   52785   95466   156936  185473  187136
      mariadb-10.3.24         4772.0  22959   42713   80442   133492  171252  186200
       
      mariadb-10.4.13         5015.7  28263   52033   93771   154690  178179  178076
      mariadb-10.4.14         4980.9  23496   43145   79795   128977  166470  179285
      

      I examined this more closely for 10.2, bisecting the commits between 10.2.32 and 10.2.33. The first faulty commit is

      commit fe39d02f51b96536dccca7ff89faf05e13548877
      Author: Thirunarayanan Balathandayuthapani <thiru@mariadb.com>
      Date:   Thu Jul 23 16:23:20 2020 +0530
       
       MDEV-20638 Remove the deadcode from srv_master_thread() and srv_active_wake_master_thread_low()
      

      Attachments

        1. 10.2.bad.svg
          700 kB
        2. 10.2.good.svg
          684 kB
        3. 10.2.pdf
          21 kB
        4. 10.2-2nd.pdf
          22 kB
        5. 10.2-binlog.png
          10.2-binlog.png
          11 kB
        6. 10.2-binlog4.png
          10.2-binlog4.png
          9 kB
        7. 10.2-binlog-revert.png
          10.2-binlog-revert.png
          11 kB
        8. 10.2-revert.pdf
          23 kB
        9. 10.3.pdf
          20 kB
        10. 10.4.pdf
          21 kB
        11. 10.5-2nd.pdf
          20 kB
        12. 10.5-4th.pdf
          21 kB
        13. MDEV-23475.png
          MDEV-23475.png
          6 kB
        14. MDEV-23475-final.pdf
          86 kB

        Issue Links

          Activity

            axel Axel Schwenke added a comment -

            For 10.3 the situation is very much like for 10.2. Patch 2 restores performance for all tests except t_writes-binlog-multi. If we want to fix performance for that scenario, we would have to rollback MDEV-20638

            10.3.pdf

            axel Axel Schwenke added a comment - For 10.3 the situation is very much like for 10.2. Patch 2 restores performance for all tests except t_writes-binlog-multi . If we want to fix performance for that scenario, we would have to rollback MDEV-20638 10.3.pdf
            axel Axel Schwenke added a comment -

            10.4 is again very similar to 10.2 and 10.3. In addition both patch 1 and 2 fix the 90:10 performance only partly. I suggest to rollback MDEV-20638 for 10.4 as well.

            10.4.pdf

            axel Axel Schwenke added a comment - 10.4 is again very similar to 10.2 and 10.3. In addition both patch 1 and 2 fix the 90:10 performance only partly. I suggest to rollback MDEV-20638 for 10.4 as well. 10.4.pdf

            I reverted MDEV-20638 in 10.2 and merged that up to 10.4. In 10.5, instead of reverting the changes, I applied a simple refinement that was called Patch 2:

            diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc
            index 7c61ad9b45b..7544f120284 100644
            --- a/storage/innobase/row/row0mysql.cc
            +++ b/storage/innobase/row/row0mysql.cc
            @@ -3820,8 +3820,6 @@ row_drop_table_for_mysql(
             
             	trx->op_info = "";
             
            -	srv_inc_activity_count();
            -
             	DBUG_RETURN(err);
             }
             
            diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
            index 0dbd985b6c3..117ff64761c 100644
            --- a/storage/innobase/trx/trx0trx.cc
            +++ b/storage/innobase/trx/trx0trx.cc
            @@ -1537,6 +1537,7 @@ trx_flush_log_if_needed_low(
             	case 1:
             		/* Write the log and optionally flush it to disk */
             		log_write_up_to(lsn, flush);
            +		srv_inc_activity_count();
             		return;
             	case 0:
             		/* Do nothing */
            

            marko Marko Mäkelä added a comment - I reverted MDEV-20638 in 10.2 and merged that up to 10.4. In 10.5, instead of reverting the changes, I applied a simple refinement that was called Patch 2: diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc index 7c61ad9b45b..7544f120284 100644 --- a/storage/innobase/row/row0mysql.cc +++ b/storage/innobase/row/row0mysql.cc @@ -3820,8 +3820,6 @@ row_drop_table_for_mysql( trx->op_info = ""; - srv_inc_activity_count(); - DBUG_RETURN(err); } diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc index 0dbd985b6c3..117ff64761c 100644 --- a/storage/innobase/trx/trx0trx.cc +++ b/storage/innobase/trx/trx0trx.cc @@ -1537,6 +1537,7 @@ trx_flush_log_if_needed_low( case 1: /* Write the log and optionally flush it to disk */ log_write_up_to(lsn, flush); + srv_inc_activity_count(); return; case 0: /* Do nothing */
            axel Axel Schwenke added a comment -

            I tested the final fix for this issue on 10.2 .. 10.5. The fix looks very good for most test cases. In 10.2 there is still a slight regression for t_writes-binlog-multi. But all in all the problem is solved. Details: MDEV-23475-final.pdf

            axel Axel Schwenke added a comment - I tested the final fix for this issue on 10.2 .. 10.5. The fix looks very good for most test cases. In 10.2 there is still a slight regression for t_writes-binlog-multi . But all in all the problem is solved. Details: MDEV-23475-final.pdf

            Could that bug occur in 10.3.23?

            jaehyun1148.lee jaehyun1148.lee added a comment - Could that bug occur in 10.3.23?

            People

              marko Marko Mäkelä
              axel Axel Schwenke
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.