Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14455

rocksdb.2pc_group_commit failed in buildbot

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/kvm-deb-xenial-amd64/builds/2392/steps/mtr/logs/stdio

      rocksdb.2pc_group_commit                 w1 [ fail ]
              Test ended at 2017-11-20 11:53:01
       
      CURRENT_TEST: rocksdb.2pc_group_commit
      --- /usr/share/mysql/mysql-test/plugin/rocksdb/rocksdb/r/2pc_group_commit.result	2017-11-20 09:40:39.000000000 -0500
      +++ /dev/shm/var/1/log/2pc_group_commit.reject	2017-11-20 11:53:00.808592240 -0500
      @@ -40,7 +40,7 @@
       select IF(variable_value - @b2 between 100 and 5000, 'OK', 'FAIL') as Binlog_group_commits 
       from information_schema.global_status where variable_name='Binlog_group_commits';
       Binlog_group_commits
      -OK
      +FAIL
       select IF(variable_value - @b3 between 1 and 9000, 'OK', 'FAIL')
       from information_schema.global_status where variable_name='Rocksdb_wal_synced';
       IF(variable_value - @b3 between 1 and 9000, 'OK', 'FAIL')
       
      mysqltest: Result length mismatch
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - And still fails http://buildbot.askmonty.org/buildbot/builders/kvm-deb-trusty-amd64/builds/10324/steps/mtr/logs/stdio

            https://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/999/steps/mtr/logs/stdio

            Now with numbers:

            rocksdb.2pc_group_commit 'write_committed' w2 [ fail ]
                    Test ended at 2018-10-13 23:20:34
             
            CURRENT_TEST: rocksdb.2pc_group_commit
            --- /usr/share/mysql-test/plugin/rocksdb/rocksdb/r/2pc_group_commit.result	2018-10-13 21:56:11.000000000 +0000
            +++ /dev/shm/var/2/log/2pc_group_commit.reject	2018-10-13 23:20:34.235742808 +0000
            @@ -44,7 +44,7 @@
             select IF(variable_value - @b3 between 1 and 9000, 'OK', variable_value - @b3)
             from information_schema.global_status where variable_name='Rocksdb_wal_synced';
             IF(variable_value - @b3 between 1 and 9000, 'OK', variable_value - @b3)
            -OK
            +9434
             ## 
             # 2PC enabled, MyRocks durability disabled, single thread
             ## 
            

            psergei Sergei Petrunia added a comment - https://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/999/steps/mtr/logs/stdio Now with numbers: rocksdb.2pc_group_commit 'write_committed' w2 [ fail ] Test ended at 2018-10-13 23:20:34   CURRENT_TEST: rocksdb.2pc_group_commit --- /usr/share/mysql-test/plugin/rocksdb/rocksdb/r/2pc_group_commit.result 2018-10-13 21:56:11.000000000 +0000 +++ /dev/shm/var/2/log/2pc_group_commit.reject 2018-10-13 23:20:34.235742808 +0000 @@ -44,7 +44,7 @@ select IF(variable_value - @b3 between 1 and 9000, 'OK', variable_value - @b3) from information_schema.global_status where variable_name='Rocksdb_wal_synced'; IF(variable_value - @b3 between 1 and 9000, 'OK', variable_value - @b3) -OK +9434 ## # 2PC enabled, MyRocks durability disabled, single thread ##

            One can use settings like

            binlog_commit_wait_count=30
            binlog_commit_wait_usec=500K
            

            and this will produce more SQL-level commit groups. That is, Binlog_group_commits will show less groups (=more grouping), even when running on a RAM disk.

            But these settings have no effect on Rocksdb_wal_synced.

            psergei Sergei Petrunia added a comment - One can use settings like binlog_commit_wait_count=30 binlog_commit_wait_usec=500K and this will produce more SQL-level commit groups. That is, Binlog_group_commits will show less groups (=more grouping), even when running on a RAM disk. But these settings have no effect on Rocksdb_wal_synced .
            psergei Sergei Petrunia added a comment - - edited

            Added a code to hook to test sync point
            "DBImpl::WriteImpl:BeforeLeaderEnters" with my_sleep(500) (this is 500
            milliseconds).

            With rocksdb_write_policy=write_prepared, this achieves the desired effect:
            For 10K queries, 50 threads, and ram disk I get ~750 Binlog_group_commits
            and 1850 Rocksdb_wal_syncs.

            Doesn't have much effect for rocksdb_write_policy=write_committed, though.

            psergei Sergei Petrunia added a comment - - edited Added a code to hook to test sync point "DBImpl::WriteImpl:BeforeLeaderEnters" with my_sleep(500) (this is 500 milliseconds). With rocksdb_write_policy=write_prepared, this achieves the desired effect: For 10K queries, 50 threads, and ram disk I get ~750 Binlog_group_commits and 1850 Rocksdb_wal_syncs. Doesn't have much effect for rocksdb_write_policy=write_committed, though.

            The upstream doesn't watch rocksdb_wal_syncs. They check rocksdb_wal_group_syncs instead.

            rocksdb_wal_group_syncs is a MyRocks-level (not RocksDB's) counter, it counts rocksdb_flush_wal() calls that the SQL layer's group commit makes.

            psergei Sergei Petrunia added a comment - The upstream doesn't watch rocksdb_wal_syncs. They check rocksdb_wal_group_syncs instead. rocksdb_wal_group_syncs is a MyRocks-level (not RocksDB's) counter, it counts rocksdb_flush_wal() calls that the SQL layer's group commit makes.
            psergei Sergei Petrunia added a comment - - edited

            fbmysql-2pc_group_commit-xpl.diff
            Examining counter increments in the current upstream:

             ## 2PC + durability + single thread
            +#  --concurrency=1 --number-of-queries=1000 
            +#      WAL_SYNCED      WAL_GROUP_SYNCED
            +#      1000    1000
             
             
             ## 2PC + durability + group commit
            +#  --concurrency=50 --number-of-queries=10000
            +#      WAL_SYNCED      WAL_GROUP_SYNCED
            +#      1101    1101
            

            # 2PC enabled, MyRocks durability disabled
            +#  --concurrency=1 --number-of-queries=1000
            +#      WAL_SYNCED      WAL_GROUP_SYNCED
            +#      1       0
             
            +#  --concurrency=50 --number-of-queries=10000
            +#      WAL_SYNCED      WAL_GROUP_SYNCED
            +#      0       0
            

             # 2PC disabled, MyRocks durability enabled
            +#  --concurrency=1 --number-of-queries=1000
            +#      WAL_SYNCED      WAL_GROUP_SYNCED
            +#      1000    0
             
             
            +#  --concurrency=50 --number-of-queries=10000
            +#      WAL_SYNCED      WAL_GROUP_SYNCED
            +#      10000   0
            

            (both write_committed and write_prepared produce similar results)

            psergei Sergei Petrunia added a comment - - edited fbmysql-2pc_group_commit-xpl.diff Examining counter increments in the current upstream: ## 2PC + durability + single thread +# --concurrency=1 --number-of-queries=1000 +# WAL_SYNCED WAL_GROUP_SYNCED +# 1000 1000     ## 2PC + durability + group commit +# --concurrency=50 --number-of-queries=10000 +# WAL_SYNCED WAL_GROUP_SYNCED +# 1101 1101 # 2PC enabled, MyRocks durability disabled +# --concurrency=1 --number-of-queries=1000 +# WAL_SYNCED WAL_GROUP_SYNCED +# 1 0   +# --concurrency=50 --number-of-queries=10000 +# WAL_SYNCED WAL_GROUP_SYNCED +# 0 0 # 2PC disabled, MyRocks durability enabled +# --concurrency=1 --number-of-queries=1000 +# WAL_SYNCED WAL_GROUP_SYNCED +# 1000 0     +# --concurrency=50 --number-of-queries=10000 +# WAL_SYNCED WAL_GROUP_SYNCED +# 10000 0 (both write_committed and write_prepared produce similar results)

            The upstream has smaller WAL_SYNCED with binlog ON (because the SQL layer calls FlushWAL() once per group).

            with 2PC disabled, they also got 10K WAL_SYNCED even for concurrent workload.

            psergei Sergei Petrunia added a comment - The upstream has smaller WAL_SYNCED with binlog ON (because the SQL layer calls FlushWAL() once per group). with 2PC disabled, they also got 10K WAL_SYNCED even for concurrent workload.

            Re-running test on the current MariaDB. Using the workload from 2pc_group_commit test: 10K queries, concurrency=50

            log_bin=OFF

            write_committed

            • the workload takes 1.2 sec, Rocksdb_wal_synced=8K
            • enabled rocksdb_enable_delay_commits (500 ms delay):
              the workload takes same 1.2 sec, Rocksdb_wal_synced=1.2K

            with write_prepared it's similar:

            • workload time is about the same
            • With rocksdb_enable_delay_commits, Rocksdb_wal_synced goes down from
              8K to 750.

            log_bin=ON

            write_committed:

            • the workload takes 1.6 sec, Rocksdb_wal_synced=9,600
            • enabled rocksdb_enable_delay_commits (500 ms delay): 12 seconds.
              Rocksdb_wal_synced=9,900

            write_prepared:

            • the workload takes 1.6 sec, Rocksdb_wal_synced=8,300
            • enabled rocksdb_enable_delay_commits (500 ms delay): 1.7 seconds,
              Rocksdb_wal_synced=1,700
            psergei Sergei Petrunia added a comment - Re-running test on the current MariaDB. Using the workload from 2pc_group_commit test: 10K queries, concurrency=50 log_bin=OFF write_committed the workload takes 1.2 sec, Rocksdb_wal_synced=8K enabled rocksdb_enable_delay_commits (500 ms delay): the workload takes same 1.2 sec, Rocksdb_wal_synced=1.2K with write_prepared it's similar: workload time is about the same With rocksdb_enable_delay_commits, Rocksdb_wal_synced goes down from 8K to 750. log_bin=ON write_committed: the workload takes 1.6 sec, Rocksdb_wal_synced=9,600 enabled rocksdb_enable_delay_commits (500 ms delay): 12 seconds. Rocksdb_wal_synced=9,900 write_prepared: the workload takes 1.6 sec, Rocksdb_wal_synced=8,300 enabled rocksdb_enable_delay_commits (500 ms delay): 1.7 seconds, Rocksdb_wal_synced=1,700

            People

              psergei Sergei Petrunia
              alice Alice Sherepa
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.