[MDEV-14455] rocksdb.2pc_group_commit failed in buildbot Created: 2017-11-21  Updated: 2023-11-29  Resolved: 2023-11-29

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - RocksDB, Tests
Affects Version/s: 10.2
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Alice Sherepa Assignee: Sergei Petrunia
Resolution: Won't Fix Votes: 0
Labels: None

Attachments: File fbmysql-2pc_group_commit-xpl.diff    
Issue Links:
Duplicate
duplicates MDEV-17188 rocksdb.2pc_group_commit fails interm... Closed

 Description   

http://buildbot.askmonty.org/buildbot/builders/kvm-deb-xenial-amd64/builds/2392/steps/mtr/logs/stdio

rocksdb.2pc_group_commit                 w1 [ fail ]
        Test ended at 2017-11-20 11:53:01
 
CURRENT_TEST: rocksdb.2pc_group_commit
--- /usr/share/mysql/mysql-test/plugin/rocksdb/rocksdb/r/2pc_group_commit.result	2017-11-20 09:40:39.000000000 -0500
+++ /dev/shm/var/1/log/2pc_group_commit.reject	2017-11-20 11:53:00.808592240 -0500
@@ -40,7 +40,7 @@
 select IF(variable_value - @b2 between 100 and 5000, 'OK', 'FAIL') as Binlog_group_commits 
 from information_schema.global_status where variable_name='Binlog_group_commits';
 Binlog_group_commits
-OK
+FAIL
 select IF(variable_value - @b3 between 1 and 9000, 'OK', 'FAIL')
 from information_schema.global_status where variable_name='Rocksdb_wal_synced';
 IF(variable_value - @b3 between 1 and 9000, 'OK', 'FAIL')
 
mysqltest: Result length mismatch



 Comments   
Comment by Elena Stepanova [ 2018-04-25 ]

And still fails
http://buildbot.askmonty.org/buildbot/builders/kvm-deb-trusty-amd64/builds/10324/steps/mtr/logs/stdio

Comment by Sergei Petrunia [ 2018-10-16 ]

https://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/999/steps/mtr/logs/stdio

Now with numbers:

rocksdb.2pc_group_commit 'write_committed' w2 [ fail ]
        Test ended at 2018-10-13 23:20:34
 
CURRENT_TEST: rocksdb.2pc_group_commit
--- /usr/share/mysql-test/plugin/rocksdb/rocksdb/r/2pc_group_commit.result	2018-10-13 21:56:11.000000000 +0000
+++ /dev/shm/var/2/log/2pc_group_commit.reject	2018-10-13 23:20:34.235742808 +0000
@@ -44,7 +44,7 @@
 select IF(variable_value - @b3 between 1 and 9000, 'OK', variable_value - @b3)
 from information_schema.global_status where variable_name='Rocksdb_wal_synced';
 IF(variable_value - @b3 between 1 and 9000, 'OK', variable_value - @b3)
-OK
+9434
 ## 
 # 2PC enabled, MyRocks durability disabled, single thread
 ## 

Comment by Sergei Petrunia [ 2019-07-11 ]

One can use settings like

binlog_commit_wait_count=30
binlog_commit_wait_usec=500K

and this will produce more SQL-level commit groups. That is, Binlog_group_commits will show less groups (=more grouping), even when running on a RAM disk.

But these settings have no effect on Rocksdb_wal_synced.

Comment by Sergei Petrunia [ 2019-07-11 ]

Added a code to hook to test sync point
"DBImpl::WriteImpl:BeforeLeaderEnters" with my_sleep(500) (this is 500
milliseconds).

With rocksdb_write_policy=write_prepared, this achieves the desired effect:
For 10K queries, 50 threads, and ram disk I get ~750 Binlog_group_commits
and 1850 Rocksdb_wal_syncs.

Doesn't have much effect for rocksdb_write_policy=write_committed, though.

Comment by Sergei Petrunia [ 2019-07-11 ]

The upstream doesn't watch rocksdb_wal_syncs. They check rocksdb_wal_group_syncs instead.

rocksdb_wal_group_syncs is a MyRocks-level (not RocksDB's) counter, it counts rocksdb_flush_wal() calls that the SQL layer's group commit makes.

Comment by Sergei Petrunia [ 2019-07-11 ]

fbmysql-2pc_group_commit-xpl.diff
Examining counter increments in the current upstream:

 ## 2PC + durability + single thread
+#  --concurrency=1 --number-of-queries=1000 
+#      WAL_SYNCED      WAL_GROUP_SYNCED
+#      1000    1000
 
 
 ## 2PC + durability + group commit
+#  --concurrency=50 --number-of-queries=10000
+#      WAL_SYNCED      WAL_GROUP_SYNCED
+#      1101    1101

# 2PC enabled, MyRocks durability disabled
+#  --concurrency=1 --number-of-queries=1000
+#      WAL_SYNCED      WAL_GROUP_SYNCED
+#      1       0
 
+#  --concurrency=50 --number-of-queries=10000
+#      WAL_SYNCED      WAL_GROUP_SYNCED
+#      0       0

 # 2PC disabled, MyRocks durability enabled
+#  --concurrency=1 --number-of-queries=1000
+#      WAL_SYNCED      WAL_GROUP_SYNCED
+#      1000    0
 
 
+#  --concurrency=50 --number-of-queries=10000
+#      WAL_SYNCED      WAL_GROUP_SYNCED
+#      10000   0

(both write_committed and write_prepared produce similar results)

Comment by Sergei Petrunia [ 2019-07-11 ]

The upstream has smaller WAL_SYNCED with binlog ON (because the SQL layer calls FlushWAL() once per group).

with 2PC disabled, they also got 10K WAL_SYNCED even for concurrent workload.

Comment by Sergei Petrunia [ 2019-07-11 ]

Re-running test on the current MariaDB. Using the workload from 2pc_group_commit test: 10K queries, concurrency=50

log_bin=OFF

write_committed

  • the workload takes 1.2 sec, Rocksdb_wal_synced=8K
  • enabled rocksdb_enable_delay_commits (500 ms delay):
    the workload takes same 1.2 sec, Rocksdb_wal_synced=1.2K

with write_prepared it's similar:

  • workload time is about the same
  • With rocksdb_enable_delay_commits, Rocksdb_wal_synced goes down from
    8K to 750.

log_bin=ON

write_committed:

  • the workload takes 1.6 sec, Rocksdb_wal_synced=9,600
  • enabled rocksdb_enable_delay_commits (500 ms delay): 12 seconds.
    Rocksdb_wal_synced=9,900

write_prepared:

  • the workload takes 1.6 sec, Rocksdb_wal_synced=8,300
  • enabled rocksdb_enable_delay_commits (500 ms delay): 1.7 seconds,
    Rocksdb_wal_synced=1,700
Generated at Thu Feb 08 08:13:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.