|
Run1: default configuration, binlog=ON, 1M row table, oltp_simple_rw, different # of client threads:
Performance numbers look more-or-less similar (that is, there is no indication that something is broken)
|
|
Results from a replication test
(the test setup looks weird and it was, but that was the first attempt)
- MariaDB 10.2 -ubuntu-xenial package
- single threaded slave
- master is run for two minutes with threads =1,10,20,40
- then, waiting for the slave to catch up.
Columns
threads, master_pos_wait time (sec)
|
InnoDB, gtid_slave_pos is innodb
|
1, 0.074385
|
10, 359.942474
|
20, 613.465163
|
40, 757.761746
|
InnoDB, gtid_slave_pos is MyRocks
|
1, 305.385057
|
MyRocks, gtid_slave_pos is InnoDB
|
1, 320.517346
|
MyRocks, gtid_slave_pos is MyRocks
|
1, 0.011072
|
10, 531.862626
|
20, 938.513108
|
40, 1197.352052
|
|
|
Take-aways:
- single threaded slave performance depends a lot on whether mysql.slave_gtid_info is using the same storage engine
- Taking that into account, slave performance is the same order-of-magnitude
**( NOTE: the difference is greater than on the chart because the master ran for 2 minutes, and according to the previous test, InnoDB is faster so it generated more events)
|
|
Parallel slave test.
- On a master (t2.2xlarge) run oltp_write_only workload with --threads=32
- Then, measure the time it will take the slave (t2.xlarge) to catch up.
Slave settings:
- slave_parallel_workers=N
- slave_parallel_mode=conservative
MariaDB 10.2, RocksDB
Sysbench run as workload generator ( oltp_write_only.lua --time=30 --table-size=1000000 --threads=32)
transactions: 220 570 (7350.21 per sec.)
|
queries: 1 330 661 (44342.59 per sec.)
|
slave_parallel_workers, catch_up_time
|
1 ,543.821963
|
10,492.044147
|
20,480.922534
|
40,479.111308
|
MariaDB 10.2, InnoDB
Sysbench run as workload generator ( oltp_write_only.lua --time=30 --table-size=1000000 --threads=32)
transactions: 152 287 (5073.76 per sec.)
|
queries: 913 907 (30448.74 per sec.)
|
slave_parallel_workers, catch_up_time
|
1, 276.695094
|
10, 100.695637
|
20, 68.44454
|
40, 57.729703
|
|
|
So, with InnoDB, the slave can get as fast as being 2x slow as the master. This is actually good because the master has more CPU power (t2.2xlarge, /proc/cpuinfo reports 8 CPUs) than slave (t2.large, cpuinfo shows 2 CPUs).
With MyRocks, the catchup time is much bigger. The difference is 2x for single-threaded slave (note: but this is partially explained by the fact that MyRocks master did more transactions, 220K/152K=1.44).
The issue is that for MyRocks, speedup of parallel slave wrt single threaded slave is much lower than for InnoDB.
|
|
Didn't note binlog group commit counters on the previous run. Re-ran the load step and I get this:
InnoDB
sysbench: transactions: 151869 (5059.10 per sec.)
Binlog_commits 151869
|
Binlog_group_commits 98083 (1.54 commits per group)
|
MyRocks
sysbench: transactions: 214206 (7138.58 per sec.)
Binlog_commits 216,944
|
Binlog_group_commits 44,895 (4.8 commits per group)
|
So the master part seems to be ok.
|
|
MariaDB 102, slave_parallel_mode=aggressive
RocksDB
slave_parallel_workers, catch_up_time
|
1 3816.330531
|
10 6605.029636
|
20 6602.966884
|
40 1227.60888
|
InnoDB
slave_parallel_workers, catch_up_time
|
1 284.314864
|
10 106.254725
|
20 79.08769
|
40 59.92061
|
|
|
Percona 5.7 bintar
slave_parallel_type=logical_clock
|
InnoDB
sysbench ... --time=30 /usr/share/sysbench/oltp_write_only.lua --table-size=1000000 --mysql_storage_engine=RocksDB --threads=32 run
|
transactions: 114771 (3822.82 per sec.)
|
slave_parallel_workers time_to_catchup
|
1 205.707733
|
10 48.771367
|
20 35.452498
|
40 33.482409
|
MyRocks
sysbench ... --time=30 /usr/share/sysbench/oltp_write_only.lua --table-size=1000000 --mysql_storage_engine=RocksDB --threads=32 run
|
transactions: 144864 (4826.62 per sec.)
|
slave_parallel_workers time_to_catchup
|
1 328.305218
|
10 70.682672
|
20 40.449479
|
40 34.089818
|
|
|
Attempt to debug the slave while it is catching up with
slave_parallel_workers=20,slave_parallel_mode=conservative:
https://gist.github.com/spetrunia/3f77615a9e33df51ffe54a1d07fc1fef
This is current MariaDB 10.2, debug build.
MariaDB binary from the original 10.2 package:
https://gist.github.com/spetrunia/acbc36c50e8c198fbc0b87e6ecf82fa1
|
|
Both results have this as the second most common trace (the first being InnoDB's AIO sleep) :
207 at,safe_cond_wait,inline_mysql_cond_wait,wait_for_commit::wait_for_prior_commit2,
|
wait_for_commit::wait_for_prior_commit,THD::wait_for_prior_commit,ha_commit_one_phase,
|
ha_commit_trans,trans_commit,Xid_log_event::do_apply_event,Log_event::apply_event,
|
apply_event_and_update_pos_apply,apply_event_and_update_pos_for_parallel,
|
rpt_handle_event,handle_rpl_parallel_thread,start_thread,clone
|
|
|
Have been playing with gdb attached to the slave and caught this:
[Switching to thread 45 (Thread 0x7f7088178700 (LWP 16939))]
|
#0 0x00007f709700681d in fdatasync () at ../sysdeps/unix/syscall-template.S:84
|
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
|
(gdb) where
|
#0 0x00007f709700681d in fdatasync () at ../sysdeps/unix/syscall-template.S:84
|
#1 0x00007f708915c1c2 in rocksdb::PosixWritableFile::Sync (this=0x7f706c15be90) at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/env/io_posix.cc:844
|
#2 0x00007f70890beb55 in rocksdb::WritableFileWriter::SyncInternal (this=0x7f706c15c070, use_fsync=false)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/util/file_reader_writer.cc:350
|
#3 0x00007f70890be924 in rocksdb::WritableFileWriter::SyncWithoutFlush (this=0x7f706c15c070, use_fsync=false)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/util/file_reader_writer.cc:338
|
#4 0x00007f7088e6afde in rocksdb::DBImpl::SyncWAL (this=0x557f60ae59f0) at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/db_impl.cc:683
|
#5 0x00007f7088e6aa8d in rocksdb::DBImpl::FlushWAL (this=0x557f60ae59f0, sync=true) at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/db_impl.cc:638
|
#6 0x00007f7088ef7a08 in rocksdb::DBImpl::WriteImpl (this=0x557f60ae59f0, write_options=..., my_batch=0x7f703c01a070, callback=0x0, log_used=0x0, log_ref=0,
|
disable_memtable=false, seq_used=0x0) at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/db_impl_write.cc:304
|
#7 0x00007f7088ef6718 in rocksdb::DBImpl::Write (this=0x557f60ae59f0, write_options=..., my_batch=0x7f703c01a070)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/db_impl_write.cc:49
|
#8 0x00007f7089213ccc in rocksdb::WriteCommittedTxn::CommitWithoutPrepareInternal (this=0x7f703c019e50)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:304
|
#9 0x00007f708921382a in rocksdb::PessimisticTransaction::Commit (this=0x7f703c019e50)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:264
|
#10 0x00007f7088dc08fc in myrocks::Rdb_transaction_impl::commit_no_binlog (this=0x7f703c019cb0) at /home/ubuntu/mariadb-10.2/storage/rocksdb/ha_rocksdb.cc:2283
|
#11 0x00007f7088dbfc24 in myrocks::Rdb_transaction::commit (this=0x7f703c019cb0) at /home/ubuntu/mariadb-10.2/storage/rocksdb/ha_rocksdb.cc:2014
|
#12 0x00007f7088d93d03 in myrocks::rocksdb_commit (hton=0x557f608ac920, thd=0x7f703c008eb0, commit_tx=true)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/ha_rocksdb.cc:3157
|
#13 0x0000557f5d4e207e in commit_one_phase_2 (thd=0x7f703c008eb0, all=true, trans=0x7f703c00c160, is_real_trans=true)
|
at /home/ubuntu/mariadb-10.2/sql/handler.cc:1567
|
#14 0x0000557f5d4e1f81 in ha_commit_one_phase (thd=0x7f703c008eb0, all=true) at /home/ubuntu/mariadb-10.2/sql/handler.cc:1548
|
#15 0x0000557f5d4e184b in ha_commit_trans (thd=0x7f703c008eb0, all=true) at /home/ubuntu/mariadb-10.2/sql/handler.cc:1415
|
#16 0x0000557f5d3cbf5f in trans_commit (thd=0x7f703c008eb0) at /home/ubuntu/mariadb-10.2/sql/transaction.cc:307
|
#17 0x0000557f5d60b059 in Xid_log_event::do_apply_event (this=0x7f7092035860, rgi=0x7f7092081750) at /home/ubuntu/mariadb-10.2/sql/log_event.cc:8447
|
#18 0x0000557f5d1cbc23 in Log_event::apply_event (this=0x7f7092035860, rgi=0x7f7092081750) at /home/ubuntu/mariadb-10.2/sql/log_event.h:1452
|
#19 0x0000557f5d1c0bad in apply_event_and_update_pos_apply (ev=0x7f7092035860, thd=0x7f703c008eb0, rgi=0x7f7092081750, reason=0)
|
at /home/ubuntu/mariadb-10.2/sql/slave.cc:3570
|
#20 0x0000557f5d1c11b0 in apply_event_and_update_pos_for_parallel (ev=0x7f7092035860, thd=0x7f703c008eb0, rgi=0x7f7092081750)
|
at /home/ubuntu/mariadb-10.2/sql/slave.cc:3745
|
#21 0x0000557f5d425a24 in rpt_handle_event (qev=0x7f70905a57f0, rpt=0x7f709000b110) at /home/ubuntu/mariadb-10.2/sql/rpl_parallel.cc:50
|
#22 0x0000557f5d42886d in handle_rpl_parallel_thread (arg=0x7f709000b110) at /home/ubuntu/mariadb-10.2/sql/rpl_parallel.cc:1265
|
#23 0x00007f7097b7b6ba in start_thread (arg=0x7f7088178700) at pthread_create.c:333
|
#24 0x00007f709701041d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
A lot of other threads are in wait_for_commit::wait_for_prior_commit call.
As for this thread:
#9 0x00007f708921382a in rocksdb::PessimisticTransaction::Commit (this=0x7f703c019e50)
|
at /home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:264
|
264 s = CommitWithoutPrepareInternal();
|
(gdb) p commit_without_prepare
|
$6 = true
|
(gdb) p commit_prepared
|
$7 = false
|
|
(gdb) up
|
#10 0x00007f7088dc08fc in myrocks::Rdb_transaction_impl::commit_no_binlog (this=0x7f703c019cb0) at /home/ubuntu/mariadb-10.2/storage/rocksdb/ha_rocksdb.cc:2283
|
2283 const rocksdb::Status s = m_rocksdb_tx->Commit();
|
(gdb) p m_is_two_phase
|
$8 = true
|
- Is this intentional that we have not done a prepare call?
- if we are doing a commit for a replication slave transaction, why are we doing a disk sync?
|
|
Attempted to check how the upstream resolves this but figured that they are based on mysql-5.6 and so do not support slave_parallel_type=logical_clock yet.
Stack traces from Percona Server:
https://gist.github.com/spetrunia/ac453401977fa620a423568e7ab5f619
There are multiple threads with a stack trace like this:
Thread 62 (Thread 0x7f35207c6700 (LWP 3540)):
|
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
|
#1 0x00007f35667f691c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
|
#2 0x00007f353130d72b in std::condition_variable::wait<rocksdb::WriteThread::BlockingAwaitState(rocksdb::WriteThread::Writer*, uint8_t)::<lambda()> > (__p=..., __lock=..., this=0x7f35207c49c0) at /usr/include/c++/4.9/condition_variable:98
|
#3 rocksdb::WriteThread::BlockingAwaitState (this=<optimized out>, w=0x7f35207c4930, goal_mask=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/write_thread.cc:44
|
#4 0x00007f353130d7ff in rocksdb::WriteThread::AwaitState (this=this@entry=0x34990b8, w=w@entry=0x7f35207c4930, goal_mask=goal_mask@entry=30 '\036', ctx=ctx@entry=0x7f35317f37f0 <rocksdb::jbg_ctx>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/write_thread.cc:176
|
#5 0x00007f353130dc65 in rocksdb::WriteThread::JoinBatchGroup (this=this@entry=0x34990b8, w=w@entry=0x7f35207c4930) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/write_thread.cc:318
|
#6 0x00007f353125f0d2 in rocksdb::DBImpl::WriteImpl (this=0x34989d0, write_options=..., my_batch=<optimized out>, callback=callback@entry=0x0, log_used=log_used@entry=0x0, log_ref=0, disable_memtable=false, seq_used=0x0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/db_impl_write.cc:104
|
#7 0x00007f35312609cd in rocksdb::DBImpl::Write (this=<optimized out>, write_options=..., my_batch=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/db_impl_write.cc:49
|
#8 0x00007f353149061a in rocksdb::WriteCommittedTxn::CommitWithoutPrepareInternal (this=0x7f34e1cefc30) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:304
|
#9 0x00007f3531493c13 in rocksdb::PessimisticTransaction::Commit (this=0x7f34e1cefc30) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:264
|
#10 0x00007f35311b71cb in myrocks::Rdb_transaction_impl::commit_no_binlog (this=0x7f34e270eb30) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/ha_rocksdb.cc:2147
|
#11 0x00007f3531199f76 in myrocks::Rdb_transaction::commit (this=0x7f34e270eb30) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/ha_rocksdb.cc:1880
|
#12 myrocks::rocksdb_commit (hton=<optimized out>, thd=0x7f34e0a54c80, commit_tx=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/ha_rocksdb.cc:2879
|
#13 0x00000000007f35f7 in ha_commit_low (thd=thd@entry=0x7f34e0a54c80, all=all@entry=true, run_after_commit=run_after_commit@entry=true) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/handler.cc:1933
|
#14 0x0000000000e26f99 in MYSQL_BIN_LOG::commit (this=0x1d42780 <mysql_bin_log>, thd=0x7f34e0a54c80, all=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:8903
|
#15 0x00000000007f4032 in ha_commit_trans (thd=thd@entry=0x7f34e0a54c80, all=all@entry=true, ignore_global_read_lock=ignore_global_read_lock@entry=false) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/handler.cc:1829
|
#16 0x0000000000d0ce19 in trans_commit (thd=thd@entry=0x7f34e0a54c80) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/transaction.cc:239
|
#17 0x0000000000df28eb in Xid_log_event::do_commit (this=0x7f35288e97e0, thd_arg=0x7f34e0a54c80) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/log_event.cc:7164
|
#18 0x0000000000df29f2 in Xid_apply_log_event::do_apply_event_worker (this=0x7f35288e9828, w=0x7f3529d1c400) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/log_event.cc:7259
|
#19 0x0000000000e5edb5 in slave_worker_exec_job_group (worker=worker@entry=0x7f3529d1c400, rli=rli@entry=0x35ae780) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/rpl_rli_pdb.cc:2671
|
#20 0x0000000000e40cc3 in handle_slave_worker (arg=arg@entry=0x7f3529d1c400) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/rpl_slave.cc:6231
|
#21 0x0000000000ead034 in pfs_spawn_thread (arg=0x7f352825c7e0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/perfschema/pfs.cc:2190
|
#22 0x00007f3567de46ba in start_thread (arg=0x7f35207c6700) at pthread_create.c:333
|
#23 0x00007f3565f6141d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
- they are relying on MyRocks' internal group commit.
- It is not then clear if/how they make sure the commit order on the slave is the same as on the master? (because of gtids? A: no because in my tests gtid_mode=OFF)
- Note the frame #8: rocksdb::WriteCommittedTxn::CommitWithoutPrepareInternal
|
|
If one runs Percona Server with log-slave-updates, the stack traces look different:
https://gist.github.com/spetrunia/634e67b743c3783aa4d102b1ebb870ae
There are a lot of threads waiting to commit:
Thread 45 (Thread 0x7fcd3c336700 (LWP 3743)):
|
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
|
#1 0x0000000000e1610a in native_cond_wait (mutex=0x1d438f8 <mysql_bin_log+4472>, cond=0x1d438c0 <mysql_bin_log+4416>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/include/thr_cond.h:140
|
#2 my_cond_wait (mp=0x1d438f8 <mysql_bin_log+4472>, cond=0x1d438c0 <mysql_bin_log+4416>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/include/thr_cond.h:195
|
#3 inline_mysql_cond_wait (src_file=0x14a4140 "/mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc", src_line=2191, mutex=<optimized out>, that=0x1d438c0 <mysql_bin_log+4416>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/include/mysql/psi/mysql_thread.h:1184
|
#4 Stage_manager::enroll_for (this=this@entry=0x1d43740 <mysql_bin_log+4032>, stage=<optimized out>, thd=0x7fcd4c00a310, stage_mutex=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:2191
|
#5 0x0000000000e1b6a5 in MYSQL_BIN_LOG::change_stage (this=0x1d42780 <mysql_bin_log>, thd=<optimized out>, stage=<optimized out>, queue=<optimized out>, leave_mutex=<optimized out>, enter_mutex=0x1d42788 <mysql_bin_log+8>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:9216
|
#6 0x0000000000e25a22 in MYSQL_BIN_LOG::ordered_commit (this=this@entry=0x1d42780 <mysql_bin_log>, thd=thd@entry=0x7fcd4c00a310, all=all@entry=true, skip_commit=skip_commit@entry=false) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:9630
|
#7 0x0000000000e27460 in MYSQL_BIN_LOG::commit (this=0x1d42780 <mysql_bin_log>, thd=0x7fcd4c00a310, all=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:8883
|
#8 0x00000000007f4032 in ha_commit_trans (thd=thd@entry=0x7fcd4c00a310, all=all@entry=true, ignore_global_read_lock=ignore_global_read_lock@entry=false) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/handler.cc:1829
|
#9 0x0000000000d0ce19 in trans_commit (thd=thd@entry=0x7fcd4c00a310) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/transaction.cc:239
|
#10 0x0000000000df28eb in Xid_log_event::do_commit (this=0x7fcd2403d910, thd_arg=0x7fcd4c00a310) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/log_event.cc:7164
|
#11 0x0000000000df29f2 in Xid_apply_log_event::do_apply_event_worker (this=0x7fcd2403d958, w=0x7fcd240a6c80) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/log_event.cc:7259
|
#12 0x0000000000e5edb5 in slave_worker_exec_job_group (worker=worker@entry=0x7fcd240a6c80, rli=rli@entry=0x32c79c0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/rpl_rli_pdb.cc:2671
|
#13 0x0000000000e40cc3 in handle_slave_worker (arg=arg@entry=0x7fcd240a6c80) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/rpl_slave.cc:6231
|
#14 0x0000000000ead034 in pfs_spawn_thread (arg=0x7fcd2403c9e0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/perfschema/pfs.cc:2190
|
#15 0x00007fcd76cf96ba in start_thread (arg=0x7fcd3c336700) at pthread_create.c:333
|
#16 0x00007fcd74e7641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
And there is one thread doing the syncing with myrocks::rocksdb_flush_wal
Thread 41 (Thread 0x7fcd3c43a700 (LWP 3739)):
|
#0 0x00007fcd74e6c81d in fdatasync () at ../sysdeps/unix/syscall-template.S:84
|
#1 0x00007fcd3fb36d3d in rocksdb::PosixWritableFile::Sync (this=0x7fcd34027420) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/env/io_posix.cc:844
|
#2 0x00007fcd3fbd6289 in rocksdb::WritableFileWriter::SyncInternal (this=this@entry=0x7fcd34027720, use_fsync=use_fsync@entry=false) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/util/file_reader_writer.cc:350
|
#3 0x00007fcd3fbd63b8 in rocksdb::WritableFileWriter::SyncWithoutFlush (this=0x7fcd34027720, use_fsync=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/util/file_reader_writer.cc:338
|
#4 0x00007fcd3fa51b1e in rocksdb::DBImpl::SyncWAL (this=0x316a990) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/db_impl.cc:701
|
#5 0x00007fcd3fa46a88 in rocksdb::DBImpl::FlushWAL (this=0x316a990, sync=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/db/db_impl.cc:656
|
#6 0x00007fcd3f99d4d8 in rocksdb::StackableDB::FlushWAL (sync=<optimized out>, this=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/rocksdb/include/rocksdb/utilities/stackable_db.h:273
|
#7 myrocks::rocksdb_flush_wal (hton=<optimized out>, binlog_group_flush=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/rocksdb/ha_rocksdb.cc:2699
|
#8 0x00000000007ee0af in flush_handlerton (thd=<optimized out>, plugin=<optimized out>, arg=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/handler.cc:2640
|
#9 0x0000000000c8578e in plugin_foreach_with_mask (thd=0x0, funcs=funcs@entry=0x7fcd3c438b30, type=type@entry=1, state_mask=4294967287, state_mask@entry=8, arg=arg@entry=0x7fcd3c438b5c) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/sql_plugin.cc:2377
|
#10 0x0000000000c8591d in plugin_foreach_with_mask (thd=<optimized out>, func=func@entry=0x7ee090 <flush_handlerton(THD*, plugin_ref, void*)>, type=type@entry=1, state_mask=state_mask@entry=8, arg=arg@entry=0x7fcd3c438b5c) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/sql_plugin.cc:2392
|
#11 0x00000000007f4d4a in ha_flush_logs (db_type=db_type@entry=0x0, binlog_group_flush=binlog_group_flush@entry=true) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/handler.cc:2650
|
#12 0x0000000000e258ab in MYSQL_BIN_LOG::process_flush_stage_queue (this=this@entry=0x1d42780 <mysql_bin_log>, total_bytes_var=total_bytes_var@entry=0x7fcd3c438bf8, rotate_var=rotate_var@entry=0x7fcd3c438bf7, out_queue_var=out_queue_var@entry=0x7fcd3c438c00) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:8991
|
#13 0x0000000000e25a69 in MYSQL_BIN_LOG::ordered_commit (this=this@entry=0x1d42780 <mysql_bin_log>, thd=thd@entry=0x7fcd380008c0, all=all@entry=true, skip_commit=skip_commit@entry=false) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:9655
|
#14 0x0000000000e27460 in MYSQL_BIN_LOG::commit (this=0x1d42780 <mysql_bin_log>, thd=0x7fcd380008c0, all=<optimized out>) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/binlog.cc:8883
|
#15 0x00000000007f4032 in ha_commit_trans (thd=thd@entry=0x7fcd380008c0, all=all@entry=true, ignore_global_read_lock=ignore_global_read_lock@entry=false) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/handler.cc:1829
|
#16 0x0000000000d0ce19 in trans_commit (thd=thd@entry=0x7fcd380008c0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/transaction.cc:239
|
#17 0x0000000000df28eb in Xid_log_event::do_commit (this=0x7fcd240bb4d0, thd_arg=0x7fcd380008c0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/log_event.cc:7164
|
#18 0x0000000000df29f2 in Xid_apply_log_event::do_apply_event_worker (this=0x7fcd240bb518, w=0x7fcd24021f40) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/log_event.cc:7259
|
#19 0x0000000000e5edb5 in slave_worker_exec_job_group (worker=worker@entry=0x7fcd24021f40, rli=rli@entry=0x32c79c0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/rpl_rli_pdb.cc:2671
|
#20 0x0000000000e40cc3 in handle_slave_worker (arg=arg@entry=0x7fcd24021f40) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/sql/rpl_slave.cc:6231
|
#21 0x0000000000ead034 in pfs_spawn_thread (arg=0x7fcd240283c0) at /mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/debian-jessie-64bit/percona-server-5.7.20-19/storage/perfschema/pfs.cc:2190
|
#22 0x00007fcd76cf96ba in start_thread (arg=0x7fcd3c43a700) at pthread_create.c:333
|
#23 0x00007fcd74e7641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
|
|
The problem seems to be similar to the one we've had with group commit:
- SQL layer is making commits in order (the call to commit trx$N will only be made after trx${N-1} has committed)
- MyRocks is instructed to do flushing on each commit().
- hence
- One thread is committing at a time
- each commit is doing a flush.
Need to do:
- Don't flush when committing from the slave SQL thread (do we ever need to flush on the slave then?)
- there's no way to do commit() calls in parallel, is there?
|
Run with log_slave_updates=ON, slave_parallel_mode=aggressive
Still MariaDB 10.2, official package
MyRocks
Sysbench workload generator (30 sec):
transactions: 207851 (6926.66 per sec.)
|
Binlog_commits += 210,508
|
Binlog_group_commits += 42,878 (4.9 commits/group)
|
threads, catchup_time
|
1 529.671741
|
10 114.643005
|
20 51.314241
|
40 19.150348
|
InnoDB
Sysbench workload generator (30 sec):
transactions: 148075 (4932.10 per sec.)
|
Binlog_commits += 148,075
|
Binlog_group_commits += 94,289 (1.57 commits/group)
|
threads, catchup_time
|
1 315.305727
|
10 90.130341
|
20 69.594089
|
40 59.796552
|
|
Run with log_slave_updates=ON, slave_parallel_mode=conservative
MyRocks
Sysbench workload generator (30 sec):
Binlog_commits += 215594
|
Binlog_group_commits += 43450
|
transactions: 212861 (7093.72 per sec.)
|
1 542.23636
|
10 111.546783
|
20 75.555415
|
40 61.089441
|
InnoDB
Sysbench workload generator (30 sec):
Binlog_commits += 150773
|
Binlog_group_commits += 96759 (1.55 commits/group)
|
transactions: 150773 (5024.59 per sec.)
|
threads, catchup_time
|
1 322.434903
|
10 88.847347
|
20 66.544515
|
40 58.500228
|
|
|
Summarizing two previous tests:
|
|
Running MariaDB parallel slave with log_slave_updates=ON, slave_parallel_mode=conservative. Stack traces https://gist.github.com/spetrunia/0fb73ad83308bd51ab8ba98c8c676a44
Typical stack looks like this:
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
|
#1 0x00007f02fe1be91c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
|
#2 0x00007f02f5e93e2b in std::condition_variable::wait<rocksdb::WriteThread::BlockingAwaitState(rocksdb::WriteThread::Writer*, uint8_t)::<lambda()> > (__p=..., __lock=..., this=0x7f02f40fddd0) at /usr/include/c++/5/condition_variable:98
|
#3 rocksdb::WriteThread::BlockingAwaitState (this=this@entry=0x55bfd4fbc338, w=0x7f02f40fdd40, goal_mask=goal_mask@entry=30 '\036') at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/db/write_thread.cc:44
|
#4 0x00007f02f5e93f99 in rocksdb::WriteThread::AwaitState (this=this@entry=0x55bfd4fbc338, w=w@entry=0x7f02f40fdd40, goal_mask=goal_mask@entry=30 '\036', ctx=ctx@entry=0x7f02f62d03c0 <rocksdb::jbg_ctx>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/db/write_thread.cc:176
|
#5 0x00007f02f5e94777 in rocksdb::WriteThread::JoinBatchGroup (this=this@entry=0x55bfd4fbc338, w=w@entry=0x7f02f40fdd40) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/db/write_thread.cc:318
|
#6 0x00007f02f5e15ea9 in rocksdb::DBImpl::WriteImplWALOnly (this=this@entry=0x55bfd4fbbb40, write_options=..., my_batch=my_batch@entry=0x7f02e079d800, callback=callback@entry=0x0, log_used=log_used@entry=0x7f02e0a6f678, log_ref=0, seq_used=0x0) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/db/db_impl_write.cc:472
|
#7 0x00007f02f5e16c76 in rocksdb::DBImpl::WriteImpl (this=0x55bfd4fbbb40, write_options=..., my_batch=0x7f02e079d800, callback=callback@entry=0x0, log_used=log_used@entry=0x7f02e0a6f678, log_ref=0, disable_memtable=true, seq_used=0x0) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/db/db_impl_write.cc:82
|
#8 0x00007f02f5fca477 in rocksdb::WriteCommittedTxn::PrepareInternal (this=0x7f02e0a6f670) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:223
|
#9 0x00007f02f5fcabc1 in rocksdb::PessimisticTransaction::Prepare (this=0x7f02e0a6f670) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction.cc:195
|
#10 0x00007f02f5d86342 in myrocks::Rdb_transaction_impl::prepare (this=0x7f02e1342550, name=...) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/ha_rocksdb.cc:2272
|
#11 0x00007f02f5d67edf in myrocks::rocksdb_prepare (hton=<optimized out>, thd=<optimized out>, prepare_tx=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/rocksdb/ha_rocksdb.cc:2894
|
#12 0x000055bfd19ab64a in prepare_or_error (all=true, thd=0x7f02e1d6f1a8, ht=0x55bfd4d8f348) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1158
|
#13 ha_commit_trans (thd=thd@entry=0x7f02e1d6f1a8, all=all@entry=true) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1437
|
#14 0x000055bfd18ecf9b in trans_commit (thd=0x7f02e1d6f1a8) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/transaction.cc:307
|
#15 0x000055bfd1a80caf in Xid_log_event::do_apply_event (this=0x7f02cfa93078, rgi=0x7f02cfc8b960) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/log_event.cc:8441
|
#16 0x000055bfd179271b in Log_event::apply_event (rgi=0x7f02cfc8b960, this=0x7f02cfa93078) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/log_event.h:1452
|
#17 apply_event_and_update_pos_apply (ev=ev@entry=0x7f02cfa93078, thd=thd@entry=0x7f02e1d6f1a8, rgi=rgi@entry=0x7f02cfc8b960, reason=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/slave.cc:3570
|
#18 0x000055bfd1798a03 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x7f02cfa93078, thd=thd@entry=0x7f02e1d6f1a8, rgi=rgi@entry=0x7f02cfc8b960) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/slave.cc:3745
|
#19 0x000055bfd19253a5 in rpt_handle_event (qev=qev@entry=0x7f02ce1a7858, rpt=0x7f02cd44c900) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/rpl_parallel.cc:50
|
#20 0x000055bfd192a97c in handle_rpl_parallel_thread (arg=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/rpl_parallel.cc:1265
|
#21 0x00007f02fe4946ba in start_thread (arg=0x7f02f40ff700) at pthread_create.c:333
|
#22 0x00007f02fdb3f41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
That is, SQL layer is using two-phase commit. Prepare() are done in parallel (and they do sync, right?)
|
|
Disabled log_slave_updates back and ran InnoDB. Stacks: https://gist.github.com/spetrunia/39f22a276ad7e4df95884ef1a2b996f2
The commits are done in order:
Thread 46 (Thread 0x7f6311b78700 (LWP 11626)):
|
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
|
#1 0x000055c5c598c23b in inline_mysql_cond_wait (src_file=0x55c5c6141978 "/home/buildbot/buildbot/build/mariadb-10.2.12/sql/sql_class.cc", src_line=7212, mutex=0x7f631c4fa3a8, that=0x7f631c4fa3d8) at /home/buildbot/buildbot/build/mariadb-10.2.12/include/mysql/psi/mysql_thread.h:1149
|
#2 wait_for_commit::wait_for_prior_commit2 (this=this@entry=0x7f631c4fa3a8, thd=thd@entry=0x7f632c000ad8) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/sql_class.cc:7212
|
#3 0x000055c5c5b57f97 in wait_for_commit::wait_for_prior_commit (thd=0x7f632c000ad8, this=0x7f631c4fa3a8) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/sql_class.h:1965
|
#4 THD::wait_for_prior_commit (this=0x7f632c000ad8) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/sql_class.h:4122
|
#5 ha_commit_one_phase (thd=thd@entry=0x7f632c000ad8, all=all@entry=true) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1545
|
#6 0x000055c5c5b5880e in ha_commit_trans (thd=thd@entry=0x7f632c000ad8, all=all@entry=true) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1415
|
#7 0x000055c5c5a99f9b in trans_commit (thd=0x7f632c000ad8) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/transaction.cc:307
|
#8 0x000055c5c5c2dcaf in Xid_log_event::do_apply_event (this=0x7f631c128778, rgi=0x7f631c4fa370) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/log_event.cc:8441
|
#9 0x000055c5c593f71b in Log_event::apply_event (rgi=0x7f631c4fa370, this=0x7f631c128778) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/log_event.h:1452
|
#10 apply_event_and_update_pos_apply (ev=ev@entry=0x7f631c128778, thd=thd@entry=0x7f632c000ad8, rgi=rgi@entry=0x7f631c4fa370, reason=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/slave.cc:3570
|
#11 0x000055c5c5945a03 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x7f631c128778, thd=thd@entry=0x7f632c000ad8, rgi=rgi@entry=0x7f631c4fa370) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/slave.cc:3745
|
#12 0x000055c5c5ad23a5 in rpt_handle_event (qev=qev@entry=0x7f631cbda6b8, rpt=0x7f631c008318) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/rpl_parallel.cc:50
|
#13 0x000055c5c5ad797c in handle_rpl_parallel_thread (arg=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/rpl_parallel.cc:1265
|
#14 0x00007f6336b7d6ba in start_thread (arg=0x7f6311b78700) at pthread_create.c:333
|
#15 0x00007f633622841d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
There is one committing thread:
Thread 43 (Thread 0x7f6311b2e700 (LWP 11623)):
|
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
|
#1 0x000055c5c5e1b826 in os_event::wait (this=0x55c5c93f6660) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/os/os0event.cc:166
|
#2 os_event::wait_low (reset_sig_count=8536, this=0x55c5c93f6660) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/os/os0event.cc:336
|
#3 os_event_wait_low (event=0x55c5c93f6660, reset_sig_count=reset_sig_count@entry=0) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/os/os0event.cc:535
|
#4 0x000055c5c5dee708 in log_write_up_to (lsn=<optimized out>, flush_to_disk=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/log/log0log.cc:1175
|
#5 0x000055c5c5df400a in log_write_up_to (lsn=<optimized out>, flush_to_disk=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/log/log0log.cc:1299
|
#6 0x000055c5c5eebc57 in trx_flush_log_if_needed_low (lsn=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/trx/trx0trx.cc:1585
|
#7 trx_flush_log_if_needed (trx=0x7f63213856f0, lsn=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/trx/trx0trx.cc:1607
|
#8 trx_commit_complete_for_mysql (trx=trx@entry=0x7f63213856f0) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/trx/trx0trx.cc:2259
|
#9 0x000055c5c5db2305 in innobase_commit (hton=<optimized out>, thd=0x7f62d4006f58, commit_trx=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/storage/innobase/handler/ha_innodb.cc:4792
|
#10 0x000055c5c5b5669e in commit_one_phase_2 (thd=thd@entry=0x7f62d4006f58, all=<optimized out>, trans=0x7f62d400a030, is_real_trans=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1567
|
#11 0x000055c5c5b57f67 in ha_commit_one_phase (thd=thd@entry=0x7f62d4006f58, all=all@entry=true) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1548
|
#12 0x000055c5c5b5880e in ha_commit_trans (thd=thd@entry=0x7f62d4006f58, all=all@entry=true) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/handler.cc:1415
|
#13 0x000055c5c5a99f9b in trans_commit (thd=0x7f62d4006f58) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/transaction.cc:307
|
#14 0x000055c5c5c2dcaf in Xid_log_event::do_apply_event (this=0x7f631c7fd0b8, rgi=0x7f631c7d0520) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/log_event.cc:8441
|
#15 0x000055c5c593f71b in Log_event::apply_event (rgi=0x7f631c7d0520, this=0x7f631c7fd0b8) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/log_event.h:1452
|
#16 apply_event_and_update_pos_apply (ev=ev@entry=0x7f631c7fd0b8, thd=thd@entry=0x7f62d4006f58, rgi=rgi@entry=0x7f631c7d0520, reason=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/slave.cc:3570
|
#17 0x000055c5c5945a03 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x7f631c7fd0b8, thd=thd@entry=0x7f62d4006f58, rgi=rgi@entry=0x7f631c7d0520) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/slave.cc:3745
|
#18 0x000055c5c5ad23a5 in rpt_handle_event (qev=qev@entry=0x7f631c7d4a68, rpt=0x7f631c007e80) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/rpl_parallel.cc:50
|
#19 0x000055c5c5ad797c in handle_rpl_parallel_thread (arg=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.2.12/sql/rpl_parallel.cc:1265
|
#20 0x00007f6336b7d6ba in start_thread (arg=0x7f6311b2e700) at pthread_create.c:333
|
#21 0x00007f633622841d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
|
.. and it is doing sync. One would expect that this would cause the parallel slave in this case to slow down to the speed of single-threaded one, but this is apparently not happening.
|
|
Debugging InnoDB on 10.2
Master
sysbench: 151 676 transactions
Binlog_commits += 151 676
Binlog_commit_groups += 95 717
Slave (debug build):
threads=20
log_write_up_to calls = 375 744
catchup time: 624.609522 sec.
Slave without the tracing code (for comparison)
sysbench on master: 145,089 transactions
catchup time: 592.940083
There are many more log_write_up_to calls than transactions
|
|
Now slave catch-up takes ~595 sec,
number of commits matches the expected one and we see that commits do happen
in parallel:
counter_value=145,158
|
n_active histogram= {
|
0, 0
|
1, 36464
|
2, 29865
|
3, 21621
|
4, 15590
|
5, 11351
|
6, 8336
|
7, 6121
|
8, 4456
|
9, 3220
|
10, 2371
|
11, 1685
|
12, 1231
|
13, 852
|
14, 624
|
15, 436
|
16, 308
|
17, 231
|
18, 177
|
19, 143
|
20, 76
|
21, 0
|
22, 0
|
23, 0
|
24, 0
|
25, 0
|
26, 0
|
27, 0
|
28, 0
|
29, 0
|
30, 0
|
31, 0
|
}
|
|
|
Using the same test setup as before (145K transactions, slave with slave_parallel_workers=20, _mode=conservative, NO log-slave-updates) to look at which code in innobase_commit executes in parallel:
This piece of code is executed by one thread at a time :
if (commit_trx
|
|| (!thd_test_options(thd, OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))) {
|
|
DBUG_EXECUTE_IF("crash_innodb_before_commit",
|
DBUG_SUICIDE(););
|
|
|
/* Run the fast part of commit if we did not already. */
|
if (!trx_is_active_commit_ordered(trx)) {
|
innobase_commit_ordered_2(trx, thd);
|
|
}
|
|
/* We were instructed to commit the whole transaction, or
|
this is an SQL statement end and autocommit is on */
|
|
/* At this point commit order is fixed and transaction is
|
visible to others. So we can wakeup other commits waiting for
|
this one, to allow then to group commit with us. */
|
histogram:
n_active histogram= {
|
0, 0
|
1, 145158
|
2, 0
|
3, 0
|
...
while this part of code runs in parallel, most likely due to the first call it is making:
thd_wakeup_subsequent_commits(thd, 0);
|
|
|
/* Now do a write + flush of logs. */
|
trx_commit_complete_for_mysql(trx);
|
|
trx_deregister_from_2pc(trx);
|
} else {
|
histogram:
n_active histogram= {
|
0, 0
|
1, 39488
|
2, 29460
|
3, 20922
|
4, 14941
|
5, 10823
|
6, 7922
|
7, 5884
|
8, 4360
|
9, 3136
|
10, 2266
|
11, 1659
|
12, 1234
|
13, 880
|
14, 632
|
15, 479
|
16, 360
|
17, 275
|
18, 212
|
19, 150
|
20, 75
|
21, 0
|
22, 0
|
23, 0
|
...
|
}
|
|
|
Filed MDEV-15372: Parallel slave speedup very limited when log_slave_updates=OFF, now testing a fix for it.
|
|
RC release was made so closing this task
|