[MDEV-13739] Huge slowness at inserting rows, CPU-bound Created: 2017-09-05  Updated: 2018-07-06

Status: Confirmed
Project: MariaDB Server
Component/s: Storage Engine - RocksDB
Affects Version/s: 10.2.6, 10.2.8
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Julien Muchembled Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File general.log.gz    
Issue Links:
Relates
relates to MDEV-14460 Memory leak with only SELECT statements Closed

 Description   

Test case:

  • DB almost empty
  • 2 transactions with thousands queries inserting 1 row, some of them result in IntegrityError (DUP_ENTRY):
    • txn1: 18645 inserts, including 7467 DUP_ENTRY
    • txn2: 23178 inserts, including 9833 DUP_ENTRY
  • the other queries in the transactions are SELECT, but whenever SHOW PROCESSLIST returns something, it's an INSERT
  • final DB size: 20 MB

RocksDB:

  • default configuration
  • txn1: 4min 25s
  • txn2: 10min 10s

TokuDB:

  • tokudb_cache_size explicitely set to 512M to match RocksDB
  • txn1: 11s
  • txn2: 15s

perf shows a lot of time spent in rollback, but the application does not do any explicitely

+ 99.43% 0.00% mysqld libpthread-2.13.so [.] start_thread 
+ 99.38% 0.00% mysqld mysqld [.] handle_one_connection 
+ 99.38% 0.00% mysqld mysqld [.] do_handle_one_connection(CONNECT*) 
+ 99.37% 0.00% mysqld mysqld [.] do_command(THD*) 
+ 99.36% 0.01% mysqld mysqld [.] dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) 
+ 99.27% 0.01% mysqld mysqld [.] mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) 
+ 98.95% 0.01% mysqld mysqld [.] mysql_execute_command(THD*) 
+ 97.48% 0.00% mysqld mysqld [.] trans_rollback_stmt(THD*) 
+ 97.48% 0.00% mysqld mysqld [.] ha_rollback_trans(THD*, bool) 
+ 97.48% 0.00% mysqld ha_rocksdb.so [.] 0x00000000001a44ad 
+ 97.48% 0.00% mysqld ha_rocksdb.so [.] myrocks::Rdb_transaction_impl::rollback_stmt() 
+ 97.48% 0.00% mysqld ha_rocksdb.so [.] rocksdb::TransactionImpl::RollbackToSavePoint() 
+ 97.47% 0.01% mysqld ha_rocksdb.so [.] rocksdb::TransactionBaseImpl::RollbackToSavePoint() 
+ 97.46% 0.00% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::RollbackToSavePoint() 
+ 97.35% 0.28% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::Rep::ReBuildIndex() 
+ 67.90% 32.38% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchEntryComparator::operator()(rocksdb::WriteBatchIndexEntry const*, rocksdb::WriteBatchIndexEntry const*) const 
+ 51.86% 0.51% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::Rep::UpdateExistingEntryWithCfId(unsigned int, rocksdb::Slice const&) 
+ 43.25% 0.32% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::Rep::AddNewEntry(unsigned int) 
+ 42.45% 9.94% mysqld ha_rocksdb.so [.] rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::WriteBatchEntryComparator const&>::FindGreaterOrEqual(rocksdb::WriteBatchIndexEntry* const&) const 
+ 42.14% 0.77% mysqld ha_rocksdb.so [.] rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::WriteBatchEntryComparator const&>::Insert(rocksdb::WriteBatchIndexEntry* const&) 
+ 34.56% 5.20% mysqld ha_rocksdb.so [.] rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::WriteBatchEntryComparator const&>::FindLessThan(rocksdb::WriteBatchIndexEntry* const&, rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::Wr
+ 11.24% 11.20% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchEntryComparator::CompareKey(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&) const 
+ 9.29% 9.26% mysqld ha_rocksdb.so [.] myrocks::Rdb_pk_comparator::Compare(rocksdb::Slice const&, rocksdb::Slice const&) const 
+ 5.20% 5.18% mysqld libc-2.13.so [.] 0x0000000000138670



 Comments   
Comment by Julien Muchembled [ 2017-09-05 ]

The attached general log dumps the executed requests for a few seconds.

The code of our software can be found at https://lab.nexedi.com/nexedi/neoppod/blob/master/neo/storage/database/mysqldb.py

  • _setup method (line 175) for schemas
  • storeData method (line 549, only the try...except matters) for the inserts

In mysql_execute_command, I added prints around trans_rollback_stmt, and it is really called as many as times as there are DUP_ENTRY. At the beginning, trans_rollback_stmt is fast and then it gets slower and slower: up to 100ms. If I sum up everything, all the time is really spent there.

Comment by Andrii Nikitin (Inactive) [ 2017-09-09 ]

I can confirm that I can easily observe problem behavior when long transaction meets many 'duplicate key' errors :

engine=${1:-rocksdb}
inserts=100000
wrap=10000
 
mysql -e 'drop table if exists t' test
mysql -e "create table t(a int primary key) engine=$engine" test
 
rm -f l.sql
 
for (( i=1; i<$inserts; i++ )) ; do
  echo "insert into t select $i%$wrap;" >> l.sql
done
 
mysql -e 'select version()'
mysql -e 'show global status where variable_name in ("Com_insert_select","Handler_commit","Handler_rollback")'
mysql --force -e 'start transaction; source l.sql; commit;' test &>test.log
# should show 100K commands executed
mysql -e 'show global status where variable_name in ("Com_insert_select","Handler_commit","Handler_rollback")'

(It also worth to note that MyRocks does show even worse performance in my tests, so the problem is not specific to MariaDB .)

Long transactions is not the strongest side of RocksDB engine, so it is kind of expected behavior.
Nevertheless, I've opened https://github.com/facebook/mysql-5.6/issues/702 to get confirmation from MyRocks developers.

Did you consider to execute SQL command below (in session before import) to solve performance problem? (assuming that you understand consequences):

set rocksdb_commit_in_the_middle=1;

Comment by Sergei Petrunia [ 2017-09-12 ]

Also filed against upstream as https://github.com/facebook/mysql-5.6/issues/702

Comment by Sergei Petrunia [ 2017-09-20 ]

Did a quick investigation and posted the outcome in issue #702. It's fixable but it will take more than a few hours.

Comment by Sergei Petrunia [ 2018-04-19 ]

Work with savepoints has been re-worked in MDEV-14460.

Comment by Otto Kekäläinen [ 2018-07-06 ]

psergey Any updates here?

Generated at Thu Feb 08 08:07:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.