Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13739

Huge slowness at inserting rows, CPU-bound

Details

    Description

      Test case:

      • DB almost empty
      • 2 transactions with thousands queries inserting 1 row, some of them result in IntegrityError (DUP_ENTRY):
        • txn1: 18645 inserts, including 7467 DUP_ENTRY
        • txn2: 23178 inserts, including 9833 DUP_ENTRY
      • the other queries in the transactions are SELECT, but whenever SHOW PROCESSLIST returns something, it's an INSERT
      • final DB size: 20 MB

      RocksDB:

      • default configuration
      • txn1: 4min 25s
      • txn2: 10min 10s

      TokuDB:

      • tokudb_cache_size explicitely set to 512M to match RocksDB
      • txn1: 11s
      • txn2: 15s

      perf shows a lot of time spent in rollback, but the application does not do any explicitely

      + 99.43% 0.00% mysqld libpthread-2.13.so [.] start_thread 
      + 99.38% 0.00% mysqld mysqld [.] handle_one_connection 
      + 99.38% 0.00% mysqld mysqld [.] do_handle_one_connection(CONNECT*) 
      + 99.37% 0.00% mysqld mysqld [.] do_command(THD*) 
      + 99.36% 0.01% mysqld mysqld [.] dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) 
      + 99.27% 0.01% mysqld mysqld [.] mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) 
      + 98.95% 0.01% mysqld mysqld [.] mysql_execute_command(THD*) 
      + 97.48% 0.00% mysqld mysqld [.] trans_rollback_stmt(THD*) 
      + 97.48% 0.00% mysqld mysqld [.] ha_rollback_trans(THD*, bool) 
      + 97.48% 0.00% mysqld ha_rocksdb.so [.] 0x00000000001a44ad 
      + 97.48% 0.00% mysqld ha_rocksdb.so [.] myrocks::Rdb_transaction_impl::rollback_stmt() 
      + 97.48% 0.00% mysqld ha_rocksdb.so [.] rocksdb::TransactionImpl::RollbackToSavePoint() 
      + 97.47% 0.01% mysqld ha_rocksdb.so [.] rocksdb::TransactionBaseImpl::RollbackToSavePoint() 
      + 97.46% 0.00% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::RollbackToSavePoint() 
      + 97.35% 0.28% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::Rep::ReBuildIndex() 
      + 67.90% 32.38% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchEntryComparator::operator()(rocksdb::WriteBatchIndexEntry const*, rocksdb::WriteBatchIndexEntry const*) const 
      + 51.86% 0.51% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::Rep::UpdateExistingEntryWithCfId(unsigned int, rocksdb::Slice const&) 
      + 43.25% 0.32% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchWithIndex::Rep::AddNewEntry(unsigned int) 
      + 42.45% 9.94% mysqld ha_rocksdb.so [.] rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::WriteBatchEntryComparator const&>::FindGreaterOrEqual(rocksdb::WriteBatchIndexEntry* const&) const 
      + 42.14% 0.77% mysqld ha_rocksdb.so [.] rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::WriteBatchEntryComparator const&>::Insert(rocksdb::WriteBatchIndexEntry* const&) 
      + 34.56% 5.20% mysqld ha_rocksdb.so [.] rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::WriteBatchEntryComparator const&>::FindLessThan(rocksdb::WriteBatchIndexEntry* const&, rocksdb::SkipList<rocksdb::WriteBatchIndexEntry*, rocksdb::Wr
      + 11.24% 11.20% mysqld ha_rocksdb.so [.] rocksdb::WriteBatchEntryComparator::CompareKey(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&) const 
      + 9.29% 9.26% mysqld ha_rocksdb.so [.] myrocks::Rdb_pk_comparator::Compare(rocksdb::Slice const&, rocksdb::Slice const&) const 
      + 5.20% 5.18% mysqld libc-2.13.so [.] 0x0000000000138670
      

      Attachments

        Issue Links

          Activity

            The attached general log dumps the executed requests for a few seconds.

            The code of our software can be found at https://lab.nexedi.com/nexedi/neoppod/blob/master/neo/storage/database/mysqldb.py

            • _setup method (line 175) for schemas
            • storeData method (line 549, only the try...except matters) for the inserts

            In mysql_execute_command, I added prints around trans_rollback_stmt, and it is really called as many as times as there are DUP_ENTRY. At the beginning, trans_rollback_stmt is fast and then it gets slower and slower: up to 100ms. If I sum up everything, all the time is really spent there.

            jmuchemb Julien Muchembled added a comment - The attached general log dumps the executed requests for a few seconds. The code of our software can be found at https://lab.nexedi.com/nexedi/neoppod/blob/master/neo/storage/database/mysqldb.py _setup method (line 175) for schemas storeData method (line 549, only the try...except matters) for the inserts In mysql_execute_command, I added prints around trans_rollback_stmt, and it is really called as many as times as there are DUP_ENTRY. At the beginning, trans_rollback_stmt is fast and then it gets slower and slower: up to 100ms. If I sum up everything, all the time is really spent there.

            I can confirm that I can easily observe problem behavior when long transaction meets many 'duplicate key' errors :

            engine=${1:-rocksdb}
            inserts=100000
            wrap=10000
             
            mysql -e 'drop table if exists t' test
            mysql -e "create table t(a int primary key) engine=$engine" test
             
            rm -f l.sql
             
            for (( i=1; i<$inserts; i++ )) ; do
              echo "insert into t select $i%$wrap;" >> l.sql
            done
             
            mysql -e 'select version()'
            mysql -e 'show global status where variable_name in ("Com_insert_select","Handler_commit","Handler_rollback")'
            mysql --force -e 'start transaction; source l.sql; commit;' test &>test.log
            # should show 100K commands executed
            mysql -e 'show global status where variable_name in ("Com_insert_select","Handler_commit","Handler_rollback")'
            

            (It also worth to note that MyRocks does show even worse performance in my tests, so the problem is not specific to MariaDB .)

            Long transactions is not the strongest side of RocksDB engine, so it is kind of expected behavior.
            Nevertheless, I've opened https://github.com/facebook/mysql-5.6/issues/702 to get confirmation from MyRocks developers.

            Did you consider to execute SQL command below (in session before import) to solve performance problem? (assuming that you understand consequences):

            set rocksdb_commit_in_the_middle=1;
            

            anikitin Andrii Nikitin (Inactive) added a comment - I can confirm that I can easily observe problem behavior when long transaction meets many 'duplicate key' errors : engine=${1:-rocksdb} inserts=100000 wrap=10000   mysql -e 'drop table if exists t' test mysql -e "create table t(a int primary key) engine=$engine" test   rm -f l.sql   for (( i=1; i<$inserts; i++ )) ; do echo "insert into t select $i%$wrap;" >> l.sql done   mysql -e 'select version()' mysql -e 'show global status where variable_name in ("Com_insert_select","Handler_commit","Handler_rollback")' mysql --force -e 'start transaction; source l.sql; commit;' test &> test .log # should show 100K commands executed mysql -e 'show global status where variable_name in ("Com_insert_select","Handler_commit","Handler_rollback")' (It also worth to note that MyRocks does show even worse performance in my tests, so the problem is not specific to MariaDB .) Long transactions is not the strongest side of RocksDB engine, so it is kind of expected behavior. Nevertheless, I've opened https://github.com/facebook/mysql-5.6/issues/702 to get confirmation from MyRocks developers. Did you consider to execute SQL command below (in session before import) to solve performance problem? (assuming that you understand consequences): set rocksdb_commit_in_the_middle=1;
            psergei Sergei Petrunia added a comment - Also filed against upstream as https://github.com/facebook/mysql-5.6/issues/702

            Did a quick investigation and posted the outcome in issue #702. It's fixable but it will take more than a few hours.

            psergei Sergei Petrunia added a comment - Did a quick investigation and posted the outcome in issue #702. It's fixable but it will take more than a few hours.

            Work with savepoints has been re-worked in MDEV-14460.

            psergei Sergei Petrunia added a comment - Work with savepoints has been re-worked in MDEV-14460 .

            psergey Any updates here?

            otto Otto Kekäläinen added a comment - psergey Any updates here?

            People

              psergei Sergei Petrunia
              jmuchemb Julien Muchembled
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.