[MDEV-15599] InnoDB operations significantly slower after upgrade to MariaDB 10x Created: 2018-03-19  Updated: 2019-01-28  Resolved: 2019-01-28

Status: Closed
Project: MariaDB Server
Component/s: Configuration
Affects Version/s: 10.1.24, 10.2.8
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Lori Patrick Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

CentOS 6.5 and 7.3


Attachments: File master_innodb_engine_status.out     File my.cnf.10-2-8     File my.cnf.5-5-34     File slave_innodb_engine_status.out    

 Description   

I recently upgraded my servers from 5.5.34 to 10x. For my hosts running CentOS 6x I went with 10.1.24 and hosts running CentOS 7x I went with 10.2.8.

I have noticed a slow down in certain innodb operations on both 10.1.24 and 10.2.8.

For example, I have a replication master that is running 5.5.34 and a delete operation took 20 seconds. When it replicated to the 10.2.8 slave, it took 70 seconds. Some additional details:

• Both servers had been rebooted and all caches flushed.
• No other connection activity was happening, so not a potential concurrency issue.
• Both CentOS 7.3 configured the same (swapiness = 1, numa-interleaving enabled, etc). • Hardware is identical
• my.cnf(s) are the same except on the 10.2.8 slave commented out: innodb_additional_mem_pool_size – thread_cache_size – thread_concurrency

Are there any config settings in 10x I can tweak that address this kind of performance issue?



 Comments   
Comment by Alice Sherepa [ 2018-03-20 ]

please add examples of the "slow" queries, EXPLAIN EXTENDED for those queries in 5.5 and 10.0 and you .cnf file(s)

Comment by Lori Patrick [ 2018-03-20 ]

I can't do an explain on a delete operation on 5.5.34, but for the 10.2.8 server it would be:

explain extended DELETE FROM audit_db_dropcopy.drop_copy_fill_events WHERE trade_date < "2017-03-20";
-------------------------------------------------------------------------------------------------

id select_type table type possible_keys key key_len ref rows filtered Extra

-------------------------------------------------------------------------------------------------

1 SIMPLE drop_copy_fill_events ALL NULL NULL NULL NULL 13860085 100.00 Using where

-------------------------------------------------------------------------------------------------

I am attaching the 5.5.34 and 10.2.8 my.cnf(s)

my.cnf.5-5-34 my.cnf.10-2-8

Comment by Alice Sherepa [ 2018-03-20 ]

please provide also output of SHOW CREATE TABLE audit_db_dropcopy.drop_copy_fill_events, so I will try to compare it on 5.5 and 10.2

Comment by Lori Patrick [ 2018-03-20 ]

CREATE TABLE `drop_copy_fill_events` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`tstamp` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP,
`curr_update_seq` bigint(20) unsigned NOT NULL DEFAULT '0',
`origin` varchar(50) NOT NULL DEFAULT '',
`order_id` varchar(64) DEFAULT NULL,
`order_state` varchar(30) NOT NULL DEFAULT '',
`event` varchar(30) NOT NULL DEFAULT '',
`gv_ts` bigint(20) unsigned NOT NULL DEFAULT '0',
`exchange_ts` bigint(20) unsigned NOT NULL DEFAULT '0',
`sent_ts` bigint(20) unsigned NOT NULL DEFAULT '0',
`ts` bigint(20) unsigned NOT NULL DEFAULT '0',
`exchange_order_id` varchar(50) DEFAULT '',
`ad_id` varchar(100) DEFAULT '',
`exchange` varchar(30) NOT NULL DEFAULT '',
`symbol` varchar(40) DEFAULT '',
`root_symbol` varchar(30) NOT NULL DEFAULT '',
`side` varchar(10) NOT NULL DEFAULT '',
`order_type` varchar(30) NOT NULL DEFAULT '',
`price` decimal(15,8) NOT NULL DEFAULT '0.00000000',
`quantity` int(10) unsigned NOT NULL DEFAULT '0',
`quantity_numerator` int(10) DEFAULT '0',
`quantity_denominator` int(10) DEFAULT '0',
`remaining_quantity` int(10) unsigned NOT NULL DEFAULT '0',
`user_id` varchar(20) NOT NULL DEFAULT '',
`context` varchar(20) NOT NULL DEFAULT '',
`ad_text` varchar(100) DEFAULT '',
`exchange_session` varchar(55) DEFAULT NULL,
`trader_id` varchar(20) DEFAULT '',
`fill_id` varchar(100) DEFAULT '',
`trade_id` varchar(30) DEFAULT NULL,
`match_id` varchar(20) DEFAULT NULL,
`total_filled` int(10) unsigned NOT NULL DEFAULT '0',
`trade_date` date DEFAULT '0000-00-00',
`leg_type` varchar(30) DEFAULT '',
`exchange_account` varchar(30) DEFAULT '',
`account_number` varchar(30) DEFAULT '',
`liquidity` varchar(20) DEFAULT '',
`busted` tinyint(4) DEFAULT '0',
`corrected` tinyint(4) DEFAULT '0',
`insert_ts` bigint(20) unsigned NOT NULL DEFAULT '0',
`ts_str` varchar(50) DEFAULT NULL,
`gv_ts_str` varchar(50) DEFAULT NULL,
`exchange_ts_str` varchar(50) DEFAULT NULL,
`sent_ts_str` varchar(50) DEFAULT NULL,
`value_date` date DEFAULT '0000-00-00',
`short_sell` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `user_idx` (`user_id`),
KEY `exchange_ts_idx` (`exchange_ts`),
KEY `fill_id_str` (`fill_id`),
KEY `exchange_order_id_idx` (`exchange_order_id`),
KEY `ts_str_idx` (`ts_str`),
KEY `tstamp_idx` (`tstamp`),
KEY `order_id_user_id_idx` (`order_id`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Comment by Alice Sherepa [ 2018-03-27 ]

Is it possible for you to compare execution time of the query without replication? And please provide output of "free -m" and SHOW ENGINE INNODB STATUS \G

Comment by Lori Patrick [ 2018-03-27 ]

I am not sure what you mean about without replication? When I noticed the problem, it ran it on each server with binary logging enabled, so those are time not thru replication but done locally on each host after a reboot without any other active threads. Do you need me to replicate in some other way?

If you want me to get the engine status and free -m when the delete runs, I can grab that on Sat after I reboot. These are production hosts I can't reboot until the weekend.

As of right this moment, the output of show engine innodb status \G is attached.

free -m output from master:
[root@ch4pxdbpts04 lpatrick]# free -m
total used free shared buff/cache available
Mem: 64150 22032 38487 43 3630 41510
Swap: 8191 0 8191

free -m output from slave:
[root@ch0pxdbpts05 lpatrick]# free -m
total used free shared buff/cache available
Mem: 64150 38921 21997 43 3230 24629
Swap: 8191 0 8191

Comment by Elena Stepanova [ 2018-12-31 ]

It looks like this one fell through the cracks, sorry about that.
Are you still having the problem?

The very first thing that comes to mind looking at the table structure, the query and the plan, is that both 20 and 70 seconds sound very lucky for a table with ~14M rows and without an index to use. Was it comparative timing for this very query, or could it be a different one?

Further, you didn't mention it, so just in case – is it certain that the structures and data are identical on both servers? Is it possible that the table on the master has less data or, for example, the table on the slave has a trigger which the master doesn't have (quite typical for slaves used for analytics)?

Regarding replication, one of the concerns is that, given binlog_format=mixed in your config, the server can rather easily switch from the STATEMENT format to ROW (it might even be caused not by this particular query, but by something that happened earlier in the session). You said that both master and slave execute the delete operation, but you didn't say that you saw the slave execute the delete query, so we can't rule it out yet. It would mean that instead of one statement removing N rows, your slave would execute N statements removing 1 row. It might even be not a bad thing, given the presence of PK and absence of an index on trade_date column, but it would definitely affect execution time, for better or worse (depending on N).

Generated at Thu Feb 08 08:22:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.