[MDEV-8687] CREATE TEMPORARY TABLE vs plain SELECT causes 280x perf drop: InnoDB takes unneeded locks Created: 2015-08-28  Updated: 2015-10-23  Resolved: 2015-10-23

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.0, 10.1
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Stewart Smith Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 0
Labels: innodb
Environment:

ppc64el


Attachments: HTML File psdoit     HTML File sql    
Issue Links:
Relates
relates to MDEV-8684 UT_RELAX_CPU isn't at all relaxing Closed
Sprint: 10.0.22

 Description   

With the attached SQL and psdoit script to run queries concurrently, you can see that the CREATE TEMPORARY TABLE part of the query causes concurrency to tank. The related issue of MDEV-8684 is for problems in the InnoDB mutex code, while this bug is for the fact that InnoDB is down in lock_rec_convert_impl_to_exp() when it probably shouldn't be.

It appears that CREATE TEMPORARY TABLE is triggering InnoDB to go and do a whole bunch of locking that it doesn't do when you just run a plain SELECT.

The addition of CREATE TEMPORARY TABLE makes performance 280x worse.

 # mysql -u root < sql
 
 # time ./psdoit 1
 0.022s
 
 # time ./psdoit 10
 0.112s
 
 # time ./psdoit 100
 7.150s
 
 # time ./psdoit 200
 30.422s
 
 # time ./psdoit 400
 190.378s

Things get very bad, very fast. We spend almost all our time in a locking
food fight:

     37.42%  mysqld         mysqld              [.] ut_delay(unsigned long)                                                                                     
		    |
		    ---ut_delay(unsigned long)
		       |          
		       |--99.74%-- mutex_spin_wait(void*, bool, char const*, unsigned long)
		       |          |          
		       |          |--74.70%-- pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long) [clone .constprop.71]
		       |          |          lock_rec_convert_impl_to_expl(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const*)
		       |          |          lock_clust_rec_read_check_and_lock(unsigned long, buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const
		       |          |          sel_set_rec_lock(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const*, unsigned long, unsigned long, 
		       |          |          row_search_for_mysql(unsigned char*, unsigned long, row_prebuilt_t*, unsigned long, unsigned long)
		       |          |          |          
		       |          |          |--76.07%-- ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)
		       |          |          |          handler::index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)
		       |          |          |          handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)
		       |          |          |          join_read_key2(THD*, st_join_table*, TABLE*, st_table_ref*)
		       |          |          |          sub_select(JOIN*, st_join_table*, bool)
		       |          |          |

and:

     27.96%  mysqld         [kernel.kallsyms]   [k] _raw_spin_lock                                                                                   
		    |
		    ---_raw_spin_lock
		       |          
		       |--58.11%-- futex_wait_setup
		       |          |          
		       |          |--99.96%-- 0
		       |          |          futex_wait
		       |          |          do_futex
		       |          |          sys_futex
		       |          |          system_call
		       |          |          |          
		       |          |          |--64.91%-- __lll_lock_wait
		       |          |          |          |          
		       |          |          |          |--83.30%-- pthread_mutex_lock
		       |          |          |          |          |          
		       |          |          |          |          |--52.72%-- os_event_reset(os_event*)
		       |          |          |          |          |          sync_array_reserve_cell(sync_array_t*, void*, unsigned long, char const*, unsigned long, unsi
		       |          |          |          |          |          mutex_spin_wait(void*, bool, char const*, unsigned long)
		       |          |          |          |          |          |          
		       |          |          |          |          |          |--75.08%-- pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long) [clone .constprop.7
		       |          |          |          |          |          |          lock_rec_convert_impl_to_expl(buf_block_t const*, unsigned char const*, dict_index
		       |          |          |          |          |          |          lock_clust_rec_read_check_and_lock(unsigned long, buf_block_t const*, unsigned cha
		       |          |          |          |          |          |          sel_set_rec_lock(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned
		       |          |          |          |          |          |          row_search_for_mysql(unsigned char*, unsigned long, row_prebuilt_t*, unsigned long 



 Comments   
Comment by Stewart Smith [ 2015-08-31 ]

One possibility as to what's going on is that InnoDB is taking locks as if statement based binary logging is enabled.

I'd argue that in this day and age, it should do so based on mixed mode replication rather than statement based, although I haven't looked at what the default binlog-format is in modern MariaDB (although there's probably a strong case for mixed mode if it isn't already)

Comment by Elena Stepanova [ 2015-09-01 ]
  1 10 100 200 400
MariaDB 5.5.45 0.019 0.084 2.160 14.047 60.398
MySQL 5.5.45 0.022 0.084 1.085 3.891 41.689
MariaDB 10.0.21 0.026 0.097 1.658 20.697 163.562
MySQL 5.6.26 0.043 0.167 1.975 7.036 82.006
MariaDB 10.1.6 0.020 0.075 1.973 27.845 152.873
MySQL 5.7.8 0.043 0.201 2.147 4.043 21.544

Small values are too volatile for meaningful comparison. High values are better on MySQL everywhere, on 5.7 they are quite impressive (but even on 5.6 they are much better than 10.0/10.1).

Comment by Sergey Vojtovich [ 2015-09-08 ]

jplindst, could you check this? Should we merge fix from MySQL?

Comment by Jan Lindström (Inactive) [ 2015-10-23 ]

This is not a bug as default isolation level is REPEATABLE_READ. For plain select we do not need to take row locks, instead a consistent read is enough. But for CREATE [TEMPORARY] TABLE ... SELECT we need to take row locks to maintain selected range consistent also remember that InnoDB does not know is binary log used or not only for innodb-locks-unsafe-for-binlog case is optimized.

Generated at Thu Feb 08 07:29:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.