Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8687

CREATE TEMPORARY TABLE vs plain SELECT causes 280x perf drop: InnoDB takes unneeded locks

Details

    • 10.0.22

    Description

      With the attached SQL and psdoit script to run queries concurrently, you can see that the CREATE TEMPORARY TABLE part of the query causes concurrency to tank. The related issue of MDEV-8684 is for problems in the InnoDB mutex code, while this bug is for the fact that InnoDB is down in lock_rec_convert_impl_to_exp() when it probably shouldn't be.

      It appears that CREATE TEMPORARY TABLE is triggering InnoDB to go and do a whole bunch of locking that it doesn't do when you just run a plain SELECT.

      The addition of CREATE TEMPORARY TABLE makes performance 280x worse.

       # mysql -u root < sql
       
       # time ./psdoit 1
       0.022s
       
       # time ./psdoit 10
       0.112s
       
       # time ./psdoit 100
       7.150s
       
       # time ./psdoit 200
       30.422s
       
       # time ./psdoit 400
       190.378s

      Things get very bad, very fast. We spend almost all our time in a locking
      food fight:

           37.42%  mysqld         mysqld              [.] ut_delay(unsigned long)                                                                                     
      		    |
      		    ---ut_delay(unsigned long)
      		       |          
      		       |--99.74%-- mutex_spin_wait(void*, bool, char const*, unsigned long)
      		       |          |          
      		       |          |--74.70%-- pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long) [clone .constprop.71]
      		       |          |          lock_rec_convert_impl_to_expl(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const*)
      		       |          |          lock_clust_rec_read_check_and_lock(unsigned long, buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const
      		       |          |          sel_set_rec_lock(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const*, unsigned long, unsigned long, 
      		       |          |          row_search_for_mysql(unsigned char*, unsigned long, row_prebuilt_t*, unsigned long, unsigned long)
      		       |          |          |          
      		       |          |          |--76.07%-- ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)
      		       |          |          |          handler::index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)
      		       |          |          |          handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)
      		       |          |          |          join_read_key2(THD*, st_join_table*, TABLE*, st_table_ref*)
      		       |          |          |          sub_select(JOIN*, st_join_table*, bool)
      		       |          |          |

      and:

           27.96%  mysqld         [kernel.kallsyms]   [k] _raw_spin_lock                                                                                   
      		    |
      		    ---_raw_spin_lock
      		       |          
      		       |--58.11%-- futex_wait_setup
      		       |          |          
      		       |          |--99.96%-- 0
      		       |          |          futex_wait
      		       |          |          do_futex
      		       |          |          sys_futex
      		       |          |          system_call
      		       |          |          |          
      		       |          |          |--64.91%-- __lll_lock_wait
      		       |          |          |          |          
      		       |          |          |          |--83.30%-- pthread_mutex_lock
      		       |          |          |          |          |          
      		       |          |          |          |          |--52.72%-- os_event_reset(os_event*)
      		       |          |          |          |          |          sync_array_reserve_cell(sync_array_t*, void*, unsigned long, char const*, unsigned long, unsi
      		       |          |          |          |          |          mutex_spin_wait(void*, bool, char const*, unsigned long)
      		       |          |          |          |          |          |          
      		       |          |          |          |          |          |--75.08%-- pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long) [clone .constprop.7
      		       |          |          |          |          |          |          lock_rec_convert_impl_to_expl(buf_block_t const*, unsigned char const*, dict_index
      		       |          |          |          |          |          |          lock_clust_rec_read_check_and_lock(unsigned long, buf_block_t const*, unsigned cha
      		       |          |          |          |          |          |          sel_set_rec_lock(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned
      		       |          |          |          |          |          |          row_search_for_mysql(unsigned char*, unsigned long, row_prebuilt_t*, unsigned long 

      Attachments

        1. psdoit
          0.6 kB
        2. sql
          314 kB

        Issue Links

          Activity

            stewart-ibm Stewart Smith added a comment -

            One possibility as to what's going on is that InnoDB is taking locks as if statement based binary logging is enabled.

            I'd argue that in this day and age, it should do so based on mixed mode replication rather than statement based, although I haven't looked at what the default binlog-format is in modern MariaDB (although there's probably a strong case for mixed mode if it isn't already)

            stewart-ibm Stewart Smith added a comment - One possibility as to what's going on is that InnoDB is taking locks as if statement based binary logging is enabled. I'd argue that in this day and age, it should do so based on mixed mode replication rather than statement based, although I haven't looked at what the default binlog-format is in modern MariaDB (although there's probably a strong case for mixed mode if it isn't already)
              1 10 100 200 400
            MariaDB 5.5.45 0.019 0.084 2.160 14.047 60.398
            MySQL 5.5.45 0.022 0.084 1.085 3.891 41.689
            MariaDB 10.0.21 0.026 0.097 1.658 20.697 163.562
            MySQL 5.6.26 0.043 0.167 1.975 7.036 82.006
            MariaDB 10.1.6 0.020 0.075 1.973 27.845 152.873
            MySQL 5.7.8 0.043 0.201 2.147 4.043 21.544

            Small values are too volatile for meaningful comparison. High values are better on MySQL everywhere, on 5.7 they are quite impressive (but even on 5.6 they are much better than 10.0/10.1).

            elenst Elena Stepanova added a comment -   1 10 100 200 400 MariaDB 5.5.45 0.019 0.084 2.160 14.047 60.398 MySQL 5.5.45 0.022 0.084 1.085 3.891 41.689 MariaDB 10.0.21 0.026 0.097 1.658 20.697 163.562 MySQL 5.6.26 0.043 0.167 1.975 7.036 82.006 MariaDB 10.1.6 0.020 0.075 1.973 27.845 152.873 MySQL 5.7.8 0.043 0.201 2.147 4.043 21.544 Small values are too volatile for meaningful comparison. High values are better on MySQL everywhere, on 5.7 they are quite impressive (but even on 5.6 they are much better than 10.0/10.1).

            jplindst, could you check this? Should we merge fix from MySQL?

            svoj Sergey Vojtovich added a comment - jplindst , could you check this? Should we merge fix from MySQL?

            This is not a bug as default isolation level is REPEATABLE_READ. For plain select we do not need to take row locks, instead a consistent read is enough. But for CREATE [TEMPORARY] TABLE ... SELECT we need to take row locks to maintain selected range consistent also remember that InnoDB does not know is binary log used or not only for innodb-locks-unsafe-for-binlog case is optimized.

            jplindst Jan Lindström (Inactive) added a comment - This is not a bug as default isolation level is REPEATABLE_READ. For plain select we do not need to take row locks, instead a consistent read is enough. But for CREATE [TEMPORARY] TABLE ... SELECT we need to take row locks to maintain selected range consistent also remember that InnoDB does not know is binary log used or not only for innodb-locks-unsafe-for-binlog case is optimized.

            People

              jplindst Jan Lindström (Inactive)
              stewart-ibm Stewart Smith
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.