Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19749

MDL scalability regression after backup locks

Details

    Description

      In 7a9dfdd GLOBAL and COMMIT namespaces were combined into BACKUP, which doubled load on BACKUP lock mutex.

      Can be fixed by implementing something similar to MySQL WL#7306 "Improve MDL performance and scalability by implementing lock-free lock acquisition for DML".

      Attachments

        Issue Links

          Activity

            I am working on the following solution (which is very much like Marko's suggestion) to remove some of the issues with BACKUP STAGE BLOCK_COMMIT;

            Adding before MDL_REQUEST_INIT() in ha_commit_trans()

            If (!(thd->backup_state= backup_is_running))

            { my_atomic_add32(&thd_in_commit, 1, MY_MEMORY_ORDER_RELAXED)) }

            else

            { MDL_REQUEST_INIT(...) }

            and when releasing:

            if (mdl_backup.ticket)

            { .... }

            else if (thd->backup_state)

            { my_atomic_add32(&thd_in_commit, -1, MY_MEMORY_ORDER_RELAXED)) thd->backup_state= 0; }

            If we have the above, we could in BACKUP_STAGE_START do something like

            backup_running= 1;
            do

            { while (thd_in_commit) sleep(1); }

            while (some_thd_has_backup_state_set())

            The effect would be to replace the MDL_backup lock in ha_commit_trans() with two atomic increment and two thread-local memory assignments when backup is not running. The cost is a slightly slower start of the backup.

            monty Michael Widenius added a comment - I am working on the following solution (which is very much like Marko's suggestion) to remove some of the issues with BACKUP STAGE BLOCK_COMMIT; Adding before MDL_REQUEST_INIT() in ha_commit_trans() If (!(thd->backup_state= backup_is_running)) { my_atomic_add32(&thd_in_commit, 1, MY_MEMORY_ORDER_RELAXED)) } else { MDL_REQUEST_INIT(...) } and when releasing: if (mdl_backup.ticket) { .... } else if (thd->backup_state) { my_atomic_add32(&thd_in_commit, -1, MY_MEMORY_ORDER_RELAXED)) thd->backup_state= 0; } If we have the above, we could in BACKUP_STAGE_START do something like backup_running= 1; do { while (thd_in_commit) sleep(1); } while (some_thd_has_backup_state_set()) The effect would be to replace the MDL_backup lock in ha_commit_trans() with two atomic increment and two thread-local memory assignments when backup is not running. The cost is a slightly slower start of the backup.

            monty, my idea was that the "fast" code path (no backup is running) would not run any atomic read-modify-write operations. This idea would work if the logic that handles the BACKUP statement could somehow count all active THD objects and wait until the atomic counter has reached that value, instead of the counter being 0.

            marko Marko Mäkelä added a comment - monty , my idea was that the "fast" code path (no backup is running) would not run any atomic read-modify-write operations. This idea would work if the logic that handles the BACKUP statement could somehow count all active THD objects and wait until the atomic counter has reached that value, instead of the counter being 0.

            Pushed to bb-10.11-monty for testing

            monty Michael Widenius added a comment - Pushed to bb-10.11-monty for testing

            monty, I revised the logic a little and created https://github.com/MariaDB/server/pull/3966 for this, currently as a draft. It seems to me that the global atomic counter may be redundant. I am still reviewing the logic for potential race conditions.

            marko Marko Mäkelä added a comment - monty , I revised the logic a little and created https://github.com/MariaDB/server/pull/3966 for this, currently as a draft. It seems to me that the global atomic counter may be redundant. I am still reviewing the logic for potential race conditions.
            marko Marko Mäkelä added a comment -

            As far as I understand from the preliminary performance test results, monty replaced the MDL_key::BACKUP acquisition in ha_commit_trans() but not in open_tables(), where the relevant section is starting as follows:

                enum enum_mdl_type mdl_type= MDL_BACKUP_DML;
             
                if (table->s->table_category != TABLE_CATEGORY_USER)
                  mdl_type= MDL_BACKUP_SYS_DML;
                else if (table->s->online_backup)
                  mdl_type= MDL_BACKUP_TRANS_DML;
             
                if (table_list->mdl_request.is_write_lock_request() &&
                    ! (flags & (MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK |
                                MYSQL_OPEN_FORCE_SHARED_MDL |
                                MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL |
                                MYSQL_OPEN_SKIP_SCOPED_MDL_LOCK)) &&
                    ! ot_ctx->has_protection_against_grl(mdl_type))
                {
                  MDL_request protection_request;
                  MDL_deadlock_handler mdl_deadlock_handler(ot_ctx);
             
                  if (thd->has_read_only_protection())
                  {
                    MYSQL_UNBIND_TABLE(table->file);
                    table->vcol_cleanup_expr(thd);
                    tc_release_table(table);
                    DBUG_RETURN(TRUE);
                  }
             
                  MDL_REQUEST_INIT(&protection_request, MDL_key::BACKUP, "", "", mdl_type,
                                   MDL_STATEMENT);
            

            It seems to me that it could be useful to check flags and possibly the presence of any pending BACKUP STAGE or FLUSH TABLES WITH READ LOCK request before assigning mdl_type or executing the rest of the above logic.

            marko Marko Mäkelä added a comment - As far as I understand from the preliminary performance test results, monty replaced the MDL_key::BACKUP acquisition in ha_commit_trans() but not in open_tables() , where the relevant section is starting as follows: enum enum_mdl_type mdl_type= MDL_BACKUP_DML;   if (table->s->table_category != TABLE_CATEGORY_USER) mdl_type= MDL_BACKUP_SYS_DML; else if (table->s->online_backup) mdl_type= MDL_BACKUP_TRANS_DML;   if (table_list->mdl_request.is_write_lock_request() && ! (flags & (MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK | MYSQL_OPEN_FORCE_SHARED_MDL | MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL | MYSQL_OPEN_SKIP_SCOPED_MDL_LOCK)) && ! ot_ctx->has_protection_against_grl(mdl_type)) { MDL_request protection_request; MDL_deadlock_handler mdl_deadlock_handler(ot_ctx);   if (thd->has_read_only_protection()) { MYSQL_UNBIND_TABLE(table->file); table->vcol_cleanup_expr(thd); tc_release_table(table); DBUG_RETURN(TRUE); }   MDL_REQUEST_INIT(&protection_request, MDL_key::BACKUP, "" , "" , mdl_type, MDL_STATEMENT); It seems to me that it could be useful to check flags and possibly the presence of any pending BACKUP STAGE or FLUSH TABLES WITH READ LOCK request before assigning mdl_type or executing the rest of the above logic.

            People

              monty Michael Widenius
              svoj Sergey Vojtovich
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.