Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12009

Allow to force kill user threads/query which are flagged as high priority by Galera

Details

    Description

      See also: https://github.com/codership/galera/issues/448

      In some cases Galera can mark a user thread as high priority(TOI DDLs) making it unkillable (with misleading message at MDEV-12008)
      I think it should be allowed a way to kill such thread/query under the responsibility of the DBA.
      Syntax could be:
      KILL [CONNECTION | QUERY] processlist_id [FORCE]

      Attachments

        Issue Links

          Activity

            claudio.nanni Claudio Nanni created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Assignee Nirbhay Choubey [ nirbhay_c ]
            serg Sergei Golubchik made changes -
            Assignee Nirbhay Choubey [ nirbhay_c ]
            serg Sergei Golubchik made changes -
            Labels galera
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.1 [ 16100 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Assignee Seppo Jaakola [ seppo ]
            jplindst Jan Lindström (Inactive) made changes -
            Affects Version/s 10.1.21 [ 22113 ]
            Affects Version/s 5.5.54-galera [ 22404 ]
            Affects Version/s 10.0.29-galera [ 22405 ]
            Issue Type Bug [ 1 ] Task [ 3 ]

            This is a feature request. bar How hard it would be to add this new FORCE keyword for kill and does it require change on storage engine API or can it be added to current structures as a additional info ?

            jplindst Jan Lindström (Inactive) added a comment - - edited This is a feature request. bar How hard it would be to add this new FORCE keyword for kill and does it require change on storage engine API or can it be added to current structures as a additional info ?
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.1 [ 16100 ]
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Seppo Jaakola [ seppo ] Alexander Barkov [ bar ]
            bar Alexander Barkov added a comment - - edited

            jplindst From a glance it looks like something not hard to do.

            If I understand the code correctly, the change should be done in kill_one_thread() in this code:

                if (((thd->security_ctx->master_access & SUPER_ACL) ||
                    thd->security_ctx->user_matches(tmp->security_ctx)) &&
                    !wsrep_thd_is_BF(tmp, false))
                {
                  tmp->awake_no_mutex(kill_signal);
                  error=0;
                }
                else
                  error= (type == KILL_TYPE_QUERY ? ER_KILL_QUERY_DENIED_ERROR :
                                                    ER_KILL_DENIED_ERROR);
            

            Most likely it is wsrep_thd_is_BF() who decides if the thread is kill-able.
            But I'm not fully sure. There are no comments for wsrep_thd_is_BF() in the code.

            I suggest you discuss with psergey and serg.

            When adding a new FORCE flag, please consider introducing a new class Sql_cmd_kill instead of adding this flag directly to LEX, as well ass moving LEX::kill_type and LEX::kill_signal into this new class Sql_cmd_kill. I can help with this refactoring.

            bar Alexander Barkov added a comment - - edited jplindst From a glance it looks like something not hard to do. If I understand the code correctly, the change should be done in kill_one_thread() in this code: if (((thd->security_ctx->master_access & SUPER_ACL) || thd->security_ctx->user_matches(tmp->security_ctx)) && !wsrep_thd_is_BF(tmp, false )) { tmp->awake_no_mutex(kill_signal); error=0; } else error= (type == KILL_TYPE_QUERY ? ER_KILL_QUERY_DENIED_ERROR : ER_KILL_DENIED_ERROR); Most likely it is wsrep_thd_is_BF() who decides if the thread is kill-able. But I'm not fully sure. There are no comments for wsrep_thd_is_BF() in the code. I suggest you discuss with psergey and serg . When adding a new FORCE flag, please consider introducing a new class Sql_cmd_kill instead of adding this flag directly to LEX, as well ass moving LEX::kill_type and LEX::kill_signal into this new class Sql_cmd_kill. I can help with this refactoring.
            serg Sergei Golubchik made changes -
            Description See also: https://github.com/codership/galera/issues/448

            In some cases Galera can mark a user thread as high priority(TOI DDLs) making it unkillable (with misleading message at https://jira.mariadb.org/browse/MDEV-12008)
            I think it should be allowed a way to kill such thread/query under the responsibility of the DBA.
            Syntax could be:
            KILL [CONNECTION | QUERY] processlist_id [FORCE]
            See also: https://github.com/codership/galera/issues/448

            In some cases Galera can mark a user thread as high priority(TOI DDLs) making it unkillable (with misleading message at MDEV-12008)
            I think it should be allowed a way to kill such thread/query under the responsibility of the DBA.
            Syntax could be:
            KILL [CONNECTION | QUERY] processlist_id [FORCE]

            Why is this check wsrep_thd_is_BF there in the first place? It directly contradicts the comment above it, that says

                /*
                  If we're SUPER, we can KILL anything, including system-threads.
                  No further checks.
                  ...
              */
            

            serg Sergei Golubchik added a comment - Why is this check wsrep_thd_is_BF there in the first place? It directly contradicts the comment above it, that says /* If we're SUPER, we can KILL anything, including system-threads. No further checks. ... */
            serg Sergei Golubchik made changes -
            Assignee Alexander Barkov [ bar ] Jan Lindström [ jplindst ]

            ralf.gebhardt@mariadb.com, it's probably a bug. See Sergei's comment. Perhaps a super user should be able to kill such threads even without this new FORCE flag. In such case it's just 5 minutes.
            If we still want to add FORCE, then it is 4 hours.

            bar Alexander Barkov added a comment - ralf.gebhardt@mariadb.com , it's probably a bug. See Sergei's comment. Perhaps a super user should be able to kill such threads even without this new FORCE flag. In such case it's just 5 minutes. If we still want to add FORCE, then it is 4 hours.

            serg Does thd->security_ctx->user_matches(tmp->security_ctx)) match with current user or superuser ?

            jplindst Jan Lindström (Inactive) added a comment - - edited serg Does thd->security_ctx->user_matches(tmp->security_ctx)) match with current user or superuser ?

            thd->security_ctx->user_matches(tmp->security_ctx)) means that thd->security_ctx has the same user as tmp->security_ctx.
            thd->security_ctx is the current user.
            But I don't see how it's relevant for the current issue.

            serg Sergei Golubchik added a comment - thd->security_ctx->user_matches(tmp->security_ctx)) means that thd->security_ctx has the same user as tmp->security_ctx . thd->security_ctx is the current user. But I don't see how it's relevant for the current issue.
            jplindst Jan Lindström (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.4 [ 22408 ]

            I will allow SUPER to kill both high priority transactions as well as wsrep applier threads as it should be.

            jplindst Jan Lindström (Inactive) added a comment - I will allow SUPER to kill both high priority transactions as well as wsrep applier threads as it should be.
            jplindst Jan Lindström (Inactive) made changes -
            issue.field.resolutiondate 2019-03-28 08:52:40.0 2019-03-28 08:52:40.223
            jplindst Jan Lindström (Inactive) made changes -
            Component/s Galera [ 10124 ]
            Fix Version/s 10.1.39 [ 23305 ]
            Fix Version/s 10.2.24 [ 23308 ]
            Fix Version/s 10.3.14 [ 23216 ]
            Fix Version/s 10.4.4 [ 23310 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]

            I reverted this change, because I think that it depends on a correct fix of MDEV-18464.

            marko Marko Mäkelä added a comment - I reverted this change , because I think that it depends on a correct fix of MDEV-18464 .
            marko Marko Mäkelä made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Stalled [ 10000 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.3.14 [ 23216 ]
            Fix Version/s 10.1.39 [ 23305 ]
            Fix Version/s 10.2.24 [ 23308 ]
            Fix Version/s 10.4.4 [ 23310 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.1 [ 16100 ]

            Killing wsrep high priority threads is dangerous and could lead to cluster inconsistent state. Thus it should not be allowed.

            jplindst Jan Lindström (Inactive) added a comment - Killing wsrep high priority threads is dangerous and could lead to cluster inconsistent state. Thus it should not be allowed.
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Resolution Won't Fix [ 2 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 79546 ] MariaDB v4 [ 133122 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 168417

            People

              jplindst Jan Lindström (Inactive)
              claudio.nanni Claudio Nanni
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.