Details

    Description

      I've recently upgraded a few MariaDB Galera clusters from 10.1 to 10.2. Since this upgrade, MariaDB / Galera has been extremely unstable on every single cluster.

      One of the issue I'm seeing quite frequently is the combination of "query end" never ending, which appears to lock the table metadata. This issue is compounded by the fact that simple SELECT statements apparently need a metadata lock on these tables in order to complete.

      The processes with "query end" hanging around cannot be killed. And because SELECT statements are waiting on metadata locks, they cannot complete either. The only "work around" if you could call it that is to literally take the entire cluster offline and start it up again with --wsrep-new-cluster

      This is happening anywhere from every couple of hours to every few days, on clusters that otherwise lasted months of uptime and only ever saw downtime during upgrades.

      Here is an example process list

      MariaDB [(none)]> show processlist;
      +--------+-------------+--------------------+-----------+---------+--------+---------------------------------+------------------------------------------------------------------------------------------------------+----------+
      | Id     | User        | Host               | db        | Command | Time   | State                           | Info                                                                                                 | Progress |
      +--------+-------------+--------------------+-----------+---------+--------+---------------------------------+------------------------------------------------------------------------------------------------------+----------+
      |      2 | system user |                    | NULL      | Sleep   |   2913 | Waiting for table metadata lock | UPDATE `prfix_file` SET `file_views`=`file_views`+1 WHERE (`file_hash`=UNHEX('36bae2de8f3399e891a30b |    0.000 |
      |      1 | system user |                    | NULL      | Sleep   | 888882 | wsrep aborter idle              | NULL                                                                                                 |    0.000 |
      |      3 | system user |                    | NULL      | Daemon  |   NULL | InnoDB purge coordinator        | NULL                                                                                                 |    0.000 |
      |      5 | system user |                    | NULL      | Daemon  |   NULL | InnoDB purge worker             | NULL                                                                                                 |    0.000 |
      |      4 | system user |                    | NULL      | Daemon  |   NULL | InnoDB purge worker             | NULL                                                                                                 |    0.000 |
      |      6 | system user |                    | NULL      | Daemon  |   NULL | InnoDB purge worker             | NULL                                                                                                 |    0.000 |
      |      7 | system user |                    | NULL      | Daemon  |   NULL | InnoDB shutdown handler         | NULL                                                                                                 |    0.000 |
      | 485377 | user000     | xxx.xx.xx.53:11779 | database1 | Query   |   2913 | query end                       | INSERT INTO `prfix_user` (`user_name`) VALUES ('new user name')                                      |    0.000 |
      | 485430 | user000     | xxx.xx.xx.52:46210 | database1 | Query   |   2788 | query end                       | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='36964')              |    0.000 |
      | 485933 | user000     | xxx.xx.xx.52:47277 | database1 | Query   |   1410 | query end                       | DELETE FROM `prfix_session` WHERE (`access`<1503723562)                                              |    0.000 |
      | 485993 | user000     | xxx.xx.xx.54:44107 | database1 | Query   |   1242 | query end                       | INSERT INTO `prfix_anchor` (`anchor_source`, `anchor_dest`) VALUES ('https:/www.example.com/some123' |    0.000 |
      | 486023 | user000     | xxx.xx.xx.52:47490 | database1 | Query   |   1163 | query end                       | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='29042')              |    0.000 |
      | 486150 | user000     | xxx.xx.xx.51:60057 | database1 | Query   |    761 | query end                       | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='5240')               |    0.000 |
      | 486223 | user000     | xxx.xx.xx.52:48073 | database1 | Query   |    497 | query end                       | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='25304')              |    0.000 |
      | 486254 | user000     | xxx.xx.xx.51:60409 | database1 | Query   |    395 | query end                       | UPDATE `prfix_gallery` SET `gallery_views`=`gallery_views`+1 WHERE (`gallery_id`='114141')           |    0.000 |
      | 486285 | user000     | xxx.xx.xx.52:48281 | database1 | Query   |    269 | query end                       | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='13309')              |    0.000 |
      | 486360 | root        | localhost          | NULL      | Query   |      0 | init                            | show processlist                                                                                     |    0.000 |
      | 486363 | user000     | xxx.xx.xx.52:48581 | database1 | Query   |     47 | Waiting for table metadata lock | SELECT SQL_NO_CACHE ga.*, us.user_id, us.user_name, us.user_url, th.thumb_hash, th.file_hash, fl.fil |    0.000 |
      | 486364 | user000     | xxx.xx.xx.53:21831 | database1 | Query   |     46 | Waiting for table metadata lock | SELECT us.*, ga.*, th.thumb_hash FROM `prfix_gallery` AS `ga` LEFT JOIN `prfix_file_thumb` AS `th` O |    0.000 |
      | 486365 | user000     | xxx.xx.xx.51:60884 | database1 | Query   |     34 | Waiting for table metadata lock | SELECT us.*, ga.*, th.thumb_hash FROM `prfix_gallery` AS `ga` LEFT JOIN `prfix_file_thumb` AS `th` O |    0.000 |
      | 486366 | user000     | xxx.xx.xx.51:60887 | database1 | Query   |     31 | Waiting for table metadata lock | SELECT * FROM `prfix_user` AS `us` LEFT JOIN `prfix_file_thumb` AS `th` ON (`th`.`file_hash`=`us`.`u |    0.000 |
      | 486373 | user000     | xxx.xx.xx.54:62339 | database1 | Query   |     26 | Waiting for table metadata lock | SELECT * FROM `prfix_user` AS `us` LEFT JOIN `prfix_file_thumb` AS `th` ON (`th`.`file_hash`=`us`.`u |    0.000 |
      | 486375 | user000     | xxx.xx.xx.51:60907 | database1 | Query   |     17 | Waiting for table metadata lock | SELECT us.*, ga.*, th.thumb_hash FROM `prfix_gallery` AS `ga` LEFT JOIN `prfix_file_thumb` AS `th` O |    0.000 |
      | 486381 | user000     | xxx.xx.xx.51:60934 | database1 | Query   |      6 | Waiting for table metadata lock | SELECT * FROM `prfix_user` AS `us` LEFT JOIN `prfix_file_thumb` AS `th` ON (`th`.`file_hash`=`us`.`u |    0.000 |
      +--------+-------------+--------------------+-----------+---------+--------+---------------------------------+------------------------------------------------------------------------------------------------------+----------+
      

      Attachments

        Issue Links

          Activity

            darkain Vincent Milum Jr created issue -
            anikitin Andrii Nikitin (Inactive) made changes -
            Field Original Value New Value
            Assignee Andrii Nikitin [ anikitin ]
            darkain Vincent Milum Jr made changes -
            Affects Version/s 10.2.9 [ 22611 ]
            Environment Debian Jessie
            FreeBSD 10.3 Jail
            Debian Jessie
            Debian Stretch
            FreeBSD 10.3 Jail
            darkain Vincent Milum Jr made changes -
            Attachment stack-1.txt [ 44343 ]
            Attachment stack-2.txt [ 44344 ]
            anikitin Andrii Nikitin (Inactive) made changes -
            Labels galera innodb galera innodb need_feedback
            elenst Elena Stepanova made changes -
            Labels galera innodb need_feedback galera innodb
            darkain Vincent Milum Jr made changes -
            Affects Version/s 10.2.10 [ 22615 ]
            elenst Elena Stepanova made changes -
            Assignee Andrii Nikitin [ anikitin ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.2 [ 14601 ]
            Assignee Sachin Setiya [ sachin.setiya.007 ]
            serg Sergei Golubchik made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Jan Lindström [ jplindst ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.2.19 [ 23207 ]
            Fix Version/s 10.3.11 [ 23141 ]
            Fix Version/s 10.2 [ 14601 ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 82760 ] MariaDB v4 [ 152883 ]

            People

              jplindst Jan Lindström (Inactive)
              darkain Vincent Milum Jr
              Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.