I've recently upgraded a few MariaDB Galera clusters from 10.1 to 10.2. Since this upgrade, MariaDB / Galera has been extremely unstable on every single cluster.
One of the issue I'm seeing quite frequently is the combination of "query end" never ending, which appears to lock the table metadata. This issue is compounded by the fact that simple SELECT statements apparently need a metadata lock on these tables in order to complete.
The processes with "query end" hanging around cannot be killed. And because SELECT statements are waiting on metadata locks, they cannot complete either. The only "work around" if you could call it that is to literally take the entire cluster offline and start it up again with --wsrep-new-cluster
This is happening anywhere from every couple of hours to every few days, on clusters that otherwise lasted months of uptime and only ever saw downtime during upgrades.
Here is an example process list
MariaDB [(none)]> show processlist;
+--------+-------------+--------------------+-----------+---------+--------+---------------------------------+------------------------------------------------------------------------------------------------------+----------+
| Id | User | Host | db | Command | Time | State | Info | Progress |
+--------+-------------+--------------------+-----------+---------+--------+---------------------------------+------------------------------------------------------------------------------------------------------+----------+
| 2 | system user | | NULL | Sleep | 2913 | Waiting for table metadata lock | UPDATE `prfix_file` SET `file_views`=`file_views`+1 WHERE (`file_hash`=UNHEX('36bae2de8f3399e891a30b | 0.000 |
| 1 | system user | | NULL | Sleep | 888882 | wsrep aborter idle | NULL | 0.000 |
| 3 | system user | | NULL | Daemon | NULL | InnoDB purge coordinator | NULL | 0.000 |
| 5 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 |
| 4 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 |
| 6 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 |
| 7 | system user | | NULL | Daemon | NULL | InnoDB shutdown handler | NULL | 0.000 |
| 485377 | user000 | xxx.xx.xx.53:11779 | database1 | Query | 2913 | query end | INSERT INTO `prfix_user` (`user_name`) VALUES ('new user name') | 0.000 |
| 485430 | user000 | xxx.xx.xx.52:46210 | database1 | Query | 2788 | query end | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='36964') | 0.000 |
| 485933 | user000 | xxx.xx.xx.52:47277 | database1 | Query | 1410 | query end | DELETE FROM `prfix_session` WHERE (`access`<1503723562) | 0.000 |
| 485993 | user000 | xxx.xx.xx.54:44107 | database1 | Query | 1242 | query end | INSERT INTO `prfix_anchor` (`anchor_source`, `anchor_dest`) VALUES ('https:/www.example.com/some123' | 0.000 |
| 486023 | user000 | xxx.xx.xx.52:47490 | database1 | Query | 1163 | query end | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='29042') | 0.000 |
| 486150 | user000 | xxx.xx.xx.51:60057 | database1 | Query | 761 | query end | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='5240') | 0.000 |
| 486223 | user000 | xxx.xx.xx.52:48073 | database1 | Query | 497 | query end | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='25304') | 0.000 |
| 486254 | user000 | xxx.xx.xx.51:60409 | database1 | Query | 395 | query end | UPDATE `prfix_gallery` SET `gallery_views`=`gallery_views`+1 WHERE (`gallery_id`='114141') | 0.000 |
| 486285 | user000 | xxx.xx.xx.52:48281 | database1 | Query | 269 | query end | UPDATE `prfix_anchor` SET `anchor_clicks`=`anchor_clicks`+1 WHERE (`anchor_id`='13309') | 0.000 |
| 486360 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
| 486363 | user000 | xxx.xx.xx.52:48581 | database1 | Query | 47 | Waiting for table metadata lock | SELECT SQL_NO_CACHE ga.*, us.user_id, us.user_name, us.user_url, th.thumb_hash, th.file_hash, fl.fil | 0.000 |
| 486364 | user000 | xxx.xx.xx.53:21831 | database1 | Query | 46 | Waiting for table metadata lock | SELECT us.*, ga.*, th.thumb_hash FROM `prfix_gallery` AS `ga` LEFT JOIN `prfix_file_thumb` AS `th` O | 0.000 |
| 486365 | user000 | xxx.xx.xx.51:60884 | database1 | Query | 34 | Waiting for table metadata lock | SELECT us.*, ga.*, th.thumb_hash FROM `prfix_gallery` AS `ga` LEFT JOIN `prfix_file_thumb` AS `th` O | 0.000 |
| 486366 | user000 | xxx.xx.xx.51:60887 | database1 | Query | 31 | Waiting for table metadata lock | SELECT * FROM `prfix_user` AS `us` LEFT JOIN `prfix_file_thumb` AS `th` ON (`th`.`file_hash`=`us`.`u | 0.000 |
| 486373 | user000 | xxx.xx.xx.54:62339 | database1 | Query | 26 | Waiting for table metadata lock | SELECT * FROM `prfix_user` AS `us` LEFT JOIN `prfix_file_thumb` AS `th` ON (`th`.`file_hash`=`us`.`u | 0.000 |
| 486375 | user000 | xxx.xx.xx.51:60907 | database1 | Query | 17 | Waiting for table metadata lock | SELECT us.*, ga.*, th.thumb_hash FROM `prfix_gallery` AS `ga` LEFT JOIN `prfix_file_thumb` AS `th` O | 0.000 |
| 486381 | user000 | xxx.xx.xx.51:60934 | database1 | Query | 6 | Waiting for table metadata lock | SELECT * FROM `prfix_user` AS `us` LEFT JOIN `prfix_file_thumb` AS `th` ON (`th`.`file_hash`=`us`.`u | 0.000 |
+--------+-------------+--------------------+-----------+---------+--------+---------------------------------+------------------------------------------------------------------------------------------------------+----------+
Andrii Nikitin (Inactive)
made changes -
2017-09-28 09:48
Field
Original Value
New Value
Assignee
Andrii Nikitin
[ anikitin
]
Vincent Milum Jr
made changes -
2017-10-16 03:30
Affects Version/s
10.2.9
[ 22611
]
Environment
Debian Jessie
FreeBSD 10.3 Jail
Debian Jessie
Debian Stretch
FreeBSD 10.3 Jail
Elena Stepanova
made changes -
2017-12-14 15:47
Labels
galera innodb need_feedback
galera innodb
Vincent Milum Jr
made changes -
2017-12-14 22:28
Affects Version/s
10.2.10
[ 22615
]
Elena Stepanova
made changes -
2018-01-09 00:16
Assignee
Andrii Nikitin
[ anikitin
]
Elena Stepanova
made changes -
2018-03-04 22:06
Fix Version/s
10.2
[ 14601
]
Assignee
Sachin Setiya
[ sachin.setiya.007
]
Sergei Golubchik
made changes -
2018-05-02 10:59
Assignee
Sachin Setiya
[ sachin.setiya.007
]
Jan Lindström
[ jplindst
]
Elena Stepanova
made changes -
2019-01-29 21:09
Fix Version/s
10.2.19
[ 23207
]
Fix Version/s
10.3.11
[ 23141
]
Fix Version/s
10.2
[ 14601
]
Resolution
Fixed
[ 1
]
Status
Open
[ 1
]
Closed
[ 6
]
Sergei Golubchik
made changes -
2021-12-06 21:45
Workflow
MariaDB v3
[ 82760
]
MariaDB v4
[ 152883
]
{"report":{"fcp":782,"ttfb":253.2999997138977,"pageVisibility":"visible","entityId":63322,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":1,"journeyId":"5e865df5-6323-4b91-bc4f-ef133d2fe964","navigationType":0,"readyForUser":867.6999998092651,"redirectCount":0,"resourceLoadedEnd":944.0999999046326,"resourceLoadedStart":259.2999997138977,"resourceTiming":[{"duration":8,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":259.2999997138977,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":259.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":267.2999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":7.900000095367432,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":259.59999990463257,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":259.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":267.5,"responseStart":0,"secureConnectionStart":0},{"duration":70.80000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":259.7999997138977,"connectEnd":259.7999997138977,"connectStart":259.7999997138977,"domainLookupEnd":259.7999997138977,"domainLookupStart":259.7999997138977,"fetchStart":259.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":259.7999997138977,"responseEnd":330.59999990463257,"responseStart":330.59999990463257,"secureConnectionStart":259.7999997138977},{"duration":154.89999961853027,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":260,"connectEnd":260,"connectStart":260,"domainLookupEnd":260,"domainLookupStart":260,"fetchStart":260,"redirectEnd":0,"redirectStart":0,"requestStart":260,"responseEnd":414.8999996185303,"responseStart":414.8999996185303,"secureConnectionStart":260},{"duration":158.30000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":260.19999980926514,"connectEnd":260.19999980926514,"connectStart":260.19999980926514,"domainLookupEnd":260.19999980926514,"domainLookupStart":260.19999980926514,"fetchStart":260.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":260.19999980926514,"responseEnd":418.5,"responseStart":418.5,"secureConnectionStart":260.19999980926514},{"duration":158.60000038146973,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":260.3999996185303,"connectEnd":260.3999996185303,"connectStart":260.3999996185303,"domainLookupEnd":260.3999996185303,"domainLookupStart":260.3999996185303,"fetchStart":260.3999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":260.3999996185303,"responseEnd":419,"responseStart":419,"secureConnectionStart":260.3999996185303},{"duration":215.19999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":260.69999980926514,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":260.69999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":475.8999996185303,"responseStart":0,"secureConnectionStart":0},{"duration":158.59999990463257,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":260.69999980926514,"connectEnd":260.69999980926514,"connectStart":260.69999980926514,"domainLookupEnd":260.69999980926514,"domainLookupStart":260.69999980926514,"fetchStart":260.69999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":260.69999980926514,"responseEnd":419.2999997138977,"responseStart":419.2999997138977,"secureConnectionStart":260.69999980926514},{"duration":158.90000009536743,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":260.8999996185303,"connectEnd":260.8999996185303,"connectStart":260.8999996185303,"domainLookupEnd":260.8999996185303,"domainLookupStart":260.8999996185303,"fetchStart":260.8999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":260.8999996185303,"responseEnd":419.7999997138977,"responseStart":419.7999997138977,"secureConnectionStart":260.8999996185303},{"duration":215,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":261.09999990463257,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":261.09999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":476.09999990463257,"responseStart":0,"secureConnectionStart":0},{"duration":159.09999990463257,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":261.19999980926514,"connectEnd":261.19999980926514,"connectStart":261.19999980926514,"domainLookupEnd":261.19999980926514,"domainLookupStart":261.19999980926514,"fetchStart":261.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":261.19999980926514,"responseEnd":420.2999997138977,"responseStart":420.2999997138977,"secureConnectionStart":261.19999980926514},{"duration":630.5999999046326,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":262.2999997138977,"connectEnd":262.2999997138977,"connectStart":262.2999997138977,"domainLookupEnd":262.2999997138977,"domainLookupStart":262.2999997138977,"fetchStart":262.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":262.2999997138977,"responseEnd":892.8999996185303,"responseStart":892.8999996185303,"secureConnectionStart":262.2999997138977},{"duration":632,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":262.2999997138977,"connectEnd":262.2999997138977,"connectStart":262.2999997138977,"domainLookupEnd":262.2999997138977,"domainLookupStart":262.2999997138977,"fetchStart":262.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":262.2999997138977,"responseEnd":894.2999997138977,"responseStart":894.2999997138977,"secureConnectionStart":262.2999997138977},{"duration":405.90000009536743,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":487.2999997138977,"connectEnd":487.2999997138977,"connectStart":487.2999997138977,"domainLookupEnd":487.2999997138977,"domainLookupStart":487.2999997138977,"fetchStart":487.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":487.2999997138977,"responseEnd":893.1999998092651,"responseStart":893.1999998092651,"secureConnectionStart":487.2999997138977},{"duration":181.80000019073486,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":775.8999996185303,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":775.8999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":957.6999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":32.80000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/e65b778d185daf5aee24936755b43da6/_/download/contextbatch/js/browser-metrics-plugin.contrib,-_super,-project.issue.navigator,-jira.view.issue,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":911.2999997138977,"connectEnd":911.2999997138977,"connectStart":911.2999997138977,"domainLookupEnd":911.2999997138977,"domainLookupStart":911.2999997138977,"fetchStart":911.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":911.2999997138977,"responseEnd":944.0999999046326,"responseStart":944.0999999046326,"secureConnectionStart":911.2999997138977}],"fetchStart":1,"domainLookupStart":1,"domainLookupEnd":1,"connectStart":1,"connectEnd":1,"requestStart":16,"responseStart":254,"responseEnd":257,"domLoading":257,"domInteractive":1005,"domContentLoadedEventStart":1005,"domContentLoadedEventEnd":1052,"domComplete":1176,"loadEventStart":1176,"loadEventEnd":1176,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":967.5},{"name":"bigPipe.sidebar-id.end","time":968.3999996185303},{"name":"bigPipe.activity-panel-pipe-id.start","time":968.5},{"name":"bigPipe.activity-panel-pipe-id.end","time":972.6999998092651},{"name":"activityTabFullyLoaded","time":1070.7999997138977}],"measures":[],"correlationId":"8ae6b28b4ae6c2","effectiveType":"4g","downlink":9.1,"rtt":0,"serverDuration":158,"dbReadsTimeInMs":26,"dbConnsTimeInMs":37,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}