[MDEV-24117] Memory management problem in statistics state for queries that have a large field IN (values) part Created: 2020-11-04 Updated: 2021-12-29 Resolved: 2020-11-11 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 10.2.35, 10.2, 10.3, 10.4, 10.5 |
| Fix Version/s: | 10.2.36, 10.3.27, 10.4.17, 10.5.8 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Geza Lucz | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 11 |
| Labels: | regression | ||
| Environment: |
Centos 7 |
||
| Attachments: |
|
||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
A query with a large IN () section will be stuck in statistics and use up all RAM. Multiple such queries will crash the server regardless of memory limitations specified in my.cnf I noticed this in 10.2.35. Downgrading to 10.2.34 fixes the problem An example is the WordPress "select post_id,meta_value from wp_postmeta where meta_id in (1,2,3 etc ...)" query. |
| Comments |
| Comment by Niels Hendriks [ 2020-11-04 ] | ||||||||||||||||||||||||||||||||
|
Same issue here. Server that was previously fine with 12G RAM had to be upgraded to 40G RAM to be "stable" with 12.2.35 on Debian Stretch. Has anyone checked by any chance if 10.3/10.4/10.5 are also affected? Considering whether we should downgrade or upgrade to the next major release to fix it. EDIT: Can confirm memory is fine again in 10.2.34. | ||||||||||||||||||||||||||||||||
| Comment by Pavel [ 2020-11-05 ] | ||||||||||||||||||||||||||||||||
|
I ran into this issue on 10.5. I tested both on Debian and Windows. 10.5.6 is fine, 10.5.7 has the bug. I attached `mariadbbug.sql` with an example table and query. | ||||||||||||||||||||||||||||||||
| Comment by Frank Sagurna [ 2020-11-07 ] | ||||||||||||||||||||||||||||||||
|
I can confirm this issue. I wondered why I see queries in the process list with "Status" "statistics". Never saw this state before. Then I had problems with memory usage up to the server was killed by the kernel OOM killer. | ||||||||||||||||||||||||||||||||
| Comment by Alice Sherepa [ 2020-11-09 ] | ||||||||||||||||||||||||||||||||
|
Thanks a lot! The regression appeared after 291be49474 commit ( | ||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-11-09 ] | ||||||||||||||||||||||||||||||||
|
Debugging 1_my.test. Counting SEL_ARG::SEL_ARG allocations. Indeed, before the patch for Another odd thing in key_or(): key2_shared==true. This doesn't seem correct - the testcase has just one index and we use only 1 key part from it. How can a SEL_ARG graph be shared in this case? | ||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-11-09 ] | ||||||||||||||||||||||||||||||||
|
The reference counters are increased here in tree_or:
(denote this as LOC1) The only place where rt1 and t2 are used is this one:
Denote the above as LOC2. Note that LOC2 has if (...&& !no_imerge_from_ranges) which gives an idea - if rt1 and and rt2 is only used when no_imerge_from_ranges==FALSE,
Unfortunately, this doesn't work, it creates some a test failure in main.range_vs_index_merge. Not sure for which reason. | ||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-11-09 ] | ||||||||||||||||||||||||||||||||
|
Ok, the issue is caused by the wrong patch for The patch that was pushed was this: But it is a preliminary version. During the review, I had requested to remove the problematic part of the patch : which was accepted and done: But then due to some error, wrong patch was pushed. | ||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-11-09 ] | ||||||||||||||||||||||||||||||||
|
A testcase for this bug (thanks serg's for the idea) : | ||||||||||||||||||||||||||||||||
| Comment by Roope Pääkkönen (Inactive) [ 2020-11-10 ] | ||||||||||||||||||||||||||||||||
|
we were hit by this as well, on testing fortunately - is a hotfix release for this possible ? | ||||||||||||||||||||||||||||||||
| Comment by Javier Carvajal [ 2020-11-11 ] | ||||||||||||||||||||||||||||||||
|
I have same issue. Server was had to be upgraded to 32G RAM with 10.2.35 on Debian 9 (Stretch). RAM wildly fluctuates. The server was killed by the kernel OOM killer. Any quick fix? | ||||||||||||||||||||||||||||||||
| Comment by Ferdi van der Werf [ 2020-11-11 ] | ||||||||||||||||||||||||||||||||
|
I'd recommend to downgrade to 10.2.34 and apt-mark hold it until there is a fix available. We've seen servers downgraded to 10.2.34 to return to normal RAM usage. | ||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-11-11 ] | ||||||||||||||||||||||||||||||||
|
Hello, A set of release is about to be made (10.2.36, 10.3.27, 10.4.17, 10.5.8) which will contain the fix for this. | ||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-11-11 ] | ||||||||||||||||||||||||||||||||
|
The commits that fix this are (all 3 were made in a row): https://github.com/MariaDB/server/commit/f81eef62e7742806e5e74b5f37f35b7cd2f82291 | ||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2020-11-11 ] | ||||||||||||||||||||||||||||||||
|
For all those watching patiently, there have been packages released https://mariadb.org/mariadb-10-5-8-10-4-17-10-3-27-and-10-2-36-now-available/ |