[MDEV-26921] Setting table_definition_cache to -1 causes the server to run out of memory Created: 2021-10-27  Updated: 2022-02-07  Resolved: 2022-02-07

Status: Closed
Project: MariaDB Server
Component/s: Configuration
Affects Version/s: 10.5
Fix Version/s: 10.6.0

Type: Bug Priority: Major
Reporter: Larry Adams Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: crash

Issue Links:
Duplicate
is duplicated by MDEV-22219 negative values on system variables l... Closed

 Description   

I noticed that when I set to -1 (auto-tune), the server memory kept increasing till the server ran out of memory and mariadb was killed by the OOM tracker. When looking at the server before this happened the next time, I noted that the table_definition_cache was already at 2097152. When I set the table definition cache back to a small static number (2000), ton's of memory was freed and since, the system has been pretty stable.



 Comments   
Comment by Larry Adams [ 2021-10-27 ]

This is 10.5.12 BTW.

Comment by Larry Adams [ 2021-10-27 ]

Note that coincident to that I also reduced the table_open_cache from 20k to 2000 as well. I'm not sure which of these may have caused the issue. Sorry.

Comment by Sergei Golubchik [ 2021-10-27 ]

after you set table_definition_cache to -1, what was its value? select @@table_definition_cache — what does it show?

Comment by Larry Adams [ 2021-11-16 ]

Sergei,

After setting this value, the status variable increases until it reaches the max value. By then, the server has used up all the cache memory. and has started nipping into swap. I've upgraded to 10.5.13, but I have yet to turn the setting back to -1 out of fear of it crashing again. I would like to keep this open for a bit and report back since it's a corner case setting and has an easy workaround for now. I'll get back to it within the next few weeks.

If memory serves me right, as previously mentioned, I set the value to a static value but then also flushed connections which I did not mention previously. Sorry, I did not move slowly to capture which action did the trick. But since the value was so high, I just assumed that this was likely the culprit.

Larry

Comment by Elena Stepanova [ 2022-01-06 ]

serg,

after you set table_definition_cache to -1, what was its value? select @@table_definition_cache — what does it show?

It is set to 2097152, the maximum for table_definition_cache, as was customary to do upon negative values pre-10.6.
In 10.6, it was attempted to be fixed in the scope of MDEV-22219.
The attempt failed, after the change the server wouldn't start with -1 at all, for which MDEV-25386 was filed and remains open so far.

TheWitness,

I noticed that when I set to -1 (auto-tune)

Why do you expect table_definition_cache to be auto-tunable this way (or any other way)? Does MariaDB documentation say it somewhere? If so, could you please paste the link, the documentation should probably be revised.

Comment by Sergei Golubchik [ 2022-01-06 ]

-1 historically means auto-tune for many performance-schema-* options, it's understandable that one might try to generalize that.

Comment by Elena Stepanova [ 2022-01-06 ]

This generalization can be dangerous, not just because of the behavior described here and in MDEV-22219 – it will hopefully be fixed – but also because for some variables -1 can be a valid value with a completely different effect from what one would expect with auto-sizing. One example that I could find quickly is max_user_connections; there can be others.

Anyway, if the assumption was the only reason, and we don't have it erroneously documented somewhere, we can consider this report to be a duplicate of MDEV-22219. If needed, maybe it can be later backported to 10.5, but MDEV-25386 needs to be fixed first.

Comment by Larry Adams [ 2022-01-06 ]

I'm okay with simply updating the doco to warn against the -1 setting for the moment.

Comment by Larry Adams [ 2022-01-06 ]

Things have been super stable since I went with a static setting. We have not re-enabled replication yet, but standing pretty good on one leg.

Comment by Sergei Golubchik [ 2022-02-07 ]

greenman, see above. I'm closing this as a duplicate. You might want to see whether the manual needs any clarifications or warnings for before-10.6 versions.

Generated at Thu Feb 08 09:48:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.