[MDEV-17124] mariadb 10.1.34, views and prepared statements: ERROR 1615 (HY000): Prepared statement needs to be re-prepared Created: 2018-09-03 Updated: 2022-11-18 Resolved: 2022-09-30 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Prepared Statements, Views |
| Affects Version/s: | 10.0, 10.1, 10.1.34, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9 |
| Fix Version/s: | 10.3.37, 10.4.27, 10.5.18, 10.6.11, 10.7.7, 10.8.6, 10.9.4, 10.10.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jaco Kroon | Assignee: | Oleksandr Byelkin |
| Resolution: | Fixed | Votes: | 15 |
| Labels: | upstream-fixed | ||
| Environment: |
Gentoo Linux |
||
| Issue Links: |
|
||||||||||||
| Description |
|
There are various reports over the internet of this. Seems most reports are just indicating that it's a load issue. I can however reproduce this with two connections so it can't be load. One thread to perform database backups, of the form:
(My actual script does a bit more than that but for testing it can be reduced to that, used as a secondary backup mechanism.) I restarted mariadb, then stopped the slave for the sake of eliminating that as cause. show processlist shows only two connections before I start, the slave slaving off of this host, and the interactive session I'm using to connect. I then start the backup loop (causing a third connection), and then this in the interactive session:
At this point I still only see three connections in processlist: mysqldump, my interactive and the slave reading replication logs. My system has nearly 25000 tables and views, will likely go up to to nearly 26000 soon, so I should increase these two variables anyway:
But since each database only has around 105 tables, I still don't get why this is happening. If mysqldump is holding 105 tables open, and I'm using 2 (foo table and foo2 view), then "forced open" should still be <110, which is well below the cache limit. Each of the 248 databases will soon be expanded to utilize the same views and prepared statements, so I'm afraid the problem is going to get worse. I'll be upping those values to 10240 anyway, so hopefully it'll avoid the issue, but I still don't see how this can be happening. Normally we don't see >500 concurrent connections, most of them idle at any given point in time.
So it's definitely hitting those limits. Increasing those variables to 5120 allows the status values to start growing immediately. During this time I can once more issue "execute test". However, once it reaches 5120 again, same problem. I have three problems here: 1. The prepared statement (according to what I understand) should be valid until there are DDL changes to the underlying tables (and views?) - this doesn't seem to be the case. As per previous reports, I can reproduce this even if the test DB isn't even in the backup list (nor was it during my testing above). If I'm not mistaken I'm going to need to set my open_{table,definition}_cache values to some value larger than my tables in order to have an operational system. This doesn't seem right. |
| Comments |
| Comment by Elena Stepanova [ 2018-09-03 ] | |||||||||||||||||||||||||||||||||||||
|
Thanks for the report and test case, I can reproduce it. | |||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2018-09-03 ] | |||||||||||||||||||||||||||||||||||||
|
Here is a non-concurrent MTR test case which seeems to reproduce the problem reliably:
Reproducible on MariaDB 10.0-10.4 and MySQL 5.6, 5.7. | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2018-11-09 ] | |||||||||||||||||||||||||||||||||||||
|
So the problem is like this: On the other hand if table is removed from the cache between PREPARE and EXECUTE there is no an error (which should be investigated. | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2018-11-12 ] | |||||||||||||||||||||||||||||||||||||
|
There is no problem with tables because there is no error observer so the error just ignored. For view observer is set. | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2018-11-12 ] | |||||||||||||||||||||||||||||||||||||
|
reprepare does not help because repeat the same problem with full table definition cache: underlying table remove view definition first and on the next circle we have the same problem the object from cache is "different" (but in fact the same). | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2018-11-13 ] | |||||||||||||||||||||||||||||||||||||
|
revision-id: 5e63f616a7ed511e6e69581f6659f60f1ec8633b (mariadb-10.1.37-9-g5e63f616a7e)
The problem is that if table definition cache (TDC) is full of real tables which are in tables cache, view definition can not stay there so will me removed by its own underlying tables. | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2019-04-30 ] | |||||||||||||||||||||||||||||||||||||
|
The idea of hecking definition version if TDC versions mismatch appeared to be wrong, due to triggest which do not touch frm of the tables (so do not change definition version) but can change tables for pre-locking so PS should be re-prepared. Possible solution: | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2019-04-30 ] | |||||||||||||||||||||||||||||||||||||
|
holyfoot got idea to use "anchor" TABLE object during execution of the query with a view to keep definition in the cache. | |||||||||||||||||||||||||||||||||||||
| Comment by Jon Capcik [ 2019-12-13 ] | |||||||||||||||||||||||||||||||||||||
|
Has there been any progress on this? We've recently upgraded from 10.0.35 to 10.3.13 in Amazon RDS, and this issue with prepared statements and views has plagued us for the past few days. We have multiple database servers, but on one we have 169 databases each with 510 tables + views (which will be increasing). As we have now learned since the upgrade, the table_definition_cache value has to be greater than 86190 (169 * 510) in order to reliably no longer receive "Prepared statement needs to be re-prepared" errors. It would be great if this could be resolved, as prior to the upgrade, the value was only 2000 and we did not run into this issue. Am I correct in assuming that one of the culprits is the fact that views are stored in the TDC - It would be great if there were some documentation on this somewhere, something along the lines of:
| |||||||||||||||||||||||||||||||||||||
| Comment by Jaco Kroon [ 2020-03-23 ] | |||||||||||||||||||||||||||||||||||||
|
Just to confirm: I can still reproduce this error reliably with 10.2.29-MariaDB-log This is assuming that: table_open_cache >= table_definition_cache As soon as table_definition_cache > table_open_cache then I find I can't reproduce any more (granted, I've only tried the interactive loop 10 or so times). Perhaps an enforcement from code to ensure that perspective that table_definition_cache > table_open_cache is in order. My questions is whether this possibly an off by one error issue (< vs <= kind of thing) somewhere in the code, or whether setting table_definition_cache > table_open_cache is merely hiding the issue. The original response from Elena is hinting at the latter. The other question is: how much bigger should table_definition_cache be than table_open_cache. I'm guessing on the order of number of concurrently required prepared statements? | |||||||||||||||||||||||||||||||||||||
| Comment by Jaco Kroon [ 2020-05-26 ] | |||||||||||||||||||||||||||||||||||||
|
Hi, We just bumped into this with:
Which confirms that setting table_definition_cache > table_open_cache merely hides the issue. MariaDB version 10.4.12 in this particular case. | |||||||||||||||||||||||||||||||||||||
| Comment by Jaco Kroon [ 2020-05-29 ] | |||||||||||||||||||||||||||||||||||||
|
MariaDB 10.4 is definitely much worse. | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2020-06-29 ] | |||||||||||||||||||||||||||||||||||||
|
You can read my report above why it happens, in short table definition cache size does not correspond table cache size (or vice versa). | |||||||||||||||||||||||||||||||||||||
| Comment by Jaco Kroon [ 2020-06-29 ] | |||||||||||||||||||||||||||||||||||||
|
Thank you Oleksandr, You've made various statements above, which is semi coherent. I am however not completely understanding what you're saying, I suspect this is due to a combination of language barrier combined with the fact that I have zero understanding of the MariaDB internals. In short: it's all good and well to know what causes the problem - but how do we fix or at least avoid the issue? Because continually just setting these caches larger and larger doesn't seem to be a practical long-term solution. I already need these at ~50'000 in order to adequately prevent the problem. In your last comment you state "table definition cache size does not correspond [with?] table cache size" - either these sizes correspond or they don't, there is no direction here that I can infer as implied by "or vice versa", so I'm not sure what you're saying? On my dev machine I've currently got (amongst others) the following cache variables which I'm understanding relates: table_open_cache=512 So from previous descriptions, the understanding is that table_definition_cache > table_open_cache should avoid the issue. On my last report, this is not the case, unless it should be: table_open_cache * table_open_cache_instances < table_definition_cache ?? | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2020-10-07 ] | |||||||||||||||||||||||||||||||||||||
|
It would be overkill probably. I am trying to fix it in the other way | |||||||||||||||||||||||||||||||||||||
| Comment by Jaroslav [ 2021-09-27 ] | |||||||||||||||||||||||||||||||||||||
|
Is there any update on fix for this issue? It seems to still happen. | |||||||||||||||||||||||||||||||||||||
| Comment by eason xu [ 2021-12-14 ] | |||||||||||||||||||||||||||||||||||||
|
@Oleksandr Byelkin | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-01-17 ] | |||||||||||||||||||||||||||||||||||||
|
Yes, I am busy with the fixing, hopefully patch will be ready soon (in days). | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-01-19 ] | |||||||||||||||||||||||||||||||||||||
|
Branch : bb-10.2-
| |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-01-24 ] | |||||||||||||||||||||||||||||||||||||
|
I added also cherrypick of timemer fix:
| |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-09-09 ] | |||||||||||||||||||||||||||||||||||||
|
The latest patch is
and it looks like exactly the patch 349283c5e7a that I've already reviewed in https://lists.launchpad.net/maria-developers/msg13077.html | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-09-26 ] | |||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-09-27 ] | |||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-09-27 ] | |||||||||||||||||||||||||||||||||||||
|
OK; there is windows compile problem due to using the macro, I will fix | |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-09-27 ] | |||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-09-28 ] | |||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2022-09-29 ] | |||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-09-29 ] | |||||||||||||||||||||||||||||||||||||
|
280c2cc2946 is ok to push |