[MDEV-6357] QUERY CACHE - Query Cache And Limit Clause Created: 2014-06-18 Updated: 2014-07-08 Due: 2014-07-02 Resolved: 2014-07-07 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.0.10 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Minor |
| Reporter: | BELUGABEHR | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | querycache | ||
| Description |
|
In reading about the query cache, I came across an undocumented situation. https://mariadb.com/kb/en/query-cache/ Basically, the cache should consume queries that always return the same results. With that said, I have noticed that the query cache will cache a query with a LIMIT clause without an ORDER BY clause. This seems incorrect because the LIMIT could return different results for different queries. For completeness sake, I would suggest not caching queries that use a LIMIT without an ORDER BY clause. |
| Comments |
| Comment by Daniel Black [ 2014-06-18 ] |
|
Just because a LIMIT without an ORDER BY could return a different result doesn't mean that the result will be returned from the cache is incorrect. If the output needs a deterministic order then a ORDER BY should be used. If no caching is needed then SQL_NO_CACHE can be used. I'd be of the impression failing to cache LIMIT-full and ORDER BY-less queries would be an unnecessary change in behaviour but that's just my opinion. What aspect of this is undocumented? There's nothing special about ORDER BY or LIMIT so its not documented as such. |
| Comment by Elena Stepanova [ 2014-06-18 ] |
|
I agree with Daniel's comment, or maybe I don't quite understand the nature of complaint. Could you please describe a situation where the result would actually be wrong because of using the query cache? |
| Comment by Elena Stepanova [ 2014-07-07 ] |
|
Closing as incomplete for now, if you have any additional information, please comment to re-open. |
| Comment by BELUGABEHR [ 2014-07-08 ] |
|
Elena, Thank you for following up on this trivial remark. I ask that you don't sink too much time into thinking about this. I am probably thinking about this too hard myself. "It also means [the] query has to be exactly the same and deterministic." It's the word "deterministic" that I find interesting in the description. I honestly cannot imagine a true business scenario, but a limit clause without a order by is not deterministic and the result set could change row order between queries. No? To store an un-ordered result list does not seem true to form. Within the documentation for replication, it states: "Statement-based replication of LIMIT clauses in DELETE, UPDATE, and INSERT ... SELECT statements is unsafe since the order of the rows affected is not defined." So, it does seem to me that caching a result set that has not been first ordered may not be the desired behavior. |
| Comment by Daniel Black [ 2014-07-08 ] |
|
ah. the select without a order by isn't deterministic in the sense that exactly which rows are returned isn't predictable. With no data changes to the tables (or global config changes) however two sequential queries will return the same result (there's no chance the query plan will change). Because this is the case it may as well go in the query cache. Data changes will invalidate the cache anyway. |
| Comment by Elena Stepanova [ 2014-07-08 ] |
|
Yes, using the word "deterministic" may be not completely accurate in this case; but then again, it is not official documentation, and besides later in the same post Peter introduces the definition of "deterministic" query, which actually fits fine a query with LIMIT without ORDER BY:
This query indeed might provide same result – even though there is no obligation to (unlike a query with CONNECTION_ID(), which might not produce the same result if it's run from different connections). As far as I can see, the official MySQL manual avoids claiming that a query must be deterministic, and instead just provides a list of exceptions explicitly. |