Details
-
Task
-
Status: Open (View Workflow)
-
Trivial
-
Resolution: Unresolved
-
None
Description
Using the idea of MDEV-4598, we could implement external query cache at mysql server side, this is very nice and i think it's better than client side
instead of executing the query cache inside of only one mysql server machine, we will execute it in a "mysql cache server machine", and the mysql server machine will report to client the mysql result (like it do today) and to cache server. in this case we DON'T NEED to change mysql client protocol, but we will have two tcp/ip connections, one for client and other to cache server, maybe the network will be a problem now... but it's a nice solution...
the api is near to MDEV-4598, but the query cache server don't need to check mysql server, it will only receive mysql server information... check:
when mysql execute a query, it will change data or cache a result (if it have query cache), the point here is two functions:
invalidate query cache and put query in cache
the other part of the query cache server is retrieve queries in cache and send to client, but instead of mysql server < - > cache server, we will do it at mysql client < - > cache server and mysql client < - > mysql server
maybe we could do it at mysql server side and mysql client don't see any change, just mysql server, now we have a seconds server, and anyone could connect to it (we should consider cache server auth methods? it's like handler socket daemon...)
--------
mysql server <-> cache server function (the write part of cache server):
cache_add_query(server_id,sql_flags,sql,result,tables_checksum){
|
server_id could be a GTID or a local query cache server ID
|
sql_flags = schema used, sql_mode, and others flags
|
sql = the query
|
result = the result of the query
|
tables_checksum = the tables used + a checksum
|
|
this function send all parameters to cache server
|
}
|
cache_table_changed(server_id,tables_checksum){
|
here mysql_server send information that a table changed
|
}
|
cache_remove_table(server_id,table){
|
remove queries that use that table
|
}
|
cache_remove_query(server_id,query_id){
|
remove queries from cache using the query id
|
}
|
cache_remove_server(server_id){
|
remove queries from cache using the server id
|
}
|
check that the tables_checksum is the main part here... and we don't know if mysql have partition or not, the cache server will always understand that we have parititions, in other words we always have db+table+partition information, when mysql server don't have partition, partition will be considered 0 or -1
when a query have 10 partitions we will use 10 db+table+paritition, in this case the .FRM file will need to record information of checksum from all partitions, not only from the table
i think that a nice 'checksum' could be:
server start time (unix time from mysql process startup) +
database name +
table name +
partition information +
table/partition counters +
table/partition checksum
table checksum can be something about table create time or another thing like this (md5 of .frm file for example)
check that a server can flush table (close table) from 'table cache' or not, in this case we could implement table counters in memory, and when server flush the table we write counters to disk
in checksum we have the server start time, a server crash will invalidate the cache, i think it's a nice feature, but users will want a on/off switch here
--------
mysql client < - > cache server function (the read part of query cache):
cache_get_query(server_id,sql_flags,sql){
|
here client will send server_id,sql_flags,the sql and the results
|
}
|
cache_dump_queries(server_id){
|
read all queries from cache
|
}
|
cache_dump_tables(server_id){
|
read all tables checksums from cache
|
}
|
|
check that qc_info.cc will have no use now... maybe we could implement a qc_info.cc at mysql server that read all cache information from query cache server
|
check that in this case we could be mysql client, or mysql server
if we use it at mysql server, the client will transparent return cache information
------
mysql client <-> mysql server functions (the instrumentation part of query cache that's not exposed to users today):
since sql_flags is something that change from one mysql/mariadb version to another... maybe we should implement in client library and mysql protocol, a new function to get sql_flags:
|
|
cache_get_sql_flags(){
|
return the sql_flags used in the current mysql client <-> mysql server connection
|
}
|
|
or maybe implement it at SQL language: SELECT SQL_CACHE_FLAGS()
|
returning a binary field with sql flags
|
------
the second part now, is how to scale cache servers?
we can hash the server_id+sql_flags+sql text, and select a query cache using this hash function
in this case we can have for example 5 mysql server + 2 cache server, or anyother mysql/cache server configuration
number of mysql server < number of cache server is stupid in my opnion, example 1 mysql server and 5 cache servers, but we can use it
using qc_info.cc i found that qc normally take about 10us to found the query in query cache, and small queries expend >100us to execute, if we have a network that's slower than query expend time, maybe external query cache server is a waste of hardware, maybe in this case we could use local query cache + external query cache? a small query cache (100MB) at mysql server + a big query cache (16GB) in query cache server?
Attachments
Issue Links
- duplicates
-
MDEV-4598 QUERY CACHE - Client Side Query Cache
- Open