[MDEV-20609] Full table scan in INFORMATION_SCHEMA.PARAMETERS/ROUTINES Created: 2019-09-17  Updated: 2023-08-07  Resolved: 2022-10-13

Status: Closed
Project: MariaDB Server
Component/s: Server
Fix Version/s: 10.11.0

Type: Task Priority: Critical
Reporter: Diego Dupin Assignee: Oleg Smirnov
Resolution: Fixed Votes: 2
Labels: CONNECTOR_RELATED, Preview_10.11

Attachments: File mdev20609.diff     PNG File screenshot-1.png     PNG File screenshot-2.png    
Issue Links:
PartOf
is part of MDEV-29547 prepare 10.11.0 preview releases Closed
Problem/Incident
causes CONJ-684 Performance of con.getMetaData().getP... Confirmed
causes MDEV-29550 Valgrind/MSAN uninitialised value err... Closed
causes MDEV-29858 Missing DBUG_RETURN or DBUG_VOID_RETU... Closed
Relates
relates to MDEV-28979 Feature Request: Allow for indexes to... Closed
relates to MDEV-29104 Performance Issues Reading informatio... Closed

 Description   

It seems there is an issue when querying on I_S.PARAMETERS table (used in java metadata) :
when doing this kind of query (one result) :
SELECT * FROM INFORMATION_SCHEMA.PARAMETERS WHERE SPECIFIC_SCHEMA = 'testj' AND SPECIFIC_NAME = 'p2_512'
It takes 15ms if 400 parameters in I_S.PARAMETERS table
It takes 65ms if 4K parameters
It takes 350ms if 20K parameters

It seems to execute a full table scan in INFORMATION_SCHEMA.PARAMETERS

The same query takes 6ms on mysql 8 with 20K same parameters



 Comments   
Comment by Sergei Golubchik [ 2019-09-17 ]

I_S code has support for up to two indexed columns. INFORMATION_SCHEMA.PARAMETERS should make use of that.

Comment by Diego Dupin [ 2020-08-31 ]

mysql 8 : DESCRIBE SELECT * FROM INFORMATION_SCHEMA.PARAMETERS WHERE SPECIFIC_SCHEMA = 'testj' AND SPECIFIC_NAME = 'p2_512'

mariadb 10.5:

Comment by Sergei Petrunia [ 2021-10-19 ]

Ok I_S.PARAMETERS table provides information about parameters for stored routines.

MariaDB code has optimization for e.g. I_S.COLUMNS where it can infer the database and/or table name from the WHERE clause.

The optimization is invoked in optimize_schema_tables_reads(). See the optimize_for_get_all_tables() call.

However the optimization is not used for I_S.PARAMETERS table. The execution takes branch:

      /* skip I_S optimizations specific to get_all_tables */
      if (table_list->schema_table->fill_table != get_all_tables)
        continue;

because

(gdb) p table_list->schema_table->fill_table
  $14 = (int (*)(THD *, TABLE_LIST *, COND *)) 0x555555f9af34 <fill_schema_proc(THD*, TABLE_LIST*, Item*)>

Comment by Sergei Petrunia [ 2021-10-19 ]

I'ts not only the optimization code that's missing.
The execution part in fill_schema_proc currently uses a full index scan over mysql.proc table (I am not sure why it is a full index scan and not a full table scan).

As far as I understand, it parses each SP unconditionally (no matter whether its columns are needed or not).

Comment by Sergei Petrunia [ 2021-10-19 ]

Possible optimizations would be (in the order of decreasing importance):
1. Try to infer a lookup value from the WHERE condition. This has huge practical importance as most queries request information about one specific stored procedure.
2. Before loading SP, check if the part of the WHERE that we can compute without loading is satisfied.

3. Don't load the SP if we don't need any columns.

Comment by Sergei Petrunia [ 2021-10-19 ]

On first glance, it seems calc_lookup_values_from_cond can be used to infer the lookup value.

Comment by Juan [ 2021-12-06 ]

Hi sanja,

We have a customer that has identified this problem as causing a plugin they use to execute stored procedures to consume 99% of the query time in most cases.

Is there any workaround to help with these missing optimizations from MySQL other than just reading I_S.PARAMETERS into a temporary table and indexing that properly for their plugin to use? (I don't know if this is possible - just trying to see what could be done short of implementing the missing optimizations)

-Juan

Comment by Oleksandr Byelkin [ 2021-12-11 ]

I got an idea to create "shadow" table by triggers on mysql.proc as a workaround, but mariadb does not support triggers on system tables. So the only posibility is creating "shadow" table by schedule (I am not sure if it is acceptable).

Comment by Sergei Golubchik [ 2022-07-06 ]

What shadow table? The fix is, as psergei wrote, use calc_lookup_values_from_cond and then index_read() to read the correct row for the requested routine

Comment by Oleksandr Byelkin [ 2022-07-19 ]

calc_lookup_values_from_cond can be applied only to schema, using it for routine name looks more like a feature

Comment by Oleksandr Byelkin [ 2022-07-20 ]

we discussed it with psergei

calc_lookup_values_from_cond can not be used directly to make optimization possible (it require different fields and applied to database instead of a file system)

There are 2 way to get functionality (both are not a bugfixes bug features):

1. make something like calc_lookup_values_from_cond but for routines with 3 parameters (schema, routine name, routine type) and then in case of present of this parameters make index range scan instead index scan used now.

2. make optimizer think that there are indexes (in both cases routines and tables) and let optimizer process it normally (it is more work but potentially more gain especially in joins).

Comment by Sergei Petrunia [ 2022-07-20 ]

See also MDEV-29104 which was filed about the same (or similar) issue.

Comment by Sergei Golubchik [ 2022-08-01 ]

calc_lookup_values_from_cond() can perfectly be used directly. I've attached a proof-of-concept patch that does just that.

Comment by Oleg Smirnov [ 2022-09-13 ]

The solution also targets queries from I_S.ROUTINES:

SELECT * FROM information_schema.ROUTINES WHERE routine_schema='test' AND routine_name='proc1';
SELECT * FROM information_schema.ROUTINES WHERE routine_schema='test';

The mysql.proc table primary key will be used if routine_schema and/or routine_name are specified explicitly in the query.

Comment by Elena Stepanova [ 2022-09-15 ]

The tree isn't ready for testing, it failed in buildbot.

Comment by Oleg Smirnov [ 2022-09-16 ]

The fix was pushed, the testing may continue.

Comment by Daniel Black [ 2022-09-17 ]

related, pushed for increased usage of information_schema.PARAMETERS in mybatis-3 so much appreciate the increased performance.

Comment by Diego Dupin [ 2022-09-28 ]

Just for confirmation: correction has been benchmarked using java connector and that shows that performance doesn't decrease using 400/4K/20K parameters.

Comment by Ramesh Sivaraman [ 2022-09-29 ]

Performance looks good when compared to 10.6 version.

10.11.0 9f7d2824779785506d1bb316540428990e815245 (Optimized)

for i in 1 5 20 100; do /test/mtest/MD280922-mariadb-10.11.0-linux-x86_64-opt/bin/mysqlslap -uroot -S/test/mtest/MD280922-mariadb-10.11.0-linux-x86_64-opt/socket.sock -q "select * from information_schema.parameters where SPECIFIC_NAME = 'aaa_sp50001'; " -i 10 -c $i; done
Benchmark
	Average number of seconds to run all queries: 0.021 seconds
	Minimum number of seconds to run all queries: 0.010 seconds
	Maximum number of seconds to run all queries: 0.039 seconds
	Number of clients running queries: 1
	Average number of queries per client: 1
 
Benchmark
	Average number of seconds to run all queries: 0.080 seconds
	Minimum number of seconds to run all queries: 0.058 seconds
	Maximum number of seconds to run all queries: 0.095 seconds
	Number of clients running queries: 5
	Average number of queries per client: 1
 
Benchmark
	Average number of seconds to run all queries: 0.434 seconds
	Minimum number of seconds to run all queries: 0.411 seconds
	Maximum number of seconds to run all queries: 0.450 seconds
	Number of clients running queries: 20
	Average number of queries per client: 1
 
Benchmark
	Average number of seconds to run all queries: 2.793 seconds
	Minimum number of seconds to run all queries: 2.754 seconds
	Maximum number of seconds to run all queries: 2.859 seconds
	Number of clients running queries: 100
	Average number of queries per client: 1

Comment by Ramesh Sivaraman [ 2022-10-07 ]

Performance comparison using Perl and Python mysql connector

perl mysql connector
10.6.10
-----------
Queries: 1 | Execution time(seconds) : 0.237706 
Queries: 5 | Execution time(seconds) : 1.392153 
Queries: 20 | Execution time(seconds) : 5.468022 
Queries: 100 | Execution time(seconds) : 27.572478 
 
10.7.6
----------
Queries: 1 | Execution time(seconds) : 0.206260 
Queries: 5 | Execution time(seconds) : 0.917974 
Queries: 20 | Execution time(seconds) : 3.775994 
Queries: 100 | Execution time(seconds) : 18.507874 
 
10.11 Bug fix branch
-------------------------------
Queries: 1 | Execution time(seconds) : 0.010314 
Queries: 5 | Execution time(seconds) : 0.049723 
Queries: 20 | Execution time(seconds) : 0.195309 
Queries: 100 | Execution time(seconds) : 0.975044 
 
python mysql connector
10.6.10
----------
Queries: 5  Execution time(seconds) :  1.1055042743682861
Queries: 20  Execution time(seconds) :  5.25916600227356
Queries: 100  Execution time(seconds) :  27.54954743385315
 
10.7.6
---------
Queries: 5  Execution time(seconds) :  0.7787322998046875
Queries: 20  Execution time(seconds) :  3.5663089752197266
Queries: 100  Execution time(seconds) :  18.440176963806152
 
10.11 Bug fix branch
-------------------------------
Queries: 5  Execution time(seconds) :  0.04043316841125488
Queries: 20  Execution time(seconds) :  0.18743038177490234
Queries: 100  Execution time(seconds) :  0.9795618057250977

Comment by Ramesh Sivaraman [ 2022-10-11 ]

okay to push.

Comment by Oleg Smirnov [ 2022-10-13 ]

Pushed to 10.11.

Generated at Thu Feb 08 09:00:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.