[MDEV-29104] Performance Issues Reading information_schema.PARAMETERS Created: 2022-07-14  Updated: 2023-03-21  Resolved: 2022-10-14

Status: Closed
Project: MariaDB Server
Component/s: Server
Fix Version/s: 10.11.0

Type: Task Priority: Major
Reporter: Juan Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Preview_10.11

Attachments: File cs0401486-test.sql     Text File mdev29104-test-log.txt    
Issue Links:
PartOf
is part of MDEV-29547 prepare 10.11.0 preview releases Closed
Relates
relates to MDEV-20609 Full table scan in INFORMATION_SCHEMA... Closed

 Description   

Reading from information_schema.PARAMETERS table in MariaDB is slower than MySQL by a factor of 20x in the best of cases, and deteriorates linearly in MariaDB once the number of reading threads exceeds the number of available cores, whereas in MySQL the deterioration remains more-or-less flat.

The following table shows relative performance running mysqlslap against the dataset provided in the attached sql file cs0401486-test.sql with this series of commands

for i in 1 5 20 100
do
	mysqlslap -q "select * from information_schema.parameters where SPECIFIC_NAME = 'aaa_sp50001'" -i 10 -c $i
done

Tests conducted in CentOS 7 machines with 1G RAM and approx. 800 real IOPS with one and four Intel i7-9750H cores @ 2.592GHz, average times reported:

CoresThreads MDB 6.7  MySQL 8 Diff %
  1     1     0.193    0.008   2413%
  1     5     0.901    0.015   6007%
  4     1     0.171    0.004   4275%
  4     5     0.296    0.007   4229%
  4     20    1.159    0.066   1756%
  4    100    19.017   0.340   5593%

Timings all on MariaDB 10.6.7. Confirmed on 10.6.8, 10.7.4, and 10.8.3 in Rocky Linux.



 Comments   
Comment by Elena Stepanova [ 2022-07-26 ]

Assigning to psergei because it was previously discussed that he was looking into similar issues. Please reassign if needed.

Comment by Sergei Petrunia [ 2022-09-16 ]

Closing as Duplicate of MDEV-20609

Comment by Sergei Petrunia [ 2022-09-16 ]

Wait this MDEV (and the underlying support issue) uses this query:

select * from information_schema.parameters where SPECIFIC_NAME = 'aaa_sp50001'

That is, they search for the routine aaa_sp50001 *in all databases*. I doubt MDEV-20609 would be of help. I also think that it's reasonable to ask the customer to use a condtion on the database name, unless they indeed want to search in all databases.

Also they claim MySQL is faster for such query. oleg.smirnov can you check if MySQL-8.0 has any optimizations for this?

Comment by Oleg Smirnov [ 2022-09-19 ]

MySQL has significantly different structure here - I_S.PARAMETERS is a view:

CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER 
VIEW `PARAMETERS` AS 
select 
  `cat`.`name` AS `SPECIFIC_CATALOG`,
  `sch`.`name` AS `SPECIFIC_SCHEMA`,
  `rtn`.`name` AS `SPECIFIC_NAME`,
  if((`rtn`.`type` = 'FUNCTION'),(`prm`.`ordinal_position` - 1),`prm`.`ordinal_position`) AS `ORDINAL_POSITION`,
  if(((`rtn`.`type` = 'FUNCTION') and (`prm`.`ordinal_position` = 1)),NULL,`prm`.`mode`) AS `PARAMETER_MODE`,
  if(((`rtn`.`type` = 'FUNCTION') and (`prm`.`ordinal_position` = 1)),NULL,`prm`.`name`) AS `PARAMETER_NAME`,
  substring_index(substring_index(`prm`.`data_type_utf8`,'(',1),' ',1) AS `DATA_TYPE`,
  internal_dd_char_length(`prm`.`data_type`,`prm`.`char_length`,`col`.`name`,0) AS `CHARACTER_MAXIMUM_LENGTH`,
  internal_dd_char_length(`prm`.`data_type`,`prm`.`char_length`,`col`.`name`,1) AS `CHARACTER_OCTET_LENGTH`,
  `prm`.`numeric_precision` AS `NUMERIC_PRECISION`,
  if((`prm`.`numeric_precision` is null),NULL,ifnull(`prm`.`numeric_scale`,0)) AS `NUMERIC_SCALE`,
  `prm`.`datetime_precision` AS `DATETIME_PRECISION`,
  (case `prm`.`data_type` 
    when 'MYSQL_TYPE_STRING' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_VAR_STRING' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_VARCHAR' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_TINY_BLOB' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_MEDIUM_BLOB' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_BLOB' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_LONG_BLOB' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_ENUM' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) 
    when 'MYSQL_TYPE_SET' then if((`cs`.`name` = 'binary'),NULL,`cs`.`name`) else NULL end) AS `CHARACTER_SET_NAME`,
  (case `prm`.`data_type` when 'MYSQL_TYPE_STRING' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_VAR_STRING' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_VARCHAR' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_TINY_BLOB' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_MEDIUM_BLOB' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_BLOB' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_LONG_BLOB' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_ENUM' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) 
    when 'MYSQL_TYPE_SET' then if((`cs`.`name` = 'binary'),NULL,`col`.`name`) else NULL end) AS `COLLATION_NAME`,
  `prm`.`data_type_utf8` AS `DTD_IDENTIFIER`,
  `rtn`.`type` AS `ROUTINE_TYPE` 
  from (((((`mysql`.`parameters` `prm` join `mysql`.`routines` `rtn` on((`prm`.`routine_id` = `rtn`.`id`))) 
    join `mysql`.`schemata` `sch` on((`rtn`.`schema_id` = `sch`.`id`))) 
    join `mysql`.`catalogs` `cat` on((`cat`.`id` = `sch`.`catalog_id`))) 
    join `mysql`.`collations` `col` on((`prm`.`collation_id` = `col`.`id`))) 
    join `mysql`.`character_sets` `cs` on((`col`.`character_set_id` = `cs`.`id`))) 
  where (0 <> can_access_routine(`sch`.`name`,`rtn`.`name`,`rtn`.`type`,`rtn`.`definer`,false))

So the explain plans looks like as follows:

mysql> explain select * from parameters where specific_name='format_bytes';
+----+-------------+-------+------------+--------+--------------------------+---------+---------+----------------------------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys            | key     | key_len | ref                        | rows | filtered | Extra                                      |
+----+-------------+-------+------------+--------+--------------------------+---------+---------+----------------------------+------+----------+--------------------------------------------+
|  1 | SIMPLE      | cat   | NULL       | index  | PRIMARY                  | name    | 194     | NULL                       |    1 |   100.00 | Using index                                |
|  1 | SIMPLE      | prm   | NULL       | ALL    | routine_id,collation_id  | NULL    | NULL    | NULL                       |   83 |   100.00 | Using where; Using join buffer (hash join) |
|  1 | SIMPLE      | rtn   | NULL       | eq_ref | PRIMARY,schema_id        | PRIMARY | 8       | mysql.prm.routine_id       |    1 |    10.00 | Using where                                |
|  1 | SIMPLE      | sch   | NULL       | eq_ref | PRIMARY,catalog_id       | PRIMARY | 8       | mysql.rtn.schema_id        |    1 |   100.00 | Using where                                |
|  1 | SIMPLE      | col   | NULL       | eq_ref | PRIMARY,character_set_id | PRIMARY | 8       | mysql.prm.collation_id     |    1 |   100.00 | NULL                                       |
|  1 | SIMPLE      | cs    | NULL       | eq_ref | PRIMARY                  | PRIMARY | 8       | mysql.col.character_set_id |    1 |   100.00 | NULL                                       |
+----+-------------+-------+------------+--------+--------------------------+---------+---------+----------------------------+------+----------+--------------------------------------------+
6 rows in set, 1 warning (0.00 sec)

Comment by Oleg Smirnov [ 2022-09-19 ]

Short overview of how the Data Dictionary architecture has been reworked in MySQL 8.0: https://www.slideshare.net/StleDeraas/data-dictionary-pl17. One of the reasons mentioned is low performance of queries to INFORMATION_SCHEMA tables.

Comment by Oleg Smirnov [ 2022-09-19 ]

Implemented an optimization suggested by psergei:

commit ac6d5ff303323a45b0125a02f41c760b30d0e1fd (HEAD -> bb-10.11-MDEV-29104, origin/bb-10.11-MDEV-29104)
Author: Oleg Smirnov <olernov@gmail.com>
Date:   Mon Sep 19 21:36:09 2022 +0700
 
    MDEV-29104 Optimize queries to INFORMATION_SCHEMA.PARAMETERS/ROUTINES
 
    For queries like
        "SELECT * FROM INFORMATION_SCHEMA.PARAMETERS
          WHERE SPECIFIC_NAME='proc_name'"
    and
        "SELECT * FROM INFORMATION_SCHEMA.ROUTINES
          WHERE ROUTINE_NAME='proc_name'"
    there is a possibility to avoid loading a stored procedure code and
    parsing it to retrieve parameters.
    If the name of the procedure/function is specified explicitly then
    it is possible to filter out routines that do not match at
    an early stage.

Comment by Sergei Petrunia [ 2022-09-22 ]

queries	Average query time		
queries	improved2-MDEV-29104	improved1-MDEV-20609	10.11-vanilla
1	0.006	0.138	0.136
5	0.033	0.283	0.293
20	0.158	2.152	2.146
100	0.8	10.802	10.868

Comment by Ramesh Sivaraman [ 2022-10-11 ]

okay to push

Comment by Oleg Smirnov [ 2022-10-14 ]

Pushed to 10.11.

Generated at Thu Feb 08 10:05:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.