[MDEV-27110] partition statistics Created: 2021-11-23  Updated: 2023-11-09

Status: Open
Project: MariaDB Server
Component/s: Partitioning
Fix Version/s: None

Type: Task Priority: Major
Reporter: Jinny Yang Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: eits, optimizer-feature


 Description   

Hi team,
When I enabled EITS and tried to collect statistics for one particular partition, it collected EITS for the whole table. Only innodb engine statistics is collected for each partition separately.
And when I ran some queries, I noticed optimizer referred both engine stats and EITS, although use_stat_tables=PREFERABLY.

May I know if this is designed on purpose, and will EITS be collected for each partition separately in the future?

Many thanks.
Best regards,
Jinny

MariaDB [jinny]> show create table orders\G
*************************** 1. row ***************************
       Table: orders
Create Table: CREATE TABLE `orders` (
  `o_orderkey` int(11) NOT NULL,
  `o_custkey` int(11) DEFAULT NULL,
  `o_orderstatus` char(1) DEFAULT NULL,
  `o_totalprice` decimal(10,2) DEFAULT NULL,
  `o_orderDATE` date NOT NULL,
  `o_orderpriority` char(15) DEFAULT NULL,
  `o_clerk` char(15) DEFAULT NULL,
  `o_shippriority` int(11) DEFAULT NULL,
  `o_comment` varchar(79) DEFAULT NULL,
  PRIMARY KEY (`o_orderkey`,`o_orderDATE`),
  KEY `i_o_orderdate` (`o_orderDATE`),
  KEY `i_o_custkey` (`o_custkey`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
 PARTITION BY RANGE (year(`o_orderDATE`))
(PARTITION `p_1992` VALUES LESS THAN (1993) ENGINE = InnoDB,
 PARTITION `p_1993` VALUES LESS THAN (1994) ENGINE = InnoDB,
 PARTITION `p_1994` VALUES LESS THAN (1995) ENGINE = InnoDB,
 PARTITION `p_1995` VALUES LESS THAN (1996) ENGINE = InnoDB,
 PARTITION `p_1996` VALUES LESS THAN (1997) ENGINE = InnoDB,
 PARTITION `p_1997` VALUES LESS THAN (1998) ENGINE = InnoDB,
 PARTITION `p_max_future_dates` VALUES LESS THAN MAXVALUE ENGINE = InnoDB)
1 row in set (0.000 sec)
 
 
MariaDB [jinny]> set use_stat_tables=PREFERABLY;
Query OK, 0 rows affected (0.000 sec)
 
 
MariaDB [jinny]> select @@use_stat_tables;
+-------------------+
| @@use_stat_tables |
+-------------------+
| PREFERABLY        |
+-------------------+
1 row in set (0.000 sec)
 
 
MariaDB [jinny]> alter table orders analyze partition p_1995;
+--------------+---------+----------+-----------------------------------------+
| Table        | Op      | Msg_type | Msg_text                                |
+--------------+---------+----------+-----------------------------------------+
| jinny.orders | analyze | status   | Engine-independent statistics collected |
| jinny.orders | analyze | status   | OK                                      |
+--------------+---------+----------+-----------------------------------------+
2 rows in set (19.475 sec)
 
 
MariaDB [jinny]>  select * from mysql.table_stats where table_name like 'orders%';
+---------+------------+-------------+
| db_name | table_name | cardinality |
+---------+------------+-------------+
| jinny   | orders     |     1500000 |
+---------+------------+-------------+
 
MariaDB [jinny]> select * from mysql.innodb_table_stats where table_name like 'orders%';
+---------------+-----------------------------+---------------------+---------+----------------------+--------------------------+
| database_name | table_name                  | last_update         | n_rows  | clustered_index_size | sum_of_other_index_sizes |
+---------------+-----------------------------+---------------------+---------+----------------------+--------------------------+
| jinny         | orders#P#p_1992             | 2021-11-19 12:07:21 |  224875 |                 1892 |                      770 |
| jinny         | orders#P#p_1993             | 2021-11-19 12:07:21 |  226044 |                 1892 |                      770 |
| jinny         | orders#P#p_1994             | 2021-11-19 12:07:21 |  224410 |                 1892 |                      770 |
| jinny         | orders#P#p_1995             | 2021-11-19 12:23:40 |  225875 |                 1892 |                      770 |>>>>>>>>>>>>>>>>>>>>>>>this is affected by "alter table orders analyze partition p_1995"
| jinny         | orders#P#p_1996             | 2021-11-19 12:07:21 |  226310 |                 1892 |                      770 |
| jinny         | orders#P#p_1997             | 2021-11-19 12:07:21 |  226102 |                 1892 |                      770 |
| jinny         | orders#P#p_max_future_dates | 2021-11-19 12:07:21 |  132749 |                 1123 |                      514 |
+---------------+-----------------------------+---------------------+---------+----------------------+--------------------------+
 
 
MariaDB [jinny]> select sum(n_rows) from mysql.innodb_table_stats where database_name='jinny' and table_name like 'orders%';
+-------------+
| sum(n_rows) |
+-------------+
|     1489127 |                                                                            >>>>>>>>>>>>>>number of rows from engine statistics
+-------------+
1 row in set (0.000 sec)
 
 
MariaDB [jinny]>  select * from mysql.table_stats where table_name like 'orders%';
+---------+------------+-------------+
| db_name | table_name | cardinality |
+---------+------------+-------------+
| jinny   | orders     |     1500000 |                                                     >>>>>>>>>>>>>>number of rows from EITS statistics
+---------+------------+-------------+
 
 
MariaDB [jinny]> explain format=json select * from orders\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "orders",
      "partitions": [
        "p_1992",
        "p_1993",
        "p_1994",
        "p_1995",
        "p_1996",
        "p_1997",
        "p_max_future_dates"
      ],
      "access_type": "ALL",
      "rows": 1489127,                                                                  >>>>>>>>>>>>>>the same as engine statistics
      "filtered": 100
    }
  }
}
 
 
MariaDB [jinny]> explain format=json select * from orders where o_orderdate='1995-05-01' and o_orderstatus in ('O','P')\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "orders",
      "partitions": ["p_1995"],
      "access_type": "ref",
      "possible_keys": ["i_o_orderdate"],
      "key": "i_o_orderdate",
      "key_length": "3",
      "used_key_parts": ["o_orderDATE"],
      "ref": ["const"],
      "rows": 629,
      "filtered": 50.78125,                                                              >>>>>>>>>>>>>>EITS
      "attached_condition": "orders.o_orderstatus in ('O','P')"
    }
  }
}
1 row in set (0.000 sec)



 Comments   
Comment by Sergei Petrunia [ 2021-11-23 ]

Yes, it is our intent that EITS statistics are collected on a per-partition basis.

The current approach where EITS statistics are for the whole table - this was either oversight or lack of resources, or both. There is no argument why statistics should be global, while there are plenty of arguments that it should be per-partition.

Generated at Thu Feb 08 09:50:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.