[MDEV-6017] Add support for Indexes on Expressions Created: 2014-04-03  Updated: 2023-12-22

Status: In Review
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 11.5

Type: New Feature Priority: Major
Reporter: Fabio Alessandro Locati Assignee: Sergei Golubchik
Resolution: Unresolved Votes: 20
Labels: gsoc19, index, optimizer-feature

Issue Links:
Blocks
blocks MDEV-4414 INDEX - SUBSTRING, LEFT and others st... Open
blocks MDEV-4430 INDEX Optimization - Virtual Column, ... Closed
blocks MXS-3624 Use indexed virtual columns if such e... Open
is blocked by MDEV-5800 indexes on virtual (not materialized)... Closed
Duplicate
duplicates MDEV-17993 WHERE F(X) = constY should used F(X) ... Closed
PartOf
includes MDEV-5631 Implement Reverse Key Index Closed

 Description   

An index on expression means something like

CREATE TABLE t1 (a int, b int, INDEX (a/2+b));
...
SELECT * FROM t1 WHERE a/2+b=100;

in this case the optimizer should be able to use an index.

This task naturally splits in two steps:

  1. add expression matching into the optimizer, use it for generated columns. Like in
    CREATE TABLE t1 (a int, b int, c INT GENERATED ALWAYS AS (a/2+b), INDEX (c));
  2. support the syntax to create an index on expression directly, this will automatically create a hidden generated column under the hood

original task description is visible in the history



 Comments   
Comment by VAROQUI Stephane [ 2014-04-04 ]

What about using the virtual columns feature ? https://mariadb.com/kb/en/virtual-columns/

Comment by roberto spadim [ 2015-01-29 ]

this allow implement of virtual columns with index

Comment by Daniel Black [ 2017-05-30 ]

fale as mentioned the virtual columns combined with indexes on these provide the equivalent functionality to the Postgres implementation even if slightly different. Queries need to be rewritten to reference the virtual column to use the index on that column. There are no triggers involved. Is this an acceptable implement of your feature request?

Comment by VAROQUI Stephane [ 2017-05-30 ]

May be a limited implementation of simple case of parser rewriting constant to virtual constant can be quick win like:
SELECT id FROM t WHERE c1="www.mariadb.com"

rewrite to for the case of virtual column v1 defined as SUBSTRING_INDEX(c1, '.', 2);

SELECT id FROM t WHERE c1=="www.mariadb.com" AND v1= SUBSTRING_INDEX('www.mariadb.org', '.', 2);

Comment by Sergei Golubchik [ 2017-05-30 ]

danblack, using virtual columns is an acceptable workaround in many cases.

This feature is mainly about using virtual columns automatically, when it's expression is used in a query, not only when a virtual column is used by name. Combining it with MDEV-10177 we can implement "indexes on expressions" with no (user-visible) virtual columns involved.

Comment by Pavel Cibulka [ 2018-08-07 ]

Can you also include Partial / Filtered Indexes MDEV-15140. MariaDB / MySQL is only major DB not supporting this feature.

Comment by Sergei Golubchik [ 2018-08-07 ]

Yes, of course. But it's MDEV-15140, a separate feature

Comment by Nikita Malyavin [ 2019-08-17 ]

According to pgsql docs: https://www.postgresql.org/docs/11/sql-createindex.html
We'll have INDEX((a+b/2))
That's reasonable since we can't distinguish a function call from ident(NUM) on parser level

Comment by Federico Razzoli [ 2023-02-13 ]

Some questions:

  1. Will UNIQUE expressional indexes be supported?
  2. Can we assume that all syntaxes supported for regular index will be supported for expressional indexes, even if this is not shown in the description? Eg: index names, IF [NOT] EXISTS.
  3. Will the index expression be visible in the information_schema?
  4. Is the hidden generated column VIRTUAL or PERSISTENT?
  5. When creating the index on an existing table, will LOCK and ALGORITHM be the same we have with ALTER TABLE ... ADD COLUMN ... GENERATED AS ... , ADD INDEX ... ?
  6. Will expressional index creation be faster than the above?
Comment by Sergei Golubchik [ 2023-02-27 ]

All answers are preliminary, but that's what I think will be:

  1. yes
  2. yes
  3. good question, we haven't thought of that. May be not at first. There is no information_schema table specifically for indexes (STATISTICS is column-centric, one index may occupy many rows)
  4. VIRTUAL
  5. yes
  6. likely not
Generated at Thu Feb 08 07:08:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.