[MDEV-4419] SQL PARSER - Enum Optimization - some queries could use index and could return 'Impossible Where' when using ENUM fields and constants outside ENUM declaration Created: 2013-04-23 Updated: 2015-11-17 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | roberto spadim | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | optimizer | ||
| Attachments: |
|
| Description |
|
Hi guys, i found a optimization (maybe a bug?) that could be 'easly' optimized, check this query...
this result in 446196 rows (read all table), and changing it to:
result in 16 rows and 'using where' (an index)
— the idea is... — A second optimization is: check that enum is a diferent field type (for where part, select part and order by/group by parts)... for sit_spa enum ('d','s') not null for sit_spa enum ('d','s') (without not null), enum_index NULL = NULL when using enum_value ='3' and enum_index=2, example: enum ('1','3') — We have a range optimizer (in opt_range.cc), it can already do pretty, much what you say, it can change, for example
to
So, perhaps, all you need to do is to make sure this code is used for your ENUM queries. — but reading sql_select.cc we have remove_eq_conds that check conditions one by one, and can rewrite the condition or return a impossible where, or a always true condition — something like:
|
| Comments |
| Comment by Elena Stepanova [ 2013-04-23 ] | ||
|
Hi Roberto, Please provide the full data dump for dig_lotes. Another question, >> optimize "enum_field!=enum_value" removing the enum_value from enum declaration list. Would you be willing to implement the optiization and provide a patch? | ||
| Comment by roberto spadim [ 2013-04-23 ] | ||
|
hi elena, i don't have many contact with source code of mariadb, but i didn't understand if you got the main idea, something like... enum_field can have enum(x,y,z,....) values + wrong value of enum( out of fields ) before execute field!=value check if the field is a enum or not, if yes it could be optimized to field IN (possible values) (i didn't checked if NOT IN () is optimized but i'm sure that IN is optimized, that's the work around that i'm using to execute SQL via index instead full table scan) i don't know where could i help in the code :/ | ||
| Comment by roberto spadim [ 2013-04-23 ] | ||
|
using the table of
result = Using where, 446577 rows, type= ALL and this (using 'enum field to optimize'):
result = Using where, 28 rows, key = consulta_lote, type = range CREATE TABLE `dig_lotes` (
`data_spa_d` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', | ||
| Comment by roberto spadim [ 2013-04-23 ] | ||
|
here a smaller test table with the same 'problem': | ||
| Comment by roberto spadim [ 2013-04-23 ] | ||
|
SEE THE DIFERENCE WITH TABLE OF LAST POST (5000 ROWS) 1) 2) | ||
| Comment by roberto spadim [ 2013-04-23 ] | ||
|
an example in php... function optimize_diference_enum($enum_values,$enum_diff_value){ i don't know where in mariadb code $enum_field_values could be checked when optimizing the query enum have a very nice point for optimizer, since we know what values can be found without reading the table! for example: 1)select * from tmp_test where sit_spa='asdfasdf' 2)select * from tmp_test where sit_spa!='a' i will stop posting examples in this MDEV, it's too long | ||
| Comment by Elena Stepanova [ 2013-04-24 ] | ||
|
Hi Roberto, >> i don't have many contact with source code of mariadb, but i didn't understand if you got the main idea... I got your idea, it's just when somebody says that something is easy to implement, I hope they already know how exactly to implement it, hence the simplest way to get it done is to provide it in the form of a patch. Anyway, I will re-attach your test case to the report if you don't mind, so it's not lost, and will convert it into a task (a.k.a feature request). | ||
| Comment by Elena Stepanova [ 2013-04-24 ] | ||
|
SQL from http://www.spadim.com.br/mdev%204419.sql is attached as mdev4419.tar.gz | ||
| Comment by roberto spadim [ 2013-04-24 ] | ||
|
yes i agree with you, it's not 'easy' since i don't know how to do it | ||
| Comment by roberto spadim [ 2013-06-14 ] | ||
|
Two points of optimization: | ||
| Comment by roberto spadim [ 2013-06-14 ] | ||
|
to anyone who want help, maybe get_mm_leaf in opt_range.cc is the point to optimize, i don't know the source code, but i will register here, to don't forget or be a start point to anyone |