[MDEV-27781] Expose syntax parse tree with name resolution Created: 2022-02-08 Updated: 2022-04-14 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Parser |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | Rick James | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | energy, innodb, performance | ||
| Issue Links: |
|
||||||||
| Description |
|
I would like to get the parse tree of a SELECT statement after the names have been resolved. The goal from there is to develop an automated INDEX generator that comes close to optimal for the query. Currently, I am experimenting with PHP and regexps and having moderately good success. The name resolution would take care of single-table queries (wherein I can't easily know what is a column versus a keyword, JOINs without fully qualified column names, etc. Also, the parse tree would take care of some edge cases that my regexps don't cover. I understand that the MariaDB code builds a tree of objects to be fed to the Optimizer. The addon would extract the attributes that I need: constant/column/keyword, column name, table_id, datatype, etc. I see different approaches for hooking this into the server: a plugin or container, or my preference: something similar to "Optimizer Trace" – turn on a flag, and some JSON is made available. Yes, I do understand that there are a lot of variations in optimizing indexes – histograms, probes, BKA, MRR, filter percentage, etc. But you might be horrified to hear how many users don't know what an index is or say "but I indexed every column" or have not heard of a 'composite' index or don't know the difference between JOIN and LEFT JOIN or how an auto_increment PK might slow down a query. I expect my 'addon' will be a useful crutch for novices – They extract the tree; my tool gives them an INDEX recommendation. |