Details

    Description

      Create a plugin API and an example plugin to do query transformations.

      This API should not force plugin to parse or work with the SQL string. Instead it may provide a DOM-like representation of the query and let the plugin to manipulate the tree nodes.

      Attachments

        Issue Links

          Activity

            this implementation from https://github.com/mariadb/server/commit/c511453f2ff22addde52e4fda3795755f52a5a77 has a couple of issues:

            • it used the audit plugin, which is conceptually wrong, audit plugins are essentially only observers by definition and cannot change anything in the query
            • it is very spider-specific, the plugin has minimal changes in the server and all the logic is done inside spider, basically it has a whole SQL parser internally now. We definitely cannot request all query rewrite plugins to implement SQL parsers, we need a much simpler solution.

            Let's discuss the API first.

            serg Sergei Golubchik added a comment - this implementation from https://github.com/mariadb/server/commit/c511453f2ff22addde52e4fda3795755f52a5a77 has a couple of issues: it used the audit plugin, which is conceptually wrong, audit plugins are essentially only observers by definition and cannot change anything in the query it is very spider-specific, the plugin has minimal changes in the server and all the logic is done inside spider, basically it has a whole SQL parser internally now. We definitely cannot request all query rewrite plugins to implement SQL parsers, we need a much simpler solution. Let's discuss the API first.
            bscalzo Bert Scalzo added a comment -

            If the initial version should work 100% the same as the current 5.7 MySQL plug-in - that would be ideal and would keep things very simple. The MySQL implementation essentially has a table with five main columns - id (pk), database, original sql, replacement SQL, and enabled flag. The idea is that a DBA or developer can review the slow log to find a problematic SQL, and then simply insert a row with the original vs the replacement. That allows a simple way for correcting bad SQL where you cannot access the code (i.e. third party tools). Note that the SQL strings do NOT replace constant or literal values with ? or anything. This is acceptable to match MySQL implementation for compatibility.

            In the future, maybe we could discuss making this feature smarter and work more like PostgreSQL - where we could consider allowing replacement of constant and literal values with a ?. But remember that if the SQL says select * from employee where col=value, then the value really does matter. If the table has fewer rows matching the value then using the index if better, otherwise doing a full table scan is better. The point is that just using a ? rather than the value would not make such replacement easy. In fact PostgreSQL offers the ability to take the SQL string with ? such as select * from table where col=? and to run a function to return the value to facilitate doing such replacement. So we should discuss and debate before choosing this direction.

            In the distant future, having the ability to work like Oracle or SQL Server and creating a named execution plan "stored profile" or "SQL Store" object where the plan is the main object could also be discussed, but there are problem with this approach as well. If I find a query where the original has N where clause conditions such as col1=1 and col2=2, etc and then the rewrite has either N+1 or N-1 conditions in the rewrite, then the optimizer in these other databases sees that difference and uses the original SQL and redoes the execution plan. So these databases will not allow for the tuning person to replace a SQL unless the numbers of conditions with values are identical. That can defaet the prupose of a SQL rewrite entirely. So again recommend that we discuss in more detail.

            Note that right now there is a 3rd party product from QIKR called HyperQuery which uses the MySQL plug-in as is to take bad SQL strings found in the slow log, runs an AI engine against a knowledge base to remove all bad programming mistakes and that further tries to rewrite the SQL to be more efficient for the original intent than what was coded. That tool offers magnitudes of performance improvement. That is not because of any issue or flaw with the optimizer, but rather that many SQL coders and tools that generate SQL for many databases (BI tools) create ASCII correct SQL that often does not perform well universally. So having the ability to do a simple replacement as with the MySQL plug-in allows for tools like this to add significant value to the database platform.

            bscalzo Bert Scalzo added a comment - If the initial version should work 100% the same as the current 5.7 MySQL plug-in - that would be ideal and would keep things very simple. The MySQL implementation essentially has a table with five main columns - id (pk), database, original sql, replacement SQL, and enabled flag. The idea is that a DBA or developer can review the slow log to find a problematic SQL, and then simply insert a row with the original vs the replacement. That allows a simple way for correcting bad SQL where you cannot access the code (i.e. third party tools). Note that the SQL strings do NOT replace constant or literal values with ? or anything. This is acceptable to match MySQL implementation for compatibility. In the future, maybe we could discuss making this feature smarter and work more like PostgreSQL - where we could consider allowing replacement of constant and literal values with a ?. But remember that if the SQL says select * from employee where col=value, then the value really does matter. If the table has fewer rows matching the value then using the index if better, otherwise doing a full table scan is better. The point is that just using a ? rather than the value would not make such replacement easy. In fact PostgreSQL offers the ability to take the SQL string with ? such as select * from table where col=? and to run a function to return the value to facilitate doing such replacement. So we should discuss and debate before choosing this direction. In the distant future, having the ability to work like Oracle or SQL Server and creating a named execution plan "stored profile" or "SQL Store" object where the plan is the main object could also be discussed, but there are problem with this approach as well. If I find a query where the original has N where clause conditions such as col1=1 and col2=2, etc and then the rewrite has either N+1 or N-1 conditions in the rewrite, then the optimizer in these other databases sees that difference and uses the original SQL and redoes the execution plan. So these databases will not allow for the tuning person to replace a SQL unless the numbers of conditions with values are identical. That can defaet the prupose of a SQL rewrite entirely. So again recommend that we discuss in more detail. Note that right now there is a 3rd party product from QIKR called HyperQuery which uses the MySQL plug-in as is to take bad SQL strings found in the slow log, runs an AI engine against a knowledge base to remove all bad programming mistakes and that further tries to rewrite the SQL to be more efficient for the original intent than what was coded. That tool offers magnitudes of performance improvement. That is not because of any issue or flaw with the optimizer, but rather that many SQL coders and tools that generate SQL for many databases (BI tools) create ASCII correct SQL that often does not perform well universally. So having the ability to do a simple replacement as with the MySQL plug-in allows for tools like this to add significant value to the database platform.
            james.wang james wang added a comment -

            Can I also suggest that the rewrite plugin able to:

            rewrites OR with UNION automatically if ORed tables are different and their columns are indexed?
            e.g.
            select t1.col2,t2.col3 from table t1 inner join table t2 using id where t1.col1=blah_blah OR t2.col2=blah_blah;
            both t1.col1 and t2.col2 are indexed.

            replaced with:
            (select t1.col2,t2.col3 from table t1 inner join table t2 using id where t1.col1=blah_blah)
            UNION
            (select t1.col2,t2.col3 from table t1 inner join table t2 using id where t2.col2=blah_blah);

            Thanks

            james.wang james wang added a comment - Can I also suggest that the rewrite plugin able to: rewrites OR with UNION automatically if ORed tables are different and their columns are indexed? e.g. select t1.col2,t2.col3 from table t1 inner join table t2 using id where t1.col1=blah_blah OR t2.col2=blah_blah; both t1.col1 and t2.col2 are indexed. replaced with: (select t1.col2,t2.col3 from table t1 inner join table t2 using id where t1.col1=blah_blah) UNION (select t1.col2,t2.col3 from table t1 inner join table t2 using id where t2.col2=blah_blah); Thanks
            stephane@skysql.com VAROQUI Stephane added a comment - - edited

            @james wang , i think this case is already covered by index merge optimization or is not the case ?
            But for sure having an UNION optimization for C1 IN RANGE AND C2 IN RANGE for covering index (C1,C2) could be more efficient compare to intersect or ICP , also in your case UNION does a distinct that could cost a lot more if you don't work on index but on temporary tables with big dataset.

            stephane@skysql.com VAROQUI Stephane added a comment - - edited @james wang , i think this case is already covered by index merge optimization or is not the case ? But for sure having an UNION optimization for C1 IN RANGE AND C2 IN RANGE for covering index (C1,C2) could be more efficient compare to intersect or ICP , also in your case UNION does a distinct that could cost a lot more if you don't work on index but on temporary tables with big dataset.
            james.wang james wang added a comment -

            @VAROQUI Stephane if ORs for the same table, yes "index merge optimization" covers .

            If different tables, nope

            james.wang james wang added a comment - @VAROQUI Stephane if ORs for the same table, yes "index merge optimization" covers . If different tables, nope

            People

              serg Sergei Golubchik
              stephane@skysql.com VAROQUI Stephane
              Votes:
              7 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.