Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7648

Extra data in ANALYZE FORMAT=JSON $stmt

Details

    Description

      ANALYZE FORMAT=JSON $stmt can (should) print more data about execution.

      Requested things

      • execution time of various parts of the statement
      • Real IO done by accesses to different tables
      • etc etc

      Attachments

        Issue Links

          Activity

            Tracking table read times will require start/stop points.

            The choice of starting point is obvious:

            • In JOIN::save_explain(), remember the current counter values. This is where the tracking starts

            As for ending, there are options:

            • Do it somewhere in JOIN::cleanup(full=true). This allows to save the data from index_merge's extra handlers before they are removed.
            • Do it when producing the ANALYZE output. This will work (tables are closed after ANALYZE is printed). This is not as intrusive (ANALYZE code is in one place).
            • Install into PFS_table a callback, which the table will call when it is closed. This is useful for index_merge/DS-MRR's extra handlers, and also compatible with log_slow_query. However, it is not compatible with the current location of code that prints ANALYZE: tables are still open when ANALYZE output is produced. Moving the code that prints ANALYZE further is difficult, because 1. other parts of the query might be freed 2. it goes against the general structure of the code.
            psergei Sergei Petrunia added a comment - Tracking table read times will require start/stop points. The choice of starting point is obvious: In JOIN::save_explain(), remember the current counter values. This is where the tracking starts As for ending, there are options: Do it somewhere in JOIN::cleanup(full=true). This allows to save the data from index_merge's extra handlers before they are removed. Do it when producing the ANALYZE output. This will work (tables are closed after ANALYZE is printed). This is not as intrusive (ANALYZE code is in one place). Install into PFS_table a callback, which the table will call when it is closed. This is useful for index_merge/DS-MRR's extra handlers, and also compatible with log_slow_query. However, it is not compatible with the current location of code that prints ANALYZE: tables are still open when ANALYZE output is produced. Moving the code that prints ANALYZE further is difficult, because 1. other parts of the query might be freed 2. it goes against the general structure of the code.
            psergei Sergei Petrunia added a comment - - edited

            Hit an interesting property in P_S: http://bugs.mysql.com/bug.php?id=76100. It doesn't directly affect this task.

            psergei Sergei Petrunia added a comment - - edited Hit an interesting property in P_S: http://bugs.mysql.com/bug.php?id=76100 . It doesn't directly affect this task.

            Ok so we now can collect the time that is spent accessing the tables.
            The next step is "execution time of various parts of the statement". We want to track the time spent in various subqueries, i.e SELECTs. P_S and profiling system has calls like this:

            void JOIN::exec_inner()
            {
              ...
              THD_STAGE_INFO(thd, stage_executing);

            It's tempting to re-use those to collect the time spent in SELECTs... but we can't do that, because

            • stage_executing doesn't tell which select is being run.
            • there is no THD_STAGE_INFO() call at the end of JOIN::exec_inner
              • when a child select changes current stage, nobody will care to set it back?
            • the stages are not timed by default.

            select * from performance_schema.setup_instruments where name like 'stage%';

            shows that stage tracking is not enabled by default.

            psergei Sergei Petrunia added a comment - Ok so we now can collect the time that is spent accessing the tables. The next step is "execution time of various parts of the statement". We want to track the time spent in various subqueries, i.e SELECTs. P_S and profiling system has calls like this: void JOIN::exec_inner() { ... THD_STAGE_INFO(thd, stage_executing); It's tempting to re-use those to collect the time spent in SELECTs... but we can't do that, because stage_executing doesn't tell which select is being run. there is no THD_STAGE_INFO() call at the end of JOIN::exec_inner when a child select changes current stage, nobody will care to set it back? the stages are not timed by default. select * from performance_schema.setup_instruments where name like 'stage%'; shows that stage tracking is not enabled by default.
            psergei Sergei Petrunia added a comment - - edited

            Examples of output with r_time_ms, time spent to access the table:
            https://gist.github.com/spetrunia/ae497fb33e2b5ff2b2c6
            https://gist.github.com/spetrunia/aa4c22b748eb74906950

            ANALYZE in PostgreSQL:
            https://gist.github.com/spetrunia/cab742456f9b0ebb74cf

            PostgreSQL also support a kind of ANALYZE FORMAT=JSON. (their syntax is explain (analyze, format json) select):
            https://gist.github.com/spetrunia/e0e5b706f4691aa81006

            psergei Sergei Petrunia added a comment - - edited Examples of output with r_time_ms, time spent to access the table: https://gist.github.com/spetrunia/ae497fb33e2b5ff2b2c6 https://gist.github.com/spetrunia/aa4c22b748eb74906950 ANALYZE in PostgreSQL: https://gist.github.com/spetrunia/cab742456f9b0ebb74cf PostgreSQL also support a kind of ANALYZE FORMAT=JSON. (their syntax is explain (analyze, format json) select ): https://gist.github.com/spetrunia/e0e5b706f4691aa81006

            Switched from using P_S to using our own timing code.

            psergei Sergei Petrunia added a comment - Switched from using P_S to using our own timing code.

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.