Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33151

Add more columns to TABLE_STATISTICS and USER STATS

Details

    Description

      Following changes are for the userstat plugin:

      Add the following columns to TABLE_STATISTICS for MariaDB monitors:

      • ROWS_INSERTED, ROWS_DELETED, ROWS_UPDATED, KEY_READ_HITS and KEY_READ_MISSES

      Add the following statistics to CLIENT_STATISTICS and USER_STATISTICS:

      • KEY_READ_HITS and KEY_READ_MISSES

      Attachments

        Issue Links

          Activity

            Commit message:

            MDEV-33151 Add more columns to TABLE_STATISTICS and USER STATS

            Columns added to TABLE_STATISTICS

            • ROWS_INSERTED, ROWS_DELETED, ROWS_UPDATED, KEY_READ_HITS and
              KEY_READ_MISSES.

            Columns added to CLIENT_STATISTICS and USER_STATISTICS:

            • KEY_READ_HITS and KEY_READ_MISSES.

            User visible changes (except new columns):

            • CLIENT_STATISTICS and USER_STATISTICS has columns KEY_READ_HITS and
              KEY_READ_MISSES added after column ROWS_UPDATED before SELECT_COMMANDS.

            Other changes:

            • All row variables that are part of user_stats moved to
              'struct rows_stats' to make it easy to clear all of them at once.
            • ha_read_key_misses added to STATUS_VAR

            Notes:

            • userstat.result has a change of handler_read_key. This is because
              use-stat-tables is now disabled for the test.
            monty Michael Widenius added a comment - Commit message: MDEV-33151 Add more columns to TABLE_STATISTICS and USER STATS Columns added to TABLE_STATISTICS ROWS_INSERTED, ROWS_DELETED, ROWS_UPDATED, KEY_READ_HITS and KEY_READ_MISSES. Columns added to CLIENT_STATISTICS and USER_STATISTICS: KEY_READ_HITS and KEY_READ_MISSES. User visible changes (except new columns): CLIENT_STATISTICS and USER_STATISTICS has columns KEY_READ_HITS and KEY_READ_MISSES added after column ROWS_UPDATED before SELECT_COMMANDS. Other changes: All row variables that are part of user_stats moved to 'struct rows_stats' to make it easy to clear all of them at once. ha_read_key_misses added to STATUS_VAR Notes: userstat.result has a change of handler_read_key. This is because use-stat-tables is now disabled for the test.

            Code can be found in bb-11.5-monty

            monty Michael Widenius added a comment - Code can be found in bb-11.5-monty

            to test this feature separately please use bb-11.5-MDEV-33151-userstat

            serg Sergei Golubchik added a comment - to test this feature separately please use bb-11.5- MDEV-33151 -userstat

            Current status:

            MDEV-33913 is a simple bug which needs to be fixed, a link to the proposed fix is in a comment.

            In regard to KEY_READ_HITS and KEY_READ_MISSES, I find the functionality confusing. First, the counters are located strangely. I would expect that key_read_hits and key_read_misses in their implemented semantics (number of rows the key has found / hasn't found) to make sense for per-index statistics, while for client/user/table it makes more sense to count the number of queries which used or didn't use the index. I'm not sure how much use is in counting total number of key hits/misses per client, without detalisation per index.

            Secondly, MDEV-33916 was filed for KEY_READ_MISSES at least in CLIENT/USER statistics not being meaningful .

            Further, I find the KEY_READ_HITS results strange even for simple queries, like

            MariaDB [test]> create or replace table t (a int primary key);
            Query OK, 0 rows affected (0.901 sec)
             
            MariaDB [test]> insert into t values (1),(2),(3),(4),(5);
            Query OK, 5 rows affected (0.014 sec)
            Records: 5  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> flush table_statistics;
            Query OK, 0 rows affected (0.001 sec)
             
            MariaDB [test]> select a from t where a > 3;
            +---+
            | a |
            +---+
            | 4 |
            | 5 |
            +---+
            2 rows in set (0.001 sec)
             
            MariaDB [test]> show table_statistics \G
            *************************** 1. row ***************************
                       Table_schema: test
                         Table_name: t
                          Rows_read: 2
                       Rows_changed: 0
            Rows_changed_x_#indexes: 0
                      Rows_inserted: 0
                       Rows_updated: 0
                       Rows_deleted: 0
                      Key_read_hits: 1
                    Key_read_misses: 0
            1 row in set (0.001 sec)
             
            MariaDB [test]> select a from t where a in (4,5,6,7);
            +---+
            | a |
            +---+
            | 4 |
            | 5 |
            +---+
            2 rows in set (0.001 sec)
             
            MariaDB [test]> show table_statistics \G
            *************************** 1. row ***************************
                       Table_schema: test
                         Table_name: t
                          Rows_read: 4
                       Rows_changed: 0
            Rows_changed_x_#indexes: 0
                      Rows_inserted: 0
                       Rows_updated: 0
                       Rows_deleted: 0
                      Key_read_hits: 3
                    Key_read_misses: 2
            1 row in set (0.001 sec)
            

            That is, for the first query the hits are "1", for the second query "2". It can probably be justified from the internal implementation perspective, but would be hard to explain to users.

            Given the above, I think it would be useful to have the feature reviewed by an optimizer expert (it doesn't look like it was reviewed at all). psergei agreed to do this, so I'm assigning it to him.

            elenst Elena Stepanova added a comment - Current status: MDEV-33913 is a simple bug which needs to be fixed, a link to the proposed fix is in a comment. In regard to KEY_READ_HITS and KEY_READ_MISSES , I find the functionality confusing. First, the counters are located strangely. I would expect that key_read_hits and key_read_misses in their implemented semantics (number of rows the key has found / hasn't found) to make sense for per-index statistics, while for client/user/table it makes more sense to count the number of queries which used or didn't use the index. I'm not sure how much use is in counting total number of key hits/misses per client, without detalisation per index. Secondly, MDEV-33916 was filed for KEY_READ_MISSES at least in CLIENT/USER statistics not being meaningful . Further, I find the KEY_READ_HITS results strange even for simple queries, like MariaDB [test]> create or replace table t (a int primary key ); Query OK, 0 rows affected (0.901 sec)   MariaDB [test]> insert into t values (1),(2),(3),(4),(5); Query OK, 5 rows affected (0.014 sec) Records: 5 Duplicates: 0 Warnings: 0   MariaDB [test]> flush table_statistics; Query OK, 0 rows affected (0.001 sec)   MariaDB [test]> select a from t where a > 3; + ---+ | a | + ---+ | 4 | | 5 | + ---+ 2 rows in set (0.001 sec)   MariaDB [test]> show table_statistics \G *************************** 1. row *************************** Table_schema: test Table_name: t Rows_read: 2 Rows_changed: 0 Rows_changed_x_#indexes: 0 Rows_inserted: 0 Rows_updated: 0 Rows_deleted: 0 Key_read_hits: 1 Key_read_misses: 0 1 row in set (0.001 sec)   MariaDB [test]> select a from t where a in (4,5,6,7); + ---+ | a | + ---+ | 4 | | 5 | + ---+ 2 rows in set (0.001 sec)   MariaDB [test]> show table_statistics \G *************************** 1. row *************************** Table_schema: test Table_name: t Rows_read: 4 Rows_changed: 0 Rows_changed_x_#indexes: 0 Rows_inserted: 0 Rows_updated: 0 Rows_deleted: 0 Key_read_hits: 3 Key_read_misses: 2 1 row in set (0.001 sec) That is, for the first query the hits are "1", for the second query "2". It can probably be justified from the internal implementation perspective, but would be hard to explain to users. Given the above, I think it would be useful to have the feature reviewed by an optimizer expert (it doesn't look like it was reviewed at all). psergei agreed to do this, so I'm assigning it to him.
            psergei Sergei Petrunia added a comment - - edited

            That is, for the first query the hits are "1", for the second query "2". It can probably be justified from the internal implementation perspective, but would be hard to explain to users.

            The query

            MariaDB [test]> select a from t where a > 3;

            has made one "hit" where it was looking for row with a right after "3".
            The row that it has read after this was not counted.

            MariaDB [test]> select a from t where a in (4,5,6,7);
            

            Here we found rows with a=4 and a=5 (two hits) and haven't found anything with a=6 and a=7 (two misses)...

            I'm still thinking about how one interprets that...
            KEY_READ_HITS and KEY_READ_MISSES are the lookups that were made into indexes.
            Index reads normally are:
            1. a lookup which can either find row or not. Lookup is relatively expensive.
            2. possibly, reads of one or more subsequent/previous index records. These should be relatively cheaper.

            This patch counts #1, it counts both the case when the row is found and when it is not. It looks somewhat incomplete that #2 are not counted...
            and I agree that one would expect to see these statistics collected on per-index basis, too.

            Another question is: should lookups to find the first / last index records be counted as KEY_READ_HITS... They are also index dives so it would be logical to count them...

            psergei Sergei Petrunia added a comment - - edited That is, for the first query the hits are "1", for the second query "2". It can probably be justified from the internal implementation perspective, but would be hard to explain to users. The query MariaDB [test]> select a from t where a > 3; has made one "hit" where it was looking for row with a right after "3". The row that it has read after this was not counted. MariaDB [test]> select a from t where a in (4,5,6,7); Here we found rows with a=4 and a=5 (two hits) and haven't found anything with a=6 and a=7 (two misses)... I'm still thinking about how one interprets that... KEY_READ_HITS and KEY_READ_MISSES are the lookups that were made into indexes. Index reads normally are: 1. a lookup which can either find row or not. Lookup is relatively expensive. 2. possibly, reads of one or more subsequent/previous index records. These should be relatively cheaper. This patch counts #1, it counts both the case when the row is found and when it is not. It looks somewhat incomplete that #2 are not counted... and I agree that one would expect to see these statistics collected on per-index basis, too. Another question is: should lookups to find the first / last index records be counted as KEY_READ_HITS ... They are also index dives so it would be logical to count them...

            Hitting Review Done as the patch does what it says it will do...

            I have the following comments (also elaborated above), but it's a matter of opinion whether to consider them to be within the scope of this MDEV or outside:

            • Should we have a counter like KEY_READ_NEXT to count rows that we read from index in next[_same] and prev[_same] calls?
            • Should we count index_first()/index_last() as lookups also?
            • Should we have KEY_READ_HITS/MISSES in INDEX statistics?
            psergei Sergei Petrunia added a comment - Hitting Review Done as the patch does what it says it will do... I have the following comments (also elaborated above), but it's a matter of opinion whether to consider them to be within the scope of this MDEV or outside: Should we have a counter like KEY_READ_NEXT to count rows that we read from index in next [_same] and prev [_same] calls? Should we count index_first()/index_last() as lookups also? Should we have KEY_READ_HITS/MISSES in INDEX statistics?

            People

              serg Sergei Golubchik
              monty Michael Widenius
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.