Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17297

stats.records=0 for a table of Archive engine when it has rows, when we run ANALYZE command

Details

    Description

      A mtr test

      --source include/have_archive.inc
      CREATE TABLE gis_point  (fid INTEGER PRIMARY KEY AUTO_INCREMENT, g POINT)engine=archive;
      INSERT INTO gis_point VALUES 
      (101, PointFromText('POINT(10 10)')),
      (102, PointFromText('POINT(20 10)')),
      (103, PointFromText('POINT(20 20)')),
      (104, PointFromWKB(AsWKB(PointFromText('POINT(10 20)'))));
      set @@optimizer_use_condition_selectivity=4;
      set @@use_stat_tables=PREFERABLY;
      ANALYZE TABLE gis_point;
      explain select * from gis_point;
      

      +----+-------------+----------+--------+----------------+------+---------+-------+-------+----------+
      | id | select_type | table    | type   | possible_keys  | key  | key_len | ref   | rows  | Extra    |
      +----+-------------+----------+--------+----------------+------+---------+-------+-------+----------+
      |  1 | SIMPLE      | gis_point| ALL    | NULL           | NULL | NULL    | NULL  | 0     |          |
      +----+-------------+----------+--------+---------------------------------+-------+-------+----------+
      

      select * from mysql.table_stats;
      db_name	table_name	cardinality
      test	gis_point	0
      

      After debugging I see, when we enter the function collect_statistics_for_table , for archive engine we get HA_ERR_END_OF_FILE when we start reading the rows, as stats.records = 0

      Attachments

        Issue Links

          Activity

            ha_archive::rnd_init() assumes that ha_archive::info( was called before:

              /* We rewind the file so that we can read from the beginning if scan */
              if (scan)
              {
                scan_rows= stats.records;
            

            and later in ha_archive::rnd_next()

              if (!scan_rows)
              {
                rc= HA_ERR_END_OF_FILE;
                goto end;
              }
              scan_rows--;
            

            This assumption is false. If Archive needs to know the number of rows before rnd_init, it should calculate this number in external_lock, not rely on info being invoked by the caller.

            serg Sergei Golubchik added a comment - ha_archive::rnd_init() assumes that ha_archive::info( was called before: /* We rewind the file so that we can read from the beginning if scan */ if (scan) { scan_rows= stats.records; and later in ha_archive::rnd_next() if (!scan_rows) { rc= HA_ERR_END_OF_FILE; goto end; } scan_rows--; This assumption is false. If Archive needs to know the number of rows before rnd_init , it should calculate this number in external_lock , not rely on info being invoked by the caller.

            Following the f2f discussion in the big room:

            So when one tries the suggested change:

             diff --git a/storage/archive/ha_archive.cc b/storage/archive/ha_archive.cc
            index 487c0038239..a2931ea840f 100644
            --- a/storage/archive/ha_archive.cc
            +++ b/storage/archive/ha_archive.cc
            @@ -1171,6 +1171,7 @@ int ha_archive::rnd_init(bool scan)
               /* We rewind the file so that we can read from the beginning if scan */
               if (scan)
               {
            +    stats.records= share->rows_recorded;
                 scan_rows= stats.records;
                 DBUG_PRINT("info", ("archive will retrieve %llu rows", 
                                     (unsigned long long) scan_rows));
            

            EITS still sees 0 rows.

            psergei Sergei Petrunia added a comment - Following the f2f discussion in the big room: So when one tries the suggested change: diff --git a/storage/archive/ha_archive.cc b/storage/archive/ha_archive.cc index 487c0038239..a2931ea840f 100644 --- a/storage/archive/ha_archive.cc +++ b/storage/archive/ha_archive.cc @@ -1171,6 +1171,7 @@ int ha_archive::rnd_init(bool scan) /* We rewind the file so that we can read from the beginning if scan */ if (scan) { + stats.records= share->rows_recorded; scan_rows= stats.records; DBUG_PRINT("info", ("archive will retrieve %llu rows", (unsigned long long) scan_rows)); EITS still sees 0 rows.

            The reason it gets 0 rows is here:

              #0  ha_archive::unpack_row (this=0x7fff7806f558, file_to_read=0x7fff7806f9f8, record=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1249
              #1  0x00007ffff085f749 in ha_archive::get_row_version3 (this=0x7fff7806f558, file_to_read=0x7fff7806f9f8, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1293
              #2  0x00007ffff085f156 in ha_archive::get_row (this=0x7fff7806f558, file_to_read=0x7fff7806f9f8, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1199
              #3  0x00007ffff085fc51 in ha_archive::rnd_next (this=0x7fff7806f558, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1399
              #4  0x0000555555efb34b in handler::ha_rnd_next (this=0x7fff7806f558, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/sql/handler.cc:2765
              #5  0x0000555555cc0553 in collect_statistics_for_table (thd=0x7fff78000d60, table=0x7fff7806e910) at /home/psergey/dev-git/10.3-clean/sql/sql_statistics.cc:2751
              #6  0x0000555555d78757 in mysql_admin_table (thd=0x7fff78000d60, tables=0x7fff78015ec0, check_opt=0x7fff78005b88, operator_name=0x5555567850b7 "analyze", lock_type=TL_READ_NO_INSERT, org_open_for_modify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x555555effc4e <handler::ha_analyze(THD*, st_ha_check_opt*)>, view_operator_func=0x0) at /home/psergey/dev-git/10.3-clean/sql/sql_admin.cc:882
              #7  0x0000555555d7a18e in Sql_cmd_analyze_table::execute (this=0x7fff78016510, thd=0x7fff78000d60) at /home/psergey/dev-git/10.3-clean/sql/sql_admin.cc:1313
              #8  0x0000555555c102da in mysql_execute_command (thd=0x7fff78000d60) at /home/psergey/dev-git/10.3-clean/sql/sql_parse.cc:6285
              #9  0x0000555555c1563b in mysql_parse (thd=0x7fff78000d60, rawbuf=0x7fff78015dd8 "ANALYZE TABLE gis_point", length=23, parser_state=0x7fffd99355d0, is_com_multi=false, is_next_command=false) at /home/psergey/dev-git/10.3-clean/sql/sql_parse.cc:8091
            

              /* If we read nothing we are at the end of the file */
              if (read == 0 || read != ARCHIVE_ROW_HEADER_SIZE)
                DBUG_RETURN(HA_ERR_END_OF_FILE);
            

            psergei Sergei Petrunia added a comment - The reason it gets 0 rows is here: #0 ha_archive::unpack_row (this=0x7fff7806f558, file_to_read=0x7fff7806f9f8, record=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1249 #1 0x00007ffff085f749 in ha_archive::get_row_version3 (this=0x7fff7806f558, file_to_read=0x7fff7806f9f8, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1293 #2 0x00007ffff085f156 in ha_archive::get_row (this=0x7fff7806f558, file_to_read=0x7fff7806f9f8, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1199 #3 0x00007ffff085fc51 in ha_archive::rnd_next (this=0x7fff7806f558, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/storage/archive/ha_archive.cc:1399 #4 0x0000555555efb34b in handler::ha_rnd_next (this=0x7fff7806f558, buf=0x7fff78060c18 "\377") at /home/psergey/dev-git/10.3-clean/sql/handler.cc:2765 #5 0x0000555555cc0553 in collect_statistics_for_table (thd=0x7fff78000d60, table=0x7fff7806e910) at /home/psergey/dev-git/10.3-clean/sql/sql_statistics.cc:2751 #6 0x0000555555d78757 in mysql_admin_table (thd=0x7fff78000d60, tables=0x7fff78015ec0, check_opt=0x7fff78005b88, operator_name=0x5555567850b7 "analyze", lock_type=TL_READ_NO_INSERT, org_open_for_modify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x555555effc4e <handler::ha_analyze(THD*, st_ha_check_opt*)>, view_operator_func=0x0) at /home/psergey/dev-git/10.3-clean/sql/sql_admin.cc:882 #7 0x0000555555d7a18e in Sql_cmd_analyze_table::execute (this=0x7fff78016510, thd=0x7fff78000d60) at /home/psergey/dev-git/10.3-clean/sql/sql_admin.cc:1313 #8 0x0000555555c102da in mysql_execute_command (thd=0x7fff78000d60) at /home/psergey/dev-git/10.3-clean/sql/sql_parse.cc:6285 #9 0x0000555555c1563b in mysql_parse (thd=0x7fff78000d60, rawbuf=0x7fff78015dd8 "ANALYZE TABLE gis_point", length=23, parser_state=0x7fffd99355d0, is_com_multi=false, is_next_command=false) at /home/psergey/dev-git/10.3-clean/sql/sql_parse.cc:8091 /* If we read nothing we are at the end of the file */ if (read == 0 || read != ARCHIVE_ROW_HEADER_SIZE) DBUG_RETURN(HA_ERR_END_OF_FILE);

            This is a step ahead as opposed to the way it was failing previously.

            (It was failing in ha_archive::rnd_next, here:

              if (!scan_rows)
              {
                rc= HA_ERR_END_OF_FILE;
                goto end;
              }
            

            now, we get a bit further, to the rc= get_row(&archive, buf); call.

            psergei Sergei Petrunia added a comment - This is a step ahead as opposed to the way it was failing previously. (It was failing in ha_archive::rnd_next, here: if (!scan_rows) { rc= HA_ERR_END_OF_FILE; goto end; } now, we get a bit further, to the rc= get_row(&archive, buf); call.

            varun please compare execution between how it happens for ANALYZE TABLE and for SELECT * FROM t1 and check what else has not been initialized.

            psergei Sergei Petrunia added a comment - varun please compare execution between how it happens for ANALYZE TABLE and for SELECT * FROM t1 and check what else has not been initialized.

            So I am debuging two servers, one running ANALYZE TABLE, the other running SELECT* FROM:

            Common:

            drop table gis_point;
            CREATE TABLE gis_point  (fid INTEGER PRIMARY KEY AUTO_INCREMENT, g POINT)engine=archive;
            INSERT INTO gis_point VALUES 
            (101, PointFromText('POINT(10 10)')),
            (102, PointFromText('POINT(20 10)')),
            (103, PointFromText('POINT(20 20)')),
            (104, PointFromWKB(AsWKB(PointFromText('POINT(10 20)'))));
             
            set @@optimizer_use_condition_selectivity=4;
            set @@use_stat_tables=PREFERABLY;
            

            Debuggee1:

            ANALYZE TABLE gis_point;
            

            Debuggee2:

            select * from gis_point;
            

            psergei Sergei Petrunia added a comment - So I am debuging two servers, one running ANALYZE TABLE, the other running SELECT* FROM: Common: drop table gis_point; CREATE TABLE gis_point (fid INTEGER PRIMARY KEY AUTO_INCREMENT, g POINT)engine=archive; INSERT INTO gis_point VALUES (101, PointFromText( 'POINT(10 10)' )), (102, PointFromText( 'POINT(20 10)' )), (103, PointFromText( 'POINT(20 20)' )), (104, PointFromWKB(AsWKB(PointFromText( 'POINT(10 20)' ))));   set @@optimizer_use_condition_selectivity=4; set @@use_stat_tables=PREFERABLY; Debuggee1: ANALYZE TABLE gis_point; Debuggee2: select * from gis_point;

            Both binaries have the above patch with

            +    stats.records= share->rows_recorded;
            

            applied.
            Both are stopped in the first ha_archive::rnd_next call.

            They are reading their archive files at the same position:

            psergey@pslp4:~$ cat /proc/27957/fdinfo/28
            pos:    1047
            flags:  02100000
            mnt_id: 90
            psergey@pslp4:~$ cat /proc/27924/fdinfo/52
            pos:    1047
            flags:  02100000
            mnt_id: 90
            

            However, this call in azread

                  s->stream.avail_in = (uInt)mysql_file_read(s->file, (uchar *)s->inbuf,
                                                             AZ_BUFSIZE_READ, MYF(0));
            

            produces

            • 10 bytes in the ANALYZE TABLE debuggee
            • 56 bytes in the SELECT-* debuggee.

            The files are indeed of different sizes:

            psergey@pslp4:~$ ls -lah /proc/27957/fd/28
            lr-x------ 1 psergey psergey 64 ноя  7 10:45 /proc/27957/fd/28 -> /home/psergey/dev-git/10.3/mysql-test/var/install.db/j1/gis_point.ARZ
            psergey@pslp4:~$ ls -la /proc/27924/fd/52
            lr-x------ 1 psergey psergey 64 ноя  7 10:45 /proc/27924/fd/52 -> /home/psergey/dev-git/10.3-cp/mysql-test/var/install.db/j1/gis_point.ARZ
            

            psergey@pslp4:~$ ls -la /home/psergey/dev-git/10.3/mysql-test/var/install.db/j1/gis_point.ARZ
            -rw-rw---- 1 psergey psergey 1057 ноя  7 10:38 /home/psergey/dev-git/10.3/mysql-test/var/install.db/j1/gis_point.ARZ
             
            psergey@pslp4:~$ ls -la /home/psergey/dev-git/10.3-cp/mysql-test/var/install.db/j1/gis_point.ARZ
            -rw-rw---- 1 psergey psergey 1103 ноя  7 10:38 /home/psergey/dev-git/10.3-cp/mysql-test/var/install.db/j1/gis_point.ARZ
            

            It reads both till the end:

            1057= 1047 + 10
            1103 = 1047 + 56
            

            psergei Sergei Petrunia added a comment - Both binaries have the above patch with + stats.records= share->rows_recorded; applied. Both are stopped in the first ha_archive::rnd_next call. They are reading their archive files at the same position: psergey@pslp4:~$ cat /proc/27957/fdinfo/28 pos: 1047 flags: 02100000 mnt_id: 90 psergey@pslp4:~$ cat /proc/27924/fdinfo/52 pos: 1047 flags: 02100000 mnt_id: 90 However, this call in azread s->stream.avail_in = (uInt)mysql_file_read(s->file, (uchar *)s->inbuf, AZ_BUFSIZE_READ, MYF(0)); produces 10 bytes in the ANALYZE TABLE debuggee 56 bytes in the SELECT-* debuggee. The files are indeed of different sizes: psergey@pslp4:~$ ls -lah /proc/27957/fd/28 lr-x------ 1 psergey psergey 64 ноя 7 10:45 /proc/27957/fd/28 -> /home/psergey/dev-git/10.3/mysql-test/var/install.db/j1/gis_point.ARZ psergey@pslp4:~$ ls -la /proc/27924/fd/52 lr-x------ 1 psergey psergey 64 ноя 7 10:45 /proc/27924/fd/52 -> /home/psergey/dev-git/10.3-cp/mysql-test/var/install.db/j1/gis_point.ARZ psergey@pslp4:~$ ls -la /home/psergey/dev-git/10.3/mysql-test/var/install.db/j1/gis_point.ARZ -rw-rw---- 1 psergey psergey 1057 ноя 7 10:38 /home/psergey/dev-git/10.3/mysql-test/var/install.db/j1/gis_point.ARZ   psergey@pslp4:~$ ls -la /home/psergey/dev-git/10.3-cp/mysql-test/var/install.db/j1/gis_point.ARZ -rw-rw---- 1 psergey psergey 1103 ноя 7 10:38 /home/psergey/dev-git/10.3-cp/mysql-test/var/install.db/j1/gis_point.ARZ It reads both till the end: 1057= 1047 + 10 1103 = 1047 + 56

            In both debuggees, INSERT INTO ... statement causes 4 calls to azwrite() to be made. They don't seem to be writing anything to the file though, they just accumulate stuff in the internal buffer.

            The SELECT-* debuggee has this call to azflush:

            (gdb) wher
              #0  azflush (s=0x7fff78041458, flush=2) at /home/psergey/dev-git/10.3-cp/storage/archive/azio.c:685
              #1  0x00007ffff086096d in ha_archive::info (this=0x7fff78033c28, flag=18) at /home/psergey/dev-git/10.3-cp/storage/archive/ha_archive.cc:1688
              #2  0x0000555555d279b6 in TABLE_LIST::fetch_number_of_rows (this=0x7fff78016070) at /home/psergey/dev-git/10.3-cp/sql/table.cc:8354
              #3  0x0000555555c553d4 in make_join_statistics (join=0x7fff780167c8, tables_list=..., keyuse_array=0x7fff78016ab8) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:4444
              #4  0x0000555555c4c884 in JOIN::optimize_inner (this=0x7fff780167c8) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:1888
              #5  0x0000555555c4af23 in JOIN::optimize (this=0x7fff780167c8) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:1451
              #6  0x0000555555c54a3a in mysql_select (thd=0x7fff78000d60, tables=0x7fff78016070, wild_num=1, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fff780167a0, unit=0x7fff78004c10, select_lex=0x7fff78005380) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:4213
              #7  0x0000555555c46a69 in handle_select (thd=0x7fff78000d60, lex=0x7fff78004b48, result=0x7fff780167a0, setup_tables_done_option=0) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:373
              #8  0x0000555555c116fa in execute_sqlcom_select (thd=0x7fff78000d60, all_tables=0x7fff78016070) at /home/psergey/dev-git/10.3-cp/sql/sql_parse.cc:6547
              #9  0x0000555555c07a6f in mysql_execute_command (thd=0x7fff78000d60) at /home/psergey/dev-git/10.3-cp/sql/sql_parse.cc:3769
              #10 0x0000555555c1567b in mysql_parse (thd=0x7fff78000d60, rawbuf=0x7fff78015e68 "select * from gis_point", length=23, parser_state=0x7fffd99355d0, is_com_multi=false, is_next_command=false) at /home/psergey/dev-git/10.3-cp/sql/sql_parse.cc:8091
            

            while the ANALYZE TABLE debuggee does not have it.

            psergei Sergei Petrunia added a comment - In both debuggees, INSERT INTO ... statement causes 4 calls to azwrite() to be made. They don't seem to be writing anything to the file though, they just accumulate stuff in the internal buffer. The SELECT-* debuggee has this call to azflush: (gdb) wher #0 azflush (s=0x7fff78041458, flush=2) at /home/psergey/dev-git/10.3-cp/storage/archive/azio.c:685 #1 0x00007ffff086096d in ha_archive::info (this=0x7fff78033c28, flag=18) at /home/psergey/dev-git/10.3-cp/storage/archive/ha_archive.cc:1688 #2 0x0000555555d279b6 in TABLE_LIST::fetch_number_of_rows (this=0x7fff78016070) at /home/psergey/dev-git/10.3-cp/sql/table.cc:8354 #3 0x0000555555c553d4 in make_join_statistics (join=0x7fff780167c8, tables_list=..., keyuse_array=0x7fff78016ab8) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:4444 #4 0x0000555555c4c884 in JOIN::optimize_inner (this=0x7fff780167c8) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:1888 #5 0x0000555555c4af23 in JOIN::optimize (this=0x7fff780167c8) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:1451 #6 0x0000555555c54a3a in mysql_select (thd=0x7fff78000d60, tables=0x7fff78016070, wild_num=1, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fff780167a0, unit=0x7fff78004c10, select_lex=0x7fff78005380) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:4213 #7 0x0000555555c46a69 in handle_select (thd=0x7fff78000d60, lex=0x7fff78004b48, result=0x7fff780167a0, setup_tables_done_option=0) at /home/psergey/dev-git/10.3-cp/sql/sql_select.cc:373 #8 0x0000555555c116fa in execute_sqlcom_select (thd=0x7fff78000d60, all_tables=0x7fff78016070) at /home/psergey/dev-git/10.3-cp/sql/sql_parse.cc:6547 #9 0x0000555555c07a6f in mysql_execute_command (thd=0x7fff78000d60) at /home/psergey/dev-git/10.3-cp/sql/sql_parse.cc:3769 #10 0x0000555555c1567b in mysql_parse (thd=0x7fff78000d60, rawbuf=0x7fff78015e68 "select * from gis_point", length=23, parser_state=0x7fffd99355d0, is_com_multi=false, is_next_command=false) at /home/psergey/dev-git/10.3-cp/sql/sql_parse.cc:8091 while the ANALYZE TABLE debuggee does not have it.
            psergei Sergei Petrunia added a comment - http://lists.askmonty.org/pipermail/commits/2018-November/013064.html . serg , please review.

            People

              psergei Sergei Petrunia
              varun Varun Gupta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.