Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7144

Warnings "bytes lost" during server shutdown after running connect.part_file test in buildbot

Details

    Description

      Buildbot example: http://buildbot.askmonty.org/buildbot/builders/kvm-dgcov-jaunty-i386/builds/4449/steps/test_2/logs/stdio

      I can reproduce the problem by running the test with valgrind on a debug build. The valgrind report goes OK, but the warnings are generated on shutdown. It happens almost every time when I run the test:

      perl ./mtr connect.part_file  --valgrind-mysqld
      Logging: ./mtr  connect.part_file --valgrind-mysqld
      ...
       
      worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
      connect.part_file                        [ pass ]   4482
      ***Warnings generated in error logs during shutdown after running tests: connect.part_file
       
      Warning: 65536 bytes lost at 0xfaea070, allocated by T@0 at 
      valgrind_report                          [ pass ]       
      --------------------------------------------------------------------------
      The servers were restarted 0 times
      Spent 4.482 of 31 seconds executing testcases
       
      Warnings in log: All 2 tests were successful.
       
      Errors/warnings were found in logfiles during server shutdown after running the
      following sequence(s) of tests:
          connect.part_file
      mysql-test-run: *** ERROR: There where errors/warnings in server logs after running test cases.

      Attachments

        Issue Links

          Activity

            Elena,

            If you mean that this test should be run with the mentioned change in TDBCSV::PrepareWriting

            I didn't mean that. Running the small test without the change is just to check whether the warning is raised because of that particular UPDATE command. What you could do is to split the original test in parts to check which command(s) are causing the warning.

            And also checking with the partition table made with another engine.

            Thanks and regards,
            Olivier

            bertrandop Olivier Bertrand added a comment - Elena, If you mean that this test should be run with the mentioned change in TDBCSV::PrepareWriting I didn't mean that. Running the small test without the change is just to check whether the warning is raised because of that particular UPDATE command. What you could do is to split the original test in parts to check which command(s) are causing the warning. And also checking with the partition table made with another engine. Thanks and regards, Olivier

            Hi Olivier,

            No, I am not getting the warnings on your test case. Here is the minimal one which triggers "bytes lost" warnings for me (they look not quite the same as the initial one, but I suppose it's just because there are more of them; the first warning from the three could be the same as the initial unfinished one):

            --source include/have_partition.inc
            CREATE TABLE t1 (
              id INT NOT NULL,
              msg VARCHAR(32)
            ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10
            PARTITION BY RANGE(id) (
            PARTITION first VALUES LESS THAN(10),
            PARTITION middle VALUES LESS THAN(50),
            PARTITION last VALUES LESS THAN(MAXVALUE));
            --error ER_GET_ERRMSG
            UPDATE t1 set id = 41 WHERE msg = 'four';

            perl ./mtr part_file2  --valgrind-mysqld --mysqld=--plugin-load-add=ha_connect
             
            ...
            CREATE TABLE t1 (
            id INT NOT NULL,
            msg VARCHAR(32)
            ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10
            PARTITION BY RANGE(id) (
            PARTITION first VALUES LESS THAN(10),
            PARTITION middle VALUES LESS THAN(50),
            PARTITION last VALUES LESS THAN(MAXVALUE));
            UPDATE t1 set id = 41 WHERE msg = 'four';
            ERROR HY000: Got error 174 'Cannot update column id because it is used for partitioning' from CONNECT
            ***Warnings generated in error logs during shutdown after running tests: main.part_file2
             
            Warning: 65536 bytes lost at 0xfaa3070, allocated by T@0 at mysys/mf_iocache.c:231, mysys/mf_cache.c:69, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379
            Warning:    8 bytes lost at 0xf850b70, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:65, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379
            Warning:   48 bytes lost at 0xf84ffb0, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:64, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379
             
            mysql-test-run: *** ERROR: Test suite aborted

            I tried the same set with MyISAM, InnoDB, and Archive engines (the latter is a dynamic library, like Connect), no warnings there.
            Of course, the other difference is that neither of them return an error on the update, it's Connect-specific, maybe the warnings are somehow related to it.

            elenst Elena Stepanova added a comment - Hi Olivier, No, I am not getting the warnings on your test case. Here is the minimal one which triggers "bytes lost" warnings for me (they look not quite the same as the initial one, but I suppose it's just because there are more of them; the first warning from the three could be the same as the initial unfinished one): --source include/have_partition.inc CREATE TABLE t1 ( id INT NOT NULL , msg VARCHAR (32) ) ENGINE= CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10 PARTITION BY RANGE(id) ( PARTITION first VALUES LESS THAN(10), PARTITION middle VALUES LESS THAN(50), PARTITION last VALUES LESS THAN(MAXVALUE)); --error ER_GET_ERRMSG UPDATE t1 set id = 41 WHERE msg = 'four' ; perl ./mtr part_file2 --valgrind-mysqld --mysqld=--plugin-load-add=ha_connect   ... CREATE TABLE t1 ( id INT NOT NULL, msg VARCHAR(32) ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10 PARTITION BY RANGE(id) ( PARTITION first VALUES LESS THAN(10), PARTITION middle VALUES LESS THAN(50), PARTITION last VALUES LESS THAN(MAXVALUE)); UPDATE t1 set id = 41 WHERE msg = 'four'; ERROR HY000: Got error 174 'Cannot update column id because it is used for partitioning' from CONNECT ***Warnings generated in error logs during shutdown after running tests: main.part_file2   Warning: 65536 bytes lost at 0xfaa3070, allocated by T@0 at mysys/mf_iocache.c:231, mysys/mf_cache.c:69, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379 Warning: 8 bytes lost at 0xf850b70, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:65, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379 Warning: 48 bytes lost at 0xf84ffb0, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:64, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379   mysql-test-run: *** ERROR: Test suite aborted I tried the same set with MyISAM, InnoDB, and Archive engines (the latter is a dynamic library, like Connect), no warnings there. Of course, the other difference is that neither of them return an error on the update, it's Connect-specific, maybe the warnings are somehow related to it.

            Helena,

            Great! Thanks to this information I have a better idea of what is happening.
            It begins in mysql_update (in sql_update.cc):

            int mysql_update:
            ...
              We are doing a search on a key that is updated. In this case
              we go trough the matching rows, save a pointer to them and
              update these in a separate loop based on the pointer.
                  IO_CACHE tempfile;
                  if (open_cached_file(&tempfile, mysql_tmpdir,TEMP_PREFIX,
                     DISK_BUFFER_SIZE, MYF(MY_WME)))
              goto err;
             
                  /* If quick select is used, initialize it before retrieving rows. */
                  if (select && select->quick && select->quick->reset())
                  {
                    close_cached_file(&tempfile);
                    goto err;
                  }
                  table->file->try_semi_consistent_read(1);
             
                  /*
                    When we get here, we have one of the following options:
                    A. query_plan.index == MAX_KEY
                       This means we should use full table scan, and start it with
                       init_read_record call
                    B. query_plan.index != MAX_KEY
                       B.1 quick select is used, start the scan with init_read_record
                       B.2 quick select is not used, this is full index scan (with LIMIT)
                           Full index scan must be started with init_read_record_idx
                  */
             
                  if (query_plan.index == MAX_KEY || (select && select->quick))
                  {
                    if (init_read_record(&info, thd, table, select, 0, 1, FALSE))
                      goto err;
                  }
                  else
                    init_read_record_idx(&info, thd, table, 1, query_plan.index, reverse);

            It calls open_cached_file that calls init_io_cache

            int init_io_cache:
            ...
                  if ((info->buffer= (uchar*) my_malloc(buffer_block, flags)) != 0)
            ...

            This is where the buffer of 65536 bytes is allocated. When updating is done,
            for instance for a MyISAM table, there is a call to close_cached_file

            void close_cached_file(IO_CACHE *cache)
            {
              DBUG_ENTER("close_cached_file");
              if (my_b_inited(cache))
              {
                File file=cache->file;
                cache->file= -1;        /* Don't flush data */
                (void) end_io_cache(cache);
                if (file >= 0)
                {
                  (void) my_close(file,MYF(0));
            #ifdef CANT_DELETE_OPEN_FILES
                  if (cache->file_name)
                  {
              (void) my_delete(cache->file_name,MYF(MY_WME | ME_NOINPUT));
              my_free(cache->file_name);
                  }
            #endif
                }
                my_free(cache->dir);
                my_free(cache->prefix);
              }
              DBUG_VOID_RETURN;
            }

            and in end_io_cache

            int end_io_cache(IO_CACHE *info)
            ...
                my_free(info->buffer);
            ...

            the allocated memory is freed.

            However, in the case of a CONNECT table, when calling init_read_record
            different functions are called until ha_connect::rnd_init

            int ha_connect::rnd_init(bool scan)
            ...
              if (OpenTable(g, xmod == MODE_DELETE))
                DBUG_RETURN(HA_ERR_INITIALIZATION);

            and in ha_connect::OpenTable

            int ha_connect::OpenTable(PGLOBAL g, bool del)
            ...
                      if (part_id && bitmap_is_set(part_id, fp->field_index)) {
                        // Trying to update a column used for partitioning
                        // This cannot be currently done because it may require
                        // a row to be moved in another partition.
                        sprintf(g->Message,
                          "Cannot update column %s because it is used for partitioning",
                          p);
                        return HA_ERR_INTERNAL_ERROR;
                        } // endif part_id

            This causes in mysql_update the statement:

                    if (init_read_record(&info, thd, table, select, 0, 1, FALSE))
                      goto err;
            ...
            err:
              delete select;
              free_underlaid_joins(thd, select_lex);
              table->disable_keyread();
              thd->abort_on_warning= 0;
              DBUG_RETURN(1);

            to return in error. Actually in delete select the function close_cached_file
            is called but the cache parameter passed to it is not the one that have been
            allocated previously and this function does nothing (my_b_inited(cache) is false)
            This explains the warnings, memory being lost indeed.

            Is this a CONNECT bug or a MariaDB bug? I can't tell but this could happen with
            any engine returning an error in this place.

            Perhaps, adding to CONNECT the flag HA_NOT_DELETE_WITH_CACHE could remove this problem
            but this would be masking it instead of really fix it.

            bertrandop Olivier Bertrand added a comment - Helena, Great! Thanks to this information I have a better idea of what is happening. It begins in mysql_update (in sql_update.cc): int mysql_update: ... We are doing a search on a key that is updated. In this case we go trough the matching rows, save a pointer to them and update these in a separate loop based on the pointer. IO_CACHE tempfile; if (open_cached_file(&tempfile, mysql_tmpdir,TEMP_PREFIX, DISK_BUFFER_SIZE, MYF(MY_WME))) goto err;   /* If quick select is used, initialize it before retrieving rows. */ if (select && select->quick && select->quick->reset()) { close_cached_file(&tempfile); goto err; } table->file->try_semi_consistent_read(1);   /* When we get here, we have one of the following options: A. query_plan.index == MAX_KEY This means we should use full table scan, and start it with init_read_record call B. query_plan.index != MAX_KEY B.1 quick select is used, start the scan with init_read_record B.2 quick select is not used, this is full index scan (with LIMIT) Full index scan must be started with init_read_record_idx */   if (query_plan.index == MAX_KEY || (select && select->quick)) { if (init_read_record(&info, thd, table, select, 0, 1, FALSE)) goto err; } else init_read_record_idx(&info, thd, table, 1, query_plan.index, reverse); It calls open_cached_file that calls init_io_cache int init_io_cache: ... if ((info->buffer= (uchar*) my_malloc(buffer_block, flags)) != 0) ... This is where the buffer of 65536 bytes is allocated. When updating is done, for instance for a MyISAM table, there is a call to close_cached_file void close_cached_file(IO_CACHE *cache) { DBUG_ENTER("close_cached_file"); if (my_b_inited(cache)) { File file=cache->file; cache->file= -1; /* Don't flush data */ (void) end_io_cache(cache); if (file >= 0) { (void) my_close(file,MYF(0)); #ifdef CANT_DELETE_OPEN_FILES if (cache->file_name) { (void) my_delete(cache->file_name,MYF(MY_WME | ME_NOINPUT)); my_free(cache->file_name); } #endif } my_free(cache->dir); my_free(cache->prefix); } DBUG_VOID_RETURN; } and in end_io_cache int end_io_cache(IO_CACHE *info) ... my_free(info->buffer); ... the allocated memory is freed. However, in the case of a CONNECT table, when calling init_read_record different functions are called until ha_connect::rnd_init int ha_connect::rnd_init(bool scan) ... if (OpenTable(g, xmod == MODE_DELETE)) DBUG_RETURN(HA_ERR_INITIALIZATION); and in ha_connect::OpenTable int ha_connect::OpenTable(PGLOBAL g, bool del) ... if (part_id && bitmap_is_set(part_id, fp->field_index)) { // Trying to update a column used for partitioning // This cannot be currently done because it may require // a row to be moved in another partition. sprintf(g->Message, "Cannot update column %s because it is used for partitioning", p); return HA_ERR_INTERNAL_ERROR; } // endif part_id This causes in mysql_update the statement: if (init_read_record(&info, thd, table, select, 0, 1, FALSE)) goto err; ... err: delete select; free_underlaid_joins(thd, select_lex); table->disable_keyread(); thd->abort_on_warning= 0; DBUG_RETURN(1); to return in error. Actually in delete select the function close_cached_file is called but the cache parameter passed to it is not the one that have been allocated previously and this function does nothing (my_b_inited(cache) is false) This explains the warnings, memory being lost indeed. Is this a CONNECT bug or a MariaDB bug? I can't tell but this could happen with any engine returning an error in this place. Perhaps, adding to CONNECT the flag HA_NOT_DELETE_WITH_CACHE could remove this problem but this would be masking it instead of really fix it.

            Is this a CONNECT bug or a MariaDB bug?

            It's probably best to ask serg about it.

            elenst Elena Stepanova added a comment - Is this a CONNECT bug or a MariaDB bug? It's probably best to ask serg about it.
            serg Sergei Golubchik added a comment - - edited

            Looks like a MariaDB bug to me. Another goto err few lines above is correctly accompanied with the close_cached_file(&tempfile).

            I'll fix it. Thanks for pointing out exactly what line to fix!

            serg Sergei Golubchik added a comment - - edited Looks like a MariaDB bug to me. Another goto err few lines above is correctly accompanied with the close_cached_file(&tempfile) . I'll fix it. Thanks for pointing out exactly what line to fix!

            People

              serg Sergei Golubchik
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.