Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31646

Online alter applies binlog cache limit to cache writes

    XMLWordPrintable

Details

    Description

      Note: To get the error shown below (ER_TRANS_CACHE_FULL), I had to modify log.cc:

      --- a/sql/log.cc
      +++ b/sql/log.cc
      @@ -7984,7 +7984,7 @@ int Event_log::write_cache(THD *thd, IO_CACHE *cache)
         } while ((length= my_b_fill(cache)));
       
         DBUG_ASSERT(carry == 0);
      -  DBUG_ASSERT(!writer.checksum_len || writer.remains == 0);
      +//  DBUG_ASSERT(!writer.checksum_len || writer.remains == 0);
       
         DBUG_RETURN(0);                               // All OK
       }
      

      This over-zealous assertion prevents any testing causing binlog write errors, even those which are supposed to be properly handled (it has been discussed with the replication team); and the test could not be run on a non-debug build because it uses debug sync facility.
      The test can be run with non-modified code, it will just fail with the assertion failure upon update instead of ER_TRANS_CACHE_FULL; the point remains the same. Just note that the assertion failure as such is not the topic of the report, it's just a side-effect of applying the limit.


      Binary logging has two hard limits for the caches, max_binlog_cache_size is applied to (multi-statement) transactions and max_binlog_stmt_cache_size to individual statements.
      Online alter applies max_binlog_cache_size to both. It would be easier to understand if the limits weren't applied at all; I think it is going to be confusing enough that binlog variables start having effect when binary logging isn't even enabled, it will be even more confusing if they also have a different meaning.

      Online alter test case

      --source include/have_debug_sync.inc
      --source include/have_sequence.inc
       
      create table t (pk int primary key, a varchar(16)) engine=MyISAM;
      insert into t select seq, NULL from seq_1_to_20000;
       
      set @cache.size= @@max_binlog_cache_size;
      set global max_binlog_cache_size= 4096;
       
      --connect (con2,localhost,root,,)
      --send
        set debug_sync= 'now wait_for do_updates';
       
      --connect (con1,localhost,root,,)
      set debug_sync= 'alter_table_online_progress signal do_updates wait_for go';
      --send
        alter table t add b int, algorithm=copy, lock=none;
       
      --connection con2
      --reap
      --error ER_TRANS_CACHE_FULL
      update t set a = 'update-2' where pk > 1 order by pk limit 10000;
      show warnings;
       
      set debug_sync= 'now signal go';
       
      --connection con1
      --reap
      show warnings;
       
      # Cleanup
      --disconnect con1
      --disconnect con2
      --connection default
      drop table t;
      set debug_sync= reset;
      set global max_binlog_stmt_cache_size= @cache.size;
      

      bb-11.2-oalter 6e6e950c8abf40041b146b538c4825856c89e64b

      update t set a = 'update-2' where pk > 1 order by pk limit 10000;
      ERROR HY000: Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mariadbd variable and try again
      show warnings;
      Level	Code	Message
      Error	1197	Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mariadbd variable and try again
      Error	1534	Writing one row to the row-based binary log failed
      Warning	1196	Some non-transactional changed tables couldn't be rolled back
      

      If we instead modify max_binlog_stmt_cache_size in the test case, it doesn't fail.

      Here is the matching test case for the traditional binary log:

      Binlog test case (for comparison)

      --source include/have_binlog_format_row.inc
      --source include/have_sequence.inc
       
      create table t (pk int primary key, a varchar(16)) engine=MyISAM;
      insert into t select seq, NULL from seq_1_to_20000;
       
      set @cache.size= @@max_binlog_cache_size;
      set global max_binlog_cache_size= 4096;
       
      --connect (con2,localhost,root,,)
      update t set a = 'update-2' where pk > 1 order by pk limit 10000;
      show warnings;
       
      # Cleanup
      --disconnect con2
      --connection default
      drop table t;
      set debug_sync= reset;
      set global max_binlog_cache_size= @cache.size;
      

      Here update is not affected by the limit change and passes.
      And if we modify max_binlog_stmt_cache_size instead, it fails.

      Attachments

        Issue Links

          Activity

            People

              nikitamalyavin Nikita Malyavin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.