Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26126

Replication delay increasing infinitely from grandpa master when enable parallel replication and use temporary table?

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.4.18
    • Fix Version/s: 10.4
    • Component/s: Replication
    • Labels:
    • Environment:
      Distributor ID: CentOS
      Description: CentOS Linux release 7.8.2003 (Core)
      Release: 7.8.2003
      Codename: Core
      Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

      Description

      Dear community:

      we have a replcation topological structure like below

      t1-node-master-dbmyd01
          |    |     A
          |    V     |
          |   t1-node-master-dbmyd02(direct slave, enable parallel replication)
          |
          V
         t1-node-master-dbmyd03(direct slave, enable parallel replication)
                  |
                  V
                  t1-node-standby-dbmym01(indrect slave, enable parallel replication, issue happened about replication issue)
      

      When Im testing the below case on t1-node-master-dbmyd01:

      use sbtest;
      DROP TABLE IF EXISTS real_table;
      CREATE TABLE real_table (
        id int primary key,
        str varchar(10)
      );
       
      DROP TEMPORARY  TABLE IF EXISTS tmp_table;
      CREATE TEMPORARY TABLE tmp_table (
        id int primary key,
        str varchar(10)
      );
       
      INSERT INTO real_table VALUES (1, 'str1');
      INSERT INTO real_table VALUES (2, 'str2');
      INSERT INTO tmp_table SELECT id, str FROM real_table;
      UPDATE real_table SET str = concat(str, '***') WHERE id IN (SELECT id FROM tmp_table);
      

      It will cause t1-node-standby-dbmym01 replication delay, and can not be disappeared.

      Node t1-node-master-dbmyd02 and t1-node-master-dbmyd03 did not have this issue.

      And when I'm checking the processlist, can find below prompt on t1-node-standby-dbmym01:

       
      (dbausr@localhost) [(none)] 18:39:34> show processlist;
      +--------+-------------+-----------+------+--------------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+----------+
      | Id     | User        | Host      | db   | Command      | Time | State                                                                       | Info                                                                                    | Progress |
      +--------+-------------+-----------+------+--------------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+----------+
      |      1 | system user |           | NULL | Daemon       | NULL | InnoDB purge coordinator                                                    | NULL                                                                                    |    0.000 |
      |      2 | system user |           | NULL | Daemon       | NULL | InnoDB purge worker                                                         | NULL                                                                                    |    0.000 |
      |      4 | system user |           | NULL | Daemon       | NULL | InnoDB purge worker                                                         | NULL                                                                                    |    0.000 |
      |      3 | system user |           | NULL | Daemon       | NULL | InnoDB purge worker                                                         | NULL                                                                                    |    0.000 |
      |      5 | system user |           | NULL | Daemon       | NULL | InnoDB shutdown handler                                                     | NULL                                                                                    |    0.000 |
      | 463111 | dbausr      | localhost | NULL | Query        |    0 | Init                                                                        | show processlist                                                                        |    0.000 |
      | 463112 | system user |           | NULL | Slave_IO     |   42 | Waiting for master to send event                                            | NULL                                                                                    |    0.000 |
      | 463114 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463115 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463116 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463117 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463118 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463120 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463119 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463121 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463122 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463123 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463124 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463125 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463126 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463127 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463128 | system user |           | NULL | Slave_worker |   42 | Update_rows_log_event::find_row(-1)                                         | UPDATE real_table SET str = concat(str, '@@@@@') WHERE id IN (SELECT id FROM tmp_table) |    0.000 |
      | 463129 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463130 | system user |           | NULL | Slave_worker |   42 | Waiting for prior transaction to commit                                     | INSERT INTO real_table VALUES (2, 'str2')                                               |    0.000 |
      | 463131 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463132 | system user |           | NULL | Slave_worker |   42 | Waiting for prior transaction to commit                                     | INSERT INTO real_table VALUES (1, 'str1')                                               |    0.000 |
      | 463133 | system user |           | NULL | Slave_worker |   42 | Waiting for work from SQL thread                                            | NULL                                                                                    |    0.000 |
      | 463113 | system user |           | NULL | Slave_SQL    |   42 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                                                                                    |    0.000 |
      +--------+-------------+-----------+------+--------------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+----------+
      
      

      after several times checking, the processlist promt will like below:

      (dbausr@localhost) [(none)] 18:40:05> show processlist;
      +--------+-------------+-----------+------+--------------+------+-----------------------------------------------------------------------------+-------------------------------------------+----------+
      | Id     | User        | Host      | db   | Command      | Time | State                                                                       | Info                                      | Progress |
      +--------+-------------+-----------+------+--------------+------+-----------------------------------------------------------------------------+-------------------------------------------+----------+
      |      1 | system user |           | NULL | Daemon       | NULL | InnoDB purge coordinator                                                    | NULL                                      |    0.000 |
      |      2 | system user |           | NULL | Daemon       | NULL | InnoDB purge worker                                                         | NULL                                      |    0.000 |
      |      4 | system user |           | NULL | Daemon       | NULL | InnoDB purge worker                                                         | NULL                                      |    0.000 |
      |      3 | system user |           | NULL | Daemon       | NULL | InnoDB purge worker                                                         | NULL                                      |    0.000 |
      |      5 | system user |           | NULL | Daemon       | NULL | InnoDB shutdown handler                                                     | NULL                                      |    0.000 |
      | 463111 | dbausr      | localhost | NULL | Query        |    0 | Init                                                                        | show processlist                          |    0.000 |
      | 463112 | system user |           | NULL | Slave_IO     |   59 | Waiting for master to send event                                            | NULL                                      |    0.000 |
      | 463114 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463115 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463116 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463117 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463118 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463120 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463119 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463121 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463122 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463123 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463124 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463125 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463126 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463127 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463128 | system user |           | NULL | Slave_worker |   59 | Waiting for prior transaction to commit                                     | NULL                                      |    0.000 |
      | 463129 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463130 | system user |           | NULL | Slave_worker |   59 | Waiting for prior transaction to commit                                     | INSERT INTO real_table VALUES (2, 'str2') |    0.000 |
      | 463131 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463132 | system user |           | NULL | Slave_worker |   59 | Waiting for prior transaction to commit                                     | INSERT INTO real_table VALUES (1, 'str1') |    0.000 |
      | 463133 | system user |           | NULL | Slave_worker |   59 | Waiting for work from SQL thread                                            | NULL                                      |    0.000 |
      | 463113 | system user |           | NULL | Slave_SQL    |   59 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                                      |    0.000 |
      +--------+-------------+-----------+------+--------------+------+-----------------------------------------------------------------------------+-------------------------------------------+----------+
      
      

      After I disabled the parallel replication, this phenomenon disappeared, so is there any issue will cause the replication delay when enable parallel replication and use temporary table?

      below is node t1-node-standby-dbmym01's configuration options:

      [mysqld]
      ########## Server Level ##########
      large_pages=ON
      port=3306
      basedir=/opt/maria10.1
      datadir=/opt/maria10.1/data/db
      tmpdir=/opt/maria10.1/data/tmp
      socket=/opt/maria10.1/data/mysql.sock
      pid-file=/opt/maria10.1/data/mysqld.pid
      lc-messages-dir=/opt/maria10.1/share
      lc-messages=en_US
      lower-case-table-names=1
       
      ##### Character #####
      character-set-client-handshake=FALSE
      character-set-server=utf8mb4
      collation-server=utf8mb4_unicode_ci
       
      ##### Security/Mode #####
      skip_name_resolve=1
      bind_address=0.0.0.0
      explicit_defaults_for_timestamp=1
      sql_mode=ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
      skip-ssl
      #ssl
      #ssl_ca=/opt/maria10.1/data/ssl/ca-cert.pem
      #ssl_cert=/opt/maria10.1/data/ssl/server-cert.pem
      #ssl_key=/opt/maria10.1/data/ssl/server-key.pem
       
      ##### Connection #####
      max_connections=3050
      max_connect_errors=65536
      max_user_connections=3050
      max_allowed_packet=128M
      connect_timeout=20
      back_log=65535
      net_read_timeout=30
      net_write_timeout=60
      net_buffer_length=2K # default 16K
      max_prepared_stmt_count=1048576
       
      ##### per-thread Buffer #####
      #read_buffer_size=128K
      #read_rnd_buffer_size=256K
      sort_buffer_size=4M
      join_buffer_size=2M
      binlog_cache_size=4M
      thread_stack=292K
      ## MRR
      optimizer_switch="index_merge_sort_intersection=on,mrr=on,mrr_cost_based=on,mrr_sort_keys=on,optimize_join_buffer_size=on"
      join_cache_level=8
      join_buffer_space_limit=250M
      mrr_buffer_size=4M
       
      ##### Global Buffer #####
      key_buffer_size=10M
      query_cache_type=0
      query_cache_size=0
      #query_cache_size=64M
      #query_cache_limit=8M
      #query_cache_min_res_unit=1M
       
      ##### THREAD POOL #####
      thread_handling=pool-of-threads
      thread_pool_max_threads=3050
      #thread_cache_size=256 # If the thread pool is active, thread_cache_size is ignored.
       
      ##### TABLE CACHE #####
      table_definition_cache=16500
      table_open_cache=2000
      table_open_cache_instances=8
       
      ##### IO #####
      open_files_limit=65000
       
      ##### OTHERS #####
      tmp_table_size=500M
      max_heap_table_size=16M # Memory Engine
       
       
      ########## Server LOG ##########
      ##### general log #####
      general_log=0
      #general_log_file=/opt/maria10.1/log/general.log
       
      ##### error log #####
      log_error=/opt/maria10.1/log/alert_TMRART01.log
      log_warnings=2
       
      ##### binary log #####
      server_id=16110
      log_bin=/opt/maria10.1/binlog/BINLOG
      binlog_format=ROW
      binlog_row_image=FULL
      sync_binlog=1
      log_slave_updates=1
      expire_logs_days=7
      max_binlog_size=128M
      max_binlog_cache_size=2G # equal to wsrep_max_ws_size
       
      ##### GTID #####
      gtid_domain_id=111
      gtid_strict_mode=1
      gtid_ignore_duplicates=ON
       
      ##### slow log #####
      slow_query_log=1
      slow_query_log_file=/opt/maria10.1/log/slow.log
      log_output='FILE'
      long_query_time=0.5
      log_slow_admin_statements=0
      log_slow_rate_limit=1
      log_slow_verbosity='query_plan,explain'
      #log_queries_not_using_indexes=1
       
      ##### audit log #####
      plugin-load=server_audit=server_audit.so
      server_audit=FORCE_PLUS_PERMANENT
      server_audit_logging=1
      server_audit_file_path=/opt/maria10.1/log/server_audit.log
      server_audit_events='CONNECT,QUERY'
      server_audit_file_rotate_size=524288000 # 500M
      server_audit_file_rotations=9
      server_audit_output_type='file'
      #server_audit_excl_users='lepus_monitor'
      server_audit_query_log_limit=10240
       
      ##### sql error log #####
      plugin_load_add=sql_errlog
      sql_error_log_filename='/opt/maria10.1/log/sql_errors.log'
      sql_error_log_rotate=1
      sql_error_log_rotations=9
      sql_error_log_size_limit=524288000 # 500M
       
      ########## INNODB ##########
      default_storage_engine=InnoDB
      ##### Tablespace #####
      innodb_file_per_table=1
       
      ##### Buffer Pool #####
      #innodb_data_file_path=ibdata1:50M;ibdata2:50M:autoextend
      innodb_buffer_pool_size=100G # innodb data cache. 70% of total RAM
      innodb_buffer_pool_instances=8
      innodb_log_buffer_size=200M
      innodb_max_dirty_pages_pct=60
      innodb_buffer_pool_dump_pct=60
       
      ##### Redo #####
      innodb_log_file_size=1280M
      innodb_log_files_in_group=2
       
      ##### Undo #####
      innodb_undo_directory=/opt/maria10.1/data/db
      innodb_undo_tablespaces=4
      innodb_undo_logs=128
      innodb_purge_threads=16
      innodb_purge_batch_size=5000
       
      ##### Locking #####
      innodb_print_all_deadlocks=1
      innodb_lock_wait_timeout=50 # 50s
      innodb_autoinc_lock_mode=2
      innodb_rollback_on_timeout=1
       
      ##### Data Security #####
      innodb_doublewrite=1
      innodb_strict_mode=1
      innodb_flush_log_at_trx_commit=1
      transaction-isolation=READ-COMMITTED
       
      ##### CPU #####
      innodb_sync_array_size=8
       
      ##### IO #####
      innodb_flush_method=O_DIRECT
      innodb_purge_threads=4
      innodb_io_capacity=10000 # if SSD, set 4000
      innodb_flush_neighbors=1 # if SSD, set 0
      innodb_read_io_threads=16
      innodb_write_io_threads=8
      innodb_open_files=60000
      innodb_use_native_aio=1
       
      ##### Turning #####
      #innodb_defragment=1
       
      ########## Performance Schema ##########
      performance_schema=ON
      performance_schema_instrument='stage/%=ON'
      performance_schema_instrument='%mdl%=ON'
      performance_schema_consumer_events_stages_current=ON
      performance_schema_consumer_events_stages_history_long=ON
      performance_schema_consumer_events_statements_current=ON
      performance_schema_consumer_events_statements_history_long=ON
      performance_schema_consumer_events_waits_current=ON
      performance_schema_consumer_events_waits_history_long=ON
       
      ########## Rpl_Slave ##########
      ##### only on slave node #####
      relay_log=/opt/maria10.1/binlog/RELAY
      max_relay_log_size=128M
      relay_log_recovery=1
      relay_log_purge=1
      replicate_wild_ignore_table=sys.%
      replicate_wild_ignore_table=test.%
      replicate_wild_ignore_table=jnxdba.%
      replicate_wild_ignore_table=mysql.slow_log
      read_only=1
      skip_slave_start=1
       
      ########## Rpl_Semi_Sync ##########
      #rpl_semi_sync_master_wait_point=AFTER_SYNC
      ##### master #####
      #rpl_semi_sync_master_enabled=0
      #rpl_semi_sync_master_timeout=1000 # 1s
      #rpl_semi_sync_master_timeout=18446744073709551615 # max
      ##### slave #####
      #rpl_semi_sync_slave_enabled=1
      slave_parallel_mode=optimistic
      slave_parallel_threads=20
      slave_parallel_max_queued=2147483647 #max value
       
      [galera]
      ########## GALERA ##########
      wsrep_on=OFF # switch of galera
      wsrep_cluster_name="mariarac"
      wsrep_cluster_address=gcomm://
      #wsrep_cluster_address="gcomm://IPDB1,IPDB2,IPDB3"
      #wsrep_node_name=hostname
      #wsrep_node_address="IPDBLOCAL"
      wsrep_provider=/opt/maria10.1/galera/libgalera_smm.so
      wsrep_provider_options="gcache.size=10G; gcache.page_size=128M; gcache.keep_pages_size=5G; gmcast.listen_addr=tcp://IPDBLOCAL:4567; gcs.fc_limit=256; gcs.fc_factor=0.9; evs.keepalive_period=PT10S; evs.suspect_timeout=PT30S; evs.inactive_timeout=PT45S; evs.install_timeout=PT45S;"
      wsrep_slave_threads=1
      wsrep_max_ws_rows=0
      wsrep_max_ws_size=2147483647 # 2G, default/max value
      wsrep_log_conflicts=1
      wsrep_sync_wait=1
      ##### GTID #####
      wsrep_gtid_mode=ON
      wsrep_gtid_domain_id=1
      ##### SST #####
      #wsrep_sst_method=xtrabackup-v2 # if 10.2, xtrabackup
      wsrep_sst_method=mariabackup # if 10.3, mariabackup
      wsrep_sst_auth=bkpuser:bkpuser
      ##### DEBUG #####
      #wsrep_debug=0
      ##### Galera-4 #####
      #wsrep_ignore_apply_errors=7
       
      # this is only for embedded server
      [embedded]
       
      # This group is only read by MariaDB-10.3 servers.
      # If you use the same .cnf file for MariaDB of different versions,
      # use this group for options that older servers don't understand
      [mariadb-10.3]
      #innodb_fast_shutdown=1
       
      # MariaDB Plugin
      #plugin_maturity=gamma
      #plugin_load_add=metadata_lock_info
      #plugin_load_add=query_cache_info
      #plugin_load_add=wsrep_info
      #plugin_load_add=ha_connect
      #plugin_load_add=disks
       
      [mariadb-10.4]
      unix_socket=OFF
       
      [xtrabackup]
      user=bkpuser
      password=bkpuser
      
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Elkin Andrei Elkin
              Reporter:
              cui3093 Kuijun Cui
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration