Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5217

Assorted failures with parallel replication on concurrent DML/DDL flow and periodic FLUSH LOGS

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.0.6
    • Component/s: None
    • Labels:
      None

      Description

      I've got several types of failures by running the provided test with the same parameters over and over again. Here is the statistics, more details on each type of the failure below:

      Out of 10 consequent runs,

      Error in Log_event::read_log_event(): 'Event too big' 1
      Error in Log_event::read_log_event(): 'read error' 4
      Error in Log_event::read_log_event(): 'Event too small' 2
      Error reading relay log event: Aborting slave SQL thread because of partial event read 1
      Could not execute .... event (Can't find record) 2

      In addition to "Could not execute .... event", I also at least once had a subsequent crash with "safe_mutex: Trying to destroy a mutex LOCK_wait_commit that was locked".

      Examples of error messages:

      131031 11:25:27 [ERROR] Error in Log_event::read_log_event(): 'read error', data_len: 2818048, event_type: 0
      131031 11:25:27 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
      131031 11:25:27 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Internal MariaDB error code: 1594
      131031 11:25:27 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000019' position 287

      131031 11:19:17 [ERROR] Slave SQL thread: I/O error reading event(errno: 175  cur_log->error: 12)
      131031 11:19:17 [ERROR] Error reading relay log event: Aborting slave SQL thread because of partial event read
      131031 11:19:17 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Internal MariaDB error code: 1594
      131031 11:19:17 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000048' position 287

      131031 10:54:58 [ERROR] Slave SQL: Could not execute Delete_rows event on table test.DD; Can't find record in 'DD', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000033, end_log_pos 688, Internal MariaDB error code: 1032
      131031 10:54:58 [ERROR] Slave SQL: Can't find record in 'DD', Internal MariaDB error code: 1032
      131031 10:54:58 [Warning] Slave: Can't find record in 'DD' Error_code: 1032
      131031 10:54:58 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000033' position 326

      131031 10:51:49 [ERROR] Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0
      131031 10:51:49 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
      131031 10:51:49 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Internal MariaDB error code: 1594
      131031 10:51:49 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000052' position 326

      Interestingly, in this case slave didn't stop but later also issued another error:

      131031 10:31:22 [ERROR] Error in Log_event::read_log_event(): 'Event too big', data_len: 1987208563, event_type: 100
      131031 10:31:22 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
      131031 10:31:22 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Internal MariaDB error code: 1594
      131031 10:31:22 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000022' position 4
      131031 10:35:55 [ERROR] Slave SQL: Could not execute Delete_rows event on table test.CC; Can't find record in 'CC', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000081, end_log_pos 1184, Internal MariaDB error code: 1032
      131031 10:35:55 [Warning] Slave: Can't find record in 'CC' Error_code: 1032
      131031 10:35:55 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000081' position 287
      131031 10:35:55 [Note] Slave SQL thread exiting, replication stopped in log 'mysql-bin.000081' at position 326

      safe_mutex: Trying to destroy a mutex LOCK_wait_commit that was locked at /home/elenst/bzr/10.0-knielsen/sql/sql_class.cc, line 5805 at /home/elenst/bzr/10.0-knielsen/sql/sql_class.cc, line 5631
      131031  3:45:35 [ERROR] mysqld got signal 6 ;

      RQG grammar (parallel-replication-6.yy):

      query:
      	dml_ddl | dml_ddl | dml_ddl | dml_ddl | dml_ddl |
      	FLUSH LOGS
      ;
       
      create_drop_table:
      	CREATE TABLE IF NOT EXISTS { 't_'.abs($$) } LIKE _table | 
      	DROP TABLE IF EXISTS { 't_'.abs($$) } ;
       
      dml_ddl:
      	delete | insert | create_drop_table ;
       
      delete:
      	DELETE FROM _table WHERE `col_int_nokey` BETWEEN _tinyint[invariant] AND _tinyint[invariant] + 2 ;
       
      insert:
      	INSERT INTO _table ( `col_int_key` , `col_int_nokey` )   VALUES ( _tinyint, _tinyint ), ( _tinyint, _tinyint ) ;

      RQG command line (works on the standard lp:randgen tree):

      perl ./runall-new.pl --grammar=parallel-replication-6.yy  --threads=12 --queries=100M --duration=300  --mysqld=--slave-parallel-threads=8 --rpl_mode=row --basedir=<your basedir> --vardir=<your vardir>

      revision-id: knielsen@knielsen-hq.org-20131030065230-kp8dykgyeth6ma55
      revno: 3690
      branch-nick: 10.0-knielsen

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              knielsen Kristian Nielsen
              Reporter:
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: