Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21675

Galera: RQG (mss): Master node disconnects from the cluster after voting on: Duplicate entry '0' for key 'PRIMARY', Error_code: 1062

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.4, 10.5.1
    • Fix Version/s: 10.4.13, 10.5.2
    • Component/s: Galera
    • Labels:
    • Environment:
      CentOS Linux release 7.6.1810 (Core)

      Description

      When RQG is used with "--galera=mss", master node disconnects from the cluster after voting on: "Duplicate entry '0' for key 'PRIMARY', Error_code: 1062".

      Galera lib 26.4.4(r4774): Repository: mariadb-corporation/es-galera; branch es-mariadb-4.x; Revision af1fa5b5d20c50ec827f45b9724ec964ab13efdd; debug build.

      RQG: Repository: MariaDB/randgen; master branch; Revision 3bf9c73c1b010facc78d210dfc875dfbaaf1ccdd

      Run:

      perl ./runall-new.pl --grammar=conf/galera/galera_stress.yy --gendata=conf/galera/galera_stress.zz --duration=300 --queries=200M --threads=8 --galera=mss --basedir=/home/stepan/ment/10.5 --vardir=/home/stepan/rqg/var "--mysqld=--wsrep-provider=/usr/lib/libgalera_enterprise_smm.so" "--mysqld=--wsrep_sst_method=rsync" "--mysqld=--core" "--mysqld=--general-log" "--mysqld=--general-log-file=queries.log" "--mysqld=--log-output=file" "--mysqld=--wsrep-debug=none" "--mysqld=--wsrep-sync-wait=15" "--mysqld=--wsrep_retry_autocommit=0" "--mysqld=--wsrep_log_conflicts=1" "--mysqld=--wsrep_on=ON" "--mysqld=--default-storage-engine=innodb" "--mysqld=--sort_buffer_size=200M" "--mysqld=--innodb-autoinc-lock-mode=2" "--mysqld=--innodb-lock-wait-timeout=3"
      

      Master 0 error log:
      2020-02-06 12:01:40 2 [ERROR] WSREP: Vote 0 (success) on 674921f6-48c7-11ea-b9a9-4a0630e482fa:240 is inconsistent with group. Leaving cluster.

      10.5.1-ES 57950ded281731263f6aa358d43c7b9d51f3dbfb debug

      2020-02-06 12:00:12 2 [Note] WSREP: Server cnt7glr11.localdomain synced with group
      2020-02-06 12:00:12 2 [Note] WSREP: Server status change joined -> synced
      2020-02-06 12:00:12 2 [Note] WSREP: Synchronized with group, ready for connections
      2020-02-06 12:00:12 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2020-02-06 12:00:14 0 [Note] WSREP: async IST sender served
      2020-02-06 12:00:14 0 [Note] WSREP: 2.0 (cnt7glr11.localdomain): State transfer from 0.0 (cnt7glr11.localdomain) complete.
      2020-02-06 12:00:14 0 [Note] WSREP: Member 2.0 (cnt7glr11.localdomain) synced with group.
      2020-02-06 12:01:14 8 [Note] WSREP: MDL conflict db=test table=table10000_innodb_key_pk_parts_2_int_autoinc ticket=1 solved by abort
      2020-02-06 12:01:14 8 [Note] WSREP: MDL conflict db=test table=table10000_innodb_key_pk_parts_2_int_autoinc ticket=1 solved by abort
      2020-02-06 12:01:14 0 [Note] WSREP: MDL conflict db=test table=table10000_innodb_key_pk_parts_2_int_autoinc ticket=1 solved by abort
      2020-02-06 12:01:14 0 [Note] WSREP: MDL conflict db=test table=table10000_innodb_key_pk_parts_2_int_autoinc ticket=9 solved by abort
      2020-02-06 12:01:14 0 [Note] WSREP: MDL conflict db=test table=table10000_innodb_key_pk_parts_2_int_autoinc ticket=9 solved by abort
      2020-02-06 12:01:40 0 [Note] WSREP: Member 2(cnt7glr11.localdomain) initiates vote on 674921f6-48c7-11ea-b9a9-4a0630e482fa:240,fdcab7194c6d731e:  Duplicate entry '0' for key 'PRIMARY', Error_code: 1062;
      2020-02-06 12:01:40 0 [Note] WSREP: Votes over 674921f6-48c7-11ea-b9a9-4a0630e482fa:240:
         fdcab7194c6d731e:   1/3
      Waiting for more votes.
      2020-02-06 12:01:40 2 [Note] WSREP: Got vote request for seqno 674921f6-48c7-11ea-b9a9-4a0630e482fa:240
      2020-02-06 12:01:40 0 [Note] WSREP: Member 1(cnt7glr11.localdomain) initiates vote on 674921f6-48c7-11ea-b9a9-4a0630e482fa:240,fdcab7194c6d731e:  Duplicate entry '0' for key 'PRIMARY', Error_code: 1062;
      2020-02-06 12:01:40 0 [Note] WSREP: Votes over 674921f6-48c7-11ea-b9a9-4a0630e482fa:240:
         fdcab7194c6d731e:   2/3
      Winner: fdcab7194c6d731e
      2020-02-06 12:01:40 0 [Note] WSREP: Recovering vote result from history: 674921f6-48c7-11ea-b9a9-4a0630e482fa:240,fdcab7194c6d731e
      2020-02-06 12:01:40 2 [ERROR] WSREP: Vote 0 (success) on 674921f6-48c7-11ea-b9a9-4a0630e482fa:240 is inconsistent with group. Leaving cluster.
      2020-02-06 12:01:40 2 [Note] WSREP: Closing send monitor...
      2020-02-06 12:01:40 2 [Note] WSREP: Closed send monitor.
      2020-02-06 12:01:40 2 [Note] WSREP: gcomm: terminating thread
      2020-02-06 12:01:40 2 [Note] WSREP: gcomm: joining thread
      2020-02-06 12:01:40 2 [Note] WSREP: gcomm: closing backend
      2020-02-06 12:01:40 2 [Note] WSREP: view(view_id(NON_PRIM,6747ff9c-a2d2,3) memb {
      	6747ff9c-a2d2,0
      } joined {
      } left {
      } partitioned {
      	67bcc51c-bea9,0
      	7026dfc8-9b76,0
      })
      2020-02-06 12:01:40 2 [Note] WSREP: PC protocol downgrade 1 -> 0
      2020-02-06 12:01:40 2 [Note] WSREP: view((empty))
      2020-02-06 12:01:40 2 [Note] WSREP: gcomm: closed
      2020-02-06 12:01:40 14 [Warning] WSREP: 0x5634837cfd98 down context(s) not set
      2020-02-06 12:01:40 14 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected)
      2020-02-06 12:01:40 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
      2020-02-06 12:01:40 0 [Note] WSREP: Flow-control interval: [16, 16]
      2020-02-06 12:01:40 0 [Note] WSREP: Received NON-PRIMARY.
      2020-02-06 12:01:40 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 240)
      2020-02-06 12:01:40 0 [Note] WSREP: New SELF-LEAVE.
      2020-02-06 12:01:40 0 [Note] WSREP: Flow-control interval: [0, 0]
      2020-02-06 12:01:40 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
      2020-02-06 12:01:40 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: -1)
      2020-02-06 12:01:40 0 [Note] WSREP: RECV thread exiting 0: Success
      2020-02-06 12:01:40 14 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
      2020-02-06 12:01:40 14 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
      2020-02-06 12:01:40 2 [Note] WSREP: recv_thread() joined.
      2020-02-06 12:01:40 2 [Note] WSREP: Closing replication queue.
      2020-02-06 12:01:40 2 [Note] WSREP: Closing slave action queue.
      2020-02-06 12:01:40 2 [Note] WSREP: ================================================
      View:
        id: 674921f6-48c7-11ea-b9a9-4a0630e482fa:-1
        status: non-primary
        protocol_version: 4
        capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
        final: no
        own_index: 0
        members(1):
      	0: 6747ff9c-48c7-11ea-a2d2-6ee13cc9cdf6, cnt7glr11.localdomain
      =================================================
      2020-02-06 12:01:40 2 [Note] WSREP: Non-primary view
      2020-02-06 12:01:40 2 [Note] WSREP: Server status change synced -> connected
      2020-02-06 12:01:40 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2020-02-06 12:01:40 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2020-02-06 12:01:40 14 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
      2020-02-06 12:01:40 2 [Note] WSREP: ================================================
      View:
        id: 674921f6-48c7-11ea-b9a9-4a0630e482fa:-1
        status: non-primary
        protocol_version: 4
        capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
        final: yes
        own_index: -1
        members(0):
      =================================================
      2020-02-06 12:01:40 2 [Note] WSREP: Non-primary view
      2020-02-06 12:01:40 2 [Note] WSREP: Server status change connected -> disconnected
      2020-02-06 12:01:40 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2020-02-06 12:01:40 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2020-02-06 12:01:40 0 [Note] WSREP: Service thread queue flushed.
      2020-02-06 12:01:40 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: 5
      2020-02-06 12:01:40 2 [Note] WSREP: Applier thread exiting ret: 0 thd: 2
      2020-02-06 12:01:40 2 [Warning] Aborted connection 2 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)
      2020-02-06 12:01:40 12 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
      2020-02-06 12:01:40 14 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
      

      Used runall-new.pl and appropriate yy and zz files are also attached.

        Attachments

        1. galera_stress.yy
          3 kB
        2. galera_stress.zz
          1 kB
        3. mysql0.err
          1.23 MB
        4. mysql1.err
          30 kB
        5. mysql2.err
          27 kB
        6. runall-new.pl
          32 kB

          Activity

            People

            Assignee:
            jplindst Jan Lindström
            Reporter:
            stepan.patryshev Stepan Patryshev
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: