[MDEV-18548] Add column in ALTER TABLE results in MDL BF-BF conflict and mysqld shutdown Created: 2019-02-12  Updated: 2021-12-23  Resolved: 2021-12-23

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.2.19
Fix Version/s: 10.2.42

Type: Bug Priority: Major
Reporter: Pim Rupert Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 1
Labels: None
Environment:

MariaDB-server-10.2.19-1.el7.centos.x86_64
galera-25.3.24-1.rhel7.el7.centos.x86_64
Linux 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux



 Description   

Schema change on MariaDB 10.2 Galera cluster done via pt-online-schema-change on master1 results in MDL BF-BF conflict and mysqld shutdown on master2.

Logs on master1:

2019-02-11 17:18:06 139920911415040 [Note] WSREP: MDL conflict db=redacteddbname table=_email_channel_new ticket=7 solved by abort
2019-02-11 17:18:07 139906639951616 [Note] WSREP: MDL conflict db=redacteddbname table=email_channel ticket=9 solved by abort

Logs for master2:

2019-02-11 17:18:06 140455700174592 [Note] WSREP: MDL BF-BF conflict
schema:  redacteddbname
request: (10 	seqno 17904291 	wsrep (1, 1, 0) cmd 0 147 	update `channel` set `last_activity_at` = '2019-02-11 17:18:06', `updated_at` = '2019-02-11 17:18:06' where `id` = 2632>�a\)
granted: (11 	seqno 17904290 	wsrep (1, 1, 0) cmd 0 3 	ALTER TABLE `redacteddbname`.`_email_channel_new` add email_theme_id int unsigned null after branding, add constraint email_channel_email_theme_id_foreign foreign key (email_theme_id) references email_theme (id) on delete set null on update cascade)
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 17904299)
2019-02-11 17:18:07 140455699560192 [ERROR] Slave SQL: Error executing row event: 'Table 'redacteddbname._email_channel_old' doesn't exist', Internal MariaDB error code: 1146
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: RBR event 59 Update_rows_v1 apply warning: 1146, 17904299
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: Failed to apply app buffer: seqno: 17904299, status: 1
	 at galera/src/trx_handle.cpp:apply():353
Retrying 2th time
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 17904299)
2019-02-11 17:18:07 140455699560192 [ERROR] Slave SQL: Error executing row event: 'Table 'redacteddbname._email_channel_old' doesn't exist', Internal MariaDB error code: 1146
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: RBR event 59 Update_rows_v1 apply warning: 1146, 17904299
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: Failed to apply app buffer: seqno: 17904299, status: 1
	 at galera/src/trx_handle.cpp:apply():353
Retrying 3th time
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 17904299)
2019-02-11 17:18:07 140455699560192 [ERROR] Slave SQL: Error executing row event: 'Table 'redacteddbname._email_channel_old' doesn't exist', Internal MariaDB error code: 1146
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: RBR event 59 Update_rows_v1 apply warning: 1146, 17904299
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: Failed to apply app buffer: seqno: 17904299, status: 1
	 at galera/src/trx_handle.cpp:apply():353
Retrying 4th time
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 17904299)
2019-02-11 17:18:07 140455699560192 [ERROR] Slave SQL: Error executing row event: 'Table 'redacteddbname._email_channel_old' doesn't exist', Internal MariaDB error code: 1146
2019-02-11 17:18:07 140455699560192 [Warning] WSREP: RBR event 59 Update_rows_v1 apply warning: 1146, 17904299
2019-02-11 17:18:07 140455699560192 [ERROR] WSREP: Failed to apply trx: source: 27c270fe-1a82-11e9-a225-87b513ed78e3 version: 4 local: 0 state: APPLYING flags: 1 conn_id: 62712540 trx_id: 582056665 seqnos (l: 17904759, g: 17904299, s: 17904298, d: 17904092, ts: 4733595338897186)
2019-02-11 17:18:07 140455699560192 [ERROR] WSREP: Failed to apply trx 17904299 4 times
2019-02-11 17:18:07 140455699560192 [ERROR] WSREP: Node consistency compromised, aborting...
2019-02-11 17:18:07 140455699560192 [Note] WSREP: Closing send monitor...
2019-02-11 17:18:07 140455699560192 [Note] WSREP: Closed send monitor.
2019-02-11 17:18:07 140455699560192 [Note] WSREP: gcomm: terminating thread
2019-02-11 17:18:07 140455699560192 [Note] WSREP: gcomm: joining thread
2019-02-11 17:18:07 140455699560192 [Note] WSREP: gcomm: closing backend
2019-02-11 17:18:07 140455699560192 [Note] WSREP: view(view_id(NON_PRIM,27c270fe,77) memb {
	aebcc706,0
} joined {
} left {
} partitioned {
	27c270fe,0
	cf17ce0a,0
})
2019-02-11 17:18:07 140455699560192 [Note] WSREP: view((empty))
2019-02-11 17:18:07 140455699560192 [Note] WSREP: gcomm: closed
2019-02-11 17:18:07 140456000202496 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Flow-control interval: [16, 16]
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Trying to continue unpaused monitor
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Received NON-PRIMARY.
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 17904300)
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Received self-leave message.
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Flow-control interval: [0, 0]
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Trying to continue unpaused monitor
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2019-02-11 17:18:07 140456000202496 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 17904300)
2019-02-11 17:18:07 140456000202496 [Note] WSREP: RECV thread exiting 0: Success
2019-02-11 17:18:07 140455699560192 [Note] WSREP: recv_thread() joined.
2019-02-11 17:18:07 140455699560192 [Note] WSREP: Closing replication queue.
2019-02-11 17:18:07 140455699560192 [Note] WSREP: Closing slave action queue.
2019-02-11 17:18:07 140455699560192 [Note] WSREP: /usr/sbin/mysqld: Terminated.



 Comments   
Comment by Jan Lindström (Inactive) [ 2019-02-13 ]

Can I have show create table output for following tables: channel, redacteddbname._email_channel_old, redacteddbname._email_channel_new, email_channel and email_theme

Comment by Pim Rupert [ 2019-02-13 ]

Jan, thanks for looking into this. Unfortunately I have no show create table output for the _email_channel_old and _email_channel_new tables, as they were created temporarily by pt-osc. Hereby what we do know:

CREATE TABLE `channel` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `phone` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `title` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `username` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `password` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `status_update` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `status` enum('ACTIVE','UNVERIFIED','CODE_SENT','PENDING','UNASSIGNED') COLLATE utf8mb4_unicode_ci NOT NULL,
  `type` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'WHATSAPP',
  `is_wa_connector` tinyint(1) NOT NULL DEFAULT 0,
  `logo_path` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `account_type` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `last_login_at` datetime DEFAULT NULL,
  `last_activity_at` datetime DEFAULT NULL,
  `expires_at` datetime DEFAULT NULL,
  `telegram_last_update_id` datetime DEFAULT NULL,
  `notification_email` text COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `is_running` tinyint(1) NOT NULL DEFAULT 0,
  `agency_id` int(10) unsigned DEFAULT NULL,
  `auto_reply` enum('ENABLED','DISABLED','IN_OPENING_HOURS','OUTSIDE_OPENING_HOURS') COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'ENABLED',
  `wa_server_id` int(10) unsigned DEFAULT NULL,
  `connection_error_notification_email` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `price` decimal(8,2) NOT NULL DEFAULT 15.00,
  `color` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `show_ticket_fields` tinyint(1) NOT NULL DEFAULT 1,
  `show_contact_fields` tinyint(1) NOT NULL DEFAULT 1,
  `can_be_deleted_at` datetime DEFAULT NULL,
  `requested_by` int(11) DEFAULT NULL,
  `reopen_closed_ticket` tinyint(1) NOT NULL DEFAULT 0,
  `deleted_at` timestamp NULL DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `agency_id` (`agency_id`),
  KEY `channel_wa_server_id_foreign` (`wa_server_id`),
  KEY `type` (`type`),
  KEY `username` (`username`(191)),
  KEY `phone` (`phone`(191)),
  KEY `status` (`status`),
  KEY `password` (`password`(191)),
  CONSTRAINT `channel_ibfk_1` FOREIGN KEY (`agency_id`) REFERENCES `agency` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
  CONSTRAINT `channel_wa_server_id_foreign` FOREIGN KEY (`wa_server_id`) REFERENCES `wa_server` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=40846 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
 
CREATE TABLE `email_channel` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `channel_id` int(10) unsigned NOT NULL,
  `signature` text COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `auto_reply_enabled` tinyint(1) NOT NULL DEFAULT 1,
  `auto_reply_subject` text COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `auto_reply_body` text COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `sender_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `branding_service_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `branding` tinyint(1) NOT NULL DEFAULT 1,
  `email_theme_id` int(10) unsigned DEFAULT NULL,
  `sender_name_personal` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `sender_email` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `sender_email_id` int(10) unsigned DEFAULT NULL,
  `add_conversation_to_reply` tinyint(1) NOT NULL DEFAULT 1,
  `split_by_subject` int(11) NOT NULL DEFAULT 1,
  `split_by_sender` int(11) NOT NULL DEFAULT 0,
  `font_family` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `embed_attachments` tinyint(1) NOT NULL DEFAULT 0,
  `prepend_ticket_number_to_subject` tinyint(1) NOT NULL DEFAULT 0,
  `spam_setting` int(11) NOT NULL DEFAULT 1,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `email_channel_channel_id_foreign` (`channel_id`),
  KEY `email_channel_sender_email_id_foreign` (`sender_email_id`),
  KEY `_email_channel_email_theme_id_foreign` (`email_theme_id`),
  CONSTRAINT `__email_channel_channel_id_foreign` FOREIGN KEY (`channel_id`) REFERENCES `channel` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `__email_channel_sender_email_id_foreign` FOREIGN KEY (`sender_email_id`) REFERENCES `email_channel_verified_domain` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
  CONSTRAINT `_email_channel_email_theme_id_foreign` FOREIGN KEY (`email_theme_id`) REFERENCES `email_theme` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=11785 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
 
CREATE TABLE `email_theme` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `agency_id` int(10) unsigned NOT NULL,
  `title` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `font_family` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `font_size` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `font_color` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `line_height` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `agency_id_foreign` (`agency_id`),
  CONSTRAINT `agency_id_foreign` FOREIGN KEY (`agency_id`) REFERENCES `agency` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=335 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Comment by Pim Rupert [ 2019-02-13 ]

For further clarification, the following command ran the ALTER:

pt-online-schema-change --alter "add email_theme_id int unsigned null after branding, add constraint email_channel_email_theme_id_foreign foreign key (email_theme_id) references email_theme (id) on delete set null on update cascade" D=redacteddbname,t=email_channel --execute -u redactedusername -p redactedpw

Comment by Brendan P [ 2019-05-02 ]

This is a terrible bug, it should be easily reproducible to anyone whom uses pt-online-schema-change on latest galera3, just manipulate some foreign keys or change anything related to the *PRIMARY key. Almost as if something is being parallel applied, the online DDL occurs however something in the replication pipeline still tries to apply something to a nonexistent table, thus innodb throws an error.

What is missing from the original report, is when this happens, it forces an entire SST on the whole database, which if you have a multi TiB db can be a very painful experience.

Generated at Thu Feb 08 08:44:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.