[MDEV-17766] Mariadb fails when the table is deleted (Internal MariaDB error code: 1146) Created: 2018-11-19  Updated: 2022-01-21  Resolved: 2022-01-21

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.3.10
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Dominik Fišer Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 2
Labels: galera
Environment:

Debian GNU/Linux 9.5 (stretch), galera cluster (wsrep_provider_version 25.3.24(r3825)), mariabackup



 Description   

We have a problem after upgrade to Mariadb 10.3.10 (from 10.2). We are deleting all tables in database each hour (it's demo database) and importing same tables again. Every few days mariadb fails totally. We are using galera cluster with mariabackup. One node failed.
Table schema is:

CREATE TABLE `shopio_vouchers_products` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `vouchers_id` int(10) NOT NULL,
  `products_id` int(10) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_ids` (`vouchers_id`,`products_id`),
  KEY `products_id` (`products_id`),
  CONSTRAINT `shopio_vouchers_products_ibfk_1` FOREIGN KEY (`products_id`) REFERENCES `shopio_products` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `shopio_vouchers_products_ibfk_2` FOREIGN KEY (`vouchers_id`) REFERENCES `shopio_vouchers` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

See log detail:

lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 1876588494)
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [ERROR] Slave SQL: Error executing row event: 'Table 's_demo.shopio_vouchers_products' doesn't exist', Internal MariaDB error code: 1146
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: RBR event 23 Update_rows_v1 apply warning: 1146, 1876588494
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: Failed to apply app buffer: seqno: 1876588494, status: 1
lis 19 00:00:11 server mysqld[26688]:          at galera/src/trx_handle.cpp:apply():353
lis 19 00:00:11 server mysqld[26688]: Retrying 2th time
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 1876588494)
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [ERROR] Slave SQL: Error executing row event: 'Table 's_demo.shopio_vouchers_products' doesn't exist', Internal MariaDB error code: 1146
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: RBR event 23 Update_rows_v1 apply warning: 1146, 1876588494
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: Failed to apply app buffer: seqno: 1876588494, status: 1
lis 19 00:00:11 server mysqld[26688]:          at galera/src/trx_handle.cpp:apply():353
lis 19 00:00:11 server mysqld[26688]: Retrying 3th time
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 1876588494)
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [ERROR] Slave SQL: Error executing row event: 'Table 's_demo.shopio_vouchers_products' doesn't exist', Internal MariaDB error code: 1146
lis 19 00:00:11 server mysqld[26688]: 2018-11-19  0:00:11 14 [Warning] WSREP: RBR event 23 Update_rows_v1 apply warning: 1146, 1876588494
lis 19 00:00:12 server systemd[1]: mariadb.service: Main process exited, code=killed, status=6/ABRT
lis 19 00:00:12 server systemd[1]: mariadb.service: Unit entered failed state.
lis 19 00:00:12 server systemd[1]: mariadb.service: Failed with result 'signal'.

Any ideas what could be wrong?



 Comments   
Comment by David Delbecq [ 2019-03-11 ]

Hello,

we have a similar problem on our galera cluster when using the percona toolkit 'pt-online-schema-change'.

As a reminder, this tool does the following

  • create a new table
  • create insert/delete/update triggers copying modifications (update old table -> update new table)
  • copy data from old table to

If the tool needs to cancel it's operation (due to detected heavy load or other sanity check reason), it does the following

  • stop it's copy operation
  • drop the trigger
  • drop the new table

for some reasons, it seems that the delete table is performed by mariadb while some cluster operation are still pending on the new table, leading to a mysql crash. It is always reproductible in our case as we are manipulating tables having hundreds of modifications per second. When this happen, the cluster node gets down is a full SST is required.

See our mariadb logs below.

Mar 06 13:46:01 db.service.com mysqld[240367]: at galera/src/trx_handle.cpp:apply():353
Mar 06 13:46:01 db.service.com mysqld[240367]: Retrying 3th time
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 3131068)
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [ERROR] Slave SQL: Error executing row event: 'Table 'MyData._mytable_new' doesn't exist', Internal MariaDB error code: 1146
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: RBR event 12 Delete_rows_v1 apply warning: 1146, 3131068
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: Failed to apply app buffer: seqno: 3131068, status: 1
Mar 06 13:46:01 db.service.com mysqld[240367]: at galera/src/trx_handle.cpp:apply():353
Mar 06 13:46:01 db.service.com mysqld[240367]: Retrying 4th time
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 3131068)
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [ERROR] Slave SQL: Error executing row event: 'Table 'MyData._mytable_new' doesn't exist', Internal MariaDB error code: 1146
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: RBR event 12 Delete_rows_v1 apply warning: 1146, 3131068
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: failed to replay trx: source: fa958901-3f52-11e9-85aa-939b6d70d4b8 version: 4 local: 1 state: REPLAYING flags: 1 conn_id: 115521 trx_id: 20478282566 seqnos (l: 3257204, g: 3131068, s: 3131066, d: 3131057, ts: 8997189420350868)
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [Warning] WSREP: Failed to apply trx 3131068 4 times
Mar 06 13:46:01 db.service.com mysqld[240367]: [110B blob data]
Mar 06 13:46:01 db.service.com mysqld[240367]: 2019-03-06 13:46:01 140273500833536 [ERROR] Aborting
Mar 06 13:46:13 db.service.com mysqld[240367]: 190306 13:46:13 [ERROR] mysqld got signal 11 ;
Mar 06 13:46:13 db.service.com mysqld[240367]: This could be because you hit a bug. It is also possible that this binary
Mar 06 13:46:13 db.service.com mysqld[240367]: or one of the libraries it was linked against is corrupt, improperly built,
Mar 06 13:46:13 db.service.com mysqld[240367]: or misconfigured. This error can also be caused by malfunctioning hardware.
Mar 06 13:46:13 db.service.com mysqld[240367]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Mar 06 13:46:13 db.service.com mysqld[240367]: We will try our best to scrape up some info that will hopefully help
Mar 06 13:46:13 db.service.com mysqld[240367]: diagnose the problem, but since we have already crashed,
Mar 06 13:46:13 db.service.com mysqld[240367]: something is definitely wrong and this may fail.
Mar 06 13:46:13 db.service.com mysqld[240367]: Server version: 10.2.22-MariaDB
Mar 06 13:46:13 db.service.com mysqld[240367]: key_buffer_size=67108864
Mar 06 13:46:13 db.service.com mysqld[240367]: read_buffer_size=1048576
Mar 06 13:46:13 db.service.com mysqld[240367]: max_used_connections=281
Mar 06 13:46:13 db.service.com mysqld[240367]: max_threads=602
Mar 06 13:46:13 db.service.com mysqld[240367]: thread_count=340
Mar 06 13:46:13 db.service.com mysqld[240367]: It is possible that mysqld could use up to
Mar 06 13:46:13 db.service.com mysqld[240367]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1927693 K  bytes of memory
Mar 06 13:46:13 db.service.com mysqld[240367]: Hope that's ok; if not, decrease some variables in the equation.
Mar 06 13:46:13 db.service.com mysqld[240367]: Thread pointer: 0x7f8df80008c8
Mar 06 13:46:13 db.service.com mysqld[240367]: Attempting backtrace. You can use the following information to find out
Mar 06 13:46:13 db.service.com mysqld[240367]: where mysqld died. If you see no messages after this, something went
Mar 06 13:46:13 db.service.com mysqld[240367]: terribly wrong...
Mar 06 13:46:13 db.service.com mysqld[240367]: stack_bottom = 0x7f94001d7d30 thread_stack 0x49000
Mar 06 13:46:13 db.service.com mysqld[240367]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x558837f4cd6e]
Mar 06 13:46:13 db.service.com mysqld[240367]: /usr/sbin/mysqld(handle_fatal_signal+0x355)[0x5588379ce665]
Mar 06 13:46:13 db.service.com mysqld[240367]: /lib64/libpthread.so.0(+0xf6d0)[0x7fb16c4516d0]
Mar 06 13:46:13 db.service.com mysqld[240367]: /usr/sbin/mysqld(+0x470b83)[0x5588377ceb83]
Mar 06 13:46:13 db.service.com mysqld[240367]: /usr/sbin/mysqld(_Z14acl_check_hostPKcS0_+0x3b)[0x5588377de69b]
Mar 06 13:46:14 db.service.com mysqld[240367]: /usr/sbin/mysqld(+0x5abdbf)[0x558837909dbf]
Mar 06 13:46:19 db.service.com systemd[1]: mariadb.service: main process exited, code=killed, status=11/SEGV
Mar 06 13:46:19 db.service.com systemd[1]: Unit mariadb.service entered failed state.
Mar 06 13:46:19 db.service.com systemd[1]: mariadb.service failed.

Comment by Jan Lindström (Inactive) [ 2022-01-21 ]

Can you please try with more recent version of MariaDB server and Galera library.

Generated at Thu Feb 08 08:38:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.