[MDEV-15822] WSREP: BF lock wait long for trx - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.1(EOL), 10.2(EOL), 10.3(EOL)
Fix Version/s: 10.1.35, 10.2.17, 10.3.9
Component/s: Galera
Labels:
None
Environment:
CentOS 7.4 (with microcode_ctl-2.1-22.2.el7.x86_64, AFAIK withdrawn because of stability issues)

Description

hi,

this might be duplicate if ~~MDEV-15747~~, but is for Verision 10.2.12 so I ma not sure

We ran in this bug last month, but I had no time to look deeper into this until now.

It was on a Cluster with 2 nodes. The problems started on node "DB2", and shifted to node "DB1" when trying to restart "DB2". The restart failed, so I rebooted the node.

here is the timeline of what happend when on node DB2:
08:00:24 - problem on node DB2
10:27:58 - trying to restart mariadb (failed!)

from /var/log/messages:
Mar 13 10:29:28 DB2 systemd: mariadb.service stop-sigterm timed out. Skipping SIGKILL.
Mar 13 10:30:59 DB2 systemd: mariadb.service stop-final-sigterm timed out. Skipping SIGKILL. Entering failed mode.
Mar 13 10:30:59 DB2 systemd: Stopped MariaDB 10.2.12 database server.
Mar 13 10:30:59 DB2 systemd: Unit mariadb.service entered failed state.
Mar 13 10:30:59 DB2 systemd: mariadb.service failed.
Mar 13 10:31:13 DB2 systemd: Starting MariaDB 10.2.12 database server...
Mar 13 10:32:43 DB2 systemd: mariadb.service start-pre operation timed out. Terminating.
Mar 13 10:32:43 DB2 systemd: Failed to start MariaDB 10.2.12 database server.
Mar 13 10:32:43 DB2 systemd: Unit mariadb.service entered failed state.
Mar 13 10:32:43 DB2 systemd: mariadb.service failed.

10:41 - rebooted node DB2

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

DB1-mysqld.log-20180313
197 kB
2018-04-09 11:17
DB2-mysqld.log-20180313
613 kB
2018-04-09 11:17
galera.diff
4 kB
2018-06-28 10:08

Issue Links

is duplicated by

MDEV-15154 WSREP: BF lock wait long after a TRUNCATE TABLE

Closed

Activity

Ascending order - Click to sort in descending order

Reinhard Sojka created issue - 2018-04-09 11:22

Elena Stepanova made changes - 2018-04-09 11:31

Field	Original Value	New Value
Fix Version/s		10.2 [ 14601 ]
Assignee		Sachin Setiya [ sachin.setiya.007 ]

Sergei Golubchik made changes - 2018-05-02 10:59

Assignee

Sachin Setiya [ sachin.setiya.007 ]

Jan Lindström [ jplindst ]

Jan Lindström (Inactive) added a comment - 2018-06-05 10:42 - edited

There is record lock wait between galera BF (brure force) DDL-transaction doing ALTER TABLE and mysql.innodb_table_stats update:

---TRANSACTION 74849215, ACTIVE 44 sec

LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1

MySQL thread id 750205, OS thread handle 140417619298048, query id 229007447 62.218.60.201 typo3 Unlocking tables

ALTER TABLE `tx_dorauktion_domain_model_konfiguration` CHANGE `steigerungsstufen_art` `steigerungsstufen_art` INT DEFAULT 0 NOT NULL

------- TRX HAS BEEN WAITING 44 SEC FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 1 page no 3 n bits 256 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 74849215 lock_mode X locks gap before rec insert intention waiting

Record lock, heap no 3 PHYSICAL RECORD: n_fields 8; compact format; info bits 0

 0: len 5; hex 7479706f33; asc typo3;;

 1: len 14; hex 6261636b656e645f6c61796f7574; asc backend_layout;;

 2: len 6; hex 0000011fe295; asc       ;;

 3: len 7; hex e70000019a0110; asc        ;;

 4: len 4; hex 5a950b50; asc Z  P;;

 5: len 8; hex 0000000000000000; asc         ;;

 6: len 8; hex 0000000000000001; asc         ;;

 7: len 8; hex 0000000000000002; asc         ;;

------------------

TABLE LOCK table `mysql`.`innodb_table_stats` trx id 74849215 lock mode IX

RECORD LOCKS space id 1 page no 3 n bits 256 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 74849215 lock_mode X

Record lock, heap no 50 PHYSICAL RECORD: n_fields 8; compact format; info bits 32

 0: len 5; hex 7479706f33; asc typo3;;

 1: len 30; hex 74785f646f7261756b74696f6e5f646f6d61696e5f6d6f64656c5f6b6f6e; asc tx_dorauktion_domain_model_kon; (total 40 bytes);

 2: len 6; hex 000004761bbf; asc    v  ;;

 3: len 7; hex 580000014c0a69; asc X   L i;;

 4: len 4; hex 5aa67ff3; asc Z   ;;

 5: len 8; hex 000000000000000a; asc         ;;

 6: len 8; hex 0000000000000001; asc         ;;

 7: len 8; hex 0000000000000004; asc         ;;

RECORD LOCKS space id 1 page no 3 n bits 256 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 74849215 lock_mode X locks gap before rec insert intention waiting

Record lock, heap no 3 PHYSICAL RECORD: n_fields 8; compact format; info bits 0

 0: len 5; hex 7479706f33; asc typo3;;

 1: len 14; hex 6261636b656e645f6c61796f7574; asc backend_layout;;

 2: len 6; hex 0000011fe295; asc       ;;

 3: len 7; hex e70000019a0110; asc        ;;

 4: len 4; hex 5a950b50; asc Z  P;;

 5: len 8; hex 0000000000000000; asc         ;;

 6: len 8; hex 0000000000000001; asc         ;;

 7: len 8; hex 0000000000000002; asc         ;;

Jan Lindström (Inactive) added a comment - 2018-06-05 10:42 - edited There is record lock wait between galera BF (brure force) DDL-transaction doing ALTER TABLE and mysql.innodb_table_stats update: ---TRANSACTION 74849215, ACTIVE 44 sec LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1 MySQL thread id 750205, OS thread handle 140417619298048, query id 229007447 62.218.60.201 typo3 Unlocking tables ALTER TABLE `tx_dorauktion_domain_model_konfiguration` CHANGE `steigerungsstufen_art` `steigerungsstufen_art` INT DEFAULT 0 NOT NULL ------- TRX HAS BEEN WAITING 44 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1 page no 3 n bits 256 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 74849215 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 3 PHYSICAL RECORD: n_fields 8; compact format; info bits 0 0: len 5; hex 7479706f33; asc typo3;; 1: len 14; hex 6261636b656e645f6c61796f7574; asc backend_layout;; 2: len 6; hex 0000011fe295; asc ;; 3: len 7; hex e70000019a0110; asc ;; 4: len 4; hex 5a950b50; asc Z P;; 5: len 8; hex 0000000000000000; asc ;; 6: len 8; hex 0000000000000001; asc ;; 7: len 8; hex 0000000000000002; asc ;; ------------------ TABLE LOCK table `mysql`.`innodb_table_stats` trx id 74849215 lock mode IX RECORD LOCKS space id 1 page no 3 n bits 256 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 74849215 lock_mode X Record lock, heap no 50 PHYSICAL RECORD: n_fields 8; compact format; info bits 32 0: len 5; hex 7479706f33; asc typo3;; 1: len 30; hex 74785f646f7261756b74696f6e5f646f6d61696e5f6d6f64656c5f6b6f6e; asc tx_dorauktion_domain_model_kon; (total 40 bytes); 2: len 6; hex 000004761bbf; asc v ;; 3: len 7; hex 580000014c0a69; asc X L i;; 4: len 4; hex 5aa67ff3; asc Z ;; 5: len 8; hex 000000000000000a; asc ;; 6: len 8; hex 0000000000000001; asc ;; 7: len 8; hex 0000000000000004; asc ;; RECORD LOCKS space id 1 page no 3 n bits 256 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 74849215 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 3 PHYSICAL RECORD: n_fields 8; compact format; info bits 0 0: len 5; hex 7479706f33; asc typo3;; 1: len 14; hex 6261636b656e645f6c61796f7574; asc backend_layout;; 2: len 6; hex 0000011fe295; asc ;; 3: len 7; hex e70000019a0110; asc ;; 4: len 4; hex 5a950b50; asc Z P;; 5: len 8; hex 0000000000000000; asc ;; 6: len 8; hex 0000000000000001; asc ;; 7: len 8; hex 0000000000000002; asc ;;

Jan Lindström (Inactive) made changes - 2018-06-05 10:43

Assignee

Jan Lindström [ jplindst ]

Vasil [ dimov ]

Jan Lindström (Inactive) added a comment - 2018-06-05 10:45 - edited

Dimov Questions: Why these mysql.innodb_index_stats and mysql.innodb_table_stats updates are done in different trx compared to trx that causes the update ? Secondly, why it is done in middle of huge DDL ? Finally, can we somehow move this update in case of BF later ? Does Galera replicate the these update from every updated node to other nodes and do this same update regardless of this replication ?

Jan Lindström (Inactive) added a comment - 2018-06-05 10:45 - edited Dimov Questions: Why these mysql.innodb_index_stats and mysql.innodb_table_stats updates are done in different trx compared to trx that causes the update ? Secondly, why it is done in middle of huge DDL ? Finally, can we somehow move this update in case of BF later ? Does Galera replicate the these update from every updated node to other nodes and do this same update regardless of this replication ?

Jan Lindström (Inactive) added a comment - 2018-06-05 10:56

Meanwhile: a work-around could be `SET GLOBAL innodb_stats_auto_recalc=OFF`

Jan Lindström (Inactive) added a comment - 2018-06-05 10:56 Meanwhile: a work-around could be `SET GLOBAL innodb_stats_auto_recalc=OFF`

Jan Lindström (Inactive) added a comment - 2018-06-05 10:58

Another similar case:

---TRANSACTION 119735696, ACTIVE 43 sec

  LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)

  MySQL thread id 250661, OS thread handle 140410807035648, query id 195176750 62.218.60.201 typo3 Opening tables

  TRUNCATE `cf_cache_imagesizes_tags`

  ------- TRX HAS BEEN WAITING 43 SEC FOR THIS LOCK TO BE GRANTED:

  RECORD LOCKS space id 1 page no 3 n bits 184 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 119735696 lock_mode X waiting

Jan Lindström (Inactive) added a comment - 2018-06-05 10:58 Another similar case: ---TRANSACTION 119735696, ACTIVE 43 sec LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s) MySQL thread id 250661, OS thread handle 140410807035648, query id 195176750 62.218.60.201 typo3 Opening tables TRUNCATE `cf_cache_imagesizes_tags` ------- TRX HAS BEEN WAITING 43 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1 page no 3 n bits 184 index PRIMARY of table `mysql`.`innodb_table_stats` trx id 119735696 lock_mode X waiting

Jan Lindström (Inactive) made changes - 2018-06-18 14:08

Assignee

Vasil [ dimov ]

Jan Lindström [ jplindst ]

Jan Lindström (Inactive) made changes - 2018-06-18 14:08

Assignee

Jan Lindström [ jplindst ]

Vasil [ dimov ]

Vasil (Inactive) added a comment - 2018-06-18 14:39

> Why these mysql.innodb_index_stats and mysql.innodb_table_stats updates are done in different trx compared to trx that causes the update?

The stats recalculations are done in an async manner with the DML that crossed the threshold and made stats recalculation necessary. In async manner by a background thread. This is to avoid delay of the DML with stats recalculation.

> Secondly, why it is done in middle of huge DDL?

It is done in an async manner. Once a DML changes too many rows a table if flagged as "needs stats recalc" and later at some point the background stats thread picks that table and recalculates its statistics, like a background/hidden ANALYZE TABLE. So, it can happen at any time, also in the middle of huge DDL, if triggered by some previous DML.

> Finally, can we somehow move this update in case of BF later?

But this shouldn't be necessary. Both operations should run concurrently and serialize where necessary. The background stats recalc is pretty much the same as the user running ANALYZE TABLE manually in another terminal. We can't ask the user not to run or delay his ANALYZE during BF.

> Does Galera replicate the these update from every updated node to other nodes and do this same update regardless of this replication?

I would guess so, but I am not sure.

Vasil (Inactive) added a comment - 2018-06-18 14:39 > Why these mysql.innodb_index_stats and mysql.innodb_table_stats updates are done in different trx compared to trx that causes the update? The stats recalculations are done in an async manner with the DML that crossed the threshold and made stats recalculation necessary. In async manner by a background thread. This is to avoid delay of the DML with stats recalculation. > Secondly, why it is done in middle of huge DDL? It is done in an async manner. Once a DML changes too many rows a table if flagged as "needs stats recalc" and later at some point the background stats thread picks that table and recalculates its statistics, like a background/hidden ANALYZE TABLE. So, it can happen at any time, also in the middle of huge DDL, if triggered by some previous DML. > Finally, can we somehow move this update in case of BF later? But this shouldn't be necessary. Both operations should run concurrently and serialize where necessary. The background stats recalc is pretty much the same as the user running ANALYZE TABLE manually in another terminal. We can't ask the user not to run or delay his ANALYZE during BF. > Does Galera replicate the these update from every updated node to other nodes and do this same update regardless of this replication? I would guess so, but I am not sure.

Jan Lindström (Inactive) added a comment - 2018-06-26 10:25

Dimov ok, can we at least somehow avoid after significant change done by BF DDL-transaction putting that table for background statistic calculation ? We maybe can't avoid the situation where that table is selected before BF transaction starts. In my understanding MDL should cause wait for ANALYZE table during other DDL. Same MDL or InnoDB row locks should avoid situation where some other transaction will do significant change to same table causing statistic recalc during BF DDL or that transaction would be selected as victim.

Jan Lindström (Inactive) added a comment - 2018-06-26 10:25 Dimov ok, can we at least somehow avoid after significant change done by BF DDL-transaction putting that table for background statistic calculation ? We maybe can't avoid the situation where that table is selected before BF transaction starts. In my understanding MDL should cause wait for ANALYZE table during other DDL. Same MDL or InnoDB row locks should avoid situation where some other transaction will do significant change to same table causing statistic recalc during BF DDL or that transaction would be selected as victim.

Vasil (Inactive) added a comment - 2018-06-26 10:35

A table is queued for background stats recalc from `row_update_statistics_if_needed()` - maybe the logic can be fiddled to avoid the call to `dict_stats_recalc_pool_add()`?

There is also `dict_stats_recalc_pool_del()` which can be used to remove a table from the queue of already present.

Vasil (Inactive) added a comment - 2018-06-26 10:35 A table is queued for background stats recalc from `row_update_statistics_if_needed()` - maybe the logic can be fiddled to avoid the call to `dict_stats_recalc_pool_add()`? There is also `dict_stats_recalc_pool_del()` which can be used to remove a table from the queue of already present.

Jan Lindström (Inactive) made changes - 2018-06-26 11:00

Affects Version/s

10.2.12 [ 22810 ]

Jan Lindström (Inactive) made changes - 2018-06-26 11:01

Affects Version/s

10.2 [ 14601 ]

Jan Lindström (Inactive) made changes - 2018-06-26 11:01

Affects Version/s

10.1 [ 16100 ]

Jan Lindström (Inactive) made changes - 2018-06-26 11:01

Fix Version/s

10.1 [ 16100 ]

Jan Lindström (Inactive) made changes - 2018-06-26 11:27

Assignee

Vasil [ dimov ]

Jan Lindström [ jplindst ]

Jan Lindström (Inactive) made changes - 2018-06-26 11:27

Status

Open [ 1 ]

In Progress [ 3 ]

Jan Lindström (Inactive) made changes - 2018-06-28 10:08

Attachment

galera.diff [ 45822 ]

Jan Lindström (Inactive) added a comment - 2018-06-28 10:09

Dimov I do not have yet fully working test case, I can get DDL to BF state but do not know yet how to get BF lock long wait. Can you see the attached galera.diff and comment my "try" ?

Jan Lindström (Inactive) added a comment - 2018-06-28 10:09 Dimov I do not have yet fully working test case, I can get DDL to BF state but do not know yet how to get BF lock long wait. Can you see the attached galera.diff and comment my "try" ?

Vasil (Inactive) added a comment - 2018-06-28 12:59 - edited

Just for the records, a copy from IM:

I do not see a problem in your patch. If it fixes the problem, then it should be ok. Some minor things: make sure that the checks will not crash due to trx or trx->mysql_thd being NULL (I am not sure if that is possible). Also in the test - you do not need galera_bf_background_statistics.opt with --innodb_stats_persistent=ON inside, it should be the default and you already check @@innodb_stats_persistent at the beginning of the test. In "--eval SELECT $wsrep_local_bf_aborts_after - $wsrep_local_bf_aborts_before = 1 AS ..." I would remove the " = 1" part because it would mask any failure as 0 instead of providing the actual difference for analysis. E.g. if you just do select new - old as ... and the .result file contains 1 and anything else is returned, like 0 or 2 or 3 or 5 you will actually see that value in the failure message from mtr.

Vasil (Inactive) added a comment - 2018-06-28 12:59 - edited Just for the records, a copy from IM: I do not see a problem in your patch. If it fixes the problem, then it should be ok. Some minor things: make sure that the checks will not crash due to trx or trx->mysql_thd being NULL (I am not sure if that is possible). Also in the test - you do not need galera_bf_background_statistics.opt with --innodb_stats_persistent=ON inside, it should be the default and you already check @@innodb_stats_persistent at the beginning of the test. In "--eval SELECT $wsrep_local_bf_aborts_after - $wsrep_local_bf_aborts_before = 1 AS ..." I would remove the " = 1" part because it would mask any failure as 0 instead of providing the actual difference for analysis. E.g. if you just do select new - old as ... and the .result file contains 1 and anything else is returned, like 0 or 2 or 3 or 5 you will actually see that value in the failure message from mtr.

Marko Mäkelä added a comment - 2018-07-05 09:47

Dimov, I believe that in MariaDB’s mysql-test-run, InnoDB persistent statistics are disabled by default.

Marko Mäkelä added a comment - 2018-07-05 09:47 Dimov , I believe that in MariaDB’s mysql-test-run , InnoDB persistent statistics are disabled by default.

Vasil (Inactive) added a comment - 2018-07-05 10:23

Ok, then of course, the `.opt` file is necessary.

Vasil (Inactive) added a comment - 2018-07-05 10:23 Ok, then of course, the `.opt` file is necessary.

Zdravelina Sokolovska (Inactive) made changes - 2018-07-05 11:20

Affects Version/s

10.3 [ 22126 ]

Jan Lindström (Inactive) made changes - 2018-07-25 05:25

issue.field.resolutiondate

2018-07-25 05:25:02.0

2018-07-25 05:25:02.416

Jan Lindström (Inactive) made changes - 2018-07-25 05:25

Component/s	wsrep [ 11500 ]
Fix Version/s		10.1.35 [ 23116 ]
Fix Version/s		10.2.17 [ 23111 ]
Fix Version/s		10.3.9 [ 23114 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	10.1 [ 16100 ]
Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2018-07-25 10:16

Link

This issue is duplicated by ~~MDEV-15154~~ [ ~~MDEV-15154~~ ]

Chow King Tak added a comment - 2018-11-22 02:23

I am using version 10.3.9 but I just encountered the BF error. It blocked all transactions and I needed to restart the node. Is there any configuration in the my.cnf I have to adjust to avoid the error?

The following are some of the error messages:

2018-11-22 8:53:28 0 [Note] InnoDB: WSREP: BF lock wait long for trx:0x2129df query: DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP'Â¢YÃµ[^S^B
TRANSACTION 2173407, ACTIVE 42084 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 2, OS thread handle 140537284146944, query id 5316319 Delete_rows_log_event::find_row(278043)
DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP'
2018-11-22 8:53:29 0 [Note] InnoDB: WSREP: BF lock wait long for trx:0x2129df query: DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP'Â¢YÃµ[^S^B
TRANSACTION 2173407, ACTIVE 42085 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 2, OS thread handle 140537284146944, query id 5316319 Delete_rows_log_event::find_row(278043)
DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP'

=====================================
2018-11-22 08:53:29 0x7fd0f9431700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 16 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 8323 srv_active, 0 srv_shutdown, 59026 srv_idle
srv_master_thread log flush and writes: 67349
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 212583
OS WAIT ARRAY INFO: signal count 659238
RW-shared spins 0, rounds 1190209, OS waits 78610
RW-excl spins 0, rounds 10315112, OS waits 63310
RW-sx spins 213949, rounds 3318375, OS waits 31377
Spin rounds per wait: 1190209.00 RW-shared, 10315112.00 RW-excl, 15.51 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 2173554
Purge done for trx's n:o < 2173504 undo n:o < 0 state: running but idle
History list length 3
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 422011681601128, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 422011681596232, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 2173426, ACTIVE 41969 sec
4 lock struct(s), heap size 1136, 129 row lock(s), undo log entries 258
MySQL thread id 17, OS thread handle 140537170228992, query id 5333660 kpcp-portalapp1.egis.hksarg 10.88.223.9 e3logexp10 Init
commit
TABLE LOCK table `pe3logex1`.`db_instance_restart_time_to_portal` trx id 2173426 lock mode IX
RECORD LOCKS space id 22 page no 3 n bits 272 index PRIMARY of table `pe3logex1`.`db_instance_restart_time_to_portal` trx id 2173426 lock_mode X locks rec but not gap
RECORD LOCKS space id 22 page no 4 n bits 256 index PRIMARY of table `pe3logex1`.`db_instance_restart_time_to_portal` trx id 2173426 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 12; compact format; info bits 32
0: len 8; hex 800000000001590c; asc Y ;;
1: len 6; hex 0000002129f2; asc !) ;;
2: len 7; hex 00000008230110; asc # ;;
3: len 5; hex 4353464953; asc CSFIS;;
4: len 2; hex 4b50; asc KP;;
5: len 10; hex 6373666973696e747031; asc csfisintp1;;
6: len 11; hex 703063736669736d646231; asc p0csfismdb1;;
7: len 2; hex 4b50; asc KP;;
8: len 5; hex 99a05b26c2; asc [& ;;
9: len 3; hex 52554e; asc RUN;;
10: len 5; hex 99a16b52c2; asc kR ;;
11: len 5; hex 99a16b5300; asc kS ;;
Record lock, heap no 3 PHYSICAL RECORD: n_fields 12; compact format; info bits 32
0: len 8; hex 800000000001590e; asc Y ;;
1: len 6; hex 0000002129f2; asc !) ;;
2: len 7; hex 0000000823013a; asc # :;;
3: len 4; hex 45504159; asc EPAY;;
4: len 2; hex 4b50; asc KP;;
5: len 9; hex 65706179696e747031; asc epayintp1;;
6: len 10; hex 7031657061796d646231; asc p1epaymdb1;;
7: len 2; hex 4b50; asc KP;;
8: len 5; hex 99a0480ade; asc H ;;
9: len 3; hex 52554e; asc RUN;;
10: len 5; hex 99a16b52c2; asc kR ;;
11: len 5; hex 99a16b5300; asc kS ;;

Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 32
0: len 8; hex 8000000000015910; asc Y ;;
1: len 6; hex 0000002129f2; asc !) ;;
2: len 7; hex 00000008230164; asc # d;;
3: len 5; hex 4f47455053; asc OGEPS;;
4: len 2; hex 4b50; asc KP;;
5: len 10; hex 6f67657073696e747031; asc ogepsintp1;;
6: len 12; hex 70306f676570736d64627332; asc p0ogepsmdbs2;;
7: len 2; hex 5350; asc SP;;
8: len 5; hex 99a04e0aed; asc N ;;
9: len 3; hex 52554e; asc RUN;;
10: len 5; hex 99a16b52c2; asc kR ;;
11: len 5; hex 99a16b5300; asc kS ;;

Chow King Tak added a comment - 2018-11-22 02:23 I am using version 10.3.9 but I just encountered the BF error. It blocked all transactions and I needed to restart the node. Is there any configuration in the my.cnf I have to adjust to avoid the error? The following are some of the error messages: 2018-11-22 8:53:28 0 [Note] InnoDB: WSREP: BF lock wait long for trx:0x2129df query: DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP'Â¢YÃµ[^S^B TRANSACTION 2173407, ACTIVE 42084 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s) MySQL thread id 2, OS thread handle 140537284146944, query id 5316319 Delete_rows_log_event::find_row(278043) DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP' 2018-11-22 8:53:29 0 [Note] InnoDB: WSREP: BF lock wait long for trx:0x2129df query: DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP'Â¢YÃµ[^S^B TRANSACTION 2173407, ACTIVE 42085 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s) MySQL thread id 2, OS thread handle 140537284146944, query id 5316319 Delete_rows_log_event::find_row(278043) DELETE FROM pe3logex1.tmp_app_to_db_status_report WHERE site='SP' ===================================== 2018-11-22 08:53:29 0x7fd0f9431700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 16 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 8323 srv_active, 0 srv_shutdown, 59026 srv_idle srv_master_thread log flush and writes: 67349 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 212583 OS WAIT ARRAY INFO: signal count 659238 RW-shared spins 0, rounds 1190209, OS waits 78610 RW-excl spins 0, rounds 10315112, OS waits 63310 RW-sx spins 213949, rounds 3318375, OS waits 31377 Spin rounds per wait: 1190209.00 RW-shared, 10315112.00 RW-excl, 15.51 RW-sx ------------ TRANSACTIONS ------------ Trx id counter 2173554 Purge done for trx's n:o < 2173504 undo n:o < 0 state: running but idle History list length 3 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 422011681601128, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 422011681596232, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 2173426, ACTIVE 41969 sec 4 lock struct(s), heap size 1136, 129 row lock(s), undo log entries 258 MySQL thread id 17, OS thread handle 140537170228992, query id 5333660 kpcp-portalapp1.egis.hksarg 10.88.223.9 e3logexp10 Init commit TABLE LOCK table `pe3logex1`.`db_instance_restart_time_to_portal` trx id 2173426 lock mode IX RECORD LOCKS space id 22 page no 3 n bits 272 index PRIMARY of table `pe3logex1`.`db_instance_restart_time_to_portal` trx id 2173426 lock_mode X locks rec but not gap RECORD LOCKS space id 22 page no 4 n bits 256 index PRIMARY of table `pe3logex1`.`db_instance_restart_time_to_portal` trx id 2173426 lock_mode X locks rec but not gap Record lock, heap no 2 PHYSICAL RECORD: n_fields 12; compact format; info bits 32 0: len 8; hex 800000000001590c; asc Y ;; 1: len 6; hex 0000002129f2; asc !) ;; 2: len 7; hex 00000008230110; asc # ;; 3: len 5; hex 4353464953; asc CSFIS;; 4: len 2; hex 4b50; asc KP;; 5: len 10; hex 6373666973696e747031; asc csfisintp1;; 6: len 11; hex 703063736669736d646231; asc p0csfismdb1;; 7: len 2; hex 4b50; asc KP;; 8: len 5; hex 99a05b26c2; asc [& ;; 9: len 3; hex 52554e; asc RUN;; 10: len 5; hex 99a16b52c2; asc kR ;; 11: len 5; hex 99a16b5300; asc kS ;; Record lock, heap no 3 PHYSICAL RECORD: n_fields 12; compact format; info bits 32 0: len 8; hex 800000000001590e; asc Y ;; 1: len 6; hex 0000002129f2; asc !) ;; 2: len 7; hex 0000000823013a; asc # :;; 3: len 4; hex 45504159; asc EPAY;; 4: len 2; hex 4b50; asc KP;; 5: len 9; hex 65706179696e747031; asc epayintp1;; 6: len 10; hex 7031657061796d646231; asc p1epaymdb1;; 7: len 2; hex 4b50; asc KP;; 8: len 5; hex 99a0480ade; asc H ;; 9: len 3; hex 52554e; asc RUN;; 10: len 5; hex 99a16b52c2; asc kR ;; 11: len 5; hex 99a16b5300; asc kS ;; Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 32 0: len 8; hex 8000000000015910; asc Y ;; 1: len 6; hex 0000002129f2; asc !) ;; 2: len 7; hex 00000008230164; asc # d;; 3: len 5; hex 4f47455053; asc OGEPS;; 4: len 2; hex 4b50; asc KP;; 5: len 10; hex 6f67657073696e747031; asc ogepsintp1;; 6: len 12; hex 70306f676570736d64627332; asc p0ogepsmdbs2;; 7: len 2; hex 5350; asc SP;; 8: len 5; hex 99a04e0aed; asc N ;; 9: len 3; hex 52554e; asc RUN;; 10: len 5; hex 99a16b52c2; asc kR ;; 11: len 5; hex 99a16b5300; asc kS ;;

Sergei Golubchik made changes - 2021-12-06 21:47

Workflow

MariaDB v3 [ 86446 ]

MariaDB v4 [ 154125 ]

peng gao added a comment - 2022-09-16 01:59

Hi All:

Like Chow King Tak, we also use mariadb 10.3.9 with gelera, one node stuck for ever.finally, we restart cluster,
errlog like ,

2022-09-14 9:33:43 59 [Note] InnoDB: WSREP: BF lock wait long for trx:0x12a95746f query: insert into *** () TRANSACTION 5009405039, ACTIVE 3706 sec inserting
mysql tables in use 1, locked 1
2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1
MySQL thread id 59, OS thread handle 139771672282880, query id 2495060706 Write_rows_log_event::write_row(842660726)
insert into *** ()

Here insert transaction stuck 3706s,state is inserting .

Have any fixed here ? Thanks..

peng gao added a comment - 2022-09-16 01:59 Hi All: Like Chow King Tak, we also use mariadb 10.3.9 with gelera, one node stuck for ever.finally, we restart cluster, errlog like , 2022-09-14 9:33:43 59 [Note] InnoDB: WSREP: BF lock wait long for trx:0x12a95746f query: insert into *** () TRANSACTION 5009405039, ACTIVE 3706 sec inserting mysql tables in use 1, locked 1 2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1 MySQL thread id 59, OS thread handle 139771672282880, query id 2495060706 Write_rows_log_event::write_row(842660726) insert into *** () Here insert transaction stuck 3706s,state is inserting . Have any fixed here ? Thanks..

Jira Automation (IT) made changes - 2024-07-04 07:39

Zendesk Related Tickets

132121 115763

People

Assignee:: Jan Lindström (Inactive)

Reporter:: Reinhard Sojka

Votes:: 1 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 2018-04-09 11:22

Updated:: 2024-07-08 00:05

Resolved:: 2018-07-25 05:25

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration