[MDEV-12837] WSREP: BF lock wait long Created: 2017-05-18  Updated: 2020-10-06  Resolved: 2017-12-07

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, wsrep
Affects Version/s: 10.2.5
Fix Version/s: 10.1.35, 10.2.17, 10.3.9

Type: Bug Priority: Critical
Reporter: Ranjan Ghosh Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 19
Labels: None
Environment:

Ubuntu 16.04.2 LTS (Xenial)


Attachments: File mysql_backtrace.log    
Issue Links:
PartOf
includes MDEV-14401 Stored procedure that declares a hand... Closed
Problem/Incident
is caused by MDEV-11039 Add new scheduling algorithm for redu... Closed
Relates
relates to MDEV-14793 Document innodb-lock-schedule-algorit... Closed
relates to MDEV-15747 WSREP: BF lock wait long for trx Closed
relates to MDEV-15154 WSREP: BF lock wait long after a TRUN... Closed
relates to MDEV-16664 InnoDB: Failing assertion: !other_loc... Closed

 Description   

Hi there,

Every few days our cluster stalls. You always know doomsday has arrived, when this message appears in the log:

May 17 06:46:25 uhu1 mysqld[27227]: WSREP: BF lock wait long

This message then repeats a lot and other weird stuff finally appears (see below). The cluster never recovers. We always need to restart all nodes. I find this especially surprising, since there are various timeouts defined. If there is a deadlock, I wonder why it doesnt resolve automatically after some time by aborting with some error message. Googling for this message to find out whether it might be an application problem didnt yield any results either.

May 17 06:46:26 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:27 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:27 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:28 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:29 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:30 uhu1 mysqld[27227]: message repeated 6 times: [ WSREP: BF lock wait long]
May 17 06:46:31 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:32 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:33 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:34 uhu1 mysqld[27227]: message repeated 8 times: [ WSREP: BF lock wait long]
May 17 06:46:35 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:36 uhu1 mysqld[27227]: message repeated 2 times: [ WSREP: BF lock wait long]
May 17 06:46:37 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:38 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:39 uhu1 mysqld[27227]: message repeated 2 times: [ WSREP: BF lock wait long]
May 17 06:46:40 uhu1 mysqld[27227]: WSREP: BF lock wait long
May 17 06:46:40 uhu1 mysqld[27227]: =====================================
May 17 06:46:40 uhu1 mysqld[27227]: 2017-05-17 06:46:40 0x7f52c3cdd700 INNODB MONITOR OUTPUT
May 17 06:46:40 uhu1 mysqld[27227]: =====================================
May 17 06:46:40 uhu1 mysqld[27227]: Per second averages calculated from the last 16 seconds
May 17 06:46:40 uhu1 mysqld[27227]: -----------------
May 17 06:46:40 uhu1 mysqld[27227]: BACKGROUND THREAD
May 17 06:46:40 uhu1 mysqld[27227]: -----------------
May 17 06:46:40 uhu1 mysqld[27227]: srv_master_thread loops: 570080 srv_active, 0 srv_shutdown, 120059 srv_idle
May 17 06:46:40 uhu1 mysqld[27227]: srv_master_thread log flush and writes: 690137
May 17 06:46:40 uhu1 mysqld[27227]: ----------
May 17 06:46:40 uhu1 mysqld[27227]: SEMAPHORES
May 17 06:46:40 uhu1 mysqld[27227]: ----------
May 17 06:46:40 uhu1 mysqld[27227]: OS WAIT ARRAY INFO: reservation count 2567993
May 17 06:46:40 uhu1 mysqld[27227]: OS WAIT ARRAY INFO: signal count 1741581
May 17 06:46:40 uhu1 mysqld[27227]: RW-shared spins 0, rounds 1276685, OS waits 365942
May 17 06:46:40 uhu1 mysqld[27227]: RW-excl spins 0, rounds 16000314, OS waits 198061
May 17 06:46:40 uhu1 mysqld[27227]: RW-sx spins 210079, rounds 4201641, OS waits 50893
May 17 06:46:40 uhu1 mysqld[27227]: Spin rounds per wait: 1276685.00 RW-shared, 16000314.00 RW-excl, 20.00 RW-sx
May 17 06:46:40 uhu1 mysqld[27227]: ------------------------
May 17 06:46:40 uhu1 mysqld[27227]: LATEST DETECTED DEADLOCK
May 17 06:46:40 uhu1 mysqld[27227]: ------------------------
May 17 06:46:40 uhu1 mysqld[27227]: 2017-05-17 03:24:35 0x7f52add94700
May 17 06:46:40 uhu1 mysqld[27227]: *** (1) TRANSACTION:
May 17 06:46:40 uhu1 mysqld[27227]: TRANSACTION 295857376, ACTIVE 0 sec inserting
May 17 06:46:40 uhu1 mysqld[27227]: mysql tables in use 1, locked 1
May 17 06:46:40 uhu1 mysqld[27227]: LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
May 17 06:46:40 uhu1 mysqld[27227]: MySQL thread id 1704966, OS thread handle 0x7f52ae768700, query id 400303172 localhost 127.0.0.1 eposservicescom update
May 17 06:46:40 uhu1 mysqld[27227]: INSERT INTO semaphore (name, value, expire) VALUES ('state:Drupal\\Core\\Cache\\CacheCollector', '1193996601591ba65393eb33.09632175', '1494984305.6058')
May 17 06:46:40 uhu1 mysqld[27227]: *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
May 17 06:46:40 uhu1 mysqld[27227]: RECORD LOCKS space id 4205 page no 3 n bits 72 index PRIMARY of table "eposservicescom"."semaphore" trx id 295857376 lock_mode X locks rec but not gap waiting
May 17 06:46:40 uhu1 mysqld[27227]: Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
May 17 06:46:40 uhu1 mysqld[27227]:  0: len 30; hex 73746174653a44727570616c5c436f72655c43616368655c436163686543; asc state:Drupal\Core\Cache\CacheC; (total 38 bytes);
May 17 06:46:40 uhu1 mysqld[27227]:  1: len 6; hex 000011a26cb5; asc     l ;;
May 17 06:46:40 uhu1 mysqld[27227]:  2: len 7; hex 40000006a82965; asc @    )e;;
May 17 06:46:40 uhu1 mysqld[27227]:  3: len 30; hex 36353632373439313735393162613635316337623362302e313335333631; asc 656274917591ba651c7b3b0.135361; (total 32 bytes);
May 17 06:46:40 uhu1 mysqld[27227]:  4: len 8; hex 4850f49be946d641; asc HP   F A;;
May 17 06:46:40 uhu1 mysqld[27227]: *** (2) TRANSACTION:
May 17 06:46:40 uhu1 mysqld[27227]: TRANSACTION 295857375, ACTIVE 0 sec inserting
May 17 06:46:40 uhu1 mysqld[27227]: mysql tables in use 1, locked 1
May 17 06:46:40 uhu1 mysqld[27227]: 3 lock struct(s), heap size 1136, 2 row lock(s)
May 17 06:46:40 uhu1 mysqld[27227]: MySQL thread id 1704967, OS thread handle 0x7f52add94700, query id 400303173 localhost 127.0.0.1 eposservicescom update
May 17 06:46:40 uhu1 mysqld[27227]: INSERT INTO semaphore (name, value, expire) VALUES ('state:Drupal\\Core\\Cache\\CacheCollector', '1474565599591ba65393ebe9.44223691', '1494984305.6058')
May 17 06:46:40 uhu1 mysqld[27227]: *** (2) HOLDS THE LOCK(S):
May 17 06:46:40 uhu1 mysqld[27227]: RECORD LOCKS space id 4205 page no 3 n bits 72 index PRIMARY of table "eposservicescom"."semaphore" trx id 295857375 lock mode S locks rec but not gap
May 17 06:46:40 uhu1 mysqld[27227]: Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
May 17 06:46:40 uhu1 mysqld[27227]:  0: len 30; hex 73746174653a44727570616c5c436f72655c43616368655c436163686543; asc state:Drupal\Core\Cache\CacheC; (total 38 bytes);
May 17 06:46:40 uhu1 mysqld[27227]:  1: len 6; hex 000011a26cb5; asc     l ;;
May 17 06:46:40 uhu1 mysqld[27227]:  2: len 7; hex 40000006a82965; asc @    )e;;
May 17 06:46:40 uhu1 mysqld[27227]:  3: len 30; hex 36353632373439313735393162613635316337623362302e313335333631; asc 656274917591ba651c7b3b0.135361; (total 32 bytes);
May 17 06:46:40 uhu1 mysqld[27227]:  4: len 8; hex 4850f49be946d641; asc HP   F A;;
May 17 06:46:40 uhu1 mysqld[27227]: *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
May 17 06:46:40 uhu1 mysqld[27227]: RECORD LOCKS space id 4205 page no 3 n bits 72 index PRIMARY of table "eposservicescom"."semaphore" trx id 295857375 lock_mode X locks rec but not gap waiting
May 17 06:46:40 uhu1 mysqld[27227]: Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
May 17 06:46:40 uhu1 mysqld[27227]:  0: len 30; hex 73746174653a44727570616c5c436f72655c43616368655c436163686543; asc state:Drupal\Core\Cache\CacheC; (total 38 bytes);
May 17 06:46:40 uhu1 mysqld[27227]:  1: len 6; hex 000011a26cb5; asc     l ;;
May 17 06:46:40 uhu1 mysqld[27227]:  2: len 7; hex 40000006a82965; asc @    )e;;
May 17 06:46:40 uhu1 mysqld[27227]:  3: len 30; hex 36353632373439313735393162613635316337623362302e313335333631; asc 656274917591ba651c7b3b0.135361; (total 32 bytes);
May 17 06:46:40 uhu1 mysqld[27227]:  4: len 8; hex 4850f49be946d641; asc HP   F A;;
May 17 06:46:40 uhu1 mysqld[27227]: *** WE ROLL BACK TRANSACTION (2)



 Comments   
Comment by Michael Weber [ 2017-05-24 ]

Hello all.
Since the update from 10.1 to 10.2.6 yesterday we are facing the same issue "WSREP: BF lock wait long".
The whole cluster is stuck after that. we only can bring it back if we restart the node with the "WSREP: BF lock wait long".
Note: we are using Debian 3.16.39-1+deb8u2

May 23 17:39:00 he-rz22-sql01 mysqld[14930]: WSREP: BF lock wait long
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: WSREP: BF lock wait long
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: =====================================
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: 2017-05-23 17:39:01 0x7f7365ba3700 INNODB MONITOR OUTPUT
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: =====================================
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: Per second averages calculated from the last 16 seconds
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: -----------------
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: BACKGROUND THREAD
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: -----------------
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: srv_master_thread loops: 632 srv_active, 0 srv_shutdown, 788 srv_idle
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: srv_master_thread log flush and writes: 1420
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ----------
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: SEMAPHORES
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ----------
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: OS WAIT ARRAY INFO: reservation count 11612
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: OS WAIT ARRAY INFO: signal count 7425
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: RW-shared spins 0, rounds 1553, OS waits 655
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: RW-excl spins 0, rounds 4065, OS waits 84
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: RW-sx spins 392, rounds 3750, OS waits 34
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: Spin rounds per wait: 1553.00 RW-shared, 4065.00 RW-excl, 9.57 RW-sx
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ------------
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: TRANSACTIONS
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ------------
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: Trx id counter 246763323
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: Purge done for trx's n:o < 246762955 undo n:o < 0 state: running but idle
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: History list length 35
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: LIST OF TRANSACTIONS FOR EACH SESSION:
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ---TRANSACTION 421620296599400, not started
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: 0 lock struct(s), heap size 1136, 0 row lock(s)
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ---TRANSACTION 246763322, ACTIVE 3 sec starting index read
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: mysql tables in use 1, locked 1
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: MySQL thread id 2113, OS thread handle 0x7f735eac8700, query id 47052 172.16.0.12 USER updating
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: UPDATE `devices` SET `updated_at`=1495553938, `last_contact`=1495553938 WHERE `id`=145
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: ------- TRX HAS BEEN WAITING 3 SEC FOR THIS LOCK TO BE GRANTED:
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: RECORD LOCKS space id 851 page no 5 n bits 128 index PRIMARY of table `DATABASE`.`devices` trx id 2467633
22 lock_mode X locks rec but not gap waiting
May 23 17:39:01 he-rz22-sql01 mysqld[14930]: Record lock, heap no 18 PHYSICAL RECORD: n_fields 54; compact format; info bits 0

Comment by Ranjan Ghosh [ 2017-06-08 ]

Does anybody at least have an idea what causes this? Our cluster somtimes stalls multiple times a day and needs to be restarted which is very, very ugly. I can't say for sure but I have the feeling this bug has even been introducted in recent versions (10.2.3, 10.2.4 ... ?). If I only knew a workaround or some setting that made this error at least less probable too appear. Right now, it's not really usable. I'm even considering turning off Galera and let MariaDB run as a single node for the time being. My current config in case that's of any help:

[mysqld]
 
# Enough buffer for data, indexes etc
innodb_buffer_pool_size=4G
 
# Do not wait too long for locks - connections might build up (Especially important for Drupal semaphore table)
innodb_lock_wait_timeout=10
 
#Recommended by Drupal
max_allowed_packet=512M
 
#Try to prevent crashes
skip-log-bin
 
# Purge faster :-)
innodb_purge_threads = 16
 
# Replicate faster :-)
wsrep_slave_threads = 32
 
# Reduce dead-locks
transaction-isolation = READ-COMMITTED
 
#Query Cache for faster performance
query_cache_type = 1
query_cache_size = 128M # Should not be too large to avoid cache contention
 
#mysql settings mandatory for galera
binlog_format=ROW
binlog_row_image=MINIMAL
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
innodb_flush_log_at_trx_commit=0
 
# Accept up to 300 incoming connections but use the thread pool to prevent too many simultaneous connections actually working at the same time
max_connections=300
thread-handling=pool-of-threads
thread_pool_size=32
extra-port=3307
extra-max-connections=1
 
#galera settings
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="uhu_cluster"
wsrep_cluster_address="gcomm://node1,node2"
wsrep_node_address=192.168.0.1
wsrep_provider_options='gcache.size=2G'   
 
enforce_storage_engine=InnoDB
sql_mode=NO_AUTO_CREATE_USER

Comment by Michael Weber [ 2017-06-09 ]

Bad News: we still got this issue.
At the moment i configured the Loadbalancer to send requests only to 1 node. that seems to work for a while. since 3 days we had no crash.
waiting for a update/patch..

Comment by Ranjan Ghosh [ 2017-06-16 ]

Unfortunately, even if I send requests only to 1 node, our cluster still crashes. Currently, I've got only 1 single node running. This defeats the whole purpose why we are using MariaDB/Galera in the first place, though. It is, however, stable.

When researching this bug, I find many different anecdotal reports over the years:

None with a real solution or any explanation on why (i.e. under which circumstances) this happens. The error seems very random to me. Sometimes it happens at night when there is hardly any load on the server at all. All reports have in common that the database never seems to recover by itself. In all reported cases a restart seemed to be required. Therefore, at the very least, I guess the database should just shutdown (signal 11) in case this error happens. This would be far better than the current situation where the database is happily accepting connections which then subsequently just hang/freeze forever. I think this bug is particularly evil just because of that said property: We use Galera in particular to have an automatic failover if sth. goes wrong. This bug stalls the database and because it isn't properly shutdown, the second node cannot take over and hangs as well.

Comment by Daniel Black [ 2017-06-18 ]

Please try to get a backtrace

Also what galera version are you running? Have you tried 25.3.25?

edit: camera -> galera (groan - phone autospell)

Comment by Ranjan Ghosh [ 2017-06-19 ]

You meant Galera version, I assume? Well, I use the one that comes with the OS (Ubuntu) AFAICT:

~ # dpkg -l | grep galera
ii  galera-3                             25.3.19-xenial                       amd64        Replication framework for transactional applications

I've also added the MariaDB PPAs but the most recent version I can find there is 25.3.20. How can I upgrade to a newer version? Is there a separate PPA?

Futhermore, I understand why you'd like to have a backtrace, although it's obviously very, very difficult to get that in this case. I'd first need to enable the second node again. Then I'd need to compile a separate MariaDB version. Then wait until it hangs. It doesnt crash. It only hangs. Sometimes this is only once every 1-2 days. Sometimes in the very early morning hours. Then I would need to send it a "kill -6" if I understand correctly and grab the backtrace. As this is a live server with important websites on it not an easy feat. I'll need sometime to make that possible. In the mean time: Any insights into why this might happen would be much appreciated. Right now, I'm totally in the dark on what things I should avoid or watch out for...

Comment by Ranjan Ghosh [ 2017-06-27 ]

OK, I wrote a small BASH script as an intermediate step (see below, SystemD required). Perhaps this is also useful to others. I will add this to the crontab to run every minute. It will watch the server if the problem occurs and shut down the node. This makes it hopefully at least possible again to run with two nodes unsupervised, because the system should recover automatically. Now, next step: Compile MariaDB with debug symbols. Let's see how this works out...

#!/bin/bash
 
# Dont do anything if MariaDB isnt running or is already in the process of shutting down
if ! (systemctl status mariadb | grep --quiet 'active (running)'); then
        echo "MariaDB not running";
        exit;
fi
 
# Get the start date/time of MariaDB and the date/time of 10 minutes ago
ts_start=$(systemctl show -p ExecMainStartTimestamp mariadb | cut -d "=" -f 2 | awk '{print $2 " " $3}')
ts_ago=$(date --date="10 minutes ago" +"%Y-%m-%d %H:%M:%S")
 
# Use whatever is greater as a log start date (i.e: Get the logs only for the current run but only the most recent 10 minutes)
if [[ "$ts_start" > "$ts_ago" ]]; then
        ts=$ts_start;
else
        ts=$ts_ago;
fi
 
# If the log contains "BF lock wait long" the database is frozen and needs to be shutdown.
if (journalctl -u mariadb --since "$ts" | grep --quiet "BF lock wait long"); then
        echo "BF lock wait detected! Shutting down MariaDB.";
        systemctl --no-block stop mariadb
fi

Comment by Ranjan Ghosh [ 2017-06-27 ]

@danblack: I hope it will be sufficient to build MariaDB with just the additional debug symbols to get the backtrace, right? I won't need the 30% slower full-debug, version? Obviously, I'd like to avoid that as it is a production server. I'm referring to this page here:

https://mariadb.com/kb/en/mariadb/compiling-mariadb-for-debugging/

Comment by Daniel Black [ 2017-06-27 ]

There are debug symbol packages for the existing install in the mariadb repos that should be installable. i.e.:
http://mirror.aarnet.edu.au/pub/MariaDB/repo/10.2/ubuntu/pool/main/m/mariadb-10.2/mariadb-server-core-10.2-dbgsym_10.2.6+maria~xenial_amd64.deb

Don't forget galera syms too:
http://mirror.aarnet.edu.au/pub/MariaDB/repo/10.2/ubuntu/pool/main/g/galera-3/galera-3-dbg_25.3.20-xenial_amd64.deb

Otherwise use -DCMAKE_BUILD_TYPE=RelWithDebInfo and it won't be any slower.

Nice script. An alternate way:

journalctl -u mariadb -f | while read line
do
  if [[ $line =~ "BF lock wait long" ]] ....
done

Comment by Ranjan Ghosh [ 2017-07-05 ]

OK, I've now installed the dbgsym packages. I wondered if I needed to restart the DB to actually "enable" the symbols. Well, I did it anyway just to be sure.
D'oh - your script is of course waaaay shorter than mine Thanks for sharing. I've added the line to create the backtrace I got from here:

https://mariadb.com/kb/en/mariadb/how-to-produce-a-full-stack-trace-for-mysqld/#getting-a-stack-trace-for-a-running-mysqld-server

Although, I think the docs were a bit wrong/misleading (didnt work for me) and "program" should really be "-p". I now have this line to get the backtrace:

gdb --batch --eval-command="thread apply all bt" -p $(pgrep -xn mysqld) > /root/mysql_backtrace.log

When I ran it, it created a backtrace most of the time. Sometimes, though, it did just hang at the end and I had to kill gdb Perhaps I was impatient. Hope this wont happen during the BFLockWait.

OK, now that everything's setup I guess we'll just have to wait until the rare striped tiger runs in front of the camera :-/
Thanks again, @danblack for all the helpful infos. It's much appreciated.

Comment by Ranjan Ghosh [ 2017-07-08 ]

Ha! Finally the crash happened. The script worked perfectly. The DB was shut down. The backtrace was automatically created. Here it is:

mysql_backtrace.log

Comment by Ralf Fischer [ 2017-07-13 ]

Exactly the same here: Just upgraded a 3 nodes cluster from 10.1.25 to 10.2.7. After 4 minutes i got: WSREP: BF lock wait long and the cluster hangs for all Updates/Inserts until restarting the hanging one. After less then 5 Minutes, got the same message again. Any good hints to downgrade back to 10.1?

Debian 7.1,
Linux mariadb-node1 3.2.0-4-amd64 #1 SMP Debian 3.2.81-2 x86_64 GNU/Linux

Thanks in advance

Comment by Tomislav Mihaldinec [ 2017-07-15 ]

Same issue here.. but on Centos 7.3

Linux taskdb02 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
KVM

Jul 14 14:50:22 vmtaskdb03 mysqld: mysql tables in use 1, locked 1
Jul 14 14:50:22 vmtaskdb03 mysqld: 5 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
Jul 14 14:50:22 vmtaskdb03 mysqld: MySQL thread id 2027, OS thread handle 0x7f5196e33700, query id 30214 10.232.10.31 taskitud_ianis query end
Jul 14 14:50:22 vmtaskdb03 mysqld: INSERT INTO `mw_campaign_track_open` (`campaign_id`, `subscriber_id`, `ip_address`, `user_agent`, `date_added`) VALUES (6351, 2314769, '90.9.29.141', 'Mozilla/5.0 (Windows NT 10.0; Win6
4; x64; rv:54.0) Gecko/20100101 Firefox/54.0', NOW())
Jul 14 14:50:22 vmtaskdb03 mysqld: TABLE LOCK table `taskitud_database`.`mw_campaign_track_open` trx id 16643769 lock mode IX
Jul 14 14:50:22 vmtaskdb03 mysqld: TABLE LOCK table `taskitud_database`.`mw_campaign` trx id 16643769 lock mode IS
Jul 14 14:50:22 vmtaskdb03 mysqld: RECORD LOCKS space id 8 page no 76 n bits 120 index PRIMARY of table `taskitud_database`.`mw_campaign` trx id 16643769 lock mode S locks rec but not gap
Jul 14 14:50:45 vmtaskdb03 journal: Suppressed 5655 messages from /system.slice/mariadb.service
Jul 14 14:50:45 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:46 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:47 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:48 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:49 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:50 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:51 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:52 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:53 vmtaskdb03 mysqld: WSREP: BF lock wait long
Jul 14 14:50:54 vmtaskdb03 mysqld: WSREP: BF lock wait long

thanks

Comment by Chris Anderson [ 2017-07-28 ]

We have also received this error running on RedHat 7.3, MariaDB-server-10.2.7-1.el7, and galera-25.3.20-1.rhel7. The error only occurred on a system with high load and did not occur in the testing environments that have a much lower load. Does this bug affect all of 10.2? Does this bug affect any 10.1.x release?

Comment by Ranjan Ghosh [ 2017-07-28 ]

Well, I'm only a poor user - no MariaDB guru, but what I can say is this:

1) There are some anecdotal reports of people upgrading from 10.1.x. to 10.2.x and they suddently encountered this bug. This suggests that the error probably didnt happen on 10.1.

2) I, myself, never used 10.1 with Galera, so I cant tell for sure. But I've been using Galera with many earlier 10.2.x versions and always had this bug as far as I can remember. This suggests that all of 10.2 is probably affected or at least it won't help to downgrade say from 10.2.7 to 10.2.5 or sth. like that as a work-around.

3) Like stated in one of my previous comments, when researching this message on the web, it seems that this message does indeed show up in many varieties/versions of MySQL over the years. This suggests that this message might only be a symptom that has/had multiple causes. I assume though, that the current influx of error reports on this have all the same underlying reason.

Up until know, I have not the slightest idea why it happens. The word "lock" in the error message, however, seems to suggest that it has sth. to do with locking problems. That is, the more InnoDB locking is going on, the more conflicts/deadlocks whatever occur and the more likely could it be that the error shows up. This is perhaps why you're experiencing the bug under high load. Just wildly guessing here, of course. On our server, the bug happens at arbitrary times. Sometimes in the middle of the night. Not necessarily under exceptionally high load.

In a previous comment, I posted my mysql configuration. I'd like to know if someone sees sth. striking when comparing it to their config Some correlation - anything.

I find it strange that there seems to be not much concern about this bug and the current MariaDB version is even considered "stable". For us, it renders the whole Galera feature totally useless - it's a huge problem. We are currently running without proper failover. The reason for this calm is probably: There are only very few people who are affected by this. Even though I assume there is a sizable population out there in the world who uses a current MariaDB version with Galera - especially now where it's out of beta. This leads to the question: What are they doing differently than us? I'd really like to find out, because there doesnt seem to be much progress on this otherwise.

Comment by Chris Anderson [ 2017-07-28 ]

Thanks Ranjan, for your response. We also upgraded from an older version of 10.1.x. We were trying to fix another bug and then ran into this much bigger bug. Sorry, I didn't include that info. We are trying to assess downgrade and want to be above 10.1.23 if possible, but we would like to know before hand if the bug exists instead of going through all the work of a downgrade and then being surprised by the same bug that we are trying to get away from. Hopefully, we can get a response from MariaDB to confirm this.

Comment by Ralf Fischer [ 2017-07-28 ]

We used a lot of 10.1 versions up to 10.1.25 in a three node cluster and never had that error. After upgrading to 10.1.7 we got that error every 5-15 minutes.

Comment by Chris Anderson [ 2017-07-28 ]

Thanks Ralf, that is very helpful

Comment by Chris Anderson [ 2017-07-28 ]

Also, Ralf, just to clarify. You got this bug after you upgraded to 10.2.7, correct? You said 10.1.7 and it may just be a typo

Comment by Ralf Fischer [ 2017-07-28 ]

Yes, sorry Upgraded from 10.1.25 to 10.2.7...

Comment by Michael Weber [ 2017-07-28 ]

note: we used serverals versions of mariadb in the major release of 10.1. that issue never occured.
After upgrading to 10.2 we are facing this issue.
so i think a downgrad to 10.1 can be a workaround.

Comment by Daniel Hawton [ 2017-08-03 ]

I too am getting this issue.. fresh installs of 10.2 no upgrade.

Comment by Vladimir Titov [ 2017-08-16 ]

Got the same issue with a fresh install of mariadb-server v. 10.2.7 from mariadb repo. The server software is Ubuntu 16.04
Some related logs here

Aug 15 22:38:22 mariadb mysqld[13566]: WSREP: BF lock wait long
Aug 15 22:38:27 mariadb mysqld[13566]: message repeated 5 times: [ WSREP: BF lock wait long]
Aug 15 22:38:27 mariadb mysqld[13566]: =====================================
Aug 15 22:38:27 mariadb mysqld[13566]: 2017-08-15 22:38:27 0x7fbfdb84d700 INNODB MONITOR OUTPUT
Aug 15 22:38:27 mariadb mysqld[13566]: =====================================
```

Basically log the same as @Michael Veber posted

Comment by Nathan Bohman [ 2017-08-22 ]

Could this perhaps be related to MDEV-13331? I am experiencing the same issue and noticed that MDEV-13331 was introduced in 10.2.2 and fixed in 10.2.8. I find that I hit the bug if the application tied to our DB (JIRA) attempts to make a ton of updates to the same row at the same time resulting in a row lock. It would make sense if update got a similar patch as drop from upstream and was ignoring innodb_lock_wait_timeout.

Comment by Ranjan Ghosh [ 2017-08-22 ]

Very interesting idea. Makes a whole lot of sense to me. There is a DELETE CASCADE. It locks. It never times out due to the bug. The lock is never released. More and more stuff that's supposed to be replicated is piling up due to the lock. Boom!

Unfortunately I cannot upgrade our cluster to 10.2.8 at the moment - perhaps in the forthcoming weeks. Would be really helpful, though, if someone else could verify this earlier. Especially those, whose cluster is shutting down after a few minutes. With our cluster, I'd probably have to wait for days to see if the error is really resolved or not.

Comment by Ranjan Ghosh [ 2017-08-22 ]

Ah. Bummer! I used this call on my database:

select     concat(table_name, '.', column_name) as 'foreign key',       
concat(referenced_table_name, '.', referenced_column_name) as 'references' 
from     information_schema.key_column_usage 
where     referenced_table_name is not null;  

This is supposed to show all tables with foreign keys. Unfortunately, there are none. So MDEV-13331 can at least not be the only reason AFAICT, because my DB keeps crashing anyway

Comment by Ralf Fischer [ 2017-08-24 ]

Just checked 10.2.8 with our 3 nodes cluster and as soon as we write on two or more nodes instead of only one, we got the same error as with the other 10.2 version.

Aug 24 07:22:46 mysql-node2 mysqld: WSREP: BF lock wait long
Aug 24 07:22:46 mysql-node2 mysqld:
Aug 24 07:22:46 mysql-node2 mysqld: =====================================
Aug 24 07:22:46 mysql-node2 mysqld: 2017-08-24 07:22:46 0x7f4e89a4b700 INNODB MONITOR OUTPUT
Aug 24 07:22:46 mysql-node2 mysqld: =====================================
Aug 24 07:22:46 mysql-node2 mysqld: Per second averages calculated from the last 15 seconds
Aug 24 07:22:46 mysql-node2 mysqld: -----------------
Aug 24 07:22:46 mysql-node2 mysqld: BACKGROUND THREAD
Aug 24 07:22:46 mysql-node2 mysqld: -----------------
Aug 24 07:22:46 mysql-node2 mysqld: srv_master_thread loops: 40245 srv_active, 0 srv_shutdown, 25086 srv_idle
Aug 24 07:22:46 mysql-node2 mysqld: srv_master_thread log flush and writes: 65331
Aug 24 07:22:46 mysql-node2 mysqld: ----------
Aug 24 07:22:46 mysql-node2 mysqld: SEMAPHORES
Aug 24 07:22:46 mysql-node2 mysqld: ----------
Aug 24 07:22:46 mysql-node2 mysqld: OS WAIT ARRAY INFO: reservation count 520504
Aug 24 07:22:46 mysql-node2 mysqld: OS WAIT ARRAY INFO: signal count 1030299
Aug 24 07:22:46 mysql-node2 mysqld: RW-shared spins 0, rounds 1439898, OS waits 149766
Aug 24 07:22:46 mysql-node2 mysqld: RW-excl spins 0, rounds 14688668, OS waits 112658
Aug 24 07:22:46 mysql-node2 mysqld: RW-sx spins 410293, rounds 5591848, OS waits 49179
Aug 24 07:22:46 mysql-node2 mysqld: Spin rounds per wait: 1439898.00 RW-shared, 14688668.00 RW-excl, 13.63 RW-sx
Aug 24 07:22:46 mysql-node2 mysqld: ------------------------
Aug 24 07:22:46 mysql-node2 mysqld: LATEST FOREIGN KEY ERROR
Aug 24 07:22:46 mysql-node2 mysqld: ------------------------
Aug 24 07:22:46 mysql-node2 mysqld: 2017-08-24 06:50:26 0x7f53fe4e9700 Transaction:
Aug 24 07:22:46 mysql-node2 mysqld: TRANSACTION 3020224898, ACTIVE 0 sec inserting
Aug 24 07:22:46 mysql-node2 mysqld: mysql tables in use 2, locked 2
Aug 24 07:22:46 mysql-node2 mysqld: 5 lock struct(s), heap size 1136, 2 row lock(s)
Aug 24 07:22:46 mysql-node2 mysqld: MySQL thread id 2, OS thread handle 0x7f53fe4e9700, query id 5813680 copy to tmp table
Aug 24 07:22:46 mysql-node2 mysqld: ALTER TABLE `catalog_product_flat_1` ADD CONSTRAINT `FK_CAT_PRD_FLAT_1_ENTT_ID_CAT_PRD_ENTT_ENTT_ID` FOREIGN KEY (`entity_id`) REFERENCES `catalog_product_entity` (`entity_id`) ON DELETE CASCADE ON UPDATE CASCADE
Aug 24 07:22:46 mysql-node2 mysqld: Foreign key constraint fails for table `magento_felt_b2b`.`#sql-4c6d_2`:
Aug 24 07:22:46 mysql-node2 mysqld: ,
Aug 24 07:22:46 mysql-node2 mysqld: CONSTRAINT `FK_CAT_PRD_FLAT_1_ENTT_ID_CAT_PRD_ENTT_ENTT_ID` FOREIGN KEY (`entity_id`) REFERENCES `catalog_product_entity` (`entity_id`) ON DELETE CASCADE ON UPDATE CASCADE in parent table, in index PRIMARY tuple:

Comment by Nathan Bohman [ 2017-08-24 ]

Sorry, I should have been more clear. 10.2.8 fixed a bug that was pulled in from upstream MySQL that caused MariaDB to ignore timeouts for DROP operations. I'm wondering if a similar piece of code was pulled in at the same time for other operations like UPDATE and SELECT?

Comment by Ranjan Ghosh [ 2017-08-24 ]

Well, the issue issue you mentioned earlier, MDEV-13331, is only about DELETE CASCADE with FOREIGN KEYS (FK). At least this is what it says in the title. And this is, indeed, the bug Ralf send the logs from. Although it seems to happen to him with 10.2.8 as well (@Ralf: Perhaps you'd like to mention the fact that it's still not fixed for you over there as well). For my part, I'm not aware of using any foreign keys at all (see my previous comment how I figured this out). So, this bug here seems to be at least not exclusively about foreign keys. I now understand your idea is: Try to find some similar pulled in code for other operations... It's difficult to find those, though, without knowing specifically what exactly we're search for...

Comment by Dutchak Vitalij [ 2017-08-29 ]

Same bug after upgrade to 10.2.7 version.

Comment by Dutchak Vitalij [ 2017-08-29 ]

Does 10.2.8 have same issue?

Comment by Michael Neumann [ 2017-09-19 ]

I can confirm, that the problem still exists in 10.2.8:

Sep 19 08:34:50 dbm-ace-live-data-03 mysqld: 2017-09-19  8:34:50 140484532815616 [Warning] WSREP: Failed to report last committed 10768816, -4 (Interrupted system call)
Sep 19 08:34:54 dbm-ace-live-data-03 mysqld: 2017-09-19  8:34:54 140484532815616 [Warning] WSREP: Failed to report last committed 10768818, -4 (Interrupted system call)
Sep 19 08:35:01 dbm-ace-live-data-03 CRON[1269]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Sep 19 08:35:02 dbm-ace-live-data-03 mysqld: 2017-09-19  8:35:02 140484532815616 [Warning] WSREP: Failed to report last committed 10768925, -4 (Interrupted system call)
Sep 19 08:35:51 dbm-ace-live-data-03 mysqld: WSREP: BF lock wait long
Sep 19 08:36:00 dbm-ace-live-data-03 mysqld: message repeated 9 times: [ WSREP: BF lock wait long]
Sep 19 08:36:01 dbm-ace-live-data-03 CRON[1361]: (root) CMD (/bin/chmod -R a+r /var/log >/dev/null 2>&1)
Sep 19 08:36:01 dbm-ace-live-data-03 mysqld: WSREP: BF lock wait long
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: message repeated 5 times: [ WSREP: BF lock wait long]
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld:
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: =====================================
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: 2017-09-19 08:36:06 0x7fc3dc748700 INNODB MONITOR OUTPUT
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: =====================================
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: Per second averages calculated from the last 50 seconds
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: -----------------
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: BACKGROUND THREAD
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: -----------------
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: srv_master_thread loops: 333435 srv_active, 0 srv_shutdown, 144764 srv_idle
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: srv_master_thread log flush and writes: 478163
Sep 19 08:36:06 dbm-ace-live-data-03 mysqld: ----------

So, is it an option to downgrade to 10.1?

Comment by Michael Weber [ 2017-10-04 ]

So "good news" for me..
since version 10.2.9 we have the cluster running without any issues for >5 days.

mysql --version
mysql  Ver 15.1 Distrib 10.2.9-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
 
systemctl status mysql
● mariadb.service - MariaDB database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: active (running) since Thu 2017-09-28 16:27:02 CEST; 5 days ago

I dont trust myself but i hope that this verison fixed it for me/us

any experience with this version from other users?

Comment by Michael Neumann [ 2017-10-05 ]

No good news for me, the problem stil exists:

dbm-ace-pre-02:~ # mysql --version
mysql  Ver 15.1 Distrib 10.2.9-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
....
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: WSREP: BF lock wait long
Oct  5 09:45:59 dbm-ace-pre-02 mysqld:
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: =====================================
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: 2017-10-05 09:45:58 0x7fb33bcf5700 INNODB MONITOR OUTPUT
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: =====================================
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: Per second averages calculated from the last 30 seconds
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: -----------------
Oct  5 09:45:59 dbm-ace-pre-02 mysqld: BACKGROUND THREAD

My setup consists of three datanodes with two maxscale proxies in front of it and one hardware loadbalancer as entry point

i can very easily reproduce the problem when i switch the two maxscale to "Galera-Mode" e.g. write to all nodes allowed.

Comment by Ranjan Ghosh [ 2017-10-05 ]

Hm. Right, still unsolved with the most recent version. Sigh. I have to admit I had faintly hoped I might return from my vacation and this bug is finally solved What I don't understand is that nobody of the core dev team seems to even be interested in this. Don't get me wrong - as a developer, I know that there are difficult bugs to solve that take a lot of time to figure out etc. But there is not even a reaction, nothing. I think this is a huge bug that in fact renders Galera completely useless. If the database locks up once per day (some even report it to happen every few minutes) and requires manual intervention it is - let's be honest - simply not usable on a production server. And I don't need Galera on a dev system. So, you cannot use it at all. But the integrated Galera was one of the main selling points of using MariaDB over MySQL. At least it was for us. For a long time I thought: "OK, it's a beta version. Just wait. They'll certainly fix this in the future". Now, MariaDB 10.2.x is even considered as "stable" AFAIK. Nothing could be further from the truth.

Comment by Michael Neumann [ 2017-10-05 ]

You speak out of my soul. The reason we gave mariadb(galera) a shot, is the write scalability. We tested it with one of our key.systems and run into this bug. The combination with maxcale appeared so promising. Now we move to our Oracle-RAC system.

Comment by Sachin Setiya (Inactive) [ 2017-10-05 ]

Hi Everyone!

I am assigned to this bug, but I was a lot busy with other bugs. I will look into this bug in this week and hopefully will come with a work arround / patch.

  • Sachin
Comment by Ralf Fischer [ 2017-10-05 ]

Hi Sachin, that's great! Thanks in advance for your support!

Comment by Andrii Nikitin (Inactive) [ 2017-10-05 ]

I was able to reproduce the problem with script below, which downloads and unpacks 10.2.9, configures 3 nodes on 3307, 3308 and 3309 and puts some concurrent write load :

set -e
ver=${1:-10.2.9}
# just use current directory if called from framework
if [ ! -f common.sh ] ; then
  [ -d mariadb-environs ] || git clone http://github.com/AndriiNikitin/mariadb-environs
  cd mariadb-environs
  ./get_plugin.sh galera
fi
 
_template/plant_cluster.sh cluster1
echo m1 > cluster1/nodes.lst
echo m2 >> cluster1/nodes.lst
echo m3 >> cluster1/nodes.lst
cluster1/replant.sh ${ver}
 
./build_or_download.sh m1
 
cluster1/gen_cnf.sh innodb_buffer_pool_size=1G \
        innodb_log_file_size=256M \
        innodb_flush_log_at_trx_commit=2 \
        innodb_lock_wait_timeout=10 \
        transaction-isolation=READ-COMMITTED \
        innodb_stats_persistent=0
cluster1/install_db.sh
# cluster1/galera_setup_acl.sh # this needed for mysqldump
cluster1/galera_start_new.sh wsrep_sst_method=rsync
 
sleep 10
cluster1/galera_cluster_size.sh
 
for i in {1..5}; do
  m1*/sql.sh create table t$i select seq as a, seq as b from seq_1_to_10000
  m1*/sql.sh "alter table t$i add unique index(a)"
done
 
trap 'kill $(jobs -p)' EXIT
 
for m in {1..3}; do
  for i in {1..5}; do
    export I=$i
    export M=$m
    ( while : ; do m$M*/sql.sh "start transaction; update t$I set b=b+1 where a=$I; delete from t$I where a=2*$I; do sleep(10); commit" 2>/dev/null || : ; done; )&
  done
  ( while : ; do
    for i in {1..30}; do m$M*/sql.sh "delete from t$i where a=floor(rand()*10000);" &>/dev/null ; done
  done ) &
  ( while : ; do
    for i in {1..30}; do m$M*/sql.sh "delete from t$i where a=floor(rand()*10000);" &>/dev/null ; done
  done ) &
done
 
while : ; do
  m1*/sql.sh show processlist
  m2*/sql.sh show processlist
  m3*/sql.sh show processlist
  grep 'BF lock wait long' m1*/dt/error.log | wc -l
  grep 'BF lock wait long' m2*/dt/error.log | wc -l
  grep 'BF lock wait long' m3*/dt/error.log | wc -l
  sleep 10
done

The output of write SQL commands is suppressed and diplayed only processlist for each node, followed by number of "BF lock wait long" occurrences in error log.
After few minutes I can see problems in galera committer threads:

0
0
0
2	system user		NULL	Sleep	99	wsrep aborter idle	NULL	0.000
1	system user		NULL	Sleep	10	committed 25	NULL	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
43	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
44	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
45	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
46	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
47	root	localhost	test	Query	9	updating	delete from t4 where a=floor(rand()*10000)	0.000
48	root	localhost	test	Query	9	updating	delete from t4 where a=floor(rand()*10000)	0.000
49	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
50	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	97	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	9	committed 30	NULL	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
27	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
29	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
30	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
31	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
32	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
33	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	89	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	20	Update_rows_log_event::find_row(13)	update t1 set b=b+1 where a=1	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
12	root	localhost	test	Query	20	init	commit	0.000
14	root	localhost	test	Query	20	init	commit	0.000
15	root	localhost	test	Query	30	query end	delete from t1 where a=floor(rand()*10000)	0.000
20	root	localhost	test	Query	15	init	commit	0.000
21	root	localhost	test	Query	10	init	commit	0.000
25	root	localhost	test	Query	9	updating	update t1 set b=b+1 where a=1	0.000
26	root	localhost	test	Query	0	init	show processlist	0.000
0
0
10
2	system user		NULL	Sleep	109	wsrep aborter idle	NULL	0.000
1	system user		NULL	Sleep	9	Update_rows_log_event::find_row(35)	update t4 set b=b+1 where a=4	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
47	root	localhost	test	Query	19	query end	delete from t4 where a=floor(rand()*10000)	0.000
49	root	localhost	test	Query	9	init	commit	0.000
51	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
52	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
53	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
54	root	localhost	test	Query	9	updating	update t4 set b=b+1 where a=4	0.000
55	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	107	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	9	committed 40	NULL	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
34	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
35	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
36	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
37	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
38	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
39	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	99	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	30	Update_rows_log_event::find_row(13)	update t1 set b=b+1 where a=1	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
12	root	localhost	test	Query	30	init	commit	0.000
14	root	localhost	test	Query	30	init	commit	0.000
15	root	localhost	test	Query	40	query end	delete from t1 where a=floor(rand()*10000)	0.000
20	root	localhost	test	Query	25	init	commit	0.000
21	root	localhost	test	Query	20	init	commit	0.000
27	root	localhost	test	Query	8	updating	update t1 set b=b+1 where a=1	0.000
28	root	localhost	test	Query	0	init	show processlist	0.000
0
0
20
2	system user		NULL	Sleep	119	wsrep aborter idle	NULL	0.000
1	system user		NULL	Sleep	19	Update_rows_log_event::find_row(35)	update t4 set b=b+1 where a=4	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
47	root	localhost	test	Query	29	query end	delete from t4 where a=floor(rand()*10000)	0.000
49	root	localhost	test	Query	19	init	commit	0.000
51	root	localhost	test	Query	9	init	commit	0.000
52	root	localhost	test	Query	9	init	commit	0.000
53	root	localhost	test	Query	9	init	commit	0.000
56	root	localhost	test	Query	8	updating	update t4 set b=b+1 where a=4	0.000
57	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	117	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	19	committed 40	NULL	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
40	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
41	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
42	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
43	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
44	root	localhost	test	Query	9	User sleep	do sleep(10)	0.000
45	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	109	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	40	Update_rows_log_event::find_row(13)	update t1 set b=b+1 where a=1	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
12	root	localhost	test	Query	40	init	commit	0.000
14	root	localhost	test	Query	40	init	commit	0.000
15	root	localhost	test	Query	50	query end	delete from t1 where a=floor(rand()*10000)	0.000
20	root	localhost	test	Query	35	init	commit	0.000
21	root	localhost	test	Query	30	init	commit	0.000
29	root	localhost	test	Query	7	updating	update t1 set b=b+1 where a=1	0.000
30	root	localhost	test	Query	0	init	show processlist	0.000
9
0
30
2	system user		NULL	Sleep	129	wsrep aborter idle	NULL	0.000
1	system user		NULL	Sleep	29	Update_rows_log_event::find_row(35)	update t4 set b=b+1 where a=4	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
47	root	localhost	test	Query	39	query end	delete from t4 where a=floor(rand()*10000)	0.000
49	root	localhost	test	Query	29	init	commit	0.000
51	root	localhost	test	Query	20	init	commit	0.000
52	root	localhost	test	Query	19	init	commit	0.000
53	root	localhost	test	Query	19	init	commit	0.000
58	root	localhost	test	Query	7	updating	update t4 set b=b+1 where a=4	0.000
59	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	127	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	29	committed 40	NULL	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
40	root	localhost	test	Query	9	init	commit	0.000
41	root	localhost	test	Query	9	init	commit	0.000
42	root	localhost	test	Query	9	init	commit	0.000
43	root	localhost	test	Query	9	init	commit	0.000
44	root	localhost	test	Query	9	init	commit	0.000
46	root	localhost	test	Query	0	init	show processlist	0.000
1	system user		NULL	Sleep	119	wsrep aborter idle	NULL	0.000
2	system user		NULL	Sleep	50	Update_rows_log_event::find_row(13)	update t1 set b=b+1 where a=1	0.000
3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
12	root	localhost	test	Query	50	init	commit	0.000
14	root	localhost	test	Query	50	init	commit	0.000
15	root	localhost	test	Query	60	query end	delete from t1 where a=floor(rand()*10000)	0.000
20	root	localhost	test	Query	45	init	commit	0.000
21	root	localhost	test	Query	40	init	commit	0.000
31	root	localhost	test	Query	6	updating	update t1 set b=b+1 where a=1	0.000
32	root	localhost	test	Query	0	init	show processlist	0.000
19
0
40

sachin.setiya.007 let me know if this script is OK or if I should work on mtr case with similar load.

Comment by Sachin Setiya (Inactive) [ 2017-10-13 ]

Hi anikitin!

I tried to run this script on jessie(buildbot), One hour run , No BF lock wait long, The thing is script is not deterministic (I think so is the bug). If we can have some mtr case which simulates it for sure , would be great

Comment by Sukan [ 2017-10-13 ]

Hi Sachin,

The below scenario will help you in reproducing the bug.

1. On a single node, for a particular ID updates are fired continuously.
2. On another node, for the same WHERE clause I set a different value.

I have a table dummy, where I generated concurrent updates on node 1 and node 2 at the same time.

*Node 1* update dummy set status=100 where id='716071007'; //This runs in 500 concurrency with 100 iteration
*Node 2* update dummy set status=101 where id=716071007;  //This runs in 500 concurrency with 100 iteration

GC should have thrown a deadlock error, but instead it keep accepting the queries.

In node 2, when the query from node 1 replicates and on node 2 when a direct UPDATE hits, it thrown the below error.

WSREP: BF lock wait long
WSREP: BF lock wait long

This makes the cluster stalls and need to forcefully kill the same to bring back.

NOTE: Tested in 10.2.8,10.2.9, creating "BF lock wait long" but not in 10.1.26.

Steps to reproduce :

CREATE TABLE `dummy` (
-> `id` bigint(20) NOT NULL AUTO_INCREMENT,
-> service varchar(20),
-> status int(11),
-> PRIMARY KEY (`id`)
-> ) ENGINE=InnoDB AUTO_INCREMENT=716071002 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.43 sec)

MariaDB [test]> insert into dummy (service,status) values ('MySQL',1),('MSSQL',2),('MONGODB',3);
Query OK, 3 rows affected (0.13 sec)
Records: 3 Duplicates: 0 Warnings: 0

*Generate concurrent updates *

*Node 1* update dummy set status=100 where id='716071007';
*Node 2* update dummy set status=101 where id=716071007;

Process list :
------------------------------------------------------------------------------------------------------------------------------------------

Id User Host db Command Time State Info Progress

------------------------------------------------------------------------------------------------------------------------------------------

1 system user   NULL Sleep 4 Update_rows_log_event::find_row(718) update dummy set status=100 where id='716071007' 0.000
2 system user   NULL Sleep 501 wsrep aborter idle NULL 0.000
6 system user   NULL Daemon NULL InnoDB purge coordinator NULL 0.000
3 system user   NULL Daemon NULL InnoDB purge worker NULL 0.000
4 system user   NULL Daemon NULL InnoDB purge worker NULL 0.000
5 system user   NULL Daemon NULL InnoDB purge worker NULL 0.000
7 system user   NULL Daemon NULL InnoDB shutdown handler NULL 0.000
11 monitor localhost:44902 NULL Sleep 0   NULL 0.000
17 root localhost test Query 0 init show processlist 0.000
22 sbtest localhost:46666 test Query 4 updating update dummy set status=101 where id=716071007 0.000
23 sbtest localhost:46696 test Query 4 updating update dummy set status=101 where id=716071007 0.000
24 sbtest localhost:46850 test Query 4 updating update dummy set status=101 where id=716071007 0.000
25 sbtest localhost:46854 test Query 4 updating update dummy set status=101 where id=716071007 0.000
26 sbtest localhost:46856 test Query 4 updating update dummy set status=101 where id=716071007 0.000
36 sbtest localhost:47108 test Query 4 updating update dummy set status=100 where id='716071007' 0.000
38 sbtest localhost:47144 test Query 4 updating update dummy set status=101 where id=716071007 0.000
39 sbtest localhost:47182 test Query 4 updating update dummy set status=101 where id=716071007 0.000
37 sbtest localhost:47140 test Query 4 updating update dummy set status=101 where id=716071007 0.000
40 sbtest localhost:47186 test Query 4 updating update dummy set status=101 where id=716071007 0.000
42 sbtest localhost:47222 test Query 4 updating update dummy set status=101 where id=716071007 0.000
41 sbtest localhost:47202 test Query 4 updating update dummy set status=100 where id='716071007' 0.000
43 sbtest localhost:47232 test Query 4 updating update dummy set status=100 where id='716071007' 0.000
27 sbtest localhost:46892 test Query 4 query end update dummy set status=100 where id='716071007' 0.000
30 sbtest localhost:47042 test Query 4 updating update dummy set status=101 where id=716071007 0.000
31 sbtest localhost:47048 test Query 4 updating update dummy set status=101 where id=716071007 0.000
32 sbtest localhost:47062 test Query 4 updating update dummy set status=101 where id=716071007 0.000
33 sbtest localhost:47066 test Query 4 updating update dummy set status=101 where id=716071007 0.000

Finally, cluster stalls until we kill the MySQL

+----+-------------+-----------------+------+---------+------+--------------------------------------+--------------------------------------------------+----------+
| Id | User        | Host            | db   | Command | Time | State                                | Info                                             | Progress |
+----+-------------+-----------------+------+---------+------+--------------------------------------+--------------------------------------------------+----------+
|  1 | system user |                 | NULL | Sleep   |  144 | Update_rows_log_event::find_row(718) | update dummy set status=100 where id='716071007' |    0.000 |
|  2 | system user |                 | NULL | Sleep   |  641 | wsrep aborter idle                   | NULL                                             |    0.000 |
|  6 | system user |                 | NULL | Daemon  | NULL | InnoDB purge coordinator             | NULL                                             |    0.000 |
|  3 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker                  | NULL                                             |    0.000 |
|  4 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker                  | NULL                                             |    0.000 |
|  5 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker                  | NULL                                             |    0.000 |
|  7 | system user |                 | NULL | Daemon  | NULL | InnoDB shutdown handler              | NULL                                             |    0.000 |
| 11 | monitor     | localhost:44902 | NULL | Sleep   |    0 |                                      | NULL                                             |    0.000 |
| 17 | root        | localhost       | test | Query   |    0 | init                                 | show processlist                                 |    0.000 |
| 27 | sbtest      | localhost:46892 | test | Query   |  144 | query end                            | update dummy set status=100 where id='716071007' |    0.000 |
+----+-------------+-----------------+------+---------+------+--------------------------------------+--------------------------------------------------+----------+

Hope this will help you in addressing the issue.

-Sukan
DBA, Mafiree

Comment by Ari Maniatis [ 2017-10-16 ]

@sachin.setiya.007 Are you able to reproduce the problem now? I can make it happen on 10.2.7 within a few minutes by running Jira with the loadbalancer JDBC connection across two nodes. This happens with very little load almost right away.

Since we have no workaround and no way of avoiding the bug, there needs to be some recommendation to avoid Galera entirely with MariaDB 10.2.x. Clearly this isn't ready for production. There are also other important issues to watch out for since 10.2.9 will not start with Galera because of this bug.

Is there some way we can get a higher level of attention for Galera issues, or for more automated testing in this area? I think this could be mariaDB's key advantage over the competition if this stuff was just really solid.

Comment by Sachin Setiya (Inactive) [ 2017-10-18 ]

Sorry , I was travelling , But will try to do it now, I got to hit the bug by anikitin script. So I am writing mtr test , and on the way to fix

Comment by Tas Dinq [ 2017-10-28 ]

This bug also effect server version: 10.2.6-MariaDB-10.2.6+maria~xenial-log mariadb.org binary distribution, in my production environment. it's drive me crazy i need force to restart full cluster each times.

i have a lot's of client process update a table, some times with same id on different node, like:
UPDATE foo SET foo.bar = 1 or 0 WHERE id=xxx;
very similar to Sukan's test case.

any progress for this bug?
or any temporary workaround for this bug?

Comment by Sukan [ 2017-10-30 ]

Hi Tan,

I would suggest you to run the same cluster in single node, you can point all the write traffic to single node. This would be temporary workaround without making any big changes.

Else you can downgrade the cluster completely to 10.1.x version, where we ran the test and didn't hit this bug.

Thanks,
Sukan
DBA, Mafiree

Comment by Tas Dinq [ 2017-11-01 ]

THANK YOU !!!!!! Sukan. thank you for your advice.

I'm trying to use maxscale's ReadWriteSplit module to route all write to single node, hope my life can be easier with this.

Comment by Anton Baykov [ 2017-11-08 ]

Hi Sachin,
any progress for this bug?

Comment by TAO ZHOU [ 2017-11-09 ]

I hit the error 'BF-BF X lock conflict' even though I am only writing to one node.

2017-11-09  4:18:49 50152503296 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-11-09 10:30:37 50153508864 [Note] InnoDB: BF-BF X lock conflict,mode: 1027 supremum: 0
2017-11-09 10:30:37 50153508864 [Note] InnoDB: conflicts states: my 0 locked 0
RECORD LOCKS space id 316718 page no 3049 n bits 240 index PRIMARY of table `

Comment by Mark [ 2017-11-09 ]

Is there any update for this ticket working twords a fix for this? There has been a distinct lack of feedback about this issue. Not that I expect a fix right away, but let us know whats happening, and maybe a better understanding of what the issue is.

We have been have been experiencing this problem even though we are sending all of the traffic to one node only. (out off 3) Our application seems to lock up about once a month or so.

Mark

Comment by Sachin Setiya (Inactive) [ 2017-11-10 ]

Hi ,

I have been assigned to this issue, but for arround 10 days I was on vacations. Currently I am occupied with 10715 and 10177. I will look into this issue after those are done. And this issue require somewhat more understanding , because I am not familiar with innodb code.

Regards
sachin

Comment by Ari Maniatis [ 2017-11-11 ]

Hi Sachin

I know you have your own set of priorities based on your customers, but I'd like to encourage you to look at this bug with higher priority. Particularly when compared to building a new feature for 10.3 (MDEV-10177). This bug:

  • completely crashes mariadb
  • has no workaround
  • affects anyone who is using galera, even if they disable writing to multiple nodes

In short, this bug is completely catastrophic for all production use of mariaDB if galera is enabled. I think mariaDB management should consider pulling the entire 10.2 release from production ready status and think about how users might roll back to 10.1 if this bug can't be fixed in the next minor 10.2.11 release. This should be on the front page of your site or download page.

Comment by Andrii Nikitin (Inactive) [ 2017-11-15 ]

mtr test case is in MDEV-14401 according to instructions from https://jira.mariadb.org/browse/MDEV-12837?focusedCommentId=101543&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-101543

Comment by Sergei Golubchik [ 2017-11-16 ]

reassigned to jplindst because seppo has found that

My debugging session from today shows that the potential smoking gun is
function lock_grant_and_move_on_page() in lock0lock.cc. This function does not
honor locking priority policy, and leaves a high priority lock in wait queue
favoring a normal priority lock.

Comment by Jan Lindström (Inactive) [ 2017-11-17 ]

When using VATS we favor older transactions this is bad for Galera as we should favor BF transactions.

Comment by Jan Lindström (Inactive) [ 2017-11-23 ]

https://github.com/MariaDB/server/commit/7d4d54e415c3d6081b71b024532b5ba8f1addc4a

Comment by Marko Mäkelä [ 2017-11-24 ]

Please try to find a solution that does not involve adding fields to trx_t. I cannot approve the addition of trx_t::trx_mutex_taken.

There are many Galera test failures on bb-10.2-MDEV-12837. Can you please rebase the fix on the latest 10.2, and then post a detailed analysis of all test failures?

Comment by Jan Lindström (Inactive) [ 2017-11-27 ]

https://github.com/MariaDB/server/commit/9f6dd866c163822e2e3b52345685722c6dec2b2b

Comment by Marko Mäkelä [ 2017-11-28 ]

The change looks much cleaner now, but I’d like to have one more review round.

Comment by Marko Mäkelä [ 2017-12-04 ]

I would like to see a revised test push to buildbot that contains only this change, based on the latest 10.2. The current bb-10.2-MDEV-12837 contains some commits related to MDEV-14401. It seems that MDEV-14401 actually depends on this MDEV-12837. It looks like the changes to lock_rec_lock_slow() and lock_table() in the initial MDEV-14401 patch have been made redundant by the MDEV-12837 change to innobase_init().

Comment by Jan Lindström (Inactive) [ 2017-12-05 ]

Split the commits on two parts:

https://github.com/MariaDB/server/commit/1ffd8b30f9f86aedd42c91f72db9d54f2f8576d7

Test case is on:

https://github.com/MariaDB/server/commit/7f555777b36983d4b4590df6634aa22170e90bc3

Because it requires both fixes.

Comment by Marko Mäkelä [ 2017-12-05 ]

I sent some review comments to the first commit. I still think that we must convert the algorithm at InnoDB startup, not while InnoDB is already open for SQL.
Where is this running on Buildbot?

Comment by Jan Lindström (Inactive) [ 2017-12-05 ]

bb-10.2-galera

Comment by Marko Mäkelä [ 2017-12-06 ]

I now see that bb-10.2-galera consists of two commits on top of 10.2: MDEV-12837 and MDEV-14401.
If the MDEV-12837 fix were working as requested (adjusting the configuration at InnoDB startup), then I think that all the configuration changes

innodb-lock-schedule-algorithm=fcfs

in the MDEV-14401 fix should be unnecessary and should be removed. The default VATS should be fine. We should not even need to suppress any warning; at startup, InnoDB could issue an informational note (not warning) that the incompatible parameter was adjusted.

One more question: Why does the MDEV-14401 commit disable the following occasionally failing tests:

diff --git a/mysql-test/suite/galera/disabled.def b/mysql-test/suite/galera/disabled.def
index 423c83208f8..801b12a1f84 100644
--- a/mysql-test/suite/galera/disabled.def
+++ b/mysql-test/suite/galera/disabled.def
@@ -56,3 +56,8 @@ MW-284 : MDEV-13549 Galera test failures 10.1
 galera_as_slave : MDEV-13549 Galera test failures 10.1
 galera_var_innodb_disallow_writes : MDEV-10949
 galera_kill_applier : race condition at the start of the test
+MW-328C: MDEV-13549 Galera test failures 10.1
+MW-328A: MDEV-13549 Galera test failures 10.1
+MW-328B: MDEV-13549 Galera test failures 10.1
+MW-328: MDEV-13549 Galera test failures 10.1
+galera_suspend_slave: MDEV-13549 Galera test failures 10.1
\ No newline at end of file

The commit message does not explain this.
I think that if it is considered necessary to disable these tests, they should be disabled in a separate commit, with a reference to separate tasks, such as MDEV-14149 or MDEV-13876 or even MDEV-13549.

Comment by Jan Lindström (Inactive) [ 2017-12-07 ]

commit da3a3a68df34c7fef387ce890d3925166edeef2c
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Dec 7 12:26:29 2017 +0200

MDEV-12837: WSREP: BF lock wait long

Problem was a merge error from MySQL wsrep i.e. Galera.

wsrep_on_check
New check function. Galera can't be enabled
if innodb-lock-schedule-algorithm=VATS.

innobase_kill_query
In Galera async kill we could own lock mutex.

innobase_init
If Variance-Aware-Transaction-Sheduling Algorithm (VATS) is
used on Galera we fall back to First-Come-First-Served (FCFS)
with notice to user.

Changed innodb-lock-schedule-algorithm as read-only parameter
as it was designed to be.

lock_reset_lock_and_trx_wait
Use ib::hex() to print out transaction ID.

lock_rec_other_has_expl_req,
lock_rec_other_has_conflicting,
RecLock::add_to_waitq
lock_rec_lock_slow
lock_table_other_has_incompatible
lock_rec_insert_check_and_lock
lock_prdt_other_has_conflicting

Change pointer to conflicting lock to normal pointer as this
pointer contents could be changed later.

RecLock::create
Conclicting lock pointer is moved to last parameter with
default value NULL. This conflicting transaction could
be selected as victim in Galera if requesting transaction
is BF (brute force) transaction. In this case contents
of conflicting lock pointer will be changed. Use ib::hex() to print
transaction ids.

Comment by Jan Lindström (Inactive) [ 2017-12-07 ]

greenman Can you document that innodb-lock-schedule-algorithm is read-only variables (i.e. it can't be changed dynamically) and that VATS method is not supported on Galera.

Here what documentation should mention:

  • 10.1, default is FCFS and InnoDB will refuse to start if VATS is used with galera.
  • >=10.2 default is VATS and in galera it will be changed to FCFS.
Comment by Ranjan Ghosh [ 2017-12-07 ]

Wowwwww. Incredible. After many months of suffering finally a MariaDB version that really deserved the label "stable"... perhaps before christmas? Can it be true? Don't tell me I'm dreaming. Now we're all waiting anxiously for the binary to arrive in the repos... Thanks to everyone involved and, especially, to Jan for solving this.

Comment by Ari Maniatis [ 2017-12-24 ]

@jplindst Was there supposed to be a commit to the 10.2 branch as well? I could only see one to the 10.1 branch (where I think the bug wasn't manifesting) and to 10.3.

https://github.com/MariaDB/server/search?q=MDEV-12837&type=Commits&utf8=%E2%9C%93

Are you able to confirm this will land in 10.2.12? That will help us plan our testing for being able to use galera cluster in production.

Comment by Sergei Golubchik [ 2017-12-24 ]

No, there wasn't supposed to be a separate 10.2 commit — every 10.1 commit will inevitably land in 10.2 eventually.

Normally this happens at least before the next 10.2 release, so this commit will be in the 10.2.12, just like "Fix Version/s" field at the top of the page says.

Comment by Jan Lindström (Inactive) [ 2017-12-28 ]

Hi, there seems to be some kind of information loss here: commit da3a3a68df34c7fef387ce890d3925166edeef2c is for 10.2 (10.1 had no merge error and FCFS is default so change there it is slightly different).

Comment by Ranjan Ghosh [ 2017-12-28 ]

I'd suggest adding a note (like the one below 10.1.22) on:

https://downloads.mariadb.org/mariadb/+releases/

Perhaps sth. like: "Updating from 10.2.11 to 10.2.12 is highly recommended for Galera users because previous versions are unstable".
In addition - IMHO - this should also be mentioned in the release notes. The term "stable" on the above-mentioned page is really misleading for older versions because we all know they will probably crash sooner or later due to this bug - even with very ordinary use patterns.

Comment by Daniel Black [ 2017-12-29 ]

rgpublic great work. I suggest the description in https://mariadb.com/kb/en/library/mariadb-10130-release-notes/ could be meaningfully expanded (replaced). This is a page humble users like you and me can edit and probably serves as a good enough place people who are looking to upgrade would read.

Comment by Ranjan Ghosh [ 2017-12-29 ]

Ah, really cool. Thanks, Daniel. Didnt know that everyone can change these pages. I've added a warning box and reference to this bug here:

https://mariadb.com/kb/en/library/mariadb-10212-release-notes/

For 10.1.x I think the already existing line under "Notable changes" should be sufficent as the bug doesnt seem to manifest there - so no need to cause panic.

Comment by Adrian Tarau [ 2018-01-24 ]

It looks like it still happens ....

innodb_version 5.7.20
protocol_version 10
slave_type_conversions
version 10.2.12-MariaDB
version_comment MariaDB Server
version_compile_machine x86_64
version_compile_os Linux
version_malloc_library system
version_ssl_library OpenSSL 1.0.1e-fips 11 Feb 2013
wsrep_patch_version wsrep_25.21

What can I provide to help troubleshoot this? It looks like the issue is triggered by a truncate table...

LOCK WAIT
MySQL thread id 138641, OS thread handle 139936878089984, query id 139638795 ........ Opening tables
TRUNCATE TABLE `ds$etl_dimension_discovery_local_standard_1`
2018-01-24 11:51:38 139938787669760 [Note] InnoDB: WSREP: BF lock wait long for trx:0x1a3598f query: TRUNCATE TABLE `ds$etl_dimension_discovery_local_standard_1`
TRANSACTION 27482511, ACTIVE 1997 sec

As soon as this kicks in, everything locks. Cannot reproduce it in a consistent way .... also, when it happens, kill 138641 says "now an owner"... even for root!!! how is that possible?

Comment by Jan Lindström (Inactive) [ 2018-01-24 ]

Can I have full unedited error log? wsrep_debug=on and wait until it repeats and again error log would also help.

Comment by Adrian Tarau [ 2018-01-24 ]

I'll try ... hard to say when it happens ... hopefully the log will not be GB of data with debug on ....

Comment by Adrian Tarau [ 2018-01-25 ]

I have 500MB of logs (uncompressed) ... I'll send you an email with a link to download the logs ....

Forgot to mention ... we are using a 3 node cluster (development database) with HA Proxy, with one node dedicated for writes...nothing interesting is logged in (mostly) read-only nodes .... A stress test is started with the database in the same state ... after a while it locks and it is always triggered by a DDL statement (truncate in this case, in the past I've seen these dead-locks with ALTER TABLE ... ADD PARTITION ...). Once the DDL enters in dead-lock, no other query can be executed (and there are plenty of threads/connections available), even queries not related to that table lock. Same with ALTER TABLE ... ADD PARTITION

First reference of a dead-lock starts at 2018-01-24 15:39:09 139784560858880 [Note] InnoDB: WSREP: BF lock wait long for trx:0x1e9eb4f query

Comment by Neil Skrypuch [ 2018-01-29 ]

I am also still seeing this issue with 10.2.12, although the behaviour has changed somewhat compared to 10.2.9 (the previous version we were running).

Now with 10.2.12, it seems to only happen on one master at a time (not multiple masters simultaneously). Also, the query that gets regularly printed to the logs is always a DDL statement now, which it was not before.

The other noteworthy change with 10.2.12 is that the log output is much terser. Instead of a complete dump of SHOW ENGINE INNODB STATUS; every second, we now just see 5 lines per second, like so:

Jan 18 02:53:12 tm2 mysqld[16437]: 2018-01-18 2:53:12 140246917351168 [Note] InnoDB: WSREP: BF lock wait long for trx:0x76a45d4 query: CREATE TABLE IF NOT EXISTS ...
Jan 18 02:53:12 tm2 mysqld[16437]: TRANSACTION 124405204, ACTIVE 269206 sec
Jan 18 02:53:12 tm2 mysqld[16437]: LOCK WAIT
Jan 18 02:53:12 tm2 mysqld[16437]: MySQL thread id 16, OS thread handle 140246857389824, query id 27596013 creating table
Jan 18 02:53:12 tm2 mysqld[16437]: CREATE TABLE IF NOT EXISTS ...

Comment by Adrian Tarau [ 2018-01-30 ]

I do not see a way to re-open the defect ....

Comment by Sergei Golubchik [ 2018-01-31 ]

adrian.tarau, please create a new one. You can refer to this one in the description if you'd like or copy-paste from it.

Comment by Jan Lindström (Inactive) [ 2018-07-25 ]

Updated correct releases where actual fix is:

commit 57cde8ccd19675dc98e3cbacf0ef5c72cb188e49 (HEAD -> 10.1, origin/10.1)
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Wed Jul 25 08:21:25 2018 +0300

MDEV-15822: WSREP: BF lock wait long for trx

In Galera BF (brute force) transactions may not wait for lock requests
and normally BF-transaction would select transaction holding conflicting
locks as a victim for rollback. However, background statistic calculation
transaction is InnoDB internal transaction and it has no thd i.e. it can't be
selected as a victim. If background statistics calculation transaction holds
conflicting locks to statistics tables it will cause BF lock wait long
error message. Correct way to handle background statistics calculation is to
acquire thd for transaction but that change is too big for GA-releases and
there are other reported problems on background statistics calculation.

This fix avoids adding a table to background statistics calculation if

Generated at Thu Feb 08 08:00:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.