[MXS-4775] KafkaCDC: current_gtid.txt is moving but is behind Created: 2023-09-23  Updated: 2023-09-29  Resolved: 2023-09-29

Status: Closed
Project: MariaDB MaxScale
Component/s: avrorouter, kafkacdc
Affects Version/s: 23.02.3
Fix Version/s: 6.4.11, 22.08.9, 23.02.5, 23.08.2

Type: Bug Priority: Major
Reporter: Presnickety Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: replication
Environment:

RHEL v8.2
VMware vCenter v7
MariaDB v10.7 (3 node Galera cluster)
Maxcsale v23.02.3 (one instance of each per MariaDB VM for redundancy)


Attachments: PNG File MXS-4775-01.PNG     PNG File MXS-4775-02.PNG     PNG File MXS-4775-03.PNG     PNG File MXS-4775-04.PNG     PNG File MXS-4775-05.PNG     Text File MXS-4775-CPU-all-soft-lockup-events01.txt     Text File MXS-4775-all-coredumps01.txt     Text File MXS-4775-coredump_20230916150821.txt     Text File MXS-4775-coredump_20230918181252.txt     Text File MXS-4775-coredump_20230921221527.txt     Text File MXS-4775-coredump_20230925091218.txt     File Screencast from 2023-09-28 10-00-51.webm     File maxscale.cnf     File my.cnf     File perf-replicator.svg     Zip Archive perf.zip    
Issue Links:
Relates
relates to MXS-4785 KafkaCDC JSON conversion is taking mo... Closed

 Description   

Hi There,

We've observed the GTID value within current_gtid.txt is behind what is reported by the maxctrl CLI. We have three Maxscale instances, the current_gtid.txt on each one exhibits the same behaviour. The value within the current_gtid.txt files is at least a day old, yet the binary logs are purged after 3 hrs.

Thanks.



 Comments   
Comment by markus makela [ 2023-09-26 ]

What does the output of maxctrl show services show as the GTID position?

Are there also any errors in the MaxScale log?

Is the MaxScale instance using a lot of CPU or is it hitting some other resource limitation?

Is the kafka broker still getting updates via MaxScale or does it appear stalled?

Comment by Presnickety [ 2023-09-27 ]

Hi Markus,

Please see attached snips for the three servers there's a difference between maxctrl list servers vs. maxctrl show services for the GTIDs.

Yesterday servers 02 & 03 report the following error;

2023-09-26 05:57:38 error : (159) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:38 error : (159) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:38 error : (160) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:38 error : (160) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:38 error : (161) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:38 error : (161) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:38 error : (162) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:38 error : (162) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:39 error : (163) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:39 error : (163) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:39 error : (164) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:39 error : (164) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:39 error : (165) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:39 error : (165) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:39 error : (166) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:57:39 error : (166) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:57:51 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:57:54 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:05 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:11 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:26 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:31 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:37 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:42 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:48 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:53 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:58:59 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:04 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:10 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:14 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:15 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:20 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:32 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:33 error : (168) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:59:33 error : (168) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:59:33 error : (169) [readwritesplit] (Read-Write-Split); No valid candidates for session command (COM_STMT_PREPARE: call session_insert_dumptruck(?,?,?,?,?,?,?,?,?)). Connection status:
2023-09-26 05:59:33 error : (169) [MariaDBProtocol] Routing the query failed. Session will be closed.
2023-09-26 05:59:37 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:42 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:46 error : Failed to execute query on server 'viexh-session-usage-mdb-01' ([10.195.241.79]:3306): Can't connect to server on '10.195.241.79' (110)
2023-09-26 05:59:47 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:52 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.
2023-09-26 05:59:57 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.

Comment by markus makela [ 2023-09-27 ]

One possible reason is that the broker can't keep up with the message flow (example here). How loaded is the broker itself (CPU and disk IO)?

Comment by Presnickety [ 2023-09-27 ]

We also saw he following repeat info message from servers 02 & 03;

2023-09-26 05:57:51 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2048859537'
2023-09-26 05:57:54 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:05 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:11 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:14 notice : Started replicating from [10.195.241.79]:3306 at GTID '1-1-2048859537'
2023-09-26 05:58:26 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:31 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:37 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:42 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:48 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:53 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:58:59 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:04 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:10 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:14 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2048859537'
2023-09-26 05:59:15 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:20 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2048859537'
2023-09-26 05:59:32 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2048859537'
2023-09-26 05:59:37 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2048859537'
2023-09-26 05:59:42 notice : Started replicating from [10.195.241.79]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:42 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:47 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:52 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'
2023-09-26 05:59:57 notice : Started replicating from [10.195.241.80]:3306 at GTID '1-1-2049123448'

Comment by Presnickety [ 2023-09-27 ]

We have 16 vCPUs in the VM, so CPU + Mem usage are OK;

top - 17:12:03 up 3 days, 10 min, 1 user, load average: 9.33, 9.96, 10.05
Tasks: 500 total, 2 running, 498 sleeping, 0 stopped, 0 zombie
%Cpu(s): 20.8 us, 20.6 sy, 0.2 ni, 45.8 id, 2.4 wa, 1.9 hi, 8.4 si, 0.0 st
MiB Mem : 257679.8 total, 103067.5 free, 23620.4 used, 130992.0 buff/cache
MiB Swap: 16384.0 total, 16384.0 free, 0.0 used. 231188.0 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
341798 mysql 20 0 219.9g 35.1g 16.0g S 281.2 13.9 7968:12 mariadbd
3609832 maxscale 20 0 2958488 205256 17244 S 264.7 0.1 124:51.81 maxscale
1204 root 20 0 4195540 1.1g 1.0g S 21.5 0.4 182:27.48 falcon-sensor-b
1386 root 20 0 262588 43968 41288 S 7.3 0.0 83:45.45 sssd_nss
5970 netdata 39 19 58576 37604 1864 S 4.6 0.0 199:47.77 apps.plugin
3424 231272 39 19 213504 138852 11512 S 3.3 0.1 50:12.05 netdata

Comment by Presnickety [ 2023-09-27 ]

Depending on whether the write master experiences a failover or drop in connections, then the KafkaCDC feed will continue even when the servers 02 & 03 are showing the above errors. Once the write master has a failover event then it will log errors as per above. Sometimes the there will be a few errors logged, then the feed will continue, other times there will be constant errors and the feed stops. We configured a tmpfs for the KafakaCDC directory as per MXS-4404, and experienced the same, since the version upgrade we reverted back to the xfs file system.

Comment by Presnickety [ 2023-09-27 ]

CPU + Mem on the broker host is OK. We had the same issue when using a Kafka broker cluster that is on the same data centre / same subnet to the load balancer VMs'

top - 17:45:32 up 50 days, 1:26, 1 user, load average: 0.69, 0.24, 0.09
Tasks: 357 total, 1 running, 356 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.6 us, 4.6 sy, 0.0 ni, 91.4 id, 0.2 wa, 0.6 hi, 1.7 si, 0.0 st
MiB Mem : 31836.1 total, 232.0 free, 2746.5 used, 29315.5 buff/cache
MiB Swap: 65534.0 total, 61606.4 free, 3927.6 used. 29089.6 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13379 kafka 20 0 8670120 1.2g 1580 S 12.6 3.7 2664:20 java
13361 kafka 20 0 1524452 11448 896 S 7.3 0.0 1489:04 rootlessport
57 root 20 0 0 0 0 S 1.7 0.0 184:16.87 kswapd0
69785 root 20 0 2873880 308516 270984 S 1.3 0.9 219:45.74 falcon-sensor-b
897 root 20 0 465708 5284 2512 S 0.3 0.0 68:23.47 vmtoolsd
81034 splunk 20 0 386892 124628 6576 S 0.3 0.4 81:34.40 splunkd

Comment by markus makela [ 2023-09-27 ]

If that top output was taken when MaxScale was processing events and the database was in normal usage, MaxScale is using quite a lot of CPU. How loaded was the remove Kafka broker during that time?

I think that the slowdown might be somewhere in the kafkacdc processing that's taking a lot of CPU time. Are you able to trace the MaxScale process with perf to see where the CPU time is being spent? You can do this with the following commands:

sudo perf record -F 1000 -g -p $(pidof maxscale) -- sleep 60
sudo perf script > perf.script 

Please then upload the generated perf.script file here and we can produce a CPU flamegraph from it (using this).

Comment by markus makela [ 2023-09-28 ]

Screencast from 2023-09-28 10-00-51.webm
I think I managed to reproduce this with a local write-intensive workload. It seems like some part in the KafkaCDC is a bottleneck that causes it to lag behind.

Comment by Presnickety [ 2023-09-28 ]

Hello Markus,

I managed to prepare the perf.script. FYI I've attached a graph of the CPU soft lockups since the platform was upgraded to vCenter 7 late Feb/Mar, the replicator process appears quite a lot.

Thanks.

Comment by markus makela [ 2023-09-28 ]

OK, it looks like at least part of what is in that perf output is the same as what I found in my local testing. In the perf output the blocking calls that happen when the server status information is requested from another thread do not show as much as they did when I locally profiled the code. However, what does show up is the binlog event-to-JSON conversion followed by the JSON-to-string conversion.

The synchronization problem is easier to solve by moving the updates to happen in parallel so that if the set of servers where the replication is done from changes or the ownership of the cluster is lost, the replicating thread will pick it up shortly afterwards. This is not perfect since there's a small gap during which events are still replicated even if the server is no longer a part of the service or the cluster lost ownership but given the nature of KafkaCDC and the fact that there's no guarantee of events only being delivered once, it doesn't seem like a showstopper. Ideally the act of updating the servers of a service or the loss of cluster ownership would trigger a function to be called but I suspect that it wouldn't improve the situation in practice.

The JSON conversion is something that is harder to solve. I'll open a separate bug for that as fixing that requires more work.

Comment by markus makela [ 2023-09-28 ]

Do those soft lockup errors have a stacktrace in them?

Comment by markus makela [ 2023-09-28 ]

Here's the perf output visualized as a flame graph.
perf-replicator.svg
I filed MXS-4785 for the remaining performance problems.

Comment by Presnickety [ 2023-09-29 ]

Hi Markus,

I found dumps only for the processes labelled 'Replicator', not for processes labelled 'cdc::Replicator'. Dumps specifically for the following events have been attached separately;

Sep 25 09:12:18 viexh-session-usage-mdb-03 kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 34s! [Replicator:427436]
Sep 21 22:15:27 viexh-session-usage-mdb-03 kernel: watchdog: BUG: soft lockup - CPU#5 stuck for 32s! [Replicator:3953414]
Sep 18 18:12:52 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 55s! [Replicator:2579371]
Sep 16 15:08:21 viexh-session-usage-mdb-03 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [Replicator:1135010]

Also attached all core dumps since this was enabled in August 2023 and all CPU soft lockup events involving the 'Replicator' and 'cdc::Replicator' processes.

Thanks.

Comment by markus makela [ 2023-09-29 ]

OK, those look like perfectly normal stacktraces of an idle MaxScale which means that it's probably a problem related to virtualization. The reason why it's the Replicator thread might be that it's doing a blocking network read which for some reason is more likely to be treated as a locked up kernel call when the VM gets scheduled out to make space for another one. I'd recommend keeping an eye on those and monitoring whether the virtualization server is under heavy load or there's a lot of memory pressure.

Comment by Presnickety [ 2023-09-29 ]

Hi Markus,

Actually several different processes get affected, mostly the swapper process;

Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#14 stuck for 34s! [swapper/14:0]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#7 stuck for 34s! [swapper/7:0]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 34s! [systemd-journal:897]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 34s! [mariadbd:581089]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#5 stuck for 34s! [kworker/5:1:682146]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 34s! [systemctl:702532]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 34s! [apps.plugin:3757745]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 34s! [rdk:broker2:449783]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 34s! [mariadbd:525894]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#8 stuck for 34s! [swapper/8:0]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 34s! [swapper/10:0]
Sep 28 17:35:25 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#15 stuck for 34s! [swapper/15:0]
Sep 27 13:30:07 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 32s! [swapper/4:0]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [swapper/0:0]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 21s! [kworker/u33:3:3394515]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 21s! [swapper/3:0]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 21s! [swapper/6:0]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 21s! [sh:3400486]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#8 stuck for 21s! [swapper/8:0]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#14 stuck for 21s! [ps:3400483]
Sep 27 12:17:45 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 21s! [PLUGINSD[apps]:5901]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#15 stuck for 23s! [swapper/15:0]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/3:0]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#13 stuck for 23s! [WEB_SERVER[stat:5903]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#14 stuck for 23s! [swapper/14:0]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [swapper/12:0]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#8 stuck for 23s! [lsof:1857482]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 23s! [swapper/10:0]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 23s! [swapper/11:0]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [curl:1858014]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#7 stuck for 23s! [mariadbd:1857442]
Sep 26 06:01:29 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [swapper/1:0]
Sep 26 05:58:14 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [swapper/8:0]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#5 stuck for 38s! [kworker/5:3:1673086]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 38s! [polkitd:1267]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 38s! [swapper/1:0]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 38s! [swapper/2:0]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 38s! [cpu.sh:1699259]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 38s! [sed:1699256]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 38s! [swapper/4:0]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#13 stuck for 38s! [top:1699244]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 38s! [swapper/10:0]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#15 stuck for 38s! [evreap:5182]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 38s! [cpu.sh:1699258]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 38s! [apps.plugin:5970]
Sep 26 02:52:33 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#14 stuck for 38s! [sssd_be:1349]
Sep 24 10:17:43 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 21s! [ksoftirqd/9:70]
Sep 24 05:41:02 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 129s! [mariadbd:1213315]
Sep 24 05:41:02 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 129s! [swapper/4:0]
Sep 24 05:41:02 viexh-session-usage-mdb-01 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 129s! [swapper/2:0]

Generated at Thu Feb 08 04:31:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.