[MDEV-19473] MariaDB Galera Cluster response longer than 1 second Created: 2019-05-14  Updated: 2019-12-12  Resolved: 2019-12-12

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Update
Affects Version/s: 10.2.22
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: xiaolingfeng Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 0
Labels: galera
Environment:

OS: Ubuntu 18.04 server 64bit
Linux: 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Database: mysql Ver 15.1 Distrib 10.2.22-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2


Attachments: Text File slow.log     Text File status.txt    

 Description   

MariaDB Galera Cluster response longer than 1 second

the collected slow log samples as follow:

SET timestamp=1557759155;
UPDATE `api_logs` SET `response_code` = 200
WHERE `id` = 104258;

  1. User@Host: CD.com[CD.com] @ [192.168.1.141]
  2. Thread_id: 1154 Schema: ciosa_productivo QC_hit: No
  3. Query_time: 1.055928 Lock_time: 0.000034 Rows_sent: 0 Rows_examined: 0
  4. Rows_affected: 1
    SET timestamp=1557759155;
    INSERT INTO `ci_sessions` (`session_id`, `ip_address`, `user_agent`, `last_activity`, `user_data`) VALUES ('6adcce0d0cc4dae7dbb3dcb7171979c0', '200.39.24.163', 'Mozilla/5.0 (Windows NT 6.2; WOW64; Trident/7.0; rv:11.0) like Gecko', 1557759147, '');
  5. User@Host: CD.com[CD.com] @ [192.168.1.56]
  6. Thread_id: 1157 Schema: ciosa_productivo QC_hit: No
  7. Query_time: 1.105405 Lock_time: 0.000024 Rows_sent: 0 Rows_examined: 0
  8. Rows_affected: 1


 Comments   
Comment by Geoff Montee (Inactive) [ 2019-05-15 ]

Your SHOW GLOBAL STATUS output shows that your cluster spends a lot of time in flow control:

| wsrep_flow_control_paused    | 0.337543                                     |
| wsrep_flow_control_paused_ns | 685159687666                                 |
| wsrep_flow_control_recv      | 1939                                         |
| wsrep_flow_control_sent      | 1809                                         |

Flow control can lead to longer response times, which appears to be exactly what you are having issues with.

Usually, flow control can be avoided by better optimizing your cluster's configuration. The main parameter that you would want to set is gcs.fc_limit:

http://galeracluster.com/documentation-webpages/galeraparameters.html#gcs-fc-limit

I also see the following in your SHOW GLOBAL STATUS output:

| wsrep_local_recv_queue_avg   | 135.605601                                   |

So you may want to set gcs.fc_limit to 128, or even higher.

I would also suggest reading through the following documentation pages:

http://galeracluster.com/documentation-webpages/nodestates.html

http://galeracluster.com/documentation-webpages/managingfc.html

http://galeracluster.com/documentation-webpages/detectingaslownode.html

Comment by Alexey [ 2019-05-15 ]

It may also help to
1. check wsrep_apply_window status and adjust wsrep_slave_threads variable accordingly.
2. check if innodb_buffer_pool_size is too big and the process is swapping.

Comment by Faustin Lammler [ 2019-10-10 ]

Hi xiaolingfeng please can you verify that the credentials from your slow.log attachment are not critical for you?
Jira instance is public and I suggest you to take the appropriate measures if necessary.

Regards,
Faustin

Comment by Jan Lindström (Inactive) [ 2019-12-12 ]

No feedback provided.

Generated at Thu Feb 08 08:51:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.