[MDEV-24159] Millions of GRA_*.log files Created: 2020-11-06  Updated: 2022-01-05  Resolved: 2021-12-27

Status: Closed
Project: MariaDB Server
Component/s: Galera, wsrep
Affects Version/s: 10.4.14
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Mark Reibert Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 2
Labels: None


 Description   

I am running a three-node MariaDB/Galera cluster on Ubuntu 16.04 with MariaDB 10.4.14 and Galera 26.4.5. The cluster is setup in "master/slave" mode such that all connections are to one (master) node. At times the two slave nodes start generating GRA_∗.log files at an alarming rate, on the order of 100k files per day.

Something seem to trigger the situation; that is, I typically have some information in the server log corresponding to the first GRA_∗.log file. One such example is:

2020-11-05 13:15:02 69 [Note] InnoDB: BF-BF X lock conflict,mode: 1027 supremum: 0conflicts states: my executing locked committing
RECORD LOCKS space id 251 page no 1933311 n bits 160 index PRIMARY of table `zabbix`.`problem` trx id 3608288917 lock_mode X locks rec but not gap
Record lock, heap no 88 PHYSICAL RECORD: n_fields 16; compact format; info bits 32
 0: len 8; hex 000000000a212cd3; asc      !, ;;
 1: len 6; hex 0000d7121e95; asc       ;;
 2: len 7; hex 16000980311ea3; asc     1  ;;
 3: len 4; hex 80000000; asc     ;;
 4: len 4; hex 80000000; asc     ;;
 5: len 8; hex 00000000001af108; asc         ;;
 6: len 4; hex dd9fd628; asc    (;;
 7: len 4; hex 81c70daa; asc     ;;
 8: len 8; hex 00000000135ec5e3; asc      ^  ;;
 9: len 4; hex dfa2a5cb; asc     ;;
 10: len 4; hex b565950e; asc  e  ;;
 11: SQL NULL;
 12: len 8; hex 0000000000000000; asc         ;;
 13: len 30; hex 5669727475616c20736572766572202f436f6d6d6f6e2f44544c535f6470; asc Virtual server /Common/DTLS_dp; (total 88 bytes);
 14: len 4; hex 80000000; asc     ;;
 15: len 4; hex 80000002; asc     ;;
Record lock, heap no 89 PHYSICAL RECORD: n_fields 16; compact format; info bits 32
 0: len 8; hex 000000000a212cd1; asc      !, ;;
 1: len 6; hex 0000d7121e95; asc       ;;
 2: len 7; hex 16000980311dd2; asc     1  ;;
 3: len 4; hex 80000000; asc     ;;
 4: len 4; hex 80000000; asc     ;;
 5: len 8; hex 00000000001af107; asc         ;;
 6: len 4; hex dd9fd627; asc    ';;
 7: len 4; hex b89c75a7; asc   u ;;
 8: len 8; hex 00000000135ec5dc; asc      ^  ;;
 9: len 4; hex dfa2a5cb; asc     ;;
 10: len 4; hex b70de8af; asc     ;;
 11: SQL NULL;
 12: len 8; hex 0000000000000000; asc         ;;
 13: len 30; hex 5669727475616c20736572766572202f436f6d6d6f6e2f64705f53545f47; asc Virtual server /Common/dp_ST_G; (total 83 bytes);
 14: len 4; hex 80000000; asc     ;;
 15: len 4; hex 80000002; asc     ;;
[121B blob data]
2020-11-05 13:15:02 69 [ERROR] mysqld: Can't find record in 'problem'
2020-11-05 13:15:02 69 [Warning] WSREP: Ignoring error 'Can't find record in 'problem'' on Delete_rows_v1 event. Error_code: 1032
2020-11-05 13:15:02 69 [Warning] Slave SQL: Could not execute Delete_rows_v1 event on table zabbix.problem; Can't find record in 'problem', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 2835058, Internal MariaDB error code: 1032

After this initial event the server generates GRA_∗.log files at an alarming rate, 1–2 per second or about 100k per day. There are no further entries in the server corresponding to the flood of GRA_∗.log files. If gone unnoticed you'll have millions of GRA_∗.log files in /var/lib/mysql in a matter of weeks, and the server will grind to a halt.

I have not been able to determine reproduction steps, however it is important to note if I set wsrep_slave_threads=1 (dynamically; without a server restart) the creation of GRA_∗.log files immediately stops. I can then return wsrep_slave_threads to a value >1 and no GRA_∗.log files are created (well, until a few days later when the situation repeats).

It is also important to note the cluster remains in-sync throughout (wsrep_local_state_comment reports Synced on all nodes), as if all these GRA_∗.log files are being generated erroneously.



 Comments   
Comment by Mark Reibert [ 2020-11-06 ]

Here are the relevant parameters set on the server:

binlog_format = ROW
innodb_autoinc_lock_mode = 2
 
wsrep_on=ON
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_provider_options="gcs.fc_master_slave = yes; gcs.fc_limit = 10000; gcache.size = 64800M"
wsrep_slave_threads = 64
 
innodb_buffer_pool_size = 1133G
innodb_buffer_pool_instances = 1
innodb_log_file_size = 2G

Comment by Walter Doekes [ 2021-04-06 ]

vvv - IGNORE THIS REPLY - ISSUE TURNED OUT TO BE POOR CONFIG - vvv

We also have a cluster with millions of GRA* files with 10.4.17+maria~bionic and galera 26.4.6-bionic.

In our case however, the cluster is not in sync. We've identified at least one query that was executed on two nodes (node2, node3) but not on the third (node1, donor).

When browsing backwards in the logs, I see that the gtid_seq_no (and gtid) is in sync until that happens – we're using wsrep_gtid_mode=OFF for the record:

  • In binlogs on all three nodes:

    #210402  2:56:42 server id 1  end_log_pos 1012988783 CRC32 0x5ae3b110   Xid = 695577552
    COMMIT/*!*/;
    

    (note that the binlog files (with log-slave-updates) do not start at the same position nor have the same; also for the record)

    # at 1012988783
    #210402  2:56:42 server id 1  end_log_pos 1012988825 CRC32 0x086217be   GTID 0-1-691597701 trans
    /*!100001 SET @@session.gtid_seq_no=691597701*//*!*/;
    START TRANSACTION
    /*!*/;
    # at 1012988825
    # at 1012988971
    #210402  2:56:42 server id 1  end_log_pos 1012988971 CRC32 0x8f842e7b   Annotate_rows:
    #Q> UPDATE my_aspnet_sessions SET LockId = 11380, Flags = 0  WHERE SessionId = 'xxx' AND ApplicationId = 4
    #210402  2:56:42 server id 1  end_log_pos 1012989046 CRC32 0x73db3b86   Table_map: `ej`.`my_aspnet_sessions` mapped to number 691652
    # at 1012989046
    #210402  2:56:42 server id 1  end_log_pos 1012990372 CRC32 0x5c84ad21   Update_rows: table id 691652 flags: STMT_END_F
     
    BINLOG '
    ...
    (Xid 695577553 and 695577554)
    ...
    #210402  2:56:42 server id 1  end_log_pos 1047855470 CRC32 0x69df4b83   Xid = 695577554
    COMMIT/*!*/;
    

  • But then on node2 and node3:

    # at 1047855470
    #210402  2:56:42 server id 1  end_log_pos 1047855512 CRC32 0x8cbefbe3   GTID 0-1-691597703 trans
    /*!100001 SET @@session.gtid_seq_no=691597703*//*!*/;
    START TRANSACTION
    /*!*/;
    # at 1047855512
    # at 1047855564
    #210402  2:56:42 server id 1  end_log_pos 1047855564 CRC32 0x4286165e   Annotate_rows:
    #Q> DELETE FROM sphinx_file_delta
    #210402  2:56:42 server id 1  end_log_pos 1047855622 CRC32 0x05fc2b00   Table_map: `ej`.`sphinx_file_delta` mapped to number 477
    # at 1047855622
    #210402  2:56:42 server id 1  end_log_pos 1047855665 CRC32 0x24891824   Delete_rows: table id 477 flags: STMT_END_F
     
    BINLOG '
    ymtmYBMBAAAAOgAAAAYCdT4AAN0BAAAAAAEAAmVqABFzcGhpbnhfZmlsZV9kZWx0YQABAwAAACv8
    BQ==
    ymtmYBkBAAAAKwAAADECdT4AAN0BAAAAAAEAAf/+yNZGAP7L1kYAJBiJJA==
    '/*!*/;
    ### DELETE FROM `ej`.`sphinx_file_delta`
    ### WHERE
    ###   @1=4642504
    ### DELETE FROM `ej`.`sphinx_file_delta`
    ### WHERE
    ###   @1=4642507
    # Number of rows: 2
    # at 1047855665
    #210402  2:56:42 server id 1  end_log_pos 1047855696 CRC32 0x78f236e7   Xid = 695577555
    COMMIT/*!*/;
    

    We can see that Xid = 695577555 is executed, which deletes two records. This is not replicated to node1. We've seen that those two records were still available on node1.

  • And then on all nodes we see the next query, but this one (node1, donor), we see the gtid_seq_no=691597703 instead of at gtid_seq_no=691597704 (which we see on the other two):

    # at 1012992064
    #210402  2:56:43 server id 1  end_log_pos 1012992106 CRC32 0x13385462   GTID 0-1-691597703 trans
    /*!100001 SET @@session.gtid_seq_no=691597703*//*!*/;
    START TRANSACTION
    /*!*/;
    # at 1012992106
    # at 1012992290
    #210402  2:56:43 server id 1  end_log_pos 1012992290 CRC32 0xf6859c0c   Annotate_rows:
    #Q> UPDATE my_aspnet_sessions SET  Locked = 1, LockDate = NOW() WHERE SessionId = 'yyy' AND ApplicationId = 4 AND Locked = 0 AND Expires > NOW()
    #210402  2:56:43 server id 1  end_log_pos 1012992365 CRC32 0xd0d50b98   Table_map: `ej`.`my_aspnet_sessions` mapped to number 691652
    # at 1012992365
    #210402  2:56:43 server id 1  end_log_pos 1012993683 CRC32 0x471464ad   Update_rows: table id 691652 flags: STMT_END_F
     
    BINLOG '
    ...
    # Number of rows: 1
    # at 1012993683
    #210402  2:56:43 server id 1  end_log_pos 1012993714 CRC32 0x3fb35660   Xid = 695577556
    COMMIT/*!*/;
    

    This is Xid = 695577556 on all nodes. But from now on, node1 has 1 gtid_seq_no lower than the rest. In the latest binlog I checked (four days later), gtid_seq_no is off by 3.

(No idea if the gtid_seq_no is relevant, but it appears to match the point when the replication does out of sync.)

Thoughts:

  • I don't know if this is related to this ticket, as we do have an out-of-sync db at this point;
  • but, there are a gazillion GRA_* logs, even though the gtid_seq_no seems to be off by just 3 after 4.5 days (I'd expect mostly logs related to the missing transaction)

Notes:

  • there is no mysql replication going on in our case, only wsrep;
  • server_id=1, gtid_domain_id=0, wsrep_gtid_domain_id=0;
  • there are more gtid's than just the 0-1-XXX: gtid_binlog_state = 0-1-701737491,21-21-618455277,21-22-618455300;
  • wsrep_slave_threads is 64 (node1) and 24 (node2, node3);
  • wsrep_local_state_comment = Synced according to show global status, but (at least for a while) it was not the case;
  • there are a bunch of INSERT IGNORE statements going on.

^^^ - IGNORE THIS REPLY - ISSUE TURNED OUT TO BE POOR CONFIG - ^^^

Comment by Walter Doekes [ 2021-04-09 ]

Sooo... ignore my reply.

It turned out that node1 had a stray replicate_wild_ignore_table= that everyone missed.

Move along people, nothing to see here..

Comment by Jan Lindström (Inactive) [ 2021-04-09 ]

mreibert I need something to analyze so at least some steps how to reproduce, error logs when something goes wrong (preferable with --wsrep-debug=1 setting).

Comment by Mark Reibert [ 2021-04-16 ]

jplindst I completely understand your request, unfortunately this problem only occurs on my production cluster that is not conducive to debugging since the primary database is 7TB in size. (SSTs are 8 hour events!)

That being said, I can enable the error log and set wsrep_debug since according to the docs it is dynamic, and then just wait for the next time the GRA_*.log files spew out. What enumeration value do you want for wsrep_debug?

Comment by Mark Reibert [ 2022-01-03 ]

Hello jplindst — I see you closed this issue as incomplete, but note I was waiting for a response from you as to what value you want for wsrep_debug.

Comment by Jan Lindström (Inactive) [ 2022-01-04 ]

mreibert It was in my request i.e. wsrep-debug=1, I closed this as we could not reproduce this.

Comment by Mark Reibert [ 2022-01-05 ]

jplindst — I apologize for the confusion. You requested wsrep-debug=1 however the documentation indicates this is an enumeration and I am unsure which which enumeration value maps to 1.

This was a very frequent problem for me when I was running 10.4.14, however since I am now running 10.4.22 I generally do not see the creation of GRA_*.log files. So somewhere between 10.4.14 and 10.4.22—which brings along with it new versions of the Galera library—things improved!

Generated at Thu Feb 08 09:27:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.