[MDEV-30973] Enhance output of innodb deadlocks (SAMU-124, SAMU-131) Created: 2023-03-30 Updated: 2023-12-22 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Replication, Server, Storage Engine - InnoDB |
| Fix Version/s: | 11.5 |
| Type: | Task | Priority: | Major |
| Reporter: | Rob Schwyzer | Assignee: | Aleksey Midenkov |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | ServiceNow | ||
| Issue Links: |
|
||||||||||||
| Description |
| Comments |
| Comment by Aleksey Midenkov [ 2023-08-17 ] | |
|
This ticket implements the following requests:
As per SAMU-124 we need to output deadlock info into multiple destinations. Since marko in c68007d refactored DeadlockChecker into Deadlock namespace now this design does not allow printing into multiple streams. I propose to refactor back namespace into a class, so that second filehandle/stream could be stored for the output. Instead of stderr there should be array of streams where output of deadlock info goes. marko, please approve. | |
| Comment by Marko Mäkelä [ 2023-08-17 ] | |
|
The InnoDB deadlock checker was rewritten in MDEV-24783 (MariaDB Server 10.6.0). The output format was changed as part of that as well. The old output would only show 2 transactions, while the new one can identify all transactions that are part of the waits-for loop. | |
| Comment by Marko Mäkelä [ 2023-08-17 ] | |
|
Is there any change that is applicable to MariaDB Server 10.6 or later? I suppose that the GTID is stored somewhere in THD, whose definition is hidden from InnoDB. The replication team should be able to help with that. I do not even know at which point of time the GTID is being assigned. It could be right before transaction commit and therefore unavailable most of the time. I think that it may be useful to carefully read | |
| Comment by Aleksey Midenkov [ 2023-08-17 ] | |
|
marko Can you please be specific on the stated topic. | |
| Comment by Marko Mäkelä [ 2023-08-18 ] | |
|
I found 2 change sets that have been applied to a MariaDB Server 10.4 based source code repository. The change set labeled SAMU-131 would extend thd_get_error_context_description() in the case that THD::LOCK_thd_data could be acquired without any waiting. (That condition is needed in order to avoid a deadlock like It might be that this extension could break some third-party tools that could be parsing the output. Therefore, it might be advisable to introduce a configuration parameter for enabling the changed format. That said, I think that it is a reasonable change, to allow better understanding of locking conflicts on replicas. I think that serg needs to review this change, and consider it for inclusion in all supported major versions. When it comes to SAMU-124, I do not think that I can review the change set in its current form, because it conflicts with the refactoring that was made in MDEV-24783. Could the desired effect be achieved by setting innodb_print_all_deadlocks=ON? Instead of being written to a separate file, the output would then be written to the server error log. | |
| Comment by Aleksey Midenkov [ 2023-08-21 ] | |
|
marko did you read this: As per SAMU-124 we need to output deadlock info into multiple destinations. Since Marko Mäkelä in c68007d refactored DeadlockChecker into Deadlock namespace now this design does not allow printing into multiple streams. I propose to refactor back namespace into a class, so that second filehandle/stream could be stored for the output. Instead of stderr there should be array of streams where output of deadlock info goes. Marko Mäkelä, please approve. Do you agree/disagree with the proposed changes? | |
| Comment by Sergei Golubchik [ 2023-08-21 ] | |
|
I don't think it's a meaningful idea to enable logging of something into an arbitrary number of streams. | |
| Comment by Aleksey Midenkov [ 2023-08-21 ] | |
|
serg maybe rob.schwyzer@mariadb.com will better explain this. From my POV it is a) hard to link transaction retry with the deadlock; b) filter-out deadlocks from error log requires complex script (I mean something more than simple grep). serg Logging some aspect into separate file is a powerful feature against constant invention of grep expressions and running grep --line-buffered. I had such experience with another software and it definitely has meaning. | |
| Comment by Sergei Golubchik [ 2023-08-22 ] | |
|
midenok, rob.schwyzer@mariadb.com, my comment was about not logging the same information into two log files. If deadlocks are difficult to filter out from the error log, they could be printed in a form that makes grepping easy. If extensive deadlock information clutters error log and makes it difficult to see anything else there — well, perhaps deadlocks should go into a separate log file indeed. Instead of the error log, not in addition to it. | |
| Comment by Rob Schwyzer [ 2023-08-22 ] | |
|
serg, gotcha, and I agree. Again, my one recommendation would be to leave an option so they can still go into the error log if wanted, as many containerized setups in particular only expose one log file from a service like MariaDB. But I fully agree with the default mode of consumption being to put deadlocks into a separate log file. And I also agree with it being a binary choice- so if a separate log file is the default, then if a variable can switch that to push to the error log instead, it would only go to the error log- not the separate deadlock log. In short, I agree with the content only going to one log file rather than multiple. | |
| Comment by Aleksey Midenkov [ 2023-08-24 ] | |
Good point. Another reason for deadlocks in error.log might be analysing them in relation to other logging that happens in InnoDB and that can be written only to stderr to my best knowledge. Since now multi-destination output is even harder to implement I can stay with single stream option which defaults to stderr. | |
| Comment by Aleksey Midenkov [ 2023-08-29 ] | |
|
I came to conclusion that coupling of retry message and deadlock message is too complex, requires some THD API extension (there is no TABLE at time of retry logging). SNOW version has it, but in CS this should be added by additional request. | |
| Comment by Aleksey Midenkov [ 2023-09-01 ] | |
|
Updated bb-11.0-midenok on top of 11.3 | |
| Comment by Marko Mäkelä [ 2023-09-15 ] | |
|
I think that we need a branch where the regression test suite passes. The InnoDB changes are only a small part of this; there is more code outside InnoDB. | |
| Comment by Aleksey Midenkov [ 2023-10-12 ] | |
|
Please review bb-11.0-midenok | |
| Comment by Marko Mäkelä [ 2023-10-25 ] | |
|
Very many ./mtr --suite=encryption tests fail due to
|