[MDEV-17423] failed tc log preventing mariadb from starting - error message to be improved Created: 2018-10-10  Updated: 2023-04-27

Status: Confirmed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.1.26
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Paolo Benvenuto Assignee: Oleksandr Byelkin
Resolution: Unresolved Votes: 0
Labels: None
Environment:

debian stretch



 Description   

I had the main disk partition full, and it seems that mariadb crashed, but when it tried to generate tc log, nothing more than a zero-length file was saved.

At next server start, mariadb couldn't start:

2018-10-10 21:13:01 139737626755648 [Note] Recovering after a crash using tc.log
2018-10-10 21:13:01 139737626755648 [ERROR] Can't init tc log
2018-10-10 21:13:01 139737626755648 [ERROR] Aborting

I had to manually delete /var/lib/mysql/tc.log

However, it seems a trivial change to let mariadb check tc.log's lenght before trying to init it, a 0-length tc.log can safely be deleted by mariadb and the server can start.



 Comments   
Comment by Elena Stepanova [ 2018-10-13 ]

On the other hand, it makes sense to alert the users about the hopelessly corrupt tc.log and let them deal with the problem manually (for example, maybe they do need it and have a backup of it)?

Comment by Elena Stepanova [ 2018-11-10 ]

serg, any opinion?

Comment by Sergei Golubchik [ 2018-11-11 ]

You mean, instead of "Can't init tc log", say something like "tc.log is hopelessly corrupt, manual intervention is required"? That's ok.

But automatically ignoring corrupt tc.log is wrong, it might cause inconsistent data.

Comment by Elena Stepanova [ 2018-11-11 ]

In my comment I only meant that even though it's technically easy to ignore a zero-sized log, I don't think it's a good idea because users should know that something went wrong and deal with it.
I didn't mean any specific changes in logging, although yes, Can't init tc log message often causes confusion, if it can be improved, it would be great.

Comment by Geoff Montee (Inactive) [ 2019-03-25 ]

I think it would probably be helpful to have some sql_print_error calls added to the TC_LOG_MMAP::open function that provide more information on the cause of failures when the init process fails. Currently, only one of the "goto err" calls that appear in that function is preceded by an sql_print_error call with a descriptive error message.

https://github.com/MariaDB/server/blob/1ef50a34ec53d0e3e43776f414dd99f343d5d6ba/sql/log.cc#L9061

Generated at Thu Feb 08 08:36:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.