[MDEV-8537] Damaged binlog index Created: 2015-07-24 Updated: 2015-07-27 Resolved: 2015-07-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | OTHER |
| Affects Version/s: | 10.0.20 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Igor Pashev | Assignee: | Unassigned |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
AWS EC2 / NixOS / Linux-amd64 |
||
| Attachments: |
|
| Description |
|
Few times we got damaged binlog index on two of our servers running MariaDB 10.0.20. I can't tell anything exact, but. Once index looked fine, but probably included invisible symbols, and none of master command were working (RESET MASTER, SHOW MASTER STATUS). Another time index got some binary data (attached). This happened on master and slave (both running the same version - 10.0.20, and have the same config) Yet another time index on master has something on the first line and mysqldump --master-data=1/2 caught SIGSEGV permanently:
mysqld printed something weird on stderr (attached). |
| Comments |
| Comment by Elena Stepanova [ 2015-07-27 ] | |||||||||||||||||
|
Judging by the logs, you have more problems than just binlog index corruption, and they appear to be related. Do you have some kind of a wrapper around mysqld / mysqld_safe which starts/stops/restarts the server? For example, where these records in the error log come from?
and more importantly
The obvious problem (that might or might not be the reason for the binlog issue, but there is a good chance it is) is that your server does not shut down properly upon restart, so during restart you have a concurrent use of essential files. Maybe binlog index is one of those, then, if both the "old" and the "new" instances write into it at once, it's not surprising it ends up with a garbage. There are visible indications that the concurrency problem does exist.
So, the normal shutdown just started, it's not completed (it would have said "Shutdown complete"). But somehow, something decides it's "ready to mysql" and starts the new one. Apparently, it does not even SIGKILL the previous server before doing so (it would have been a bad idea, but starting a new one without stopping the old one is even worse):
As the log says, something is still using the aria control file. Given the above, it's apparently the old process which is still shutting down. There are other weird artefacts which might be caused by the same issue:
etc. I suggest to get rid of this problem (e.g. fix the wrapper so it waits for the shutdown to finish, with all proper checks on the PID), to start fresh and see if the binlog index corruption re-appears. If the wrapper is something we provide that I'm not aware of, please let me know. | |||||||||||||||||
| Comment by Igor Pashev [ 2015-07-27 ] | |||||||||||||||||
|
Yes, you are right. Our server is running under supervisor, that itself is under super-supervisor. Thank you for your analysis! I'd close this case as "invalid" | |||||||||||||||||
| Comment by Elena Stepanova [ 2015-07-27 ] | |||||||||||||||||
|
Closing as not-a-bug for now. If the issue with the binlog index corruption re-appears on a cleaner setup, please comment to re-open. |