[MDEV-6623] Key corruption on upgrade to v10.0.13 Created: 2014-08-21 Updated: 2014-10-22 Resolved: 2014-10-22 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - Aria |
| Affects Version/s: | 10.0.13 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Charles C | Assignee: | Elena Stepanova |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Ubuntu "Trusty" 14.04 |
||
| Attachments: |
|
| Description |
|
Running MariaDB from the deb repository. Have been running v10.0.x for 6 months or so. This week, I updated the server to 10.0.13. We immediately started seeing problems at the application layer with apparent bad or missing data. Downgrading back to 10.0.12 (and .11, and .10) did not fix the issue, which seemed to indicate the data itself was bad. I saw indications that the server was possibly corrupting keys, such as:
Still running 10.0.10, I then dumped the table contents via mysqldump and reloaded them, and the problems disappeared. i.e. merely applying a limit caused the query to return no rows (id is the primary key on this table). We're using Aria for most of the tables, including the above. |
| Comments |
| Comment by Elena Stepanova [ 2014-08-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi, Were there any errors in the server error log on server startup and/or on executing the problematic queries when the problem was observed? Thanks. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Charles C [ 2014-08-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Letting apt-get update the server resulted in the db server restarting a couple of times (iirc), and I also restarted it by hand once when I saw the weird errors happening. Looking in the errors log at the time didn't seem to show any errors, but I've just gone over it again. In the middle of one of those startups (I'm not sure which one it was) from the time I did the upgrade, this is the only error:
That's followed by a normal startup, so whatever that problem was, it was transient. I did a CHECK TABLE EXTENDED on the invoices table at the time (as well as some others), and it showed the normal "OK" status, like:
Table schema etc follows. Note this is after I downgraded to 10.0.10 and did a dump/load of this (and the other) tables, so the data corruption is already fixed by this point.
| |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2014-08-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you. Please also attach your cnf file(s). | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Charles C [ 2014-08-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
my.cnf plus conf.d/*.cnf files | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Charles C [ 2014-08-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
My configuration dir's contents attached - I've only anonymized the company name, a staff member's name, and the IP address of the server. One other thing that might be relevant is that the data directory is on a PCI-express based SSD, not on spinning discs, so it has very low latency. The filesystem the data is on is also XFS, not ext4 or whatever else. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2014-10-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
I've been trying to reproduce it, but no luck. Do you happen to know with which version the table was initially created, and with which version it was updated last time before the problem happened? Even though I believe the problem was real, without any way to reproduce it there isn't much we can do. And reproducing is tricky if the only visible effect of data corruption is a wrong result set – so, even if I actually hit the problem, but on my query/data it doesn't translate into a wrong result, I won't know. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Charles C [ 2014-10-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
All installations were from the MariaDB deb repo. The first version of MariaDB I installed was 10.0.10: mariadb-server:amd64 (10.0.10+maria-1~trusty). However, the server was upgraded to 10.0.11 and then 10.0.12 before these tables were created (or at least the tables were dropped and recreated after that upgrade to 10.0.12+maria-1~trusty). If I'm wrong about the tables being recreated after the 10.0.12 upgrade, then it would have been under 10.0.11. Everything was running apparently normally on 10.0.12. The tables were in flux with reads and writes on this version. Then, on 20 August, I upgraded the server to 10.0.13+maria-1~trusty, and that's when I started seeing the key corruption and incorrect query results immediately after the upgrade. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2014-10-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
I tried to use the previous 10.0.x version, no luck. Some bugs around strange status of keys or recovery or data in Aria: |