[MDEV-9383] Server fails to read master.info after upgrade 10.0 -> 10.1 Created: 2016-01-08 Updated: 2016-04-08 Resolved: 2016-04-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.1.10, 10.0, 10.1 |
| Fix Version/s: | 10.1.14 |
| Type: | Bug | Priority: | Major |
| Reporter: | Igor Pashev | Assignee: | Kristian Nielsen |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Environment: |
Linux |
||
| Attachments: |
|
| Description |
|
I tried to upgrade the server 10.0.22 -> 10.1.10.
I guess I can't use mysql_upgrade when the server is not running. |
| Comments |
| Comment by Igor Pashev [ 2016-01-08 ] | |||||||||||||||
|
--skip-slave-start does not help | |||||||||||||||
| Comment by Elena Stepanova [ 2016-01-11 ] | |||||||||||||||
|
Please paste a bigger unabridged log fragment, covering the previous server session (before the upgrade) if you still have it, when it started and shut down normally, and continuing onto the startup attempt of the upgraded server, until it aborted. If there is any confidential information in there – IPs or whatever – you can obfuscate it. Please also attach your cnf file(s) and paste the output of ls -l of the directory where your binary logs, index files and such reside. Thanks. | |||||||||||||||
| Comment by Igor Pashev [ 2016-01-21 ] | |||||||||||||||
|
Attached log and cnf file. I found some IRC conversation as well
I'd tried to reproduce the issue with fresh setup, but couldn't: migration went flawlessly. | |||||||||||||||
| Comment by Igor Pashev [ 2016-01-21 ] | |||||||||||||||
|
One more thing. When it's started with - | |||||||||||||||
| Comment by Elena Stepanova [ 2016-01-22 ] | |||||||||||||||
|
Thank you. Could you please pack all *.info files and upload them to our FTP (ftp.askmonty.org/private)? As you can see yourself, there is nothing remotely confidential in those files, except for maybe host names/addresses (you can obfuscate those if you want); but the server fails to parse some of them, it would be useful to see why. Is it intentional that you have so many master connections? Seems to be a lot of garbage in there, maybe some of it stale or inconsistent. | |||||||||||||||
| Comment by Elena Stepanova [ 2016-02-19 ] | |||||||||||||||
|
Please comment to re-open if you have further information on the issue. | |||||||||||||||
| Comment by Anton Avramov [ 2016-03-17 ] | |||||||||||||||
|
I have the same problem trying to upgrade from ubuntu .deb packages. I also had some old replications but have managed to downgrade and remove all of them. The problem still persists. | |||||||||||||||
| Comment by Andreas Leathley [ 2016-04-03 ] | |||||||||||||||
|
I had the same problem when upgrading von 10.0 to 10.1, while an upgrade from 5.5 to 10.1 worked without problems. In both instances I uninstalled MariaDB and then installed the new version (this worked best so far). I noticed the following lines were missing in master.info in the 10.0->10.1 upgrade: do_domain_ids=0 When I added these lines, my master configuration could be read again and the replication slave worked as expected. It seems the 10.0 to 10.1 upgrade does not add these lines and then cannot read the configuration - maybe because "using_gtid=0" was already in the master configuration, while this is (of course) missing when upgrading from 5.5. I think MariaDB 10.1 should check the master.info for such a 10.0 configuration and also update it if necessary. | |||||||||||||||
| Comment by Anton Avramov [ 2016-04-04 ] | |||||||||||||||
|
I confirm that the solution proposed by Mr. Leathley fix the problem for me. Then I've rerun the apt-get command and this time it was all fine so far. | |||||||||||||||
| Comment by Elena Stepanova [ 2016-04-06 ] | |||||||||||||||
|
lukav@lukav.com, thanks a lot for providing the files, it's much clearer now. The problem with the info file is not the three missing lines – 10.1 is perfectly fine starting without them, – but the extra empty line at the end of the file. I see it in lukav@lukav.com's archive, and apparently, iquito had it too. I get the same exact error if I insert an empty line manually into my info file. The question is where this empty line came from – I am not getting it by just firing the recent 10.0 server and shutting it down, apparently, something else caused it. | |||||||||||||||
| Comment by Andreas Leathley [ 2016-04-06 ] | |||||||||||||||
|
Ah so the END MARKER solved it. Well I can only describe my case, where the situation was: Master running MariaDB 5.5.34 I updated the slave from 5.5.34 to 10.0.10 about two years ago, and did not change anything else as far as I remember (no CHANGE MASTER TO or any other manual changes to master.info), the slave then continued to work until a few days ago, when I changed the master (to MariaDB 10.1.13) and upgraded MariaDB on the slave from 10.0.10 to 10.1.13. When I then started the slave, the errors about master.info came up. Upgrading from 5.5.34 to 10.1.13 on other slaves worked flawlessly. When comparing I noticed that MariaDB 10.0.10 added the "using_gtid=0" in master.info, and 10.1.13 did not change master.info in any way, so I would guess 10.0.10 also added the newline at the end, which lead to the problems in 10.1.13. | |||||||||||||||
| Comment by Anton Avramov [ 2016-04-06 ] | |||||||||||||||
|
I can also say that this was installation regularly upgraded. Others have: I can upload the files if this would help? | |||||||||||||||
| Comment by Elena Stepanova [ 2016-04-06 ] | |||||||||||||||
|
iquito, it is actually a very good story. So, you did not use multi-master replication at all? It narrows down the search considerably. Are you using parallel replication? lukav@lukav.com, yes, I agree that the new server should ignore erroneous lines; the problem is, if we don't know why erroneous lines appear, we don't know where the server should expect and ignore them – is it at the end of the file only, or is it after using_gtid (which is not the end of the file in 10.1, so if 10.1 still has the bug that causes those empty lines sporadically, they can appear between using_gtid and do_domain_ids), or can it be in a random place of the file at all, in which case ignoring will be way more complicated. | |||||||||||||||
| Comment by Andreas Leathley [ 2016-04-06 ] | |||||||||||||||
|
No, I only use "regular" replication, with just the out-of-the-box settings - so just the default parallel replication settings (which would be "conservative" mode now, I guess), and nothing fancy at all. | |||||||||||||||
| Comment by Andreas Leathley [ 2016-04-06 ] | |||||||||||||||
|
I seems as though in both our cases the additional newline was inserted after the using_gtid line, which was added by MariaDB 10.0, that is why I would point my finger at 10.0.(10 in my case) - but of course I don't know if there are more scenarios which are possible/likely. | |||||||||||||||
| Comment by Andreas Leathley [ 2016-04-06 ] | |||||||||||||||
|
I still have the master.info which was used in 5.5.34 (which was then upgraded to 10.0.10 and then to 10.1.13) - maybe that helps, I only anonymized IP, username and password. master.info.example Unfortunately I don't have the version used in 10.0.10 anymore, I just know that the only "noticeable" difference was the enable_gtid=0 (and probably the newline at the end). | |||||||||||||||
| Comment by Elena Stepanova [ 2016-04-06 ] | |||||||||||||||
|
iquito, thank you. | |||||||||||||||
| Comment by Elena Stepanova [ 2016-04-06 ] | |||||||||||||||
|
Thanks all. So, the issue is two-fold. In 10.0, do not add garbage at the end of the file beyond the line count.
The second part of the problem is 10.1 – as lukav@lukav.com said above, it shouldn't be this sensitive to a trailing empty line.
| |||||||||||||||
| Comment by Kristian Nielsen [ 2016-04-07 ] | |||||||||||||||
|
Seems the bug was introduced with this commit:
When the function read_mi_key_from_file() sees an empty line, it handles it (There seems to be more potential problems with this code? For example, what | |||||||||||||||
| Comment by Kristian Nielsen [ 2016-04-07 ] | |||||||||||||||
|
BTW, the issue with leaving extra stuff at the end of master.info is fixed in 10.1 with END_MARKER. | |||||||||||||||
| Comment by Anton Avramov [ 2016-04-07 ] | |||||||||||||||
|
MDEV-9383-100grama.sample I've compiled the ending of the .info files in some of our clients multi-masters. I think this would give you a good samples of the "damages" the bug made | |||||||||||||||
| Comment by Kristian Nielsen [ 2016-04-07 ] | |||||||||||||||
|
Suggested fix: http://lists.askmonty.org/pipermail/commits/2016-April/009251.html | |||||||||||||||
| Comment by Kristian Nielsen [ 2016-04-07 ] | |||||||||||||||
|
BTW, what happens here is that when a new master.info file is written, it So if the new file happens to be shorter, extra junk is left at the In 10.1, a line "END_MARKER" is added at the bottom to easily avoid reading But then apparently some bugs were introduced with the previously mentioned An easy work-around, until a fixed 10.1 release becomes available, is to | |||||||||||||||
| Comment by Kristian Nielsen [ 2016-04-08 ] | |||||||||||||||
|
Pushed to 10.1 |