Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9383

Server fails to read master.info after upgrade 10.0 -> 10.1

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1.10, 10.0(EOL), 10.1(EOL)
    • 10.1.14
    • Replication
    • None
    • Linux

    Description

      I tried to upgrade the server 10.0.22 -> 10.1.10.
      Old version was shutdown, but new failed to start:

      2016-01-08 22:25:00 7f8017479700 InnoDB: Error: Column last_update in table "mysql"."innodb_table_stats" is INT UNSIGNED NOT NULL but should be BINARY(4) NOT NULL (type mismatch).
      2016-01-08 22:25:00 7f8017479700 InnoDB: Error: Fetch of persistent statistics requested for table "mysql"."gtid_slave_pos" but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
      2016-01-08 22:25:01 140188123334592 [Note] Server socket created on IP: '::'.
      2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
      2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
      2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dhk.info' failed
      2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
      2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
      2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dth.info' failed
      2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
      2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
      2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dsg.info' failed
      2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
      2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
      2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-delivery.info' failed
      2016-01-08 22:25:01 140188123334592 [Warning] Reading of some Master_info entries failed
      2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize multi master structures
      2016-01-08 22:25:01 140188123334592 [ERROR] Aborting
      

      I guess I can't use mysql_upgrade when the server is not running.

      Attachments

        1. log.txt
          23 kB
        2. mysqld.cnf
          1 kB
        3. relay.ls.txt.gz
          34 kB
        4. master-info.txt
          4 kB
        5. master.info.example
          0.1 kB
        6. MDEV-9383-100grama.sample
          0.5 kB
        7. MDEV-9383-ck.sample
          0.7 kB
        8. MDEV-9383-miss60.sample
          0.9 kB

        Activity

          ip1981 Igor Pashev created issue -
          ip1981 Igor Pashev added a comment -

          --skip-slave-start does not help

          ip1981 Igor Pashev added a comment - --skip-slave-start does not help

          ip1981,

          Please paste a bigger unabridged log fragment, covering the previous server session (before the upgrade) if you still have it, when it started and shut down normally, and continuing onto the startup attempt of the upgraded server, until it aborted. If there is any confidential information in there – IPs or whatever – you can obfuscate it.

          Please also attach your cnf file(s) and paste the output of ls -l of the directory where your binary logs, index files and such reside.

          Thanks.

          elenst Elena Stepanova added a comment - ip1981 , Please paste a bigger unabridged log fragment, covering the previous server session (before the upgrade) if you still have it, when it started and shut down normally, and continuing onto the startup attempt of the upgraded server, until it aborted. If there is any confidential information in there – IPs or whatever – you can obfuscate it. Please also attach your cnf file(s) and paste the output of ls -l of the directory where your binary logs, index files and such reside. Thanks.
          elenst Elena Stepanova made changes -
          Field Original Value New Value
          Labels need_feedback
          ip1981 Igor Pashev made changes -
          Attachment log.txt [ 41405 ]
          ip1981 Igor Pashev made changes -
          Attachment mysqld.cnf [ 41406 ]
          ip1981 Igor Pashev added a comment -

          Attached log and cnf file. I found some IRC conversation as well

          2015-10-29 11:34:36 Jippi Hi, I just upgraded a slave from 10.0.21 to 10.1.8  - looks like something went wrong with the gtid replication - the slave refuses to start up - I’m getting “ERROR 1201 (HY000): Could not initialize master info structure for ''; more error messages can be found in the MariaDB error log” when running ‘start slave’ and “[ERROR] Failed to initialize master info using_gtid” in my log file
          2015-10-29 11:34:42 Jippi anyone know what the cause for that could be
          2015-10-29 11:35:07 paulo-tada joined chan
          2015-10-29 11:35:16 Jippi my mysql.gtid_slave_pos table looks correct and with the last values before upgrade
          2015-10-29 11:36:23 Jippi selecting  @@GLOBAL.gtid_slave_pos also looks correct
          2015-10-29 11:38:10 Jippi even doing the full ‘change master …’ fail with the same error
          2015-10-29 11:39:14 Jippi hmm okay, so a “reset slave;” fixed it

          I'd tried to reproduce the issue with fresh setup, but couldn't: migration went flawlessly.

          ip1981 Igor Pashev added a comment - Attached log and cnf file. I found some IRC conversation as well 2015-10-29 11:34:36 Jippi Hi, I just upgraded a slave from 10.0.21 to 10.1.8 - looks like something went wrong with the gtid replication - the slave refuses to start up - I’m getting “ERROR 1201 (HY000): Could not initialize master info structure for ''; more error messages can be found in the MariaDB error log” when running ‘start slave’ and “[ERROR] Failed to initialize master info using_gtid” in my log file 2015-10-29 11:34:42 Jippi anyone know what the cause for that could be 2015-10-29 11:35:07 paulo-tada joined chan 2015-10-29 11:35:16 Jippi my mysql.gtid_slave_pos table looks correct and with the last values before upgrade 2015-10-29 11:36:23 Jippi selecting @@GLOBAL.gtid_slave_pos also looks correct 2015-10-29 11:38:10 Jippi even doing the full ‘change master …’ fail with the same error 2015-10-29 11:39:14 Jippi hmm okay, so a “reset slave;” fixed it I'd tried to reproduce the issue with fresh setup, but couldn't: migration went flawlessly.
          ip1981 Igor Pashev made changes -
          Attachment relay.ls.txt.gz [ 41407 ]
          ip1981 Igor Pashev made changes -
          Attachment master-info.txt [ 41408 ]
          elenst Elena Stepanova made changes -
          Labels need_feedback
          ip1981 Igor Pashev added a comment -

          One more thing. When it's started with -skip-slave-start mysqld terminated after [ERROR] Aborting. While without -skip-slave-start it keeps running, socket is created, but can't be read (mysql cli hangs), strace shows recvfrom as the last call.

          ip1981 Igor Pashev added a comment - One more thing. When it's started with - skip-slave-start mysqld terminated after [ERROR] Aborting . While without -skip-slave-start it keeps running, socket is created, but can't be read (mysql cli hangs), strace shows recvfrom as the last call.

          ip1981,

          Thank you. Could you please pack all *.info files and upload them to our FTP (ftp.askmonty.org/private)? As you can see yourself, there is nothing remotely confidential in those files, except for maybe host names/addresses (you can obfuscate those if you want); but the server fails to parse some of them, it would be useful to see why.

          Is it intentional that you have so many master connections? Seems to be a lot of garbage in there, maybe some of it stale or inconsistent.

          elenst Elena Stepanova added a comment - ip1981 , Thank you. Could you please pack all *.info files and upload them to our FTP (ftp.askmonty.org/private)? As you can see yourself, there is nothing remotely confidential in those files, except for maybe host names/addresses (you can obfuscate those if you want); but the server fails to parse some of them, it would be useful to see why. Is it intentional that you have so many master connections? Seems to be a lot of garbage in there, maybe some of it stale or inconsistent.
          elenst Elena Stepanova made changes -
          Labels need_feedback

          Please comment to re-open if you have further information on the issue.

          elenst Elena Stepanova added a comment - Please comment to re-open if you have further information on the issue.
          elenst Elena Stepanova made changes -
          Component/s OTHER [ 10125 ]
          Fix Version/s N/A [ 14700 ]
          Resolution Incomplete [ 4 ]
          Status Open [ 1 ] Closed [ 6 ]
          lukav@lukav.com Anton Avramov added a comment -

          I have the same problem trying to upgrade from ubuntu .deb packages.
          I have uploaded tar.gz to the ftp with the config folder .info files and syslog cut that represents stopping the old instance and running the new.

          I also had some old replications but have managed to downgrade and remove all of them. The problem still persists.

          lukav@lukav.com Anton Avramov added a comment - I have the same problem trying to upgrade from ubuntu .deb packages. I have uploaded tar.gz to the ftp with the config folder .info files and syslog cut that represents stopping the old instance and running the new. I also had some old replications but have managed to downgrade and remove all of them. The problem still persists.
          elenst Elena Stepanova made changes -
          Assignee Elena Stepanova [ elenst ]
          Resolution Incomplete [ 4 ]
          Status Closed [ 6 ] Stalled [ 10000 ]
          elenst Elena Stepanova made changes -
          Labels need_feedback

          I had the same problem when upgrading von 10.0 to 10.1, while an upgrade from 5.5 to 10.1 worked without problems. In both instances I uninstalled MariaDB and then installed the new version (this worked best so far). I noticed the following lines were missing in master.info in the 10.0->10.1 upgrade:

          do_domain_ids=0
          ignore_domain_ids=0
          END_MARKER

          When I added these lines, my master configuration could be read again and the replication slave worked as expected. It seems the 10.0 to 10.1 upgrade does not add these lines and then cannot read the configuration - maybe because "using_gtid=0" was already in the master configuration, while this is (of course) missing when upgrading from 5.5. I think MariaDB 10.1 should check the master.info for such a 10.0 configuration and also update it if necessary.

          iquito Andreas Leathley added a comment - I had the same problem when upgrading von 10.0 to 10.1, while an upgrade from 5.5 to 10.1 worked without problems. In both instances I uninstalled MariaDB and then installed the new version (this worked best so far). I noticed the following lines were missing in master.info in the 10.0->10.1 upgrade: do_domain_ids=0 ignore_domain_ids=0 END_MARKER When I added these lines, my master configuration could be read again and the replication slave worked as expected. It seems the 10.0 to 10.1 upgrade does not add these lines and then cannot read the configuration - maybe because "using_gtid=0" was already in the master configuration, while this is (of course) missing when upgrading from 5.5. I think MariaDB 10.1 should check the master.info for such a 10.0 configuration and also update it if necessary.
          lukav Anton Avramov added a comment -

          I confirm that the solution proposed by Mr. Leathley fix the problem for me.
          I'va installed the new package via apt-get install mariadb-server.
          The installation failed as before, on configuring the instance.
          I then manually edit master-[myslavename].info file and added:
          do_domain_ids=0
          ignore_domain_ids=0
          just after "using_gtid=0", preserving the last line.

          Then I've rerun the apt-get command and this time it was all fine so far.
          The slave continued as normal and the upgrade procedures ware finished, as far as I can tell.

          lukav Anton Avramov added a comment - I confirm that the solution proposed by Mr. Leathley fix the problem for me. I'va installed the new package via apt-get install mariadb-server. The installation failed as before, on configuring the instance. I then manually edit master- [myslavename] .info file and added: do_domain_ids=0 ignore_domain_ids=0 just after "using_gtid=0", preserving the last line. Then I've rerun the apt-get command and this time it was all fine so far. The slave continued as normal and the upgrade procedures ware finished, as far as I can tell.
          elenst Elena Stepanova added a comment - - edited

          lukav@lukav.com, thanks a lot for providing the files, it's much clearer now.

          The problem with the info file is not the three missing lines – 10.1 is perfectly fine starting without them, – but the extra empty line at the end of the file. I see it in lukav@lukav.com's archive, and apparently, iquito had it too. I get the same exact error if I insert an empty line manually into my info file.
          By adding the three lines, you got rid of the empty one, and even if you didn't, it doesn't matter anymore because it's after the END MARKER now.

          The question is where this empty line came from – I am not getting it by just firing the recent 10.0 server and shutting it down, apparently, something else caused it.

          elenst Elena Stepanova added a comment - - edited lukav@lukav.com , thanks a lot for providing the files, it's much clearer now. The problem with the info file is not the three missing lines – 10.1 is perfectly fine starting without them, – but the extra empty line at the end of the file. I see it in lukav@lukav.com 's archive, and apparently, iquito had it too. I get the same exact error if I insert an empty line manually into my info file. By adding the three lines, you got rid of the empty one, and even if you didn't, it doesn't matter anymore because it's after the END MARKER now. The question is where this empty line came from – I am not getting it by just firing the recent 10.0 server and shutting it down, apparently, something else caused it.

          Ah so the END MARKER solved it. Well I can only describe my case, where the situation was:

          Master running MariaDB 5.5.34
          Slave running MariaDB 5.5.34

          I updated the slave from 5.5.34 to 10.0.10 about two years ago, and did not change anything else as far as I remember (no CHANGE MASTER TO or any other manual changes to master.info), the slave then continued to work until a few days ago, when I changed the master (to MariaDB 10.1.13) and upgraded MariaDB on the slave from 10.0.10 to 10.1.13. When I then started the slave, the errors about master.info came up. Upgrading from 5.5.34 to 10.1.13 on other slaves worked flawlessly. When comparing I noticed that MariaDB 10.0.10 added the "using_gtid=0" in master.info, and 10.1.13 did not change master.info in any way, so I would guess 10.0.10 also added the newline at the end, which lead to the problems in 10.1.13.

          iquito Andreas Leathley added a comment - Ah so the END MARKER solved it. Well I can only describe my case, where the situation was: Master running MariaDB 5.5.34 Slave running MariaDB 5.5.34 I updated the slave from 5.5.34 to 10.0.10 about two years ago, and did not change anything else as far as I remember (no CHANGE MASTER TO or any other manual changes to master.info), the slave then continued to work until a few days ago, when I changed the master (to MariaDB 10.1.13) and upgraded MariaDB on the slave from 10.0.10 to 10.1.13. When I then started the slave, the errors about master.info came up. Upgrading from 5.5.34 to 10.1.13 on other slaves worked flawlessly. When comparing I noticed that MariaDB 10.0.10 added the "using_gtid=0" in master.info, and 10.1.13 did not change master.info in any way, so I would guess 10.0.10 also added the newline at the end, which lead to the problems in 10.1.13.
          lukav Anton Avramov added a comment -

          I can also say that this was installation regularly upgraded.
          I've check one of our clients multi master setups and can say that some of the Info files have those extra files, some doesn't.
          Some have more that one extra lines.
          Some have something like:
          using_gtid=1
          0

          Others have:
          using_gtid=1
          =0

          I can upload the files if this would help?
          But I think that obviously there was a bug at some point that introduced those newlines, and the new 10.1 should just remove or ignore them.

          lukav Anton Avramov added a comment - I can also say that this was installation regularly upgraded. I've check one of our clients multi master setups and can say that some of the Info files have those extra files, some doesn't. Some have more that one extra lines. Some have something like: using_gtid=1 0 Others have: using_gtid=1 =0 I can upload the files if this would help? But I think that obviously there was a bug at some point that introduced those newlines, and the new 10.1 should just remove or ignore them.
          elenst Elena Stepanova added a comment - - edited

          iquito, it is actually a very good story. So, you did not use multi-master replication at all? It narrows down the search considerably. Are you using parallel replication?

          lukav@lukav.com, yes, I agree that the new server should ignore erroneous lines; the problem is, if we don't know why erroneous lines appear, we don't know where the server should expect and ignore them – is it at the end of the file only, or is it after using_gtid (which is not the end of the file in 10.1, so if 10.1 still has the bug that causes those empty lines sporadically, they can appear between using_gtid and do_domain_ids), or can it be in a random place of the file at all, in which case ignoring will be way more complicated.

          elenst Elena Stepanova added a comment - - edited iquito , it is actually a very good story. So, you did not use multi-master replication at all? It narrows down the search considerably. Are you using parallel replication? lukav@lukav.com , yes, I agree that the new server should ignore erroneous lines; the problem is, if we don't know why erroneous lines appear, we don't know where the server should expect and ignore them – is it at the end of the file only, or is it after using_gtid (which is not the end of the file in 10.1, so if 10.1 still has the bug that causes those empty lines sporadically, they can appear between using_gtid and do_domain_ids), or can it be in a random place of the file at all, in which case ignoring will be way more complicated.

          No, I only use "regular" replication, with just the out-of-the-box settings - so just the default parallel replication settings (which would be "conservative" mode now, I guess), and nothing fancy at all.

          iquito Andreas Leathley added a comment - No, I only use "regular" replication, with just the out-of-the-box settings - so just the default parallel replication settings (which would be "conservative" mode now, I guess), and nothing fancy at all.

          I seems as though in both our cases the additional newline was inserted after the using_gtid line, which was added by MariaDB 10.0, that is why I would point my finger at 10.0.(10 in my case) - but of course I don't know if there are more scenarios which are possible/likely.

          iquito Andreas Leathley added a comment - I seems as though in both our cases the additional newline was inserted after the using_gtid line, which was added by MariaDB 10.0, that is why I would point my finger at 10.0.(10 in my case) - but of course I don't know if there are more scenarios which are possible/likely.
          iquito Andreas Leathley made changes -
          Attachment master.info.example [ 41992 ]

          I still have the master.info which was used in 5.5.34 (which was then upgraded to 10.0.10 and then to 10.1.13) - maybe that helps, I only anonymized IP, username and password. master.info.example

          Unfortunately I don't have the version used in 10.0.10 anymore, I just know that the only "noticeable" difference was the enable_gtid=0 (and probably the newline at the end).

          iquito Andreas Leathley added a comment - I still have the master.info which was used in 5.5.34 (which was then upgraded to 10.0.10 and then to 10.1.13) - maybe that helps, I only anonymized IP, username and password. master.info.example Unfortunately I don't have the version used in 10.0.10 anymore, I just know that the only "noticeable" difference was the enable_gtid=0 (and probably the newline at the end).
          elenst Elena Stepanova added a comment - - edited

          iquito, thank you.

          elenst Elena Stepanova added a comment - - edited iquito , thank you.
          elenst Elena Stepanova made changes -
          Status Stalled [ 10000 ] Confirmed [ 10101 ]

          Thanks all.
          Old 10.0 did all kinds of crazy things with master.info, but even the current one can produce this empty line (or something other than empty, but it seems only an empty line causes the problem).

          So, the issue is two-fold. In 10.0, do not add garbage at the end of the file beyond the line count.
          Here is how I'm getting an empty line on 10.0, reproducible on 10.0 commit f8adeccd78bff80725a95b73447e34a5f4528179.

          # Assuming a server with server-id=N --log-bin --port=3306 is already running
          # Start a fresh 10.0 server with --log-bin --server-id=2 --port=3307, otherwise defaults
          change master to master_host='127.0.0.1', master_port=3306, master_user='root';
          start slave;
          stop slave;
          change master to master_log_pos=4000;
           
          # At this point, an empty line appears in master.info after using_gtid=0
          

          The second part of the problem is 10.1 – as lukav@lukav.com said above, it shouldn't be this sensitive to a trailing empty line.

          • shutdown the 10.0 server which was used above
          • start a fresh 10.1 server with the same options on the datadir
          • behold the errors

            2016-04-07  0:53:53 140486837634944 [Note] Server socket created on IP: '::'.
            2016-04-07  0:53:53 140486837634944 [ERROR] Failed to initialize master info using_gtid
            2016-04-07  0:53:53 140486837634944 [ERROR] Error reading master configuration
            2016-04-07  0:53:53 140486837634944 [ERROR] Failed to initialize the master info structure
            2016-04-07  0:53:53 140486837634944 [ERROR] Failed to allocate memory for the Master Info structure
            2016-04-07  0:53:53 140486837634944 [Note] /bld/10.1/bin/mysqld: ready for connections.
            Version: '10.1.13-MariaDB-debug'  socket: '/bld/10.1/data/tmp/mysql.sock'  port: 3307  Source distribution
            

            Reproducible on 10.1 commit 4b6a3518e4dc9088d1f42cd9bc487d06137d2760.

          elenst Elena Stepanova added a comment - Thanks all. Old 10.0 did all kinds of crazy things with master.info, but even the current one can produce this empty line (or something other than empty, but it seems only an empty line causes the problem). So, the issue is two-fold. In 10.0, do not add garbage at the end of the file beyond the line count. Here is how I'm getting an empty line on 10.0, reproducible on 10.0 commit f8adeccd78bff80725a95b73447e34a5f4528179. # Assuming a server with server-id=N --log-bin --port=3306 is already running # Start a fresh 10.0 server with --log-bin --server-id=2 --port=3307, otherwise defaults change master to master_host= '127.0.0.1' , master_port=3306, master_user= 'root' ; start slave; stop slave; change master to master_log_pos=4000;   # At this point, an empty line appears in master.info after using_gtid=0 The second part of the problem is 10.1 – as lukav@lukav.com said above, it shouldn't be this sensitive to a trailing empty line. shutdown the 10.0 server which was used above start a fresh 10.1 server with the same options on the datadir behold the errors 2016-04-07 0:53:53 140486837634944 [Note] Server socket created on IP: '::'. 2016-04-07 0:53:53 140486837634944 [ERROR] Failed to initialize master info using_gtid 2016-04-07 0:53:53 140486837634944 [ERROR] Error reading master configuration 2016-04-07 0:53:53 140486837634944 [ERROR] Failed to initialize the master info structure 2016-04-07 0:53:53 140486837634944 [ERROR] Failed to allocate memory for the Master Info structure 2016-04-07 0:53:53 140486837634944 [Note] /bld/10.1/bin/mysqld: ready for connections. Version: '10.1.13-MariaDB-debug' socket: '/bld/10.1/data/tmp/mysql.sock' port: 3307 Source distribution Reproducible on 10.1 commit 4b6a3518e4dc9088d1f42cd9bc487d06137d2760.
          elenst Elena Stepanova made changes -
          Component/s Replication [ 10100 ]
          Component/s OTHER [ 10125 ]
          Fix Version/s 10.0 [ 16000 ]
          Fix Version/s 10.1 [ 16100 ]
          Fix Version/s N/A [ 14700 ]
          Affects Version/s 10.0 [ 16000 ]
          Affects Version/s 10.1 [ 16100 ]
          Assignee Elena Stepanova [ elenst ] Kristian Nielsen [ knielsen ]
          Environment Linux, Multi-source replication Linux
          ip1981 Igor Pashev made changes -
          Description I tried to upgrade the server 10.0.22 -> 10.1.10.
          Old version was shutdown, but new failed to start:
          {noformat}
          2016-01-08 22:25:00 7f8017479700 InnoDB: Error: Column last_update in table "mysql"."innodb_table_stats" is INT UNSIGNED NOT NULL but should be BINARY(4) NOT NULL (type mismatch).
          2016-01-08 22:25:00 7f8017479700 InnoDB: Error: Fetch of persistent statistics requested for table "mysql"."gtid_slave_pos" but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
          2016-01-08 22:25:01 140188123334592 [Note] Server socket created on IP: '::'.
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dhk.info' failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dth.info' failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dsg.info' failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-delivery.info' failed
          2016-01-08 22:25:01 140188123334592 [Warning] Reading of some Master_info entries failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize multi master structures
          2016-01-08 22:25:01 140188123334592 [ERROR] Aborting
          {noformat}

          I guess I can't user mysql_upgrade when the server is not running.
          I tried to upgrade the server 10.0.22 -> 10.1.10.
          Old version was shutdown, but new failed to start:
          {noformat}
          2016-01-08 22:25:00 7f8017479700 InnoDB: Error: Column last_update in table "mysql"."innodb_table_stats" is INT UNSIGNED NOT NULL but should be BINARY(4) NOT NULL (type mismatch).
          2016-01-08 22:25:00 7f8017479700 InnoDB: Error: Fetch of persistent statistics requested for table "mysql"."gtid_slave_pos" but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
          2016-01-08 22:25:01 140188123334592 [Note] Server socket created on IP: '::'.
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dhk.info' failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dth.info' failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-bob@002dsg.info' failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize master info using_gtid
          2016-01-08 22:25:01 140188123334592 [ERROR] Error reading master configuration
          2016-01-08 22:25:01 140188123334592 [ERROR] Initialized Master_info from 'master-delivery.info' failed
          2016-01-08 22:25:01 140188123334592 [Warning] Reading of some Master_info entries failed
          2016-01-08 22:25:01 140188123334592 [ERROR] Failed to initialize multi master structures
          2016-01-08 22:25:01 140188123334592 [ERROR] Aborting
          {noformat}

          I guess I can't use mysql_upgrade when the server is not running.

          Seems the bug was introduced with this commit:

          commit a50ddebb5cfa7b79540d155e8e41c7a07c4c3fbf
          Author: Nirbhay Choubey <nirbhay@mariadb.com>
          Date:   Wed Dec 3 22:30:48 2014 -0500
           
              MDEV-6593 : domain_id based replication filters
              
              Implementation for domain ID based filtering of replication events.
          

          When the function read_mi_key_from_file() sees an empty line, it handles it
          incorrectly and returns whatever randomly happened to be the old key (which
          is probably "using_gtid"). Thus the code tries to parse a phantom line
          "using_gtid=", and fails.

          (There seems to be more potential problems with this code? For example, what
          happens if it sees a line like "using_gtid" or "do_domain_ids", without any
          "=" sign?)

          knielsen Kristian Nielsen added a comment - Seems the bug was introduced with this commit: commit a50ddebb5cfa7b79540d155e8e41c7a07c4c3fbf Author: Nirbhay Choubey <nirbhay@mariadb.com> Date: Wed Dec 3 22:30:48 2014 -0500   MDEV-6593 : domain_id based replication filters Implementation for domain ID based filtering of replication events. When the function read_mi_key_from_file() sees an empty line, it handles it incorrectly and returns whatever randomly happened to be the old key (which is probably "using_gtid"). Thus the code tries to parse a phantom line "using_gtid=", and fails. (There seems to be more potential problems with this code? For example, what happens if it sees a line like "using_gtid" or "do_domain_ids", without any "=" sign?)

          BTW, the issue with leaving extra stuff at the end of master.info is fixed in 10.1 with END_MARKER.

          knielsen Kristian Nielsen added a comment - BTW, the issue with leaving extra stuff at the end of master.info is fixed in 10.1 with END_MARKER.
          lukav Anton Avramov made changes -
          Attachment MDEV-9383-100grama.sample [ 41993 ]
          Attachment MDEV-9383-ck.sample [ 41994 ]
          Attachment MDEV-9383-miss60.sample [ 41995 ]
          lukav Anton Avramov added a comment -

          MDEV-9383-100grama.sample MDEV-9383-ck.sample MDEV-9383-miss60.sample

          I've compiled the ending of the .info files in some of our clients multi-masters.
          I've used the command: tail n 6 /var/lib/mysql/master*.info > /tmp/MDEV-9383-[host].sample

          I think this would give you a good samples of the "damages" the bug made

          lukav Anton Avramov added a comment - MDEV-9383-100grama.sample MDEV-9383-ck.sample MDEV-9383-miss60.sample I've compiled the ending of the .info files in some of our clients multi-masters. I've used the command: tail n 6 /var/lib/mysql/master *.info > /tmp/ MDEV-9383 - [host] .sample I think this would give you a good samples of the "damages" the bug made
          knielsen Kristian Nielsen made changes -
          Summary Upgrade 10.0.22 -> 10.1.10: unable to start Server fails to read master.info after upgrade 10.0 -> 10.1
          knielsen Kristian Nielsen added a comment - Suggested fix: http://lists.askmonty.org/pipermail/commits/2016-April/009251.html

          BTW, what happens here is that when a new master.info file is written, it
          does not truncate the file (for performance reasons).

          So if the new file happens to be shorter, extra junk is left at the
          end. This has always been so, but it has become more of a problem now that
          MySQL and MariaDB are each adding new lines to the file without
          coordinating.

          In 10.1, a line "END_MARKER" is added at the bottom to easily avoid reading
          any left-over junk, but this change was deemed too dangerous for stable 10.0
          (which is fine, since 10.0 is not going to add anothing more to
          master.info).

          But then apparently some bugs were introduced with the previously mentioned
          patch, which could cause 10.1 to incorrectly interpret some cases of extra
          junk left by 10.0, causing the problems seen here. In particular, an empty
          line was treated as an empty "using_gtid=" line, causing the failure.

          An easy work-around, until a fixed 10.1 release becomes available, is to
          just remove anything after the "using_gtid=X" line (but do not remove the
          newline character at the end of that line).

          knielsen Kristian Nielsen added a comment - BTW, what happens here is that when a new master.info file is written, it does not truncate the file (for performance reasons). So if the new file happens to be shorter, extra junk is left at the end. This has always been so, but it has become more of a problem now that MySQL and MariaDB are each adding new lines to the file without coordinating. In 10.1, a line "END_MARKER" is added at the bottom to easily avoid reading any left-over junk, but this change was deemed too dangerous for stable 10.0 (which is fine, since 10.0 is not going to add anothing more to master.info). But then apparently some bugs were introduced with the previously mentioned patch, which could cause 10.1 to incorrectly interpret some cases of extra junk left by 10.0, causing the problems seen here. In particular, an empty line was treated as an empty "using_gtid=" line, causing the failure. An easy work-around, until a fixed 10.1 release becomes available, is to just remove anything after the "using_gtid=X" line (but do not remove the newline character at the end of that line).

          Pushed to 10.1

          knielsen Kristian Nielsen added a comment - Pushed to 10.1
          knielsen Kristian Nielsen made changes -
          Fix Version/s 10.1.14 [ 21804 ]
          Fix Version/s 10.0 [ 16000 ]
          Fix Version/s 10.1 [ 16100 ]
          Resolution Fixed [ 1 ]
          Status Confirmed [ 10101 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 73534 ] MariaDB v4 [ 149983 ]

          People

            knielsen Kristian Nielsen
            ip1981 Igor Pashev
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.