Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6459

max_relay_log_size and sql_slave_skip_counter misbehave on PPC64

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.13
    • 10.0.14
    • None
    • None
    • PPC64 RHEL 6.5

    Description

      The following tests fail on PPC64 due to misbehaving variables: multi_source.skip_counter, rpl.rpl_auto_increment, rpl.rpl_mdev6020, rpl.rpl_skip_replication, rpl.rpl_stm_max_relay_size, sys_vars.max_relay_log_size_basic, sys_vars.sql_slave_skip_counter_basic.

      BB link: http://buildbot.askmonty.org/buildbot/builders/bintar-rhel6-p8/builds/211/steps/test/logs/stdio

      Most failures look as following:

      @@ -21,17 +21,17 @@
       set global sql_slave_skip_counter = 2;
       select @@global.sql_slave_skip_counter;
       @@global.sql_slave_skip_counter
      -2
      +8589934592

      Attachments

        Issue Links

          Activity

            Kristian, please review fix for this bug.

            The patch has been pushed to 10.0.13:

            revno: 4293
            revision-id: svoj@mariadb.org-20140718154521-mwoz6ezimga0axcj
            parent: svoj@mariadb.org-20140718111625-uch1ssbh8kf6i4ib
            committer: Sergey Vojtovich <svoj@mariadb.org>
            branch nick: 10.0
            timestamp: Fri 2014-07-18 19:45:21 +0400
            message:
              MDEV-6459 - max_relay_log_size and sql_slave_skip_counter
                          misbehave on PPC64
              
              There was a mix of ulong and uint casts/variables which caused
              incorrect value to be passed to/retrieved from max_relay_log_size
              and sql_slave_skip_counter.
              
              This mix failed to work on big-endian PPC64 where sizeof(int)= 4,
              sizeof(long)= 8. E.g. session_var(thd, uint)= 1 will in fact store
              0x100000000.

            svoj Sergey Vojtovich added a comment - Kristian, please review fix for this bug. The patch has been pushed to 10.0.13: revno: 4293 revision-id: svoj@mariadb.org-20140718154521-mwoz6ezimga0axcj parent: svoj@mariadb.org-20140718111625-uch1ssbh8kf6i4ib committer: Sergey Vojtovich <svoj@mariadb.org> branch nick: 10.0 timestamp: Fri 2014-07-18 19:45:21 +0400 message: MDEV-6459 - max_relay_log_size and sql_slave_skip_counter misbehave on PPC64 There was a mix of ulong and uint casts/variables which caused incorrect value to be passed to/retrieved from max_relay_log_size and sql_slave_skip_counter. This mix failed to work on big-endian PPC64 where sizeof(int)= 4, sizeof(long)= 8. E.g. session_var(thd, uint)= 1 will in fact store 0x100000000.

            Try to avoid long in sysvars, prefer int or longlong instead. They are a lot more stable between architectures, int is typically 23-bit, longlong is 64-bit. But long can be either, so the variable gets different limits on different platforms — this makes documenting the variable (and writing test cases) rather complicated.

            serg Sergei Golubchik added a comment - Try to avoid long in sysvars, prefer int or longlong instead. They are a lot more stable between architectures, int is typically 23-bit, longlong is 64-bit. But long can be either, so the variable gets different limits on different platforms — this makes documenting the variable (and writing test cases) rather complicated.

            Max value for sql_slave_skip_counter is UINT_MAX and for max_relay_log_size is 1024L*1024*1024. That is both fit 32-bit unsigned integer.

            Not sure if there was a good reason to choose ulong and not the other type. Since Kristian created this code, I'm better handing off this recommendation to him.

            svoj Sergey Vojtovich added a comment - Max value for sql_slave_skip_counter is UINT_MAX and for max_relay_log_size is 1024L*1024*1024. That is both fit 32-bit unsigned integer. Not sure if there was a good reason to choose ulong and not the other type. Since Kristian created this code, I'm better handing off this recommendation to him.

            > Not sure if there was a good reason to choose ulong and not the other
            > type. Since Kristian created this code, I'm better handing off this
            > recommendation to him.

            I don't think I could have created the code for max_relay_log_size and
            sql_slave_skip_counter? Those have existed since far before I started working
            on replication, AFAIK?

            Generally, I would agree with Serg that it's best to avoid using ulong. Using
            ulonglong seems fine here.

            I have noticed that binlog sizes and offsets have a tendency to use 32-bit
            values around the replication code (which is generally wrong for file
            offsets). I suspect that there are other bugs related to this lingering
            around.

            Using ulonglong by default when adding or otherwise changing code seems a
            reasonable approach to me, where there are no performance concerns that would
            suggest using a 32-bit type (and that does not seem to be the case here).

            • Kristian.
            knielsen Kristian Nielsen added a comment - > Not sure if there was a good reason to choose ulong and not the other > type. Since Kristian created this code, I'm better handing off this > recommendation to him. I don't think I could have created the code for max_relay_log_size and sql_slave_skip_counter? Those have existed since far before I started working on replication, AFAIK? Generally, I would agree with Serg that it's best to avoid using ulong. Using ulonglong seems fine here. I have noticed that binlog sizes and offsets have a tendency to use 32-bit values around the replication code (which is generally wrong for file offsets). I suspect that there are other bugs related to this lingering around. Using ulonglong by default when adding or otherwise changing code seems a reasonable approach to me, where there are no performance concerns that would suggest using a 32-bit type (and that does not seem to be the case here). Kristian.

            People

              serg Sergei Golubchik
              svoj Sergey Vojtovich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.