Details

    • Technical task
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • None
    • None
    • None
    • CentOS 5 buildbot, SELinux

    Description

      Galera installation tests on CentOS 5 in buildbot fail with an obscure "Lost connection to MySQL server during query" error on enabling wsrep_provider:
      http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos5-x86/builds/2248/steps/test/logs/stdio
      http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos5-amd64/builds/1981/steps/test/logs/stdio

      I did some digging and it turned out that the actual cause is strict SELinux settings which Galera cannot deal with (it's documented in Galera FAQ: http://www.codership.com/wiki/doku.php?id=faq#qnothing_works_damnit)

      In my experiments, it was enough to switch the level from Enforcing which it is in the VM now, to Permissive at runtime using setenforce. Setting the level permanently and rebooting the machine also helped, of course.

      If the level is set/kept high on purpose, please consider a conditional setting it to Permissive at runtime for Galera tests (if you do so, please make sure setenforce comes with the full path, it's not on the PATH). If there is no particular reason to have it Enforcing by default, maybe it's easier to reconfigure it permanently.

      I was checking 64-bit build, but I suppose the reason of the failure on x86 is the same.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            dbart Daniel Bartholomew made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            dbart Daniel Bartholomew made changes -
            Status In Progress [ 3 ] Open [ 1 ]

            Let me know if other VMs need the same thing done to them and/or if we need to do more to the CentOS VMs.

            dbart Daniel Bartholomew added a comment - Let me know if other VMs need the same thing done to them and/or if we need to do more to the CentOS VMs.

            Yes, thanks, lets wait and see how the next buildbot run goes (I don't want to trigger re-run now since the buildbot is busy as it is, and our question is not urgent).

            elenst Elena Stepanova added a comment - Yes, thanks, lets wait and see how the next buildbot run goes (I don't want to trigger re-run now since the buildbot is busy as it is, and our question is not urgent).

            Is this issue resolved now?

            dbart Daniel Bartholomew added a comment - Is this issue resolved now?

            We haven't had a new Galera build in buildbot since then, so the status is still unknown.

            elenst Elena Stepanova added a comment - We haven't had a new Galera build in buildbot since then, so the status is still unknown.

            We finally have a new build. Strangely, the fix worked for amd64, but not for x86.
            amd64 (good):http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos5-amd64/builds/2016/steps/test/logs/stdio
            x86 (not good): http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos5-x86/builds/2285/steps/test/logs/stdio

            If you are sure that x86 was modified, I'll try later to see what else might cause the problem, but it looks very much like the previous one

            elenst Elena Stepanova added a comment - We finally have a new build. Strangely, the fix worked for amd64, but not for x86. amd64 (good): http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos5-amd64/builds/2016/steps/test/logs/stdio x86 (not good): http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos5-x86/builds/2285/steps/test/logs/stdio If you are sure that x86 was modified, I'll try later to see what else might cause the problem, but it looks very much like the previous one

            I've checked both of the x86 VMs,

            • vm-centos5-i386-install.qcow2
            • vm-centos6-i386-install.qcow2

            And they both have SELINUX=permissive set.

            Very strange that the fix on amd64 worked and the exact same fix on x86 didn't.

            dbart Daniel Bartholomew added a comment - I've checked both of the x86 VMs, vm-centos5-i386-install.qcow2 vm-centos6-i386-install.qcow2 And they both have SELINUX=permissive set. Very strange that the fix on amd64 worked and the exact same fix on x86 didn't.
            elenst Elena Stepanova made changes -
            Assignee Daniel Bartholomew [ dbart ] Elena Stepanova [ elenst ]

            It means I'll need to repeat the same exercise of manually reproducing the problem in a cloned VM to see what's going on there.

            elenst Elena Stepanova added a comment - It means I'll need to repeat the same exercise of manually reproducing the problem in a cloned VM to see what's going on there.

            Hi Daniel,

            It turns out that wsrep crashes when the server is run with an old version of libgcc. The VM (vm-centos5-i386-install.qcow2) has
            libgcc.i386 4.1.2-44.el5

            while currently available is
            libgcc i386 4.1.2-54.el5

            I'll file a bug report for Galera so Codership could check why it crashes, but meanwhile we might upgrade the library in the VM (or maybe not, I don't know what our VM upgrade policies are). If you think it makes sense to upgrade, please go ahead; if not, please close this issue as fixed, since the initial one with SELinux has been resolved anyway.

            elenst Elena Stepanova added a comment - Hi Daniel, It turns out that wsrep crashes when the server is run with an old version of libgcc. The VM (vm-centos5-i386-install.qcow2) has libgcc.i386 4.1.2-44.el5 while currently available is libgcc i386 4.1.2-54.el5 I'll file a bug report for Galera so Codership could check why it crashes, but meanwhile we might upgrade the library in the VM (or maybe not, I don't know what our VM upgrade policies are). If you think it makes sense to upgrade, please go ahead; if not, please close this issue as fixed, since the initial one with SELinux has been resolved anyway.
            elenst Elena Stepanova made changes -
            Assignee Elena Stepanova [ elenst ] Daniel Bartholomew [ dbart ]
            elenst Elena Stepanova made changes -

            I'm fine with upgrading the library in the VM

            dbart Daniel Bartholomew added a comment - I'm fine with upgrading the library in the VM

            If you do that, could you please backup a VM image with the old library, so I could re-test the fix if the bug is fixed?
            (the bug was filed as MDEV-4344)

            elenst Elena Stepanova added a comment - If you do that, could you please backup a VM image with the old library, so I could re-test the fix if the bug is fixed? (the bug was filed as MDEV-4344 )

            With the new version of Galera, SELinux breaks tests on RHEL5 (I've seen and checked failures on x86, but I suppose it's the same on amd64, it just currently lags behind).
            The failure is a bit different, it hides inside the init script, and shows up in messages as follows:

            SELinux is preventing the mysqld from using potentially mislabeled files (./tmp.wBeQpv4640). For complete SELinux messages. run sealert -l 3efbde02-94d1-4e88-88ce-88a8a5cd3987

            Switching to Permissive fixes it, please do so for Galera tests.

            I suppose if we don't want to duplicate VM images and don't want to change it for regular tests, we can switch it off dynamically for the test, adding a conditional step
            sudo bash -c "echo 0 > /selinux/enforce"
            or alike.

            elenst Elena Stepanova added a comment - With the new version of Galera, SELinux breaks tests on RHEL5 (I've seen and checked failures on x86, but I suppose it's the same on amd64, it just currently lags behind). The failure is a bit different, it hides inside the init script, and shows up in messages as follows: SELinux is preventing the mysqld from using potentially mislabeled files (./tmp.wBeQpv4640). For complete SELinux messages. run sealert -l 3efbde02-94d1-4e88-88ce-88a8a5cd3987 Switching to Permissive fixes it, please do so for Galera tests. I suppose if we don't want to duplicate VM images and don't want to change it for regular tests, we can switch it off dynamically for the test, adding a conditional step sudo bash -c "echo 0 > /selinux/enforce" or alike.
            elenst Elena Stepanova made changes -
            Summary Galera in buildbot: SELinux in CentOS 5 builders breaks Galera installation tests Galera in buildbot: SELinux in CentOS 5 and RHEL5 builders breaks Galera installation tests
            dbart Daniel Bartholomew made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            dbart Daniel Bartholomew made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            dbart Daniel Bartholomew made changes -
            Assignee Daniel Bartholomew [ dbart ] Elena Stepanova [ elenst ]

            RHEL and CentOS builders are green now, closing as fixed.

            elenst Elena Stepanova added a comment - RHEL and CentOS builders are green now, closing as fixed.
            elenst Elena Stepanova made changes -
            Assignee Elena Stepanova [ elenst ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow defaullt [ 26507 ] MariaDB v2 [ 44132 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB v2 [ 44132 ] MariaDB v3 [ 64244 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 64244 ] MariaDB v4 [ 146490 ]

            People

              Unassigned Unassigned
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.