[MDEV-3956] start slave crash the server after upgrade to 5.5 on REHL5 Created: 2012-12-21  Updated: 2013-01-06  Resolved: 2013-01-06

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.28a
Fix Version/s: None

Type: Bug Priority: Major
Reporter: VAROQUI Stephane Assignee: Elena Stepanova
Resolution: Fixed Votes: 0
Labels: None
Environment:

Linux rehl5 X86_64



 Comments   
Comment by VAROQUI Stephane [ 2012-12-21 ]

The master servers are 5.3 but this does not count .
This issue is related to rpm installation and dynamic libraries
a client , compat , commun rpm have been upgraded prior to install the server .
With all rpm to 5.5 and the server to 5.3 the crash did not show up

A workaround have been found :
Installing from the linux tar.gz and change basedir

So similar to LP:1037711.
Glibc version : 2.5

121221 11:52:56 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='192.168.45.143', master_port='3306', master_log_file='', master_log_pos='4'. New state master_host='192.168.45.143', master_port='3306', master_log_file='po43-bin.004221', master_log_pos='4'.
121221 11:53:07 [Note] Slave SQL thread initialized, starting replication in log 'po43-bin.004221' at position 4, relay log './potest-relay-bin.000001' position: 4
121221 11:53:07 [Note] Slave I/O thread: connected to master 'repl@192.168.45.143:3306',replication started in log 'po43-bin.004221' at position 4
121221 11:53:07 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 5.5.25-MariaDB-log
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=13
max_threads=202
thread_count=11
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1699825 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x2113ad40
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x41bc20b8 thread_stack 0x40000
??:0(my_print_stacktrace)[0xa834be]
??:0(handle_fatal_signal)[0x6cc10c]
:0()[0x3e00e0ebe0]
??:0(open_tables(THD*, TABLE_LIST*, unsigned int, unsigned int, Prelocking_strategy*))[0x539813]
??:0(open_and_lock_tables(THD*, TABLE_LIST*, bool, unsigned int, Prelocking_strategy*))[0x53a24e]
??:0(execute_sqlcom_select(THD*, TABLE_LIST*))[0x5764a7]
??:0(mysql_execute_command(THD*))[0x57cb58]
??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x57f6c1]
??:0(Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int))[0x7a3839]
??:0(apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*))[0x5086d2]
??:0(exec_relay_log_event(THD*, Relay_log_info*))[0x510390]
??:0(handle_slave_sql)[0x5116a4]
:0()[0x3e00e0677d]
:0()[0x3e006d3c1d]

Version MariaDB-5.5.25-rhel5-x86_64-server.rpm
[root@potest/data/log]$ ldd /usr/sbin/mysqld
linux-vdso.so.1 => (0x00007fffb9dfd000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003e00e00000)
libz.so.1 => /lib64/libz.so.1 (0x0000003e01e00000)
librt.so.1 => /lib64/librt.so.1 (0x0000003e02200000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003e02a00000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003e00a00000)
libssl.so.6 => /lib64/libssl.so.6 (0x0000003956800000)
libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003e03a00000)
libstdc+.so.6 => /usr/lib64/libstdc+.so.6 (0x0000003e01200000)
libm.so.6 => /lib64/libm.so.6 (0x0000003e01a00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003e00600000)
/lib64/ld-linux-x86-64.so.2 (0x0000003e00200000)
libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 (0x0000003956c00000)
libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003956400000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003957800000)
libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003957000000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003e03200000)
libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 (0x0000003957400000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003e03600000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003e02600000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003956000000)
libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003e01600000)

version mariadb-5.5.28a-linux-x86_64.tar.gz
[root@potest/usr/local/mysql/bin]$ ldd mysqld
linux-vdso.so.1 => (0x00007fffe8dfd000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003e00e00000)
librt.so.1 => /lib64/librt.so.1 (0x0000003e02200000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003e02a00000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003e00a00000)
libstdc+.so.6 => /usr/lib64/libstdc+.so.6 (0x0000003e01200000)
libm.so.6 => /lib64/libm.so.6 (0x0000003e01a00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003e00600000)
/lib64/ld-linux-x86-64.so.2 (0x0000003e00200000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003e03200000)

[root@potest/data/log]$ uname -a
Linux potest.paybox.com 2.6.18-308.13.1.el5 #1 SMP Thu Jul 26 05:45:09 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@potest/home/admin]$ cat /proc/version
Linux version 2.6.18-308.13.1.el5 (mockbuild@x86-002.build.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)) #1 SMP Thu Jul 26 05:45:09 EDT 2012
[root@potest/home/admin]$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.8 (Tikanga)

Comment by Elena Stepanova [ 2012-12-24 ]

Hi Stephane!

You wrote:

>> a client , compat , commun rpm have been upgraded prior to install the server .
>> With all rpm to 5.5 and the server to 5.3 the crash did not show up

Could you please go into some more details on how exactly you did the upgrade?
I'm trying to reproduce it (granted, I don't have RHEL 5, trying on Fedora 17, but I'm not sure it's relevant at the stage where I get stuck), and I don't seem to be able to do a partial upgrade – not in a clean way, at least.

I have shared/server/client 5.3.11 RPMs installed.
If I try to upgrade shared/client to 5.5.28a (via rpm -U), I'm getting massive conflicts, e.g. compat-5.5.28a conflicts with shared-5.3.11, etc., which I kind of expected.
If I try to remove shared/client 5.3.11 first (via rpm -e), it also fails due to failed dependencies, as the server 5.3.11 depends on the client.

So what was the sequence of your upgrade? Did you ignore failed dependencies, or did you do it some other way entirely?

Thanks.

Comment by VAROQUI Stephane [ 2012-12-24 ]

yes we did --force --nodeps to bypass the conflicts

Comment by Elena Stepanova [ 2013-01-06 ]

So far I couldn't reproduce the problem.
I ran a similar test on Fedora 17, and as ugly as the process was, it didn't cause a subsequent crash on slave startup.
The list of dynamic libraries is very close, with a couple of exceptions (see below).

Stephane,

Could you please elaborate a bit on why you think the problem is caused by dynamic libraries (rather than different server versions – your rpm is 5.5.25 and tarball is 5.5.28, – or different configuration, file location, and so on)?

Thanks.

ldd mysqld:
linux-vdso.so.1 => (0x00007fffb0b6f000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003e40e00000)
libz.so.1 => /lib64/libz.so.1 (0x0000003e41e00000)
librt.so.1 => /lib64/librt.so.1 (0x0000003e41600000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003e4d600000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003e41200000)
libssl.so.6 => /lib64/libssl.so.6 (0x00007f16fd747000)
libcrypto.so.6 => /lib64/libcrypto.so.6 (0x00007f16fd3ad000)
libstdc+.so.6 => /lib64/libstdc+.so.6 (0x0000003e4de00000)
libm.so.6 => /lib64/libm.so.6 (0x0000003e41a00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003e40a00000)
/lib64/ld-linux-x86-64.so.2 (0x0000003e40600000)
libfreebl3.so => /lib64/libfreebl3.so (0x0000003e4d200000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000003e52a00000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000003e52e00000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003e50200000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000003e52600000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003e45600000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000003e53200000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003e50a00000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003e42e00000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003e42200000)

rpm -qa | grep -i mariadb
MariaDB-shared-5.3.11-118.el5.x86_64
MariaDB-server-5.5.25-1.x86_64
MariaDB-devel-5.3.11-118.el5.x86_64
MariaDB-compat-5.5.25-1.x86_64
MariaDB-client-5.5.25-1.x86_64
MariaDB-common-5.5.25-1.x86_64

MariaDB [test]> show slave status \G

                                                      • 1. row ***************************
                                                        Slave_IO_State: Waiting for master to send event
                                                        ...
                                                        Master_Log_File: master-bin.000002
                                                        Read_Master_Log_Pos: 245
                                                        Relay_Log_File: fedora17-relay-bin.000002
                                                        Relay_Log_Pos: 530
                                                        Relay_Master_Log_File: master-bin.000002
                                                        Slave_IO_Running: Yes
                                                        Slave_SQL_Running: Yes
                                                        Replicate_Do_DB:
                                                        ...
Comment by VAROQUI Stephane [ 2013-01-06 ]

Elena,
Thank's for the effort trying to reproduce.

We can not reproduce it easily. But what make me think of library issue is that the same server release taken from tar.gz did not crash .
We did not check MD5 on the rpm file so it could also be a corrupted download as well . I will try to MD5 if we still have the RPM on the server next time .

Stéphane

Comment by Elena Stepanova [ 2013-01-06 ]

Hi Stephane,

> the same server release taken from tar.gz did not crash

But which same server release you actually compared? Was it 5.5.25 or 5.5.28a?

Because your original description says
Version MariaDB-5.5.25-rhel5-x86_64-server.rpm
version mariadb-5.5.28a-linux-x86_64.tar.gz

which is not the same server release.

When you are saying that you cannot reproduce it easily, can it be that you are now trying MariaDB-5.5.28a RPMs and they work? If so, it could have been simply fixed – e.g.there was a bugfix for invalid write in Query_log_event (see http://bugs.mysql.com/bug.php?id=64624) – maybe it was what you had hit?

Comment by VAROQUI Stephane [ 2013-01-06 ]

Stephane Varoqui | Senior Consultant EMEA
SkySQL Ab | www.skysql.com Location: Paris -France | Tel : +33 609 013 638
SkySQL - The first choice in affordable MySQL® Database solutions for the Enterprise and Cloud

Ok i did not notice that. The last rpm was 3 release late may be this as been fixed in last release .

Comment by Elena Stepanova [ 2013-01-06 ]

Okay, then I will close it as 'fixed' for now, assuming that you indeed had hit the bug# 64624 or its relatives, since it looks similar. If you encounter the problem again, please feel free to re-open.

Comment by VAROQUI Stephane [ 2013-01-06 ]

Thanks Elena
Stephane Varoqui | Senior Consultant EMEA
SkySQL Ab | www.skysql.com Location: Paris -France | Tel : +33 609 013 638
SkySQL - The first choice in affordable MySQL® Database solutions for the Enterprise and Cloud

Generated at Thu Feb 08 06:52:37 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.