[MDEV-6340] Mariadb 10.0.12 fatal "Lost connection" error w/ GCC 4.9 'Release' build; workaround ~ CFLAGS="-fno-delete-null-pointer-checks" - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.0.12
Fix Version/s: 10.0.13
Component/s: None
Labels:
None
Environment:

Hide
uname -rm
3.15.0-1.gc9b3c8c-desktop x86_64
mysqld --version
mysqld Ver 10.0.12-MariaDB-log for Linux on x86_64 (Source distribution)
bzr log | head
------------------------------------------------------------
revno: 4253
committer: Sergei Golubchik <sergii@pisem.net>
branch nick: 10.0
timestamp: Fri 2014-06-13 13:25:32 +0200
message:
promote server_audit and sequence plugins to stable
------------------------------------------------------------
revno: 4252
committer: Sergei Golubchik <sergii@pisem.net>
php -v
PHP 5.6.0-dev (cli) (built: Jun 12 2014 10:24:30)
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.6.0-dev, Copyright (c) 1998-2014 Zend Technologies
with Zend OPcache v7.0.4-dev, Copyright (c) 1999-2014, by Zend Technologies

Show
uname -rm 3.15.0-1.gc9b3c8c-desktop x86_64 mysqld --version mysqld Ver 10.0.12-MariaDB-log for Linux on x86_64 (Source distribution) bzr log | head ------------------------------------------------------------ revno: 4253 committer: Sergei Golubchik < sergii@pisem.net > branch nick: 10.0 timestamp: Fri 2014-06-13 13:25:32 +0200 message: promote server_audit and sequence plugins to stable ------------------------------------------------------------ revno: 4252 committer: Sergei Golubchik < sergii@pisem.net > php -v PHP 5.6.0-dev (cli) (built: Jun 12 2014 10:24:30) Copyright (c) 1997-2014 The PHP Group Zend Engine v2.6.0-dev, Copyright (c) 1998-2014 Zend Technologies with Zend OPcache v7.0.4-dev, Copyright (c) 1999-2014, by Zend Technologies

Description

After a clean/new install of MariaDB 10.0.11, undertaking a completely NEW drush-install from clean Drupal v7.28 source, I get the following fatal error + crash:

    SQLSTATE[HY000]: General error: 2006 MySQL server has gone away

Attachments

Issue Links

is duplicated by

MDEV-6360 MariaDB crashes on simple insert or alter table clauses

Closed

relates to

MDEV-6202 MariaDB 10.0.10 doesn't work with KDE's Akonadi

Closed

links to

Mageia bug report

Activity

Ascending order - Click to sort in descending order

View 11 older comments

GrantK added a comment - 2014-07-06 04:23 - edited

gcc 4.9.1 is neither released, nor shipping with any distribution; GCC 4.9.0 is.

is the decision, then, to simply ignore builds of RELEASE MariaDB being broken with RELEASE GCC, and kick the ball down the road to GCC 4.9.1, whenever it's released?

How, exactly, do we RE-OPEN this?

GrantK added a comment - 2014-07-06 04:23 - edited gcc 4.9.1 is neither released, nor shipping with any distribution; GCC 4.9.0 is. is the decision, then, to simply ignore builds of RELEASE MariaDB being broken with RELEASE GCC, and kick the ball down the road to GCC 4.9.1, whenever it's released? How, exactly, do we RE-OPEN this?

Sergei Golubchik added a comment - 2014-07-06 12:28

I've reopened it.

But 4.9.0 is pretty much the bleeding edge, most distributions don't ship it (and, as you can see, they have good reasons not to). On the other hand, 4.9.1 is already in Mageia Cauldron (which is in the development stage and won't be declared stable anytime soon).

I will try to see if we can change something in MariaDB to avoid this gcc bug. But given that it is a gcc bug, apparently, and all that I wrote above, this won't be a hight priority bug, sorry.

Sergei Golubchik added a comment - 2014-07-06 12:28 I've reopened it. But 4.9.0 is pretty much the bleeding edge, most distributions don't ship it (and, as you can see, they have good reasons not to). On the other hand, 4.9.1 is already in Mageia Cauldron (which is in the development stage and won't be declared stable anytime soon). I will try to see if we can change something in MariaDB to avoid this gcc bug. But given that it is a gcc bug, apparently, and all that I wrote above, this won't be a hight priority bug, sorry.

GrantK added a comment - 2014-07-06 16:03

Can you provide a reference to the specific GCC bug that you suggest is fixed?

In apparent reference to

"Operational Notification – Changes in gcc Code Optimization Can Cause a Crash in BIND"
https://kb.isc.org/article/AA-01167

as pointed out by showaz in bind's #irc, the bind dev team posted a GCC bug here,

"GCC 4.9 generates incorrect object code"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61236,

for which a workaround is the similar

-fno-delete-null-pointer-checks

@ GCC, that bug has been resolved as INVALID by the GCC team, and, as a result, the bind team committed fixes to their repository branches to address the crash and work around the optimization issue.

In that bug report, it's glibc that's called into question, not gcc.

Noting as posted above here, in the mariadb backtrace,

...
/lib64/libpthread.so.0(+0x80db)[0x7fcf4ff230db]
/lib64/libc.so.6(clone+0x6d)[0x7fcf4ebd390d]

So, is it in fact GCC, as you've ascribed, or glibc/other, that's invovled with the MariaDB crashes?

GrantK added a comment - 2014-07-06 16:03 Can you provide a reference to the specific GCC bug that you suggest is fixed? In apparent reference to "Operational Notification – Changes in gcc Code Optimization Can Cause a Crash in BIND" https://kb.isc.org/article/AA-01167 as pointed out by showaz in bind's #irc, the bind dev team posted a GCC bug here, "GCC 4.9 generates incorrect object code" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61236 , for which a workaround is the similar -fno-delete-null-pointer-checks @ GCC, that bug has been resolved as INVALID by the GCC team, and, as a result, the bind team committed fixes to their repository branches to address the crash and work around the optimization issue. In that bug report, it's glibc that's called into question, not gcc. Noting as posted above here, in the mariadb backtrace, ... /lib64/libpthread.so.0(+0x80db) [0x7fcf4ff230db] /lib64/libc.so.6(clone+0x6d) [0x7fcf4ebd390d] So, is it in fact GCC, as you've ascribed, or glibc/other, that's invovled with the MariaDB crashes?

Sergei Golubchik added a comment - 2014-07-30 22:59

jplindst, please take a look at the following patch:

=== modified file 'storage/innobase/include/lock0lock.h'

--- storage/innobase/include/lock0lock.h        2014-05-07 15:32:23 +0000

+++ storage/innobase/include/lock0lock.h        2014-07-30 19:36:42 +0000

@@ -277,31 +277,31 @@

 UNIV_INTERN

 dberr_t

 lock_rec_insert_check_and_lock(

 /*===========================*/

        ulint           flags,  /*!< in: if BTR_NO_LOCKING_FLAG bit is

                                set, does nothing */

        const rec_t*    rec,    /*!< in: record after which to insert */

        buf_block_t*    block,  /*!< in/out: buffer block of rec */

        dict_index_t*   index,  /*!< in: index */

        que_thr_t*      thr,    /*!< in: query thread */

        mtr_t*          mtr,    /*!< in/out: mini-transaction */

        ibool*          inherit)/*!< out: set to TRUE if the new

                                inserted record maybe should inherit

                                LOCK_GAP type locks from the successor

                                record */

-       __attribute__((nonnull, warn_unused_result));

+       __attribute__((nonnull(2,3,4,6,7), warn_unused_result));

 /*********************************************************************//**

(the same for xtradb, of course).

Here's why: old declaration promises that thr can never be NULL, and gcc-4.9.0 trusts that and optimizes accordingly. But in fact, the function starts from

lock_rec_insert_check_and_lock(

/*===========================*/

...

	ibool*		inherit)

...

	if (flags & BTR_NO_LOCKING_FLAG) {

		return(DB_SUCCESS);

	trx = thr_get_trx(thr);

so when BTR_NO_LOCKING_FLAG is set, thr can be NULL (and it is NULL in this stack trace: btr_insert_on_non_leaf_level_func → btr_cur_optimistic_insert → btr_cur_ins_lock_and_undo → lock_rec_insert_check_and_lock). The patch fixes this by removing nonnull attribute for thr. Another solution would be to move the check for BTR_NO_LOCKING_FLAG out of the function and keep the nonnull attribute.

Sergei Golubchik added a comment - 2014-07-30 22:59 jplindst , please take a look at the following patch: === modified file 'storage/innobase/include/lock0lock.h' --- storage/innobase/include/lock0lock.h 2014-05-07 15:32:23 +0000 +++ storage/innobase/include/lock0lock.h 2014-07-30 19:36:42 +0000 @@ -277,31 +277,31 @@ UNIV_INTERN dberr_t lock_rec_insert_check_and_lock( /*===========================*/ ulint flags, /*!< in: if BTR_NO_LOCKING_FLAG bit is set, does nothing */ const rec_t* rec, /*!< in: record after which to insert */ buf_block_t* block, /*!< in/out: buffer block of rec */ dict_index_t* index, /*!< in: index */ que_thr_t* thr, /*!< in: query thread */ mtr_t* mtr, /*!< in/out: mini-transaction */ ibool* inherit)/*!< out: set to TRUE if the new inserted record maybe should inherit LOCK_GAP type locks from the successor record */ - __attribute__((nonnull, warn_unused_result)); + __attribute__((nonnull(2,3,4,6,7), warn_unused_result)); /*********************************************************************//** (the same for xtradb, of course). Here's why: old declaration promises that thr can never be NULL, and gcc-4.9.0 trusts that and optimizes accordingly. But in fact, the function starts from lock_rec_insert_check_and_lock( /*===========================*/ ... ibool* inherit) { ... if (flags & BTR_NO_LOCKING_FLAG) { return(DB_SUCCESS); } trx = thr_get_trx(thr); so when BTR_NO_LOCKING_FLAG is set, thr can be NULL (and it is NULL in this stack trace: btr_insert_on_non_leaf_level_func → btr_cur_optimistic_insert → btr_cur_ins_lock_and_undo → lock_rec_insert_check_and_lock). The patch fixes this by removing nonnull attribute for thr. Another solution would be to move the check for BTR_NO_LOCKING_FLAG out of the function and keep the nonnull attribute.

Jan Lindström (Inactive) added a comment - 2014-07-31 07:08

Patch is corret, I just do not follow why bother to call this function at all if BTR_NO_LOCKING_FLAG is set. Removing the call(s) could need deeper fix.

Jan Lindström (Inactive) added a comment - 2014-07-31 07:08 Patch is corret, I just do not follow why bother to call this function at all if BTR_NO_LOCKING_FLAG is set. Removing the call(s) could need deeper fix.

People

Assignee:: Sergei Golubchik

Reporter:: GrantK

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2014-06-14 20:15

Updated:: 2014-07-31 10:52

Resolved:: 2014-07-31 10:52

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server