[MDEV-3422] LP:925985 - Innodb crash on JOIN query when "join_buffer_size" > "join_buffer_space_limit" Created: 2012-02-03  Updated: 2015-02-02  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Troex Nevelin (Inactive) Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug925985.xml     File LPexportBug925985_bug-925985-db.sql     File LPexportBug925985_bug-925985-query.sql     File LPexportBug925985_bug925985.test     Text File LPexportBug925985_maria-innodb-crash.txt    

 Description   

I don't how to describe the problem but the query below crashes innodb (xtradb). It does not happening on mysql 5.1 & 5.5
I'm using 5.3.3-MariaDB-rc-mariadb108~squeeze-log from mariadb repository on debian 6 64bit

Query:

SELECT DISTINCT
B.TYPE_SID,
B.ID BANNER_ID,
B.WEIGHT BANNER_WEIGHT,
B.SHOWS_FOR_VISITOR,
B.FIX_CLICK,
B.FIX_SHOW,
B.KEYWORDS BANNER_KEYWORDS,
DATE_FORMAT(B.DATE_SHOW_FIRST, '%d.%m.%Y %H:%i:%s') DATE_SHOW_FIRST,
DATE_FORMAT(B.DATE_SHOW_FROM, '%d.%m.%Y %H:%i:%s') DATE_SHOW_FROM,
DATE_FORMAT(B.DATE_SHOW_TO, '%d.%m.%Y %H:%i:%s') DATE_SHOW_TO,
B.FLYUNIFORM FLYUNIFORM,
B.MAX_SHOW_COUNT MAX_SHOW_COUNT,
B.SHOW_COUNT SHOW_COUNT,
C.ID CONTRACT_ID,
C.WEIGHT CONTRACT_WEIGHT,
C.KEYWORDS CONTRACT_KEYWORDS
FROM
b_adv_type T

INNER JOIN b_adv_banner B ON (
B.ACTIVE='Y'
and B.TYPE_SID = T.SID
and B.STATUS_SID = 'PUBLISHED'
and (B.FOR_NEW_GUEST is null or B.FOR_NEW_GUEST='N')
and (ifnull(B.MAX_SHOW_COUNT,0)>ifnull(B.SHOW_COUNT,0) or ifnull(B.MAX_SHOW_COUNT,0)=0)
and (ifnull(B.MAX_CLICK_COUNT,0)>ifnull(B.CLICK_COUNT,0) or ifnull(B.MAX_CLICK_COUNT,0)=0)
and (ifnull(B.MAX_VISITOR_COUNT,0)>ifnull(B.VISITOR_COUNT,0) or ifnull(B.MAX_VISITOR_COUNT,0)=0)
and (B.DATE_SHOW_FROM<=now() or B.DATE_SHOW_FROM is null or length(B.DATE_SHOW_FROM)<=0)
and (B.DATE_SHOW_TO>=now() or B.DATE_SHOW_TO is null or length(B.DATE_SHOW_TO)<=0))

INNER JOIN b_adv_banner_2_site BS ON (
BS.BANNER_ID = B.ID
and BS.SITE_ID = 'en')

INNER JOIN b_adv_contract C ON (
C.ID = B.CONTRACT_ID
and C.ACTIVE='Y'
and (ifnull(C.MAX_SHOW_COUNT,0)>ifnull(C.SHOW_COUNT,0) or ifnull(C.MAX_SHOW_COUNT,0)=0)
and (ifnull(C.MAX_CLICK_COUNT,0)>ifnull(C.CLICK_COUNT,0) or ifnull(C.MAX_CLICK_COUNT,0)=0)
and (ifnull(C.MAX_VISITOR_COUNT,0)>ifnull(C.VISITOR_COUNT,0) or ifnull(C.MAX_VISITOR_COUNT,0)=0)
and (C.DATE_SHOW_FROM<=now() or C.DATE_SHOW_FROM is null or length(C.DATE_SHOW_FROM)<=0)
and (C.DATE_SHOW_TO>=now() or C.DATE_SHOW_TO is null or length(C.DATE_SHOW_TO)<=0))

INNER JOIN b_adv_contract_2_site CS ON (
CS.CONTRACT_ID = B.CONTRACT_ID
and CS.SITE_ID = 'en')

INNER JOIN b_adv_contract_2_type CT ON (
CT.CONTRACT_ID = C.ID
and (CT.TYPE_SID = 'ALL' or CT.TYPE_SID = T.SID))

INNER JOIN b_adv_banner_2_weekday BW ON (
BW.BANNER_ID = B.ID
and BW.C_WEEKDAY='FRIDAY'
and BW.C_HOUR = '11')

INNER JOIN b_adv_contract_2_weekday CW ON (
CW.CONTRACT_ID = C.ID
and CW.C_WEEKDAY='FRIDAY'
and CW.C_HOUR = '11')

LEFT JOIN b_adv_banner_2_group UG1 ON (
(UG1.BANNER_ID = B.ID
 and UG1.GROUP_ID in (1,2) and UG1.GROUP_ID<>2)
)

LEFT JOIN b_adv_banner_2_page BP1 ON (
BP1.BANNER_ID = B.ID
and BP1.SHOW_ON_PAGE='Y')

LEFT JOIN b_adv_banner_2_page BP2 ON (
BP2.BANNER_ID = B.ID
and BP2.SHOW_ON_PAGE='N'
and '/club/index.php' like concat(BP2.PAGE, '%'))

LEFT JOIN b_adv_contract_2_page CP1 ON (
CP1.CONTRACT_ID = C.ID
and CP1.SHOW_ON_PAGE='Y')

LEFT JOIN b_adv_contract_2_page CP2 ON (
CP2.CONTRACT_ID = C.ID
and CP2.SHOW_ON_PAGE='N'
and '/club/index.php' like concat(CP2.PAGE, '%'))

LEFT JOIN b_adv_banner_2_stat_adv BA ON BA.BANNER_ID = B.ID
LEFT JOIN b_adv_banner_2_country BC ON BC.BANNER_ID = B.ID AND (
(
(B.STAT_TYPE is null OR length(B.STAT_TYPE)=0 OR B.STAT_TYPE='COUNTRY')
AND BC.COUNTRY_ID='UA'
) OR (
B.STAT_TYPE='REGION'
AND BC.COUNTRY_ID='UA'
AND BC.REGION=''
) OR (
B.STAT_TYPE='CITY'
AND BC.CITY_ID='2'
)
)

WHERE
T.ACTIVE = 'Y'

and (
B.STAT_COUNT is null
or B.STAT_COUNT = 0
or BC.BANNER_ID is not null
)
and BP2.ID is null
and CP2.ID is null
and (BP1.ID is null or '/club/index.php' like concat(BP1.PAGE, '%'))
and (CP1.ID is null or '/club/index.php' like concat(CP1.PAGE, '%'))
and (BA.STAT_ADV_ID is null or BA.STAT_ADV_ID='0')
and (BC.COUNTRY_ID is null or BC.COUNTRY_ID='UA')
 and
 (
(B.SHOW_USER_GROUP = 'Y' and UG1.GROUP_ID is not null)
or
(B.SHOW_USER_GROUP <> 'Y' and UG1.GROUP_ID is null)
 )
ORDER BY B.TYPE_SID desc, C.ID desc

This query is used in most popular russian CMS (bitrix) upon user login. I would like to provide more info if needed



 Comments   
Comment by Troex Nevelin (Inactive) [ 2012-02-03 ]

Re: innodb crash on big query (please review)

Comment by Troex Nevelin (Inactive) [ 2012-02-03 ]

syslog output on crash
LPexportBug925985_maria-innodb-crash.txt

Comment by Troex Nevelin (Inactive) [ 2012-02-03 ]

Re: innodb crash on big query (please review)
Downgraded to 5.2.10-MariaDB-mariadb107~squeeze-log without re-creating tables - now this query works fine

Comment by Elena Stepanova [ 2012-02-06 ]

Re: innodb crash on big query (please review)
Hi,

Could you please provide information about the tables used in the query?
Ideally, the full database dump, but if the contents is confidential and can't be obfuscated, then at least the structure (e.g. the output of SHOW CREATE TABLE) of the involved tables, and, if possible, some idea about the size of the tables.

You can use the private section on ftp.askmonty.org to upload the data you don't want to make public.

Thanks.

Comment by Troex Nevelin (Inactive) [ 2012-02-06 ]

Re: innodb crash on big query (please review)
Okay, I figured out what exactly lead to the problem.
This problem was not happening with default server configuration, but only happend with my config, after excluding options one by one I found that option which is causing crash - "join_buffer_size = 4M".

Setting this option down to 2M or less will not cause crash. As I see "join_buffer_size" was bigger than "join_buffer_space_limit" which is by default 2M.

Setting "join_buffer_space_limit = 4M" and "join_buffer_size = 4M" did the trick and that query does not crash the DB anymore.

If you still need the database sample - let me know, I'll upload it to the private ftp area.

Comment by Elena Stepanova [ 2012-02-07 ]

Re: innodb crash on big query (please review)
Yes, please upload it anyway.
Thanks.

Comment by Troex Nevelin (Inactive) [ 2012-02-08 ]

Re: innodb crash on big query (please review)
I've extracted part of db which doesn't contain any private info. The error is still happening with query and setting described above. Correct result of this query must return one row.

Comment by Troex Nevelin (Inactive) [ 2012-02-08 ]

I've extracted part of db which doesn't contain any private info. The error is still happening with query and setting described above. Correct result of this query must return one row.
sample db to reproduce crash
LPexportBug925985_bug-925985-db.sql

Comment by Troex Nevelin (Inactive) [ 2012-02-08 ]

Re: innodb crash on big query (please review)

Comment by Troex Nevelin (Inactive) [ 2012-02-08 ]

Query in separate file
LPexportBug925985_bug-925985-query.sql

Comment by Elena Stepanova [ 2012-02-13 ]

Re: Innodb crash on JOIN query when "join_buffer_size" > "join_buffer_space_limit"
Thank you.
I suppose even if the combination of parameters is not very smart, it is not bad enough to allow server to die.
I'm attaching a somewhat reduced MTR test case.

Crash is reproducible on release builds (5.3.3, 5.3.4 from buildbot).

Debug builds don't crash, but cause valgrind warnings:

==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x8329AC6: JOIN_CACHE::join_matching_records(bool) (sql_join_cache.cc:2202)
==16502== by 0x83297C2: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2062)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>, st_table, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)
==16502==
==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x832A0CF: JOIN_CACHE::join_null_complements(bool) (sql_join_cache.cc:2496)
==16502== by 0x83298D1: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2096)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>, st_table, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)
==16502==
==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x83296F8: JOIN_CACHE::restore_last_record() (sql_join_cache.cc:2009)
==16502== by 0x83299CC: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2134)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>, st_table, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)

Comment by Elena Stepanova [ 2012-02-13 ]

Thank you.
I suppose even if the combination of parameters is not very smart, it is not bad enough to allow server to die.
I'm attaching a somewhat reduced MTR test case.

Crash is reproducible on release builds (5.3.3, 5.3.4 from buildbot).

Debug builds don't crash, but cause valgrind warnings:

==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x8329AC6: JOIN_CACHE::join_matching_records(bool) (sql_join_cache.cc:2202)
==16502== by 0x83297C2: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2062)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>, st_table, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)
==16502==
==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x832A0CF: JOIN_CACHE::join_null_complements(bool) (sql_join_cache.cc:2496)
==16502== by 0x83298D1: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2096)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>, st_table, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)
==16502==
==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x83296F8: JOIN_CACHE::restore_last_record() (sql_join_cache.cc:2009)
==16502== by 0x83299CC: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2134)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>, st_table, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)

MTR test case
LPexportBug925985_bug925985.test

Comment by Igor Babaev [ 2012-02-14 ]

Re: Innodb crash on JOIN query when "join_buffer_size" > "join_buffer_space_limit"
The following simple test case crashes the server (debug version) due to the same problem.

CREATE TABLE t1 (a int);
INSERT INTO t1 VALUES (5), (3);

CREATE TABLE t2 (a int, b int);
INSERT INTO t2 VALUES
(3,30), (1,10), (7,70), (2,20),
(3,31), (1,11), (7,71), (2,21),
(3,32), (1,12), (7,72), (2,22);

CREATE TABLE t3 (b int, c int);
INSERT INTO t3 VALUES (32, 302), (42,400), (30,300);

set @tmp_optimizer_switch=@@optimizer_switch;
set optimizer_switch='optimize_join_buffer_size=off';
set join_buffer_space_limit=4096;
set join_buffer_size=4096*2;
set join_cache_level=2;
set optimizer_switch='outer_join_with_cache=on';

EXPLAIN
SELECT * FROM t1, t2 LEFT JOIN t3 ON t2.b=t3.b WHERE t1.a=t2.a;
SELECT * FROM t1, t2 LEFT JOIN t3 ON t2.b=t3.b WHERE t1.a=t2.a;

set join_buffer_space_limit=default;
set join_buffer_size=default;
set join_cache_level=default;
set optimizer_switch=@tmp_optimizer_switch;

DROP TABLE t1,t2,t3;

Comment by Rasmus Johansson (Inactive) [ 2012-02-20 ]

Launchpad bug id: 925985

Generated at Thu Feb 08 06:48:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.