[MDEV-4897] Assertion `share->tdc.prev == 0 && share->tdc.next == 0' failed in TABLE_SHARE* tdc_acquire_share(THD*, const char*, const char*, const char*, uint, uint, TABLE**) Created: 2013-08-14  Updated: 2013-08-22  Resolved: 2013-08-15

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.4
Fix Version/s: 10.0.4

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None

Attachments: HTML File threads1    
Issue Links:
Relates
relates to MDEV-4702 Reduce usage of LOCK_open Closed

 Description   

mysqld: 10.0/sql/table_cache.cc:901: TABLE_SHARE* tdc_acquire_share(THD*, const char*, const char*, const char*, uint, uint, TABLE**): Assertion `share->tdc.prev == 0 && share->tdc.next == 0' failed.

#7  0x00007f27641da192 in __GI___assert_fail (assertion=0xfbf840 "share->tdc.prev == 0 && share->tdc.next == 0", file=0xfbf3c0 "10.0/sql/table_cache.cc", line=901, function=0xfbfc20 "TABLE_SHARE* tdc_acquire_share(THD*, const char*, const char*, const char*, uint, uint, TABLE**)") at assert.c:103
#8  0x00000000007b04ef in tdc_acquire_share (thd=0x310edd8, db=0x7f26f4c2bd80 "test", table_name=0x7f26f439e6f0 "k", key=0x7f26f46dc7ed "test", key_length=7, flags=3, out_table=0x7f2747e96258) at 10.0/sql/table_cache.cc:901
#9  0x00000000005d45d1 in open_table (thd=0x310edd8, table_list=0x7f26f46dc3b0, mem_root=0x7f2747e96610, ot_ctx=0x7f2747e96650) at 10.0/sql/sql_base.cc:2245
#10 0x00000000005d6e41 in open_and_process_table (thd=0x310edd8, lex=0x3112810, tables=0x7f26f46dc3b0, counter=0x7f2747e9676c, flags=0, prelocking_strategy=0x7f2747e967a0, has_prelocking_list=false, ot_ctx=0x7f2747e96650, new_frm_mem=0x7f2747e96610) at 10.0/sql/sql_base.cc:3739
#11 0x00000000005d8081 in open_tables (thd=0x310edd8, start=0x7f2747e96720, counter=0x7f2747e9676c, flags=0, prelocking_strategy=0x7f2747e967a0) at 10.0/sql/sql_base.cc:4282
#12 0x00000000005d8f35 in open_and_lock_tables (thd=0x310edd8, tables=0x7f26f46dc3b0, derived=true, flags=0, prelocking_strategy=0x7f2747e967a0) at 10.0/sql/sql_base.cc:4897
#13 0x00000000005cdaab in open_and_lock_tables (thd=0x310edd8, tables=0x7f26f46dc3b0, derived=true, flags=0) at 10.0/sql/sql_base.h:485
#14 0x0000000000616b6d in mysql_insert (thd=0x310edd8, table_list=0x7f26f46dc3b0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at 10.0/sql/sql_insert.cc:731
#15 0x0000000000637bbc in mysql_execute_command (thd=0x310edd8) at 10.0/sql/sql_parse.cc:3380
#16 0x000000000063fd7c in mysql_parse (thd=0x310edd8, rawbuf=0x7f26f4fab720 "INSERT INTO k ( `col_int_nokey`, `col_int_key` ) VALUES ( 6 , 3 ) , ( 9 , 7 ) , ( 7 , 0 ) , ( 3 , 5 )", length=101, parser_state=0x7f2747e974f0) at 10.0/sql/sql_parse.cc:6264
#17 0x0000000000632bae in dispatch_command (command=COM_QUERY, thd=0x310edd8, packet=0x31157e9 " INSERT INTO k ( `col_int_nokey`, `col_int_key` ) VALUES ( 6 , 3 ) , ( 9 , 7 ) , ( 7 , 0 ) , ( 3 , 5 ) ", packet_length=103) at 10.0/sql/sql_parse.cc:1277
#18 0x0000000000631fe6 in do_command (thd=0x310edd8) at 10.0/sql/sql_parse.cc:983
#19 0x0000000000752983 in do_handle_one_connection (thd_arg=0x310edd8) at 10.0/sql/sql_connect.cc:1379
#20 0x00000000007526d6 in handle_one_connection (arg=0x310edd8) at 10.0/sql/sql_connect.cc:1293
#21 0x0000000000e556bd in pfs_spawn_thread (arg=0x308f0c8) at 10.0/storage/perfschema/pfs.cc:1853
#22 0x00007f27651c0e9a in start_thread (arg=0x7f2747e98700) at pthread_create.c:308
#23 0x00007f276429ecbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

(All threads stack trace is attached as 'threads1')

Coredump doesn't seem to be very helpful:

(gdb) frame 8
#8  0x00000000007b04ef in tdc_acquire_share (thd=0x310edd8, db=0x7f26f4c2bd80 "test",
    table_name=0x7f26f439e6f0 "k", key=0x7f26f46dc7ed "test", key_length=7, flags=3,
    out_table=0x7f2747e96258) at 10.0/sql/table_cache.cc:901
901       DBUG_ASSERT(share->tdc.prev == 0 && share->tdc.next == 0);
(gdb) p share->tdc.prev
$1 = (TABLE_SHARE **) 0x0
(gdb) p share->tdc.next
$2 = (TABLE_SHARE *) 0x0

bzr version-info

revision-id: svoj@mariadb.org-20130814085017-hgdzpt71613pyfkw
revno: 3794
branch-nick: 10.0

Query (0x7f26f4fab720): INSERT INTO k ( `col_int_nokey`, `col_int_key` ) VALUES ( 6 , 3 ) , ( 9 , 7 ) , ( 7 , 0 ) , ( 3 , 5 )
Connection ID (thread ID): 14
Status: NOT_KILLED

RQG grammar (mdev4897.yy)

query_init:
	SET GLOBAL table_open_cache = 10 ; CREATE TABLE IF NOT EXISTS A ( `i` INT ) ; CREATE TABLE IF NOT EXISTS B ( `i` INT ) ;
 
thread1:
	SELECT * FROM B t1, B t2, B t3, B t4, B t5, B t6, B t7, B t8, B t9, B t10 ;
 
query:
	SELECT * FROM A ;

gentest command line (assuming the server is running on port 3306, all default options suffice):

perl ./gentest.pl --threads=8 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=3306:user=root:database=test --grammar=mdev4897.yy

runall command line (starts the server on port 19300 and runs the same test):

perl ./runall.pl --grammar=/home/elenst/bug/mdev4897.yy --skip-gendata --threads=8 --queries=100M --duration=600 --basedir=/home/elenst/bzr/10.0 --vardir=/home/elenst/test_results/mdev4897

It takes a few minutes to hit the failure on my machine.



 Comments   
Comment by Sergey Vojtovich [ 2013-08-14 ]

The most obvious way to reproduce it should be as following:

1. create a table (e.g. create table t1(a int))
2. create another table (e.g. create table t2(a int))
2. access this table (e.g. select * from t1), now table instance is in the table cache
3. access other table so that t1 table instance is evicted from the cache, but it's share stays
(e.g. select * from t2 as a1, t2 as a2, ... t2 as aN where N is table cache size)
4. let 2 (or more) threads to access t1 (e.g. select * form t1)
5. if we're still not crashed goto 2

There is probably easier way to evict t2 from the table cache, but it needs more code analysis.

This assertion is not fully correct: when multiple threads acquire the same previously unused share, only one thread shall remove share from unused list. Other threads will ignore this step and may continue even if that thread didn't yet remove the share from unused list.

Comment by Sergey Vojtovich [ 2013-08-14 ]

Ehm,

5. if we're still not crashed goto 3

...to evict t1 from the table cache...

sorry for confusion.

Comment by Elena Stepanova [ 2013-08-14 ]

Thanks for the analysis, based on that I was able to create a stress test which reproduces the problem reliably on my machine, I added details to the problem description. You can try it out on yours, or I can try your fix on mine, or I can create a test setup on perro.

Comment by Sergey Vojtovich [ 2013-08-15 ]

Hi Elena,

just remove this assertion and test again.

There is even simpler test case, which I didn't think about yesterday:
1. create view
2. let multiple threads access this view
and that's it.

Comment by Sergey Vojtovich [ 2013-08-15 ]

Fixed in 10.0.4.

http://bazaar.launchpad.net/~maria-captains/maria/10.0/revision/3801

Comment by Elena Stepanova [ 2013-08-16 ]

Removing the assertion worked all right so far. I'm running more tests on the new revision.

Generated at Thu Feb 08 07:00:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.