[MDEV-12428] SIGSEGV in buf_page_decrypt_after_read() during DDL Created: 2017-04-02  Updated: 2020-01-24  Resolved: 2017-04-04

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2
Fix Version/s: 10.1.23, 10.2.5

Type: Bug Priority: Blocker
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
is caused by MDEV-11581 Mariadb starts innodb encryption thre... Closed
is caused by MDEV-11738 Mariadb uses 100% of several of my 8 ... Closed
Relates
relates to MDEV-12602 InnoDB: Failing assertion: space->n_p... Closed
relates to MDEV-12694 test failure: encryption.create_or_re... Closed

 Description   

https://internal.askmonty.org/buildbot/builders/p8-rhel71-bintar-debug/builds/1775/steps/test/logs/stdio
https://internal.askmonty.org/buildbot/builders/kvm-fulltest2/builds/7628/steps/test_2/logs/stdio

rpl.rpl_commit_after_flush 'innodb,row'  w1 [ fail ]
        Test ended at 2017-04-02 08:39:50
 
CURRENT_TEST: rpl.rpl_commit_after_flush
mysqltest: In included file "./include/rpl_init.inc": 
included from ./include/master-slave.inc at line 38:
included from /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/mysql-test/suite/rpl/t/rpl_commit_after_flush.test at line 3:
At line 170: query 'SET GLOBAL gtid_slave_pos= ""' failed: 2013: Lost connection to MySQL server during query

10.2 238c6700dd5

#3  <signal handler called>
#4  0x00000000281c36cc in buf_page_decrypt_after_read (bpage=0x3fff95adbf40) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/buf/buf0buf.cc:7543
#5  0x00000000281bdcd0 in buf_page_io_complete (bpage=0x3fff95adbf40, evict=false) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/buf/buf0buf.cc:5949
#6  0x00000000281f2a68 in buf_read_page_low (err=0x3fff95794414, sync=true, type=0, mode=132, page_id=..., page_size=..., unzip=false, rbpage=0x3fff957945d8) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/buf/buf0rea.cc:217
#7  0x00000000281f329c in buf_read_page (page_id=..., page_size=..., bpage=0x3fff957945d8) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/buf/buf0rea.cc:417
#8  0x00000000281b8130 in buf_page_get_gen (page_id=..., page_size=..., rw_latch=2, guess=0x0, mode=10, file=0x288c8b60 "/home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/btr/btr0btr.cc", line=1096, mtr=0x3fff95794af0, err=0x0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/buf/buf0buf.cc:4340
#9  0x0000000028146244 in btr_free_root_check (page_id=..., page_size=..., index_id=45, mtr=0x3fff95794af0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/btr/btr0btr.cc:1095
#10 0x0000000028146dc8 in btr_free_if_exists (page_id=..., page_size=..., index_id=45, mtr=0x3fff95794af0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/btr/btr0btr.cc:1372
#11 0x0000000028208830 in dict_drop_index_tree (rec=0x3fff96320a48 "", pcur=0x3fff95795008, mtr=0x3fff95794af0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/dict/dict0crea.cc:1052
#12 0x000000002805cd20 in DropIndex::operator() (this=0x3fff957951a0, mtr=0x3fff95794af0, pcur=0x3fff95795008) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/row/row0trunc.cc:949
#13 0x0000000028067b8c in IndexIterator::for_each<DropIndex> (this=0x3fff95794af0, callback=...) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/row/row0trunc.cc:110
#14 0x00000000280650d8 in SysIndexIterator::for_each<DropIndex> (this=0x3fff95795168, callback=...) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/row/row0trunc.cc:168
#15 0x000000002805f5e8 in row_truncate_table_for_mysql (table=0x3fff2c0835a8, trx=0x3fff964808c8) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/row/row0trunc.cc:1966
#16 0x0000000027e58020 in ha_innobase::truncate (this=0x3fff3407d668) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/innobase/handler/ha_innodb.cc:13755
#17 0x0000000027b35378 in handler::ha_truncate (this=0x3fff3407d668) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/handler.cc:4069
#18 0x0000000027a1790c in rpl_slave_state::truncate_state_table (this=0x10010f94190, thd=0x3fff48000b00) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/rpl_gtid.cc:405
#19 0x0000000027a19b30 in rpl_slave_state::load (this=0x10010f94190, thd=0x3fff48000b00, state_from_master=0x3fff480125e8 "", len=0, reset=true, in_statement=true) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/rpl_gtid.cc:1070
#20 0x0000000027820a58 in rpl_gtid_pos_update (thd=0x3fff48000b00, str=0x3fff480125e8 "", len=0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_repl.cc:4511
#21 0x00000000279a5ebc in Sys_var_gtid_slave_pos::global_update (this=0x28fe1cc8 <Sys_gtid_slave_pos>, thd=0x3fff48000b00, var=0x3fff48012598) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sys_vars.cc:1651
#22 0x00000000276cd2f4 in sys_var::update (this=0x28fe1cc8 <Sys_gtid_slave_pos>, thd=0x3fff48000b00, var=0x3fff48012598) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/set_var.cc:208
#23 0x00000000276cfd20 in set_var::update (this=0x3fff48012598, thd=0x3fff48000b00) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/set_var.cc:825
#24 0x00000000276cf6a0 in sql_set_variables (thd=0x3fff48000b00, var_list=0x3fff48005310, free=true) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/set_var.cc:726
#25 0x00000000277d3fbc in mysql_execute_command (thd=0x3fff48000b00) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_parse.cc:4811
#26 0x00000000277debec in mysql_parse (thd=0x3fff48000b00, rawbuf=0x3fff48012448 "SET GLOBAL gtid_slave_pos= \"\"", length=29, parser_state=0x3fff957969e8, is_com_multi=false, is_next_command=false) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_parse.cc:7874
#27 0x00000000277ca5b8 in dispatch_command (command=COM_QUERY, thd=0x3fff48000b00, packet=0x3fff480ebf51 "SET GLOBAL gtid_slave_pos= \"\"", packet_length=29, is_com_multi=false, is_next_command=false) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_parse.cc:1812
#28 0x00000000277c8b84 in do_command (thd=0x3fff48000b00) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_parse.cc:1362
#29 0x0000000027980a30 in do_handle_one_connection (connect=0x100115632a0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_connect.cc:1354
#30 0x000000002798068c in handle_one_connection (arg=0x100115632a0) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/sql/sql_connect.cc:1260
#31 0x0000000028550808 in pfs_spawn_thread (arg=0x10011607710) at /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/storage/perfschema/pfs.cc:1862
#32 0x00003fff9bf27cec in start_thread (arg=0x3fff95798180) at pthread_create.c:312
#33 0x00003fff9b791140 in clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:96



 Comments   
Comment by Marko Mäkelä [ 2017-04-03 ]

I believe that this is a regression of merging the MDEV-11738/MDEV-11581 fix from 10.1 to 10.2.

In 10.2, the problem occurs when the replication subsystem (exercised by replication tests) is internally executing TRUNCATE TABLE mysql.gtid_slave_pos. For the crash to occur, I believe that some pages of the table must be missing from the InnoDB buffer pool. Then, while reading a page from the file, when looking up the tablespace, InnoDB would get a NULL pointer, because both fil_space_t::stop_new_ops and fil_space_t::is_being_truncated would be set during TRUNCATE TABLE.

In 10.1, the bug should be harder to trigger, but I believe that it is there as well, introduced by the above-mentioned commit. The scenario would be that a background operation would read pages into the buffer pool while the tablespace is being dropped, truncated or rebuilt (DROP/TRUNCATE/ALTER/OPTIMIZE TABLE).

My suggested fix is to add a parameter to fil_space_acquire() that indicates that the caller is doing it for low-level I/O. The only caller that would set this flag would be buf_page_decrypt_after_read(). With this change, the replication tests in 10.2 seem to work.

Comment by Marko Mäkelä [ 2017-04-03 ]

bb-10.1-marko
bb-10.2-marko

Comment by Jan Lindström (Inactive) [ 2017-04-03 ]

ok to push.

Generated at Thu Feb 08 07:57:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.