Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28800

SIGABRT due to running out of memory for InnoDB locks

Details

    Description

      # mysqld options required for replay: --innodb-buffer-pool-size=300M 
      # Repeat entire testcase (inc server startup) if necessary to reproduce crash (usually a single attempt is sufficient)
      SET GLOBAL innodb_buffer_pool_size=12*1024*1024;
      CREATE TABLE t1 (d DOUBLE) ENGINE=InnoDB;
      INSERT INTO t1 VALUES (0x0061),(0x0041),(0x00E0),(0x00C0),(0x1EA3),(0x1EA2),(0x00E3),(0x00C3),(0x00E1),(0x00C1),(0x1EA1),(0x1EA0);
      INSERT INTO t1 SELECT t1.* FROM t1,t1 t2,t1 t3,t1 t4,t1 t5,t1 t6;
      

      Leads to:

      10.10.0 081a284712bb661349e2e3802077b12211cede3e (Optimized)

      Core was generated by `/test/MD310522-mariadb-10.10.0-linux-x86_64-opt/bin/mysqld --no-defaults --core'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      [Current thread is 1 (Thread 0x14bae07b4700 (LWP 3389561))]
      (gdb) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x000014bb12242859 in __GI_abort () at abort.c:79
      #2  0x00005601e3746893 in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /test/10.10_opt/storage/innobase/ut/ut0ut.cc:530
      #3  0x00005601e374f45a in buf_LRU_check_size_of_non_data_objects () at /test/10.10_opt/storage/innobase/include/ut0ut.h:337
      #4  buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at /test/10.10_opt/storage/innobase/buf/buf0lru.cc:404
      #5  0x00005601e37a6ff7 in buf_page_init_for_read (mode=132, page_id={m_id = 21474838481}, zip_size=0, unzip=<optimized out>) at /test/10.10_opt/storage/innobase/buf/buf0rea.cc:119
      #6  0x00005601e3eb3c12 in buf_read_page_low (unzip=false, zip_size=0, page_id={m_id = 21474838481}, mode=132, sync=false, space=0x14ba68019740, err=<synthetic pointer>) at /test/10.10_opt/storage/innobase/buf/buf0rea.cc:299
      #7  buf_read_page_background (space=space@entry=0x14ba68019740, page_id={m_id = 0}, zip_size=0) at /test/10.10_opt/storage/innobase/buf/buf0rea.cc:487
      #8  0x00005601e3e8ce09 in btr_cur_prefetch_siblings (block=0x14bb0957bd60, index=<optimized out>) at /test/10.10_opt/storage/innobase/include/buf0types.h:95
      #9  0x00005601e3e988ae in btr_cur_optimistic_insert (flags=<optimized out>, cursor=0x14bae07b1360, offsets=offsets@entry=0x14bae07b1348, heap=heap@entry=0x14bae07b1340, entry=entry@entry=0x14ba6801aa98, rec=rec@entry=0x14bae07b1358, big_rec=0x14bae07b1338, n_ext=<optimized out>, thr=0x14ba6801c788, mtr=0x14bae07b1ab0) at /test/10.10_opt/storage/innobase/btr/btr0cur.cc:3453
      #10 0x00005601e3e0cf4d in row_ins_clust_index_entry_low (flags=<optimized out>, mode=<optimized out>, index=0x14ba6801adf8, n_uniq=<optimized out>, entry=0x14ba6801aa98, n_ext=<optimized out>, thr=<optimized out>) at /test/10.10_opt/storage/innobase/row/row0ins.cc:2754
      #11 0x00005601e3e0e6e2 in row_ins_clust_index_entry (index=0x14ba6801adf8, entry=0x14ba6801aa98, thr=0x14ba6801c788, n_ext=0) at /test/10.10_opt/storage/innobase/row/row0ins.cc:3156
      #12 0x00005601e3e0ee31 in row_ins_index_entry (thr=0x14ba6801c788, entry=<optimized out>, index=<optimized out>) at /test/10.10_opt/storage/innobase/row/row0ins.cc:3293
      #13 row_ins_index_entry_step (thr=0x14ba6801c788, node=<optimized out>) at /test/10.10_opt/storage/innobase/row/row0ins.cc:3461
      #14 row_ins (thr=0x14ba6801c788, node=<optimized out>) at /test/10.10_opt/storage/innobase/row/row0ins.cc:3608
      #15 row_ins_step (thr=thr@entry=0x14ba6801c788) at /test/10.10_opt/storage/innobase/row/row0ins.cc:3754
      #16 0x00005601e3e1ec15 in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x14ba6801b458 <incomplete sequence \375>, prebuilt=0x14ba6801c098, ins_mode=ROW_INS_NORMAL) at /test/10.10_opt/storage/innobase/row/row0mysql.cc:1318
      #17 0x00005601e3d73a1a in ha_innobase::write_row (this=0x14ba6801b870, record=0x14ba6801b458 <incomplete sequence \375>) at /test/10.10_opt/storage/innobase/handler/ha_innodb.cc:7842
      #18 0x00005601e3afda50 in handler::ha_write_row (this=0x14ba6801b870, buf=0x14ba6801b458 <incomplete sequence \375>) at /test/10.10_opt/sql/handler.cc:7551
      #19 0x00005601e3872c3d in write_record (thd=0x14ba68000c58, table=0x14ba68017ca8, info=info@entry=0x14ba680141f0, sink=0x0) at /test/10.10_opt/sql/sql_insert.cc:2162
      #20 0x00005601e3873833 in select_insert::send_data (this=0x14ba680141a0, values=<optimized out>) at /test/10.10_opt/sql/sql_insert.cc:4126
      #21 0x00005601e391046c in select_result_sink::send_data_with_check (u=<optimized out>, sent=<optimized out>, items=@0x14ba68045e78: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x14ba68049238, last = 0x14ba68049238, elements = 1}, <No data fields>}, this=<optimized out>) at /test/10.10_opt/sql/sql_class.h:5689
      #22 select_result_sink::send_data_with_check (sent=<optimized out>, u=<optimized out>, items=@0x14ba68045e78: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x14ba68049238, last = 0x14ba68049238, elements = 1}, <No data fields>}, this=<optimized out>) at /test/10.10_opt/sql/sql_class.h:5679
      #23 end_send (join=0x14ba68045ab0, join_tab=0x14ba68055e60, end_of_records=<optimized out>) at /test/10.10_opt/sql/sql_select.cc:22427
      #24 0x00005601e38e7173 in evaluate_join_record (join=join@entry=0x14ba68045ab0, join_tab=0x14ba68055ab0, error=error@entry=0) at /test/10.10_opt/sql/sql_select.cc:21421
      #25 0x00005601e391cee7 in AGGR_OP::end_send (this=0x14ba68049228) at /test/10.10_opt/sql/sql_select.cc:29660
      #26 0x00005601e391d120 in sub_select_postjoin_aggr (join=0x14ba68045ab0, join_tab=0x14ba68055ab0, end_of_records=<optimized out>) at /test/10.10_opt/sql/sql_select.cc:20901
      #27 0x00005601e39264df in do_select (procedure=<optimized out>, join=0x14ba68045ab0) at /test/10.10_opt/sql/sql_select.cc:20738
      #28 JOIN::exec_inner (this=0x14ba68045ab0) at /test/10.10_opt/sql/sql_select.cc:4786
      #29 0x00005601e3926a48 in JOIN::exec (this=this@entry=0x14ba68045ab0) at /test/10.10_opt/sql/sql_select.cc:4564
      #30 0x00005601e3924c51 in mysql_select (thd=0x14ba68000c58, tables=0x14ba68011508, fields=@0x14ba680111c8: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x14ba680114c0, last = 0x14ba680114c0, elements = 1}, <No data fields>}, conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=<optimized out>, result=0x14ba680141a0, unit=0x14ba68004cb8, select_lex=0x14ba68010f28) at /test/10.10_opt/sql/sql_select.cc:5044
      #31 0x00005601e3925397 in handle_select (thd=thd@entry=0x14ba68000c58, lex=lex@entry=0x14ba68004be0, result=result@entry=0x14ba680141a0, setup_tables_done_option=setup_tables_done_option@entry=1073741824) at /test/10.10_opt/sql/sql_select.cc:578
      #32 0x00005601e38b74dc in mysql_execute_command (thd=0x14ba68000c58, is_called_from_prepared_stmt=<optimized out>) at /test/10.10_opt/sql/sql_parse.cc:4708
      #33 0x00005601e38a3bb5 in mysql_parse (rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, thd=0x14ba68000c58) at /test/10.10_opt/sql/sql_parse.cc:8036
      #34 mysql_parse (thd=0x14ba68000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/10.10_opt/sql/sql_parse.cc:7958
      #35 0x00005601e38af6ca in dispatch_command (command=COM_QUERY, thd=0x14ba68000c58, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/10.10_opt/sql/sql_class.h:1364
      #36 0x00005601e38b15f2 in do_command (thd=0x14ba68000c58, blocking=blocking@entry=true) at /test/10.10_opt/sql/sql_parse.cc:1407
      #37 0x00005601e39c78af in do_handle_one_connection (connect=<optimized out>, connect@entry=0x5601e6906368, put_in_cache=put_in_cache@entry=true) at /test/10.10_opt/sql/sql_connect.cc:1418
      #38 0x00005601e39c7b8d in handle_one_connection (arg=0x5601e6906368) at /test/10.10_opt/sql/sql_connect.cc:1312
      #39 0x000014bb12753609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #40 0x000014bb1233f133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      10.10.0 081a284712bb661349e2e3802077b12211cede3e (Debug)

      Core was generated by `/test/MD310522-mariadb-10.10.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      [Current thread is 1 (Thread 0x1455640cc700 (LWP 3589026))]
      (gdb) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x00001455874f5859 in __GI_abort () at abort.c:79
      #2  0x000055f12541eaad in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /test/10.10_dbg/storage/innobase/ut/ut0ut.cc:530
      #3  0x000055f1254bf9a5 in buf_LRU_check_size_of_non_data_objects () at /test/10.10_dbg/storage/innobase/include/ut0ut.h:337
      #4  buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at /test/10.10_dbg/storage/innobase/buf/buf0lru.cc:404
      #5  0x000055f1254c06dd in buf_page_init_for_read (mode=mode@entry=132, page_id=<optimized out>, zip_size=zip_size@entry=0, unzip=unzip@entry=false) at /test/10.10_dbg/storage/innobase/buf/buf0rea.cc:119
      #6  0x000055f1254c2ef0 in buf_read_page_low (err=err@entry=0x1455640c8ae4, space=space@entry=0x1454ac0218f0, sync=sync@entry=false, mode=mode@entry=132, page_id={m_id = 21474838467}, zip_size=0, unzip=false) at /test/10.10_dbg/storage/innobase/buf/buf0rea.cc:299
      #7  0x000055f1254c535a in buf_read_page_background (space=space@entry=0x1454ac0218f0, page_id={m_id = 21474838467}, zip_size=<optimized out>) at /test/10.10_dbg/storage/innobase/buf/buf0rea.cc:487
      #8  0x000055f12544c0bf in btr_cur_prefetch_siblings (block=block@entry=0x14557e833800, index=index@entry=0x1454ac0205c0) at /test/10.10_dbg/storage/innobase/include/buf0types.h:95
      #9  0x000055f125452fac in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x1455640c9160, offsets=offsets@entry=0x1455640c9148, heap=heap@entry=0x1455640c9140, entry=entry@entry=0x1454ac01eff0, rec=rec@entry=0x1455640c9158, big_rec=0x1455640c9138, n_ext=<optimized out>, thr=0x1454ac026f00, mtr=0x1455640c9710) at /test/10.10_dbg/storage/innobase/btr/btr0cur.cc:3453
      #10 0x000055f12533c37b in row_ins_clust_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=2, index=index@entry=0x1454ac0205c0, n_uniq=n_uniq@entry=0, entry=entry@entry=0x1454ac01eff0, n_ext=n_ext@entry=0, thr=<optimized out>) at /test/10.10_dbg/storage/innobase/row/row0ins.cc:2754
      #11 0x000055f12533e562 in row_ins_clust_index_entry (index=index@entry=0x1454ac0205c0, entry=entry@entry=0x1454ac01eff0, thr=thr@entry=0x1454ac026f00, n_ext=n_ext@entry=0) at /test/10.10_dbg/storage/innobase/row/row0ins.cc:3156
      #12 0x000055f1253402e8 in row_ins_index_entry (thr=0x1454ac026f00, entry=0x1454ac01eff0, index=0x1454ac0205c0) at /test/10.10_dbg/storage/innobase/row/row0ins.cc:3293
      #13 row_ins_index_entry_step (thr=0x1454ac026f00, node=<optimized out>) at /test/10.10_dbg/storage/innobase/row/row0ins.cc:3461
      #14 row_ins (thr=0x1454ac026f00, node=<optimized out>) at /test/10.10_dbg/storage/innobase/row/row0ins.cc:3608
      #15 row_ins_step (thr=thr@entry=0x1454ac026f00) at /test/10.10_dbg/storage/innobase/row/row0ins.cc:3754
      #16 0x000055f12536293d in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x1454ac025b18 <incomplete sequence \375>, prebuilt=0x1454ac0267f0, ins_mode=ROW_INS_NORMAL) at /test/10.10_dbg/storage/innobase/row/row0mysql.cc:1318
      #17 0x000055f1251eca63 in ha_innobase::write_row (this=0x1454ac025fa0, record=0x1454ac025b18 <incomplete sequence \375>) at /test/10.10_dbg/storage/innobase/handler/ha_innodb.cc:7842
      #18 0x000055f124ed76c9 in handler::ha_write_row (this=0x1454ac025fa0, buf=0x1454ac025b18 <incomplete sequence \375>) at /test/10.10_dbg/sql/handler.cc:7551
      #19 0x000055f124b9afd2 in write_record (thd=0x1454ac000db8, table=0x1454ac0256a8, info=info@entry=0x1454ac017710, sink=0x0) at /test/10.10_dbg/sql/sql_insert.cc:2162
      #20 0x000055f124b9b460 in select_insert::send_data (this=0x1454ac0176c0, values=<optimized out>) at /test/10.10_dbg/sql/sql_insert.cc:4126
      #21 0x000055f124c54fc9 in select_result_sink::send_data_with_check (sent=<optimized out>, u=<optimized out>, items=@0x1454ac06e3a8: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x1454ac071758, last = 0x1454ac071758, elements = 1}, <No data fields>}, this=<optimized out>) at /test/10.10_dbg/sql/sql_class.h:5689
      #22 end_send (join=0x1454ac06dfe0, join_tab=0x1454ac07ed60, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:22427
      #23 0x000055f124c2702e in evaluate_join_record (join=join@entry=0x1454ac06dfe0, join_tab=0x1454ac07e9b0, error=error@entry=0) at /test/10.10_dbg/sql/sql_select.cc:21421
      #24 0x000055f124c63943 in AGGR_OP::end_send (this=this@entry=0x1454ac071748) at /test/10.10_dbg/sql/sql_select.cc:29660
      #25 0x000055f124c63c4b in sub_select_postjoin_aggr (join=0x1454ac06dfe0, join_tab=0x1454ac07e9b0, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:20901
      #26 0x000055f124c3c9d1 in sub_select (join=join@entry=0x1454ac06dfe0, join_tab=join_tab@entry=0x1454ac07e600, end_of_records=end_of_records@entry=true) at /test/10.10_dbg/sql/sql_select.cc:21135
      #27 0x000055f124c3ce4a in sub_select_cache (join=0x1454ac06dfe0, join_tab=0x1454ac07e600, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:20968
      #28 0x000055f124c3c9d1 in sub_select (join=join@entry=0x1454ac06dfe0, join_tab=join_tab@entry=0x1454ac07e250, end_of_records=end_of_records@entry=true) at /test/10.10_dbg/sql/sql_select.cc:21135
      #29 0x000055f124c3ce4a in sub_select_cache (join=0x1454ac06dfe0, join_tab=0x1454ac07e250, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:20968
      #30 0x000055f124c3c9d1 in sub_select (join=join@entry=0x1454ac06dfe0, join_tab=join_tab@entry=0x1454ac07dea0, end_of_records=end_of_records@entry=true) at /test/10.10_dbg/sql/sql_select.cc:21135
      #31 0x000055f124c3ce4a in sub_select_cache (join=0x1454ac06dfe0, join_tab=0x1454ac07dea0, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:20968
      #32 0x000055f124c3c9d1 in sub_select (join=join@entry=0x1454ac06dfe0, join_tab=join_tab@entry=0x1454ac07daf0, end_of_records=end_of_records@entry=true) at /test/10.10_dbg/sql/sql_select.cc:21135
      #33 0x000055f124c3ce4a in sub_select_cache (join=0x1454ac06dfe0, join_tab=0x1454ac07daf0, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:20968
      #34 0x000055f124c3c9d1 in sub_select (join=join@entry=0x1454ac06dfe0, join_tab=join_tab@entry=0x1454ac07d740, end_of_records=end_of_records@entry=true) at /test/10.10_dbg/sql/sql_select.cc:21135
      #35 0x000055f124c3ce4a in sub_select_cache (join=0x1454ac06dfe0, join_tab=0x1454ac07d740, end_of_records=<optimized out>) at /test/10.10_dbg/sql/sql_select.cc:20968
      #36 0x000055f124c3c9d1 in sub_select (join=0x1454ac06dfe0, join_tab=0x1454ac07d390, end_of_records=true) at /test/10.10_dbg/sql/sql_select.cc:21135
      #37 0x000055f124c702b1 in do_select (procedure=<optimized out>, join=0x1454ac06dfe0) at /test/10.10_dbg/sql/sql_select.cc:20738
      #38 JOIN::exec_inner (this=this@entry=0x1454ac06dfe0) at /test/10.10_dbg/sql/sql_select.cc:4786
      #39 0x000055f124c70814 in JOIN::exec (this=this@entry=0x1454ac06dfe0) at /test/10.10_dbg/sql/sql_select.cc:4564
      #40 0x000055f124c6e598 in mysql_select (thd=thd@entry=0x1454ac000db8, tables=0x1454ac014a28, fields=@0x1454ac0146e8: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x1454ac0149e0, last = 0x1454ac0149e0, elements = 1}, <No data fields>}, conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2202244745984, result=0x1454ac0176c0, unit=0x1454ac004fd8, select_lex=0x1454ac014448) at /test/10.10_dbg/sql/sql_select.cc:5044
      #41 0x000055f124c6ed8e in handle_select (thd=thd@entry=0x1454ac000db8, lex=lex@entry=0x1454ac004f00, result=result@entry=0x1454ac0176c0, setup_tables_done_option=setup_tables_done_option@entry=1073741824) at /test/10.10_dbg/sql/sql_select.cc:578
      #42 0x000055f124be8f9d in mysql_execute_command (thd=thd@entry=0x1454ac000db8, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/10.10_dbg/sql/sql_parse.cc:4708
      #43 0x000055f124bd4e3a in mysql_parse (thd=thd@entry=0x1454ac000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1455640cb470) at /test/10.10_dbg/sql/sql_parse.cc:8036
      #44 0x000055f124be2422 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x1454ac000db8, packet=packet@entry=0x1454ac00b6d9 "INSERT INTO t1 SELECT t1.* FROM t1,t1 t2,t1 t3,t1 t4,t1 t5,t1 t6", packet_length=packet_length@entry=64, blocking=blocking@entry=true) at /test/10.10_dbg/sql/sql_class.h:1364
      #45 0x000055f124be4b2c in do_command (thd=0x1454ac000db8, blocking=blocking@entry=true) at /test/10.10_dbg/sql/sql_parse.cc:1407
      #46 0x000055f124d443c0 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55f127e141d8, put_in_cache=put_in_cache@entry=true) at /test/10.10_dbg/sql/sql_connect.cc:1418
      #47 0x000055f124d448c9 in handle_one_connection (arg=0x55f127e141d8) at /test/10.10_dbg/sql/sql_connect.cc:1312
      #48 0x0000145587a06609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #49 0x00001455875f2133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      Though earlier versions will not crash, the query will take a very long time in all MariaDB versions, as well as all MySQL server versions. See MDEV-28804.
      As an interesting side note, MS 5.7 runs just under two minutes slower/behind MD 10.7.5 on the first query, started at approximately the same time, on the same machine.

      Attachments

        Issue Links

          Activity

            If I remember correctly, the prototype that I had produced performed very badly in stress tests.

            marko Marko Mäkelä added a comment - If I remember correctly, the prototype that I had produced performed very badly in stress tests.
            debarun Debarun Banerjee added a comment - - edited

            I could analyze the issue. The root cause is explicit record locks are taking up much bigger space in buffer pool. It is a regression in 10.6.
            Let's see the problem from users perspective. We use buffer pool of sufficient size (128M) to prevent other critical symptoms (I will come to that shortly) and focus on the buffer pool memory occupied by non-data objects using the test mentioned in bug. I use a transaction block so that the locks are not immediately released and we can measure the memory.

            Test

            CREATE TABLE t1 (d DOUBLE) ENGINE=InnoDB;
            INSERT INTO t1 VALUES (0x0061),(0x0041),(0x00E0),(0x00C0),(0x1EA3),(0x1EA2),(0x00E3),(0x00C3),(0x00E1),(0x00C1),(0x1EA1),(0x1EA0);
             
            START TRANSACTION;
            INSERT INTO t1 SELECT t1.* FROM t1,t1 t2,t1 t3,t1 t4,t1 t5,t1 t6;
             
            SELECT POOL_SIZE, FREE_BUFFERS, DATABASE_PAGES, MODIFIED_DATABASE_PAGES, POOL_SIZE - (FREE_BUFFERS + DATABASE_PAGES) FROM information_schema.innodb_buffer_pool_stats;
            

            10.5

            | POOL_SIZE | FREE_BUFFERS | DATABASE_PAGES | MODIFIED_DATABASE_PAGES | POOL_SIZE - (FREE_BUFFERS + DATABASE_PAGES) | 
            +-----------+--------------+----------------+-------------------------+---------------------------------------------+
            |      8015 |            0 |           7609 |                    1485 |                                         406 |
            +-----------+--------------+----------------+-------------------------+---------------------------------------------+
            

            10.6

            +-----------+--------------+----------------+-------------------------+---------------------------------------------+
            | POOL_SIZE | FREE_BUFFERS | DATABASE_PAGES | MODIFIED_DATABASE_PAGES | POOL_SIZE - (FREE_BUFFERS + DATABASE_PAGES) | 
            +-----------+--------------+----------------+-------------------------+---------------------------------------------+
            |      8089 |         1536 |           3724 |                     280 |                                        2829 |
            +-----------+--------------+----------------+-------------------------+---------------------------------------------+
            

            The last column here is the measure of the BP pages occupied by non-data objects which in current case are the explicit row locks. We can observe the usage increases from 406 pages (6.34M) to 2829 pages (44.2 M). There is a substantial increase in lock memory usage ~ about 8x.

            Analysis showed the issue is introduced in 10.6 by following commit ID.

            commit 898dcf93a886a68450dce7713208fdad82f9178e (HEAD -> 10.6)
            Author: Marko Mäkelä <marko.makela@mariadb.com>
            Date:   Tue Jan 26 09:00:43 2021 +0200
             
                Cleanup the lock creation
                
                LOCK_MAX_N_STEPS_IN_DEADLOCK_CHECK, LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK,
                LOCK_RELEASE_INTERVAL: Replace with the bare use of the constants.
                
                lock_rec_create_low(): Remove LOCK_PAGE_BITMAP_MARGIN altogether.
                We already have REDZONE_SIZE as a 'safety margin' in AddressSanitizer
                builds, to catch any out-of-bounds access.
            

            static const ulint LOCK_PAGE_BITMAP_MARGIN = 64;

            It removed one important optimization for lock bitmap pre-allocation. We pre-allocate about 8 byte extra space along with every lock object to adjust for similar locks on the same page by same transaction. When it is exhausted, a new lock object is created with similar 8 byte pre allocation. With this optimization removed we have are left with only 1 byte pre-allocation. When large number of records are locked in a single page, we end up re-allocating the lock too many times almost in n^2 order. The lock size being 88 bytes here is how the overall memory allocation grows in 10.5 and 10.6 with 512 records locked in a page.

            10.5: Allocates 8 lock objects with 9 byte bitmap associated with each.
            (88 + 1 + 8) + (88 + 2 * ( 1 + 8)) + ... + (88 + 8 * (1 + 8)) = 89 + 97 + 106 + 115 + ... 160 ~ 1K

            in 10.6: Allocates 64 lock objects with 1 byte bitmap associated with each.
            89 + 90 + 91 + .... + 160 ~ 8K

            1. While the 10.5 design can be improved too it is reasonable approach as the max number of rows don't grow too high to create an exponential pattern.

            2. The advantage of 10.6 design is that it wastes less space per lock when the lock pattern is more scattered and we don't lock too many rows in a page.

            Since locking many pages in Innodb is quite common in RR (default) and Serializable isolation and the lock object itself costs us 88 bytes +8 bytes of extra allocation seems reasonable and the bug should be fixed by bringing back the 10.5 design. We may further research and work out a better design altogether for lock allocation in future.

            The scenario that gets affected include
            1. Large Range scan with UPDATE, DELETE, SELECT FOR UPDATE
            2. INSERT INTO SELECT where we select from the table into which we are inserting.

            I will upload a patch for the fix shortly.

            debarun Debarun Banerjee added a comment - - edited I could analyze the issue. The root cause is explicit record locks are taking up much bigger space in buffer pool. It is a regression in 10.6. Let's see the problem from users perspective. We use buffer pool of sufficient size (128M) to prevent other critical symptoms (I will come to that shortly) and focus on the buffer pool memory occupied by non-data objects using the test mentioned in bug. I use a transaction block so that the locks are not immediately released and we can measure the memory. Test CREATE TABLE t1 (d DOUBLE) ENGINE=InnoDB; INSERT INTO t1 VALUES (0x0061),(0x0041),(0x00E0),(0x00C0),(0x1EA3),(0x1EA2),(0x00E3),(0x00C3),(0x00E1),(0x00C1),(0x1EA1),(0x1EA0);   START TRANSACTION; INSERT INTO t1 SELECT t1.* FROM t1,t1 t2,t1 t3,t1 t4,t1 t5,t1 t6;   SELECT POOL_SIZE, FREE_BUFFERS, DATABASE_PAGES, MODIFIED_DATABASE_PAGES, POOL_SIZE - (FREE_BUFFERS + DATABASE_PAGES) FROM information_schema.innodb_buffer_pool_stats; 10.5 | POOL_SIZE | FREE_BUFFERS | DATABASE_PAGES | MODIFIED_DATABASE_PAGES | POOL_SIZE - (FREE_BUFFERS + DATABASE_PAGES) | +-----------+--------------+----------------+-------------------------+---------------------------------------------+ | 8015 | 0 | 7609 | 1485 | 406 | +-----------+--------------+----------------+-------------------------+---------------------------------------------+ 10.6 +-----------+--------------+----------------+-------------------------+---------------------------------------------+ | POOL_SIZE | FREE_BUFFERS | DATABASE_PAGES | MODIFIED_DATABASE_PAGES | POOL_SIZE - (FREE_BUFFERS + DATABASE_PAGES) | +-----------+--------------+----------------+-------------------------+---------------------------------------------+ | 8089 | 1536 | 3724 | 280 | 2829 | +-----------+--------------+----------------+-------------------------+---------------------------------------------+ The last column here is the measure of the BP pages occupied by non-data objects which in current case are the explicit row locks. We can observe the usage increases from 406 pages (6.34M) to 2829 pages (44.2 M). There is a substantial increase in lock memory usage ~ about 8x. Analysis showed the issue is introduced in 10.6 by following commit ID. commit 898dcf93a886a68450dce7713208fdad82f9178e (HEAD -> 10.6) Author: Marko Mäkelä <marko.makela@mariadb.com> Date: Tue Jan 26 09:00:43 2021 +0200   Cleanup the lock creation LOCK_MAX_N_STEPS_IN_DEADLOCK_CHECK, LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK, LOCK_RELEASE_INTERVAL: Replace with the bare use of the constants. lock_rec_create_low(): Remove LOCK_PAGE_BITMAP_MARGIN altogether. We already have REDZONE_SIZE as a 'safety margin' in AddressSanitizer builds, to catch any out-of-bounds access. static const ulint LOCK_PAGE_BITMAP_MARGIN = 64; It removed one important optimization for lock bitmap pre-allocation. We pre-allocate about 8 byte extra space along with every lock object to adjust for similar locks on the same page by same transaction. When it is exhausted, a new lock object is created with similar 8 byte pre allocation. With this optimization removed we have are left with only 1 byte pre-allocation. When large number of records are locked in a single page, we end up re-allocating the lock too many times almost in n^2 order. The lock size being 88 bytes here is how the overall memory allocation grows in 10.5 and 10.6 with 512 records locked in a page. 10.5: Allocates 8 lock objects with 9 byte bitmap associated with each. (88 + 1 + 8) + (88 + 2 * ( 1 + 8)) + ... + (88 + 8 * (1 + 8)) = 89 + 97 + 106 + 115 + ... 160 ~ 1K in 10.6: Allocates 64 lock objects with 1 byte bitmap associated with each. 89 + 90 + 91 + .... + 160 ~ 8K 1. While the 10.5 design can be improved too it is reasonable approach as the max number of rows don't grow too high to create an exponential pattern. 2. The advantage of 10.6 design is that it wastes less space per lock when the lock pattern is more scattered and we don't lock too many rows in a page. Since locking many pages in Innodb is quite common in RR (default) and Serializable isolation and the lock object itself costs us 88 bytes +8 bytes of extra allocation seems reasonable and the bug should be fixed by bringing back the 10.5 design. We may further research and work out a better design altogether for lock allocation in future. The scenario that gets affected include 1. Large Range scan with UPDATE, DELETE, SELECT FOR UPDATE 2. INSERT INTO SELECT where we select from the table into which we are inserting. I will upload a patch for the fix shortly.

            While testing the scenario, I observed 2 more independent issues that affects the outcome when DB page size gets low due to too many pages gets allocated by the locks.

            1. For low BP size (BP < 80M) , we could observe hang in page cleaner if too many pages are occupied by the locks. This issue is present in 10.5. I am filing a separate issue for it.
            MDEV-34166

            2. There is another regression in 10.6 which prevents the DML from exiting with ER_LOCK_TABLE_FULL and leads to the FATAL error and server exit. I am filing a separate issue for it.
            MDEV-34167

            debarun Debarun Banerjee added a comment - While testing the scenario, I observed 2 more independent issues that affects the outcome when DB page size gets low due to too many pages gets allocated by the locks. 1. For low BP size (BP < 80M) , we could observe hang in page cleaner if too many pages are occupied by the locks. This issue is present in 10.5. I am filing a separate issue for it. MDEV-34166 2. There is another regression in 10.6 which prevents the DML from exiting with ER_LOCK_TABLE_FULL and leads to the FATAL error and server exit. I am filing a separate issue for it. MDEV-34167

            Uploaded patch for review.

            debarun Debarun Banerjee added a comment - Uploaded patch for review.

            Thank you, a good catch and fix.

            marko Marko Mäkelä added a comment - Thank you, a good catch and fix.

            People

              debarun Debarun Banerjee
              Roel Roel Van de Paar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.