[MDEV-33087] ALTER TABLE...ALGORITHM=COPY should build indexes more efficiently Created: 2023-12-20  Updated: 2024-01-30

Status: Confirmed
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Storage Engine - InnoDB
Affects Version/s: 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2, 11.3, 11.4
Fix Version/s: 10.11, 11.0, 11.1, 11.2, 11.3

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Thirunarayanan Balathandayuthapani
Resolution: Unresolved Votes: 0
Labels: performance

Issue Links:
Blocks
is blocked by MDEV-24621 In bulk insert, pre-sort and build in... Closed
is blocked by MDEV-26740 Inplace alter rebuild increases file ... Closed
Relates
relates to MDEV-16356 Allow ALGORITHM=NOCOPY for ADD CONSTR... Open
relates to MDEV-33094 row-by-row logging needs to be disabl... Open
relates to MDEV-33329 ALTER TABLE...FORCE fails to recalcul... Open

 Description   

As noted in MDEV-26740, ALTER TABLE...ALGORITHM=COPY fails to make use of the MDEV-24621 optimization to pre-sort each index and to build them page by page.

I tested this on MariaDB Server 11.2, which implements MDEV-16329:

./mtr --rr main.alter_table_online_debug,nobinlog

I set a breakpoint inside ha_innobase::extra(HA_EXTRA_END_ALTER_COPY) and started to debug its surroundings:

break ha_innodb.cc:15692
continue
break ha_innobase::write_row

For the first statement that hits the first breakpoint (alter table t1 add b int NULL, algorithm= copy, lock= none), we can see that the row-level undo logging is being disabled as expected (MDEV-11415). But, we can also see that the initial copying phase is using the regular row-by-row-insert API if we set a data watchpoint on FIL_PAGE_LSN in the clustered index root page:

11.2 96250c82691169921a38a38bc24910294294eb24

#2  0x0000560582646449 in mtr_t::commit (this=...) at /mariadb/11/storage/innobase/mtr/mtr0mtr.cc:438
#3  0x00005605826a68e8 in row_ins_clust_index_entry_low (flags=..., mode=..., index=..., n_uniq=..., entry=..., n_ext=..., thr=...) at /mariadb/11/storage/innobase/row/row0ins.cc:2895
#4  0x00005605826a6bca in row_ins_clust_index_entry (index=..., entry=..., thr=..., n_ext=...) at /mariadb/11/storage/innobase/row/row0ins.cc:3243
#5  0x00005605826a6e84 in row_ins_index_entry (index=..., entry=..., thr=...) at /mariadb/11/storage/innobase/row/row0ins.cc:3375
#6  0x00005605826a6f50 in row_ins_index_entry_step (node=..., thr=...) at /mariadb/11/storage/innobase/row/row0ins.cc:3543
#7  0x00005605826a71a0 in row_ins (node=..., thr=...) at /mariadb/11/storage/innobase/row/row0ins.cc:3660
#8  0x00005605826a73d9 in row_ins_step (thr=...) at /mariadb/11/storage/innobase/row/row0ins.cc:3789
#9  0x00005605826bd2a9 in row_insert_for_mysql (mysql_rec=..., prebuilt=..., ins_mode=...) at /mariadb/11/storage/innobase/row/row0mysql.cc:1314
#10 0x00005605824d641b in ha_innobase::write_row (this=..., record=...) at /mariadb/11/storage/innobase/handler/ha_innodb.cc:7847
#11 0x0000560581ff2a35 in handler::ha_write_row (this=..., buf=...) at /mariadb/11/sql/handler.cc:7852
#12 0x00005605822adb11 in copy_data_between_tables (thd=..., from=..., to=..., ignore=..., order_num=..., order=..., copied=..., deleted=..., alter_info=..., alter_ctx=..., online=..., start_alter_id=...)
    at /mariadb/11/sql/sql_table.cc:12186
#13 0x00005605822b0fce in mysql_alter_table (thd=..., new_db=..., new_name=..., create_info=..., table_list=..., recreate_info=..., alter_info=..., order_num=..., order=..., ignore=..., if_exists=...)
    at /mariadb/11/sql/sql_table.cc:11335
#14 0x0000560582324b53 in Sql_cmd_alter_table::execute (this=..., thd=...) at /mariadb/11/sql/sql_alter.cc:701
#15 0x00005605821eb5c3 in mysql_execute_command (thd=..., is_called_from_prepared_stmt=...) at /mariadb/11/sql/sql_parse.cc:5777
#16 0x00005605821ec175 in mysql_parse (thd=..., rawbuf=..., length=..., parser_state=...) at /mariadb/11/sql/sql_parse.cc:7808

Because this code path provides a work-around for MDEV-26740, we had better fix that bug before we can implement this optimization.

It would be good to test the fix on 11.2 before applying it to the earliest supported version (10.11), because MDEV-16329 is extending the function copy_data_between_tables() with a call to online_alter_read_from_binlog().



 Comments   
Comment by Marko Mäkelä [ 2023-12-20 ]

Related to this, while CREATE TABLE…SELECT is disabling the row-by-row undo logging, it currently uses the inefficient row_ins() API:

--source include/have_innodb.inc
--source include/have_sequence.inc
CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB STATS_PERSISTENT=0
SELECT seq a FROM seq_1_to_3;
TRUNCATE t1;
SET STATEMENT unique_checks=0,foreign_key_checks=0 FOR
INSERT INTO t1 SELECT * FROM seq_1_to_10;
DROP TABLE t1;

For the above SQL, only the INSERT…SELECT will invoke the more efficient trx_t::bulk_insert_apply().

Comment by Marko Mäkelä [ 2023-12-20 ]

If foreign_key_checks=1 (the default) and there are any FOREIGN KEY…REFERENCES clauses in the table that is being altered, then I think that we must keep using the row-by-row interface, so that row_ins_check_foreign_constraints() will check the FOREIGN KEY constraints. This is something that could be improved later in MDEV-16356.

Generated at Thu Feb 08 10:36:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.