[MDEV-5171]  Add support for --innodb-optimize-keys to mysqldump. Created: 2013-10-22  Updated: 2023-10-24  Resolved: 2023-04-11

Status: Closed
Project: MariaDB Server
Component/s: Scripts & Clients
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Peter (Stig) Edwards Assignee: Marko Mäkelä
Resolution: Won't Fix Votes: 5
Labels: None

Attachments: Text File mdev5171.diff.txt    
Issue Links:
Relates
relates to MDEV-515 innodb bulk insert Closed
relates to MDEV-11415 Remove excessive undo logging during ... Closed
relates to MDEV-16281 Implement parallel CREATE INDEX, ALTE... Open
relates to MDEV-24621 In bulk insert, pre-sort and build in... Closed
relates to MDEV-32250 To benefit from MDEV-515 , please mak... Open
Sprint: 10.0.20, 10.0.21

 Description   

Hello and thank you for mariadb,

If the --innodb-optimize-keys mysqldump option was available with MariaDB I would use it when backing up and moving tables using mysqldump. It can also be used to shrink InnoDB table files on mysqld instances where "ALTER TABLE table_name ROW_FORMAT=Compact" does not result in fast index creation being used and where expand_fast_index_creation is not available so "OPTIMIZE TABLE table_name" and "ALTER TABLE table_name ENGINE=INNODB" do not use fast index creation.

Having support for expand_fast_index_creation would also be great, but I think there is value from just adding the pragmatic mysqldump option.

Applying the latest changes with fixes for the mysqldump option to MariaDB 10 was relatively easy. The original work and subsequent patches with tests having been created by Alexey Kopytov.

Here are some links to the background:

http://bugs.mysql.com/bug.php?id=57583
http://bugs.mysql.com/bug.php?id=49120
http://www.percona.com/doc/percona-server/5.5/management/innodb_expanded_fast_index_creation.html#expand_fast_index_creation
http://www.mysqlperformanceblog.com/2012/06/19/building-indexes-by-sorting-in-innodb-aka-fast-index-creation/
http://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/
http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/
https://bugs.launchpad.net/percona-server/+bug/989253
https://bugs.launchpad.net/percona-server/+bug/858945
https://bugs.launchpad.net/percona-server/+bug/744103
https://github.com/facebook/mysql-5.6/commit/7862a74ddb48eceaf2a48531d20550752c868a46

I tested using the example from Mark Callaghan in http://bugs.mysql.com/bug.php?id=57583

create table rt (i int primary key auto_increment, j float) engine=innodb;
insert into rt values (null, 1);
create index x2 on rt(j);
insert into rt select null, rand(0) from rt;     (21 times for 2,097,152 rows)
 
120M test/rt.ibd
 
    Data_length: 62472192
   Index_length: 50937856
      Data_free: 7340032
 
mariadb-10.0.4-linux-x86_64/bin/mysqldump --order-by-primary test rt | mysql -D test2
 
120M test2/rt.ibd
 
    Data_length: 62472192
   Index_length: 50937856
      Data_free: 7340032
 
Percona-Server-5.6.13-rel61.0-461.Linux.x86_64/bin/mysqldump --order-by-primary --innodb-optimize-keys test rt | mysql -D test3
 
92M test3/rt.ibd
 
    Data_length: 62472192
   Index_length: 30998528
      Data_free: 0

Thanks again.



 Comments   
Comment by Elena Stepanova [ 2013-10-22 ]

I suppose since it's just an option in a client program, it can be added even after 10.0.5, setting to 10.0.6 for now.

Comment by Peter (Stig) Edwards [ 2013-10-23 ]

Alexey pointed me to a recent bug with mysqldump --innodb-optimize-keys in Percona Server, where incorrect CREATE TABLE statement for partitioned tables are produced, it has not been fixed yet:
https://bugs.launchpad.net/percona-server/+bug/1233841

Comment by Peter (Stig) Edwards [ 2013-10-30 ]

This is a diff against 10.0, it contains just the changes needed for mysqldump from the Percona 5.5 code and two other tiny changes, an update to the man page and the addition of the warning for the duplicate index in the test result. Attribution may also be needed.

added:
mysql-test/r/percona_mysqldump_innodb_optimize_keys.result
mysql-test/t/percona_mysqldump_innodb_optimize_keys.test
modified:
client/client_priv.h
client/mysqldump.c
man/mysqldump.1

This would still have this bug - https://bugs.launchpad.net/percona-server/+bug/1233841

Comment by Patryk Pomykalski [ 2013-11-25 ]

The above bug is fixed.

Comment by Jan Lindström (Inactive) [ 2015-05-12 ]

http://lists.askmonty.org/pipermail/commits/2015-May/007813.html

Comment by Jan Lindström (Inactive) [ 2015-07-27 ]

http://lists.askmonty.org/pipermail/commits/2015-July/008193.html

Could you review this again, as I had to rewrite internal parser to support unquoted identifiers
and multi-column keys.

Comment by Rasmus Johansson (Inactive) [ 2016-11-28 ]

MySQL has the same feature request:
http://bugs.mysql.com/bug.php?id=49120

Comment by Marko Mäkelä [ 2017-10-23 ]

Maybe the problem would be simply solved by using a special "bulk load" mechanism when executing a transaction that performs the first insert into an empty table or partition. This could be part of MDEV-515.

Comment by Sergei Golubchik [ 2018-02-14 ]

Wouldn't it be better for InnoDB to support ALTER TABLE DISABLE/ENABLE KEYS that mysqldump already uses around inserts anyway?

A simple solution could be done purely in ha_innodb.cc, by dropping indexes on ALTER TABLE DISABLE KEYS and recreating them online on {{ALTER TABLE ENABLE KEYS. This won't require InnoDB to support disabled indexes and for an empty table dropping is fast.

Comment by Marko Mäkelä [ 2021-02-09 ]

The scope of MDEV-515 was reduced, and MDEV-24621 was filed for optimizing index creation during an insert into an empty table. With that, I think that a LOAD DATA statement should do the right thing.

Unfortunately, currently mysqldump may generate multiple INSERT statements per table. Only the first INSERT statement into an empty table would be optimized by MDEV-515 and MDEV-24621. Could we change that to LOAD?

Comment by Daniel Black [ 2021-02-15 ]

The default mysqldump is with --opt, implying --extended-insert, so one INSERT per table. It can be configured otherwise. (up until the max-packet-size - default 24M)

For LOAD DATA we'd need the mariadb client (or server - mysqldump isn't strictly for the consumption of mysql/mariadb client) to recognize a an inline version of LOAD DATA. Otherwise some form of multiple files like https://github.com/maxbube/mydumper/commits/master (that seems to be getting activity again).

Comment by Marko Mäkelä [ 2021-05-26 ]

In MDEV-24818 we hacked the code so that a multi-statement INSERT transaction from mysqldump will be accelerated using the MDEV-515 mechanism. The data loading can be accelerated seriously further by MDEV-24621.

Comment by Marko Mäkelä [ 2023-04-11 ]

The Description feels a bit outdated.

git show mariadb-10.0.20:storage/innobase/include/univ.i|grep VERSION|head -3

#define INNODB_VERSION_MAJOR	5
#define INNODB_VERSION_MINOR	6
#define INNODB_VERSION_BUGFIX	25

MariaDB 10.0.20 claims to be based on the InnoDB from MySQL 5.6.25. That version includes Alter_inplace_info::RECREATE_TABLE, which had been added in MySQL 5.6.13 to support OPTIMIZE TABLE using the WL#6255 InnoDB online table rebuild algorithm whose original version was released as part of MySQL 5.6.8.

I think that the last missing piece to speed up data loading is MDEV-16281, to implement the multi-threaded creation of index trees.

Comment by Marko Mäkelä [ 2023-10-24 ]

Note: To benefit from MDEV-24621, a special option --no-autocommit needs to be specified to mariadb-dump until MDEV-32250 makes it the default.

Generated at Thu Feb 08 07:02:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.