[MDEV-16291] Allow ALGORITHM=NOCOPY for most ALTER TABLE in InnoDB - Jira

XML

Word

Printable

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: None
Component/s: Storage Engine - InnoDB
Labels:
- ddl

Description

~~MDEV-11424~~ introduces support for ALGORITHM=INSTANT for failure-free ALTER TABLE operations that do not involve adding or dropping indexes.

Operations that may fail will require validation by reading all data in the table, and thus they cannot be instantaneous. While we cannot support ALGORITHM=INSTANT, we can support ALGORITHM=NOCOPY, that is, perform the validation and use the MDEV-11424 mechanism for representing the operation, without modifying any data.

Examples of such operations include:

Adding or removing UNSIGNED attribute of an integer column
Adding NOT NULL attribute
Shortening CHAR or VARCHAR (to produce errors or warnings for data truncation)
Changing the character encoding of a CHAR or VARCHAR

Examples of changes that will require the table to be rebuilt (that will not be supported with ALGORITHM=NOCOPY) include:

Adding, dropping or changing PRIMARY KEY columns
Changing the ROW_FORMAT or ENGINE
Altering a table that is in ROW_FORMAT=COMPRESSED

Metadata format changes

This work will extend the 'default row' record introduced in MDEV-11369 further, to store 'conversion recipes' that make it possible to process records that correspond to an earlier definition of the table. To indicate the presence of the recipes in the 'default row', a new bit in info_bits will be set, and an additional BLOB column is appended to the record for storing the recipes.

TO BE DEFINED: What metadata exactly needs to be stored and how? At least the following:

Character set encoding, so that data can be converted to the newest encoding
Whether integer columns are signed or unsigned
The original fixed length of string columns
Do we want to support conversions between string and integer columns? ENUM? BIT? SET?

Data format changes

Since this work is a refinement of MDEV-11424, any INSERT or UPDATE after an ALTER TABLE operation that changes the data format would cause the clustered index leaf page to be rewritten in ROW_FORMAT=REDUNDANT.

In this work, we must store a 'conversion recipe number' in the record or page header.

We can repurpose the 64-bit page header field PAGE_MAX_TRX_ID. ~~MDEV-6076~~ repurposed it for PAGE_ROOT_AUTO_INC in the root page. If the table consists of a single root page or it later shrinks to a single page, we can simply convert the table to non-instant format. Having a per-page format would require that all records of the page be converted to the newest format on any write.
In the ROW_FORMAT=REDUNDANT record header, there are 3 spare info_bits that we could use to support a small maximum number of conversion recipes.

We may choose to limit the number of conversion recipes that can be supported. If an ALTER TABLE operation would require a conversion recipe, but there is no space to store one, the operation would be refused with ALGORITHM=NOCOPY. After any ALTER TABLE that rebuilds the table, the maximum number of conversion recipes will be available again.

How it works

The 'default row' record needs to store the conversion recipes that allow each clustered index leaf page record format version to be converted to the latest table definition. These recipes will be cached in dict_table_t. The 'default row' record will only be read when the table definition is first loaded to the data dictionary cache.

Adding or dropping indexes or renaming columns or indexes would not touch the 'default row' record.

If we store recipes for converting from version 1 to latest, 2 to latest, and so on, then each instant ALTER (~~MDEV-11424~~, ~~MDEV-11369~~, ~~MDEV-15562~~) may have to rewrite the old recipes, so that they will work from 1 to latest+1, 2 to latest+1 and so on, and finally add a recipe from latest to latest+1. Because we will store the recipes in a BLOB, and because BLOBs are copy-on-write in InnoDB, all conversion recipe pages will have to be rewritten in any case.

Read and write access to the clustered index would be affected as follows:
Writes would always use the latest dictionary version. If we use per-page identifier (instead of per-record) and the page that is being written to contains other old-format records, we could convert all of them while we are modifying the page.

Reads would have to be prepared to convert records from old format to the latest one.

Possible future work: Cleaning up conversion recipes

There could be a separate online operation for clean-up, such as OPTIMIZE INDEX PRIMARY ON t1, which would also defragment the index. This would be useful if writes to the table are rare, because reads would not write back any converted records.

The clean-up operation could also remove old version history from the 'default row' record. After ~~MDEV-16264~~, we could also have a background cleanup operation that would (with some granularity) remember the last processed PRIMARY KEY values, by storing them in the (otherwise ignored) keys of the 'default row' records.

Attachments

Issue Links

blocks

MDEV-17035 Support ALGORITHM=NOCOPY for CHANGE virtual column expression

Open

includes

MDEV-17520 Instant ALTER TABLE for failure-free column type changes

Stalled

is blocked by

MDEV-11424 Instant ALTER TABLE of failure-free record format changes

Closed

MDEV-16356 Allow ALGORITHM=NOCOPY for ADD CONSTRAINT

Open

MDEV-18926 Support ALTER TABLE…ADD COLUMN…ADD INDEX, ALGORITHM=NOCOPY

Open

MDEV-34832 Support adding AUTO_INCREMENT flag to existing numeric using INPLACE

Open

relates to

MDEV-28727 ALTER TABLE ALGORITHM=NOCOPY does not work after upgrade

Closed

(1 is blocked by, 1 relates to)

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 6 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2018-05-25 10:09

Updated:: 2025-03-16 21:49

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.