[MDEV-20590] Introduce a file format constraint to ALTER TABLE Created: 2019-09-13 Updated: 2022-06-07 Resolved: 2020-03-30 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Storage Engine - InnoDB |
| Fix Version/s: | 10.3.23, 10.4.13, 10.5.3 |
| Type: | Task | Priority: | Major |
| Reporter: | Geoff Montee (Inactive) | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Description |
|
If a table is altered using the new INSTANT algorithm, it can force the table to use a non-canonical format:
This is generally not a problem. However, there may be cases where users may want to avoid putting a table into this format. For example, users may want to avoid bugs similar to
https://mariadb.com/kb/en/library/innodb-online-ddl-operations-with-algorithminstant/#known-bugs We may want to add a way for users to avoid putting tables into the non-canonical INSTANT format, if they would like. How would it be implemented?:
|
| Comments |
| Comment by Sergei Golubchik [ 2019-09-23 ] | ||||||||||||
|
I don't understand. Why cannot ALGORITHM=INPLACE prevent instant changes? | ||||||||||||
| Comment by Marko Mäkelä [ 2019-09-23 ] | ||||||||||||
|
As noted in our documentation, the ALGORITHM clause (other than ALGORITHM=COPY) specifies the least efficient algorithm that the user is willing to accept. There currently is no mechanism for specifying the most efficient algorithm. Two cases of ALGORITHM=INSTANT affect the data file format of InnoDB: Related to this, I think that it could be useful to have something that identifies the actual format of the InnoDB data files (‘normal’, ‘instant add to the end’, ‘instant add/drop/reorder’) of tables and partitions. It is not currently being reflected anywhere. Maybe refining the ALGORITHM clause is wrong for this, and we’d want something like ALTER COMPATIBLE TABLE, or we would want to introduce a settable parameter which would cause the refusal of certain cases of ALTER TABLE…ALGORITHM=NOCOPY or ALTER TABLE…ALGORITHM=INSTANT. Suggestions for better names are welcome, but I would consider reintroducing the once-removed innodb_file_format parameter in MariaDB Server 10.3 onwards, with the following options:
| ||||||||||||
| Comment by Sergei Golubchik [ 2019-09-25 ] | ||||||||||||
|
I feel that "ALGORITHM clause specifies the least efficient algorithm that the user is willing to accept" is a rather unexpected semantics. In particular, because ALGORITHM=COPY means just that, COPY. May be we'd better change it to say that ALGORITHM=XXX means exactly that, XXX. And to have "least efficient" one can specify ALGORITHM >= INPLACE ? (or <=, whatever ordering of algorithms we can come up with). Same semantics for the LOCK clause. Alternatively, I was thinking of ALGORITHM=INPLACE,INSTANT,NOCOPY but it's more verbose and, I feel, redundantly specific, a user generally only needs "at least as efficient as XXX", not a list of specific algorithms. to summarize, if "ALGORITHM" is an ENUM, than one can use <=, if it's a SET one has to list all options. But both approaches are well known to MariaDB users, nothing new to learn here. | ||||||||||||
| Comment by Marko Mäkelä [ 2019-10-01 ] | ||||||||||||
|
serg, could you please give a concrete proposal that would be compatible with replication from older versions? | ||||||||||||
| Comment by Sergei Golubchik [ 2019-11-07 ] | ||||||||||||
|
Yes. One of
or
the second syntax is shorter to type, but it implies we'll always be able to order algorithms, even in the future. Sounds like a rather risky assumption. Both proposals extend the current syntax, so replication from older versions will work normally. | ||||||||||||
| Comment by Marko Mäkelä [ 2019-11-08 ] | ||||||||||||
|
Let us take an example: replication of the following statement to a 10.3 server:
GeoffMontee provided the following motivation in the Description:
serg, you did not cover SET alter_algorithm at all. It was introduced in 10.3 by The user could want to use the INSTANT format only on a specific set of nodes that are being replicated from or to, by setting a configuration parameter, which could be alter_algorithm or something else (such as the repurposed innodb_file_format that I suggested). Also, I have trouble understanding the following part of the proposal:
That seems to contradict with the following in
ALGORITHM=INPLACE was introduced in MySQL 5.6 and MariaDB 10.0 already. It can indeed cover 3 fundamentally different classes of operations:
For ALGORITHM=INSTANT or ALGORITHM=COPY, the proposal “use exactly algorithm XXX” is clear, and does not change anything. But, what does “use exactly algorithm NOCOPY” mean? That the operation must consume some time that is proportional to the size of the table? Should we reject the following (if the column is only being renamed)?
If yes, that would break replication from older versions. Note: I do not think that we can redefine ALGORITHM=NOCOPY to only cover those operations that do not require changes to the file format (
Also, what does “use exactly algorithm INPLACE” mean? That it should imply the FORCE clause, which was introduced in MariaDB 5.5.42, to ensure that the table will be rebuilt? But, that would again break the replication of the following from older versions:
Assuming that the type of the column does not change, the first operation is instantaneous. The second operation will cause the table to be copied. And InnoDB could refuse table-rebuilding ALGORITHM=INPLACE in some cases, most notably, if multiple FULLTEXT INDEX exist in the table. Furthermore, not all cases of table-rebuilding ALGORITHM=INPLACE allow LOCK=NONE. In summary, the proposal would seem to not only make replication (and binlog roll-forward) considerably slower (by replacing schema-only changes with table rebuild operations), but also cause it to abort because previously valid operations could be rejected. I would appreciate feedback on my proposal of introducing a configuration parameter that would restrict the storage engine’s choice of file formats. | ||||||||||||
| Comment by Sergei Golubchik [ 2019-11-08 ] | ||||||||||||
|
I did not mean metadata only changes. They're basically always done without any data changes by updating the metadata only.
Not really. It means data-changing ALTER that doesn't rebuild the primary key and is not deferred (does not use non-canonical INSTANT format). COPY means upper-layer copy, copy_data_between_tables(). INPLACE is a superset of NOCOPY, data-changing non-deferred ALTER, even if it copies the table internally (so, yes, ALGORITHM=INPLACE,NOCOPY is redundant, unfortunately). INSTANT means deferred ALTER operations. session variable alter_algorithm — I'd rather made it a SET too, not an ENUM. | ||||||||||||
| Comment by Marko Mäkelä [ 2019-11-08 ] | ||||||||||||
|
serg, what you think might be logical and what the MariaDB Server 10.3 and 10.4 GA releases implement are two different things.
Currently, the following is accepted by MariaDB Server 10.3 and 10.4:
You seem to be suggesting that such operations should be rejected. I think that we should ensure that replication from older versions to newer ones always works. If we changed the interpretation of ALGORITHM=NOCOPY in your suggested way in a later release (major release or point release), replication would be broken. The | ||||||||||||
| Comment by Marko Mäkelä [ 2019-11-12 ] | ||||||||||||
|
While I have a fairly good reason to believe that the corruption bug in MariaDB 10.3 was fixed in
| ||||||||||||
| Comment by Marko Mäkelä [ 2019-11-16 ] | ||||||||||||
|
If we make a file format parameter part of the syntax, we could both control the choice of formats and detect the current format of the table:
The alter_format=DEFAULT or FORMAT=DEFAULT would be to adjust the file format if needed, to avoid table rebuild. | ||||||||||||
| Comment by Sergei Golubchik [ 2019-11-23 ] | ||||||||||||
|
I primarily want us to have a logical, consistent, storage engine independent ALTER TABLE syntax. Adding even more InnoDB specific keywords is hardly a step in the right direction. Currently
I'd prefer them to be either strictly ordered and inclusive (every value includes all following values, COPY includes INPLACE) or exclusive (INPLACE does not include NOCOPY, NOCOPY does not include INSTANT). With clearly defined engine-independent semantics (NOCOPY is weird, INSTANT is fine). If replication is your only concern, it's easy to make the replication thread to ignore the ALGORITHM clause completely. It won't fix mysqlbinlog|mysql case, but replication will work no matter what we change here. | ||||||||||||
| Comment by Marko Mäkelä [ 2019-11-26 ] | ||||||||||||
|
serg, I feel that it would be counter-intuitive to redefine ALGORITHM=INSTANT to refer only to those operations that change the data format of the table. As far as I understand, renaming a column, changing the collation of a non-indexed column, or extending the maximum length of a VARCHAR column are examples of operations that ALGORITHM=INSTANT could allow for any storage engine. As a matter of fact, some of these operations are currently supported by multiple storage engines. For many users, the file format does not matter, but the algorithmic complexity or the expected duration of the operation does. The current definition of ALGORITHM=INSTANT should be intuitive to them. At the same time, some users might want to control the file format as well. That could be provided by introducing a storage engine specific parameter, if extending the ALTER TABLE syntax is not desirable. However, in my opinion innodb_file_per_table and innodb_encrypt_tables were mistakes, and there should be no engine-specific ‘side channels’ that modify the operation of ALTER TABLE. | ||||||||||||
| Comment by Marko Mäkelä [ 2020-01-30 ] | ||||||||||||
|
One more use case for this feature could be replication:
Especially after instant DROP COLUMN, the performance of subsequent UPDATE and INSERT can suffer compared to the canonical format, because the clustered index records would keep storing dummy garbage values for the dropped columns. So, as a maintenance task, we might want to replace instances where the ‘dirty’ instant ALTER TABLE was invoked with ‘clean’ replicas. serg, for this use case, I think that a non-replicated InnoDB parameter (such as repurposing innodb_file_format like I suggested above) would work best. Do you agree? | ||||||||||||
| Comment by Marko Mäkelä [ 2020-02-25 ] | ||||||||||||
|
I think that our decision on this must be consistent with other potential changes in this area, such as the MDEV-18845 proposal to introduce SET alter_lock, extending the | ||||||||||||
| Comment by Marko Mäkelä [ 2020-02-26 ] | ||||||||||||
|
Based on a discussion with serg, I intend to proceed with my plan to repurpose the deprecated parameter innodb_file_format. The alter_algorithm parameter could be clarified or revised as part of MDEV-18845 later. | ||||||||||||
| Comment by Marko Mäkelä [ 2020-03-13 ] | ||||||||||||
|
I am thinking to revive the deprecated innodb_file_format as an ENUM, with the following names, starting with MariaDB 10.3:
In old configuration files, users can have innodb_file_format=barracuda, which was necessary to unlock the creation of ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED tables until that option became the default in MariaDB 10.2 (and MySQL 5.7). The value innodb_file_format=antelope was accepted in the past to limit InnoDB to the MySQL 5.0.3 ROW_FORMAT=REDUNDANT and ROW_FORMAT=COMPACT, but it is unlikely that anyone would explicitly configure that. Because ENUM values cannot have multiple names, the name innodb_file_format=barracuda that was originally introduced in the Oracle-distributed InnoDB Plugin for MySQL 5.1 would be used. It is a rather accurate and canonical name of the data file format, after all. | ||||||||||||
| Comment by Marko Mäkelä [ 2020-03-16 ] | ||||||||||||
|
Unfortunately, the ENUM values add and drop would have to quoted:
Due to this, we’d better use different format names. | ||||||||||||
| Comment by Elena Stepanova [ 2020-03-16 ] | ||||||||||||
|
I hope you would consider different names even regardless the syntax error. In my opinion, it should either be a real name , something abstract (like barracuda), or something descriptive and accurate. "Add" or "drop" are not descriptive or accurate names for the meaning they are supposed to carry, they are just confusing. File format is a lot more than the way one particular variant of one particular SQL command is executed, it can't be named based on that. File format "add" – if anything, I would guess that it has something to do with how the underlying file gets written to, like always adding contents to the file, never replacing or removing, something like that; but certainly not that it's about how instant ALTER TABLE ... ADD is allowed to be. | ||||||||||||
| Comment by Marko Mäkelä [ 2020-03-16 ] | ||||||||||||
|
I am considering to use append and mapped instead of ‘add’ or ‘drop’. Here is the 10.3 version. The commit message includes some reasoning for these names. | ||||||||||||
| Comment by Marko Mäkelä [ 2020-03-18 ] | ||||||||||||
|
After some more discussions, I am suggesting the following names:
I would use _ instead of . so that quotes around the parameter values can be avoided. We might want to omit the force_…_compatible variants and always enforce the requested format in ALTER TABLE or OPTIMIZE TABLE. That could cause a negative surprise for users who might have a leftover innodb_file_format=barracuda in their configuration files and could have executed instant ADD/DROP in an earlier 10.3 or 10.4 version. However, they should have noticed the deprecation warning for that parameter. For future file format changes after MariaDB Server 10.5, we might introduce another parameter to control whether instant ADD/DROP is allowed, or we could introduce further values for innodb_file_format. In my vision, we could extend the 10.4 metadata record format with further things (persistent COUNT in MDEV-18188, or secondary index metadata for simpler IMPORT in MDEV-11658), and users could want to use that format while preventing instant ADD/DROP/reorder column. | ||||||||||||
| Comment by Marko Mäkelä [ 2020-03-18 ] | ||||||||||||
|
From the user point of view, the file format might be an uninteresting implementation detail. What the user wants is a guarantee that there will be no overhead for future DML operations. Instant ADD COLUMN or DROP COLUMN always incurs some of that. So, reviving the deprecated parameter innodb_file_format might be a bad idea, especially given the future plans to improve the .ibd file format in ways that have nothing to do with instant ALTER TABLE operations. I revise my suggestion to introduce a new parameter innodb_instant_alter_column_allowed:
The maximum and default values would be as follows:
| ||||||||||||
| Comment by Matthias Leich [ 2020-03-19 ] | ||||||||||||
|
| ||||||||||||
| Comment by Marko Mäkelä [ 2020-03-30 ] | ||||||||||||
|
I pushed this to 10.3 and merged up to 10.5. Starting with 10.3.24, we introduce the parameter with the default value add_last, and the non-default value never would introduce a constraint. Starting with MariaDB 10.4.14, 10.5.3, and later, we introduce the parameter with the default value will be add_drop_reorder.
|