[MDEV-27720] Missleading "Out of memory" message on failure to write to innodb_tmpdir Created: 2022-02-02  Updated: 2022-02-03

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5.13, 10.6.5
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Hartmut Holzgraefe Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

When having innodb_tmpdir set to a directory that the MariaDB server process can't write to, then doing an INPLACE ALTER TABLE, and doing concurrent DML on the table in parallel, the ALTER TABLE will eventually fail with "Out of memory" error message even though the problem is actually file system / permission related.

MariaDB [test]> alter table t1 engine=innodb, algorithm=inplace;
ERROR 1041 (HY000): Out of memory.

The DML statement in the other session on the other hand will report the problem correctly:

MariaDB [test]> insert into t1 select null, md5(rand()) from t1 limit 10000;
ERROR 1 (HY000): Can't create/write to file '/var/tmp/mysql/innodb_tmpdir/ibLWAFHd' (Errcode: 13 "Permission denied")

What I also noticed: it was only the second time I ran the above INSERT in parallel to the ALTER that things failed. So INPLACE seems to first collect concurrent changes in Memory, and only start to write them to actual temporary files in innodb_tmdir after reaching a certain threshold?

Maybe that's the reason for the "Out of memory" message: the memory buffer for this being full, and extending things to disk not being possible?

Anyway, seeing two completely different error messages here is not good.

The user running the concurrent INSERT may not even be aware of the ongoing ALTER and so may have no idea why there's suddenly an error message about not being able to create a file.

And the user running the ALTER may not be aware of concurrent DML, especially when not explicitly setting the ALGORITHM (which includes running OPTIMIZE TABLE which implicitly does a copying ALTER TABLE in the background), and so will be left rather clueless with the "Out of memory" message.



 Comments   
Comment by Hartmut Holzgraefe [ 2022-02-02 ]

Ah, the KB is not clear on this:

"The table is extended by innodb_sort_buffer_size up to the limit set by this variable. If this limit is exceeded, the online DDL operation fails and all uncommitted changes are rolled back."

The MySQL manual has this instead, which IMHO is much more clear:

"The temporary log file is extended when needed by the value of innodb_sort_buffer_size, up to the maximum specified by innodb_online_alter_log_max_size. If a temporary log file exceeds the upper size limit, the ALTER"

So temporary log file data for an ongoing INPLACE ALTER TABLE is stored in in RAM in a buffer of innodb_sort_buffer_size first, and only written out to a tmp file in innodb_tmpdir when more than innodb_sort_buffer_size bytes are needed?

That would explain why I was only getting an error on my 2nd batch INSERT ...

Comment by Marko Mäkelä [ 2022-02-03 ]

First of all, the ALGORITHM=INPLACE syntax is misleading, and I think that ALGORITHM=ENGINE would have been more appropriate. It simply requests that the ALTER TABLE opeartion be executed natively by the storage engine. The alternative to that is ALGORITHM=COPY, which will create a new table definition using an intermediate name, and then execute the equivalent of INSERT…SELECT, and finally rename the old table to an intermediate name, the intermediate table to the user-visible name, and drop the old table.

MDEV-13134 introduced the keywords ALGORITHM=NOCOPY and ALGORITHM=INSTANT in an attempt to clarify this a little.

The redundant specification of ENGINE=InnoDB is the old way of requesting that the table be rebuilt. The new way is the FORCE keyword. So, the requested operation is an online table rebuild.

Depending on the size of the table, InnoDB may create temporary files for performing a merge sort in the native ALTER TABLE. For a table rebuild, sorting on the clustered index will be skipped unless the PRIMARY KEY definition is being changed in a way that may require sorting. Sorting will be necessary for all secondary indexes, because for rebuilding the table, all indexes will be constructed from the clustered index of the old table. This is why a table rebuild will trivially fix any corruption in all secondary indexes.

Concurrently with the table scan, merge sorting and initial index creation by native ALTER TABLE, concurrently executing DML transactions may generate log. In a table rebuild, a single log for concurrent modifications will be used. There is a small memory buffer. Once it runs out, the log will spill to a temporary file. In online secondary index creation, there will be a separate log for each ADD INDEX operation that is part of the ALTER TABLE.

Generated at Thu Feb 08 09:55:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.