A user is seeing an 'Undo log record is too big.' error that only occurs if they update 2 mediumtext fields at once, and if the length of the existing values in each those fields happen to be between 3962 and 4030 characters long.
Some other interesting notes about this:
The table definition has ROW_FORMAT=COMPACT, so it uses the Antelope format.
The only indexes on the two mediumtext fields have prefix lengths of 255.
The problem is repeatable on MySQL 5.5 and 5.7, and MariaDB 10.1 and 10.2, so I don't think it has been fixed in any released version of MySQL or MariaDB.
I have attached an SQL file that can be used to reproduce this issue.
For ROW_FORMAT=COMPACT (or ROW_FORMAT=REDUNDANT), a local prefix of 768
bytes will be stored in the clustered index leaf page record for all
columns that are longer than that.
This could be a previously unknown limitation for me, and it is
theoretically possible that we can fix it. I would have to carefully
analyze the test case first. I suppose that InnoDB’s practice of
dynamically deciding which columns to move off-page is related to this.
It is also possible that the problem was made bigger by a BLOB bug fix that
sometimes causes non-updated columns to be added to the update vector (and
to the undo log record):
Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH
RECOVERY LOSE DATA
This fix is only present in 5.7.5 (and MariaDB 10.2.2) onwards. It is a
follow-up to an earlier fix by me, which may cause redo log overrun (making
InnoDB crash-unsafe):
commit 41bb3537ba507799ab0143acd75ccab72192931e
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Mon Aug 29 11:16:42 2011 +0300
Bug#12704861 Corruption after a crash during BLOB update
That fix was present in 5.1.60,5.5.17,5.6.4. But with that earlier fix, the
undo log records should not be "padded" with non-updated columns. I am only
saying that the problem is likely worse in 10.2.
Geoff Montee (Inactive)
added a comment - Some thoughts from marko :
For ROW_FORMAT=COMPACT (or ROW_FORMAT=REDUNDANT), a local prefix of 768
bytes will be stored in the clustered index leaf page record for all
columns that are longer than that.
This could be a previously unknown limitation for me, and it is
theoretically possible that we can fix it. I would have to carefully
analyze the test case first. I suppose that InnoDB’s practice of
dynamically deciding which columns to move off-page is related to this.
It is also possible that the problem was made bigger by a BLOB bug fix that
sometimes causes non-updated columns to be added to the update vector (and
to the undo log record):
commit ce0a1e85e24e48b8171f767b44330da635a6ea0a
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Wed Apr 23 18:45:35 2014 +0530
Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH
RECOVERY LOSE DATA
This fix is only present in 5.7.5 (and MariaDB 10.2.2) onwards. It is a
follow-up to an earlier fix by me, which may cause redo log overrun (making
InnoDB crash-unsafe):
commit 41bb3537ba507799ab0143acd75ccab72192931e
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Mon Aug 29 11:16:42 2011 +0300
Bug#12704861 Corruption after a crash during BLOB update
That fix was present in 5.1.60,5.5.17,5.6.4. But with that earlier fix, the
undo log records should not be "padded" with non-updated columns. I am only
saying that the problem is likely worse in 10.2.
The ultimate fix would be a new undo log format that would lift any size
restrictions, as outlined in https://jira.mariadb.org/browse/MDEV-11657
A user is seeing an 'Undo log record is too big.' error that only occurs if they update 2 mediumtext fields at once, and if the length of the existing values in each those fields happen to be between 3962 and 4030 characters long.
Some other interesting notes about this:
- The table definition has ROW_FORMAT=COMPACT, so it uses the Antelope format.
- The only indexes on the two mediumtext fields have prefix lengths of 255.
- The problem is repeatable on MySQL 5.5 and 5.7, and MariaDB 10.1 and 10.2, so I don't think it has been fixed in any released version of MySQL or MariaDB.
I have attached an SQL file that can be used to reproduce this issue.
A user is seeing an 'Undo log record is too big.' error that only occurs if they update 2 mediumtext fields at once, and if the length of the existing values in each those fields happen to be between 3962 and 4030 characters long.
Some other interesting notes about this:
- The table definition has ROW_FORMAT=COMPACT, so it uses the Antelope format.
- The only indexes on the two mediumtext fields have prefix lengths of 255.
- The problem is repeatable on MySQL 5.5 and 5.7, and MariaDB 10.1 and 10.2, so I don't think it has been fixed in any released version of MySQL or MariaDB.
I have attached an SQL file that can be used to reproduce this issue.
mysql> update test_tab set a_str_18=CONCAT(a_str_18,1);
ERROR 1713 (HY000): Undo log record is too big.
Claudio Nanni
added a comment - For completeness the problem is reproducible also with one single field, upper limit should be of 8061 characters.
------------------
LENGTH(a_str_18)
------------------
63381
------------------
1 row in set (0.02 sec)
mysql> update test_tab set a_str_18=CONCAT(a_str_18,1);
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> UPDATE test_tab set a_str_18=LEFT(a_str_18,8000);
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update test_tab set a_str_18=CONCAT(a_str_18,1);
ERROR 1713 (HY000): Undo log record is too big.
I believe that the problem must be in the function dtuple_convert_big_rec() or in the related logic.
When a clustered index leaf page record is too long to fit in a page, InnoDB would start moving the longest columns to off-page storage.
The goal is to have room for at least 2 records in each leaf page. (For ROW_FORMAT=COMPRESSED, the minimum is 1 record for leaf pages and 2 for non-leaf pages.)
Some boundary condition must be off by a few bytes.
Marko Mäkelä
added a comment - I believe that the problem must be in the function dtuple_convert_big_rec() or in the related logic.
When a clustered index leaf page record is too long to fit in a page, InnoDB would start moving the longest columns to off-page storage.
The goal is to have room for at least 2 records in each leaf page. (For ROW_FORMAT=COMPRESSED, the minimum is 1 record for leaf pages and 2 for non-leaf pages.)
Some boundary condition must be off by a few bytes.
My guess was wrong, and this indeed is a bug with the InnoDB undo log format.
Here is how I debugged it. I took the attached test, prepended the line
--source include/have_innodb.inc
saved into a .test file, executed with
./mtr --manual-gdb …
and then
break ha_innobase::update_row
run
break trx_undo_page_report_modify
c
fin
In this function, I started to single-step. The first invocation of trx_undo_page_report_modify() can ‘safely’ fail; there is a fall-back that retries with an empty undo log page.
On the second invocation, we can see the following:
(gdb) p first_free
$3 = 56
(gdb) p undo_page
$4 = (ib_page_t *) 0x7fffef1f4000 ""
(gdb) display ptr-$3-$4
After some single-stepping I set breakpoints to every return statement in the function.
There are two loops that are the problem:
for (i = 0; i < upd_get_n_fields(update) - extended; i++) {
The first one (above) logs every updated column.
The second loop (below) logs every indexed column, including columns that were already logged above:
for (col_no = 0; col_no < dict_table_get_n_cols(table);
col_no++) {
const dict_col_t* col
= dict_table_get_nth_col(table, col_no);
if (col->ord_part) {
ulint pos;
/* Write field number to undo log */
if (trx_undo_left(undo_page, ptr) < 5 + 15) {
return(0);
}
pos = dict_index_get_nth_col_pos(index,
col_no);
ptr += mach_write_compressed(ptr, pos);
…
}
}
The space runs out during this loop, apparently at the very last user column. We abort at ptr-$3-$4=15933, right here in the second loop:
/* Write field number to undo log */
if (trx_undo_left(undo_page, ptr) < 5 + 15) {
return(0);
}
Because this second loop is writing the column number to the undo log record, the fix might be as simple as omitting those indexed columns that were logged as updated.
I will have to carefully check that such a change would be safe with the undo log parsing code.
Furthermore, in MariaDB 10.2 (and MySQL 5.7) the undo log format has changed due to indexed virtual columns (MDEV-5800).
Omitting the numbers of updated non-virtual columns must be determined to be safe also when indexed virtual columns are present (and affected by the update).
Marko Mäkelä
added a comment - My guess was wrong, and this indeed is a bug with the InnoDB undo log format.
Here is how I debugged it. I took the attached test, prepended the line
--source include/have_innodb.inc
saved into a .test file, executed with
./mtr --manual-gdb …
and then
break ha_innobase::update_row
run
break trx_undo_page_report_modify
c
fin
In this function, I started to single-step. The first invocation of trx_undo_page_report_modify() can ‘safely’ fail; there is a fall-back that retries with an empty undo log page.
On the second invocation, we can see the following:
(gdb) p first_free
$3 = 56
(gdb) p undo_page
$4 = (ib_page_t *) 0x7fffef1f4000 ""
(gdb) display ptr-$3-$4
After some single-stepping I set breakpoints to every return statement in the function.
There are two loops that are the problem:
for (i = 0; i < upd_get_n_fields(update) - extended; i++) {
ulint pos = upd_get_nth_field(update, i)->field_no;
/* Write field number to undo log */
if (trx_undo_left(undo_page, ptr) < 5) {
return (0);
}
…
if (flen != UNIV_SQL_NULL) {
if (trx_undo_left(undo_page, ptr) < flen) {
return (0);
}
ut_memcpy(ptr, field, flen);
ptr += flen;
}
}
The first one (above) logs every updated column.
The second loop (below) logs every indexed column, including columns that were already logged above:
for (col_no = 0; col_no < dict_table_get_n_cols(table);
col_no++) {
const dict_col_t* col
= dict_table_get_nth_col(table, col_no);
if (col->ord_part) {
ulint pos;
/* Write field number to undo log */
if (trx_undo_left(undo_page, ptr) < 5 + 15) {
return (0);
}
pos = dict_index_get_nth_col_pos(index,
col_no);
ptr += mach_write_compressed(ptr, pos);
…
}
}
The space runs out during this loop, apparently at the very last user column. We abort at ptr-$3-$4=15933, right here in the second loop:
/* Write field number to undo log */
if (trx_undo_left(undo_page, ptr) < 5 + 15) {
return (0);
}
Because this second loop is writing the column number to the undo log record, the fix might be as simple as omitting those indexed columns that were logged as updated.
I will have to carefully check that such a change would be safe with the undo log parsing code.
Furthermore, in MariaDB 10.2 (and MySQL 5.7) the undo log format has changed due to indexed virtual columns ( MDEV-5800 ).
Omitting the numbers of updated non-virtual columns must be determined to be safe also when indexed virtual columns are present (and affected by the update).
I pushed to bb-5.5-marko.
The merge upwards will require some effort. At least in 10.2 there are additional tests failing, because the undo log record size limitations are no longer being exceeded.
(In 10.2, also the code is slightly different due to spatial indexes and indexed virtual columns.)
Marko Mäkelä
added a comment - I pushed to bb-5.5-marko .
The merge upwards will require some effort. At least in 10.2 there are additional tests failing, because the undo log record size limitations are no longer being exceeded.
(In 10.2, also the code is slightly different due to spatial indexes and indexed virtual columns.)
This bug caused a permanent ‘corruption’ of data files that causes performance regression for accessing secondary indexes: MDEV-14799 After UPDATE of indexed columns, old values will not be purged from secondary indexes
Marko Mäkelä
added a comment - This bug caused a permanent ‘corruption’ of data files that causes performance regression for accessing secondary indexes:
MDEV-14799 After UPDATE of indexed columns, old values will not be purged from secondary indexes
Some thoughts from marko: