[MDEV-14051] 'Undo log record is too big.' error occurring in very narrow range of string lengths - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.1.26, 5.5.57, 10.0.32, 10.2.9
Fix Version/s: 5.5.59, 10.0.33, 10.1.29, 10.2.10, 10.3.3
Component/s: Storage Engine - InnoDB
Labels:
- innodb
- upstream

Description

Some other interesting notes about this:

The table definition has ROW_FORMAT=COMPACT, so it uses the Antelope format.

The only indexes on the two mediumtext fields have prefix lengths of 255.

The problem is repeatable on MySQL 5.5 and 5.7, and MariaDB 10.1 and 10.2, so I don't think it has been fixed in any released version of MySQL or MariaDB.

I have attached an SQL file that can be used to reproduce this issue.

Upstream bug report:

https://bugs.mysql.com/bug.php?id=88150

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

undo_log_rec_too_big_15451.sql
30 kB
2017-10-12 00:20

Issue Links

relates to

MDEV-11657 Cross-engine transaction metadata

Open

MDEV-14799 After UPDATE of indexed columns, old values will not be purged from secondary indexes

Closed

MDEV-29666 InnoDB fails to purge secondary index records when indexed virtual columns exist

Closed

Activity

Ascending order - Click to sort in descending order

Geoff Montee (Inactive) created issue - 2017-10-12 00:21

Geoff Montee (Inactive) added a comment - 2017-10-12 00:26

Some thoughts from marko:

For ROW_FORMAT=COMPACT (or ROW_FORMAT=REDUNDANT), a local prefix of 768
bytes will be stored in the clustered index leaf page record for all
columns that are longer than that.

This could be a previously unknown limitation for me, and it is
theoretically possible that we can fix it. I would have to carefully
analyze the test case first. I suppose that InnoDB’s practice of
dynamically deciding which columns to move off-page is related to this.

It is also possible that the problem was made bigger by a BLOB bug fix that
sometimes causes non-updated columns to be added to the update vector (and
to the undo log record):

commit ce0a1e85e24e48b8171f767b44330da635a6ea0a
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Wed Apr 23 18:45:35 2014 +0530

Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH
RECOVERY LOSE DATA

This fix is only present in 5.7.5 (and MariaDB 10.2.2) onwards. It is a
follow-up to an earlier fix by me, which may cause redo log overrun (making
InnoDB crash-unsafe):

commit 41bb3537ba507799ab0143acd75ccab72192931e
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Mon Aug 29 11:16:42 2011 +0300

Bug#12704861 Corruption after a crash during BLOB update

That fix was present in 5.1.60,5.5.17,5.6.4. But with that earlier fix, the
undo log records should not be "padded" with non-updated columns. I am only
saying that the problem is likely worse in 10.2.

The ultimate fix would be a new undo log format that would lift any size
restrictions, as outlined in https://jira.mariadb.org/browse/MDEV-11657

Geoff Montee (Inactive) added a comment - 2017-10-12 00:26 Some thoughts from marko : For ROW_FORMAT=COMPACT (or ROW_FORMAT=REDUNDANT), a local prefix of 768 bytes will be stored in the clustered index leaf page record for all columns that are longer than that. This could be a previously unknown limitation for me, and it is theoretically possible that we can fix it. I would have to carefully analyze the test case first. I suppose that InnoDB’s practice of dynamically deciding which columns to move off-page is related to this. It is also possible that the problem was made bigger by a BLOB bug fix that sometimes causes non-updated columns to be added to the update vector (and to the undo log record): commit ce0a1e85e24e48b8171f767b44330da635a6ea0a Author: Annamalai Gurusami <annamalai.gurusami@oracle.com> Date: Wed Apr 23 18:45:35 2014 +0530 Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH RECOVERY LOSE DATA This fix is only present in 5.7.5 (and MariaDB 10.2.2) onwards. It is a follow-up to an earlier fix by me, which may cause redo log overrun (making InnoDB crash-unsafe): commit 41bb3537ba507799ab0143acd75ccab72192931e Author: Marko Mäkelä <marko.makela@oracle.com> Date: Mon Aug 29 11:16:42 2011 +0300 Bug#12704861 Corruption after a crash during BLOB update That fix was present in 5.1.60,5.5.17,5.6.4. But with that earlier fix, the undo log records should not be "padded" with non-updated columns. I am only saying that the problem is likely worse in 10.2. The ultimate fix would be a new undo log format that would lift any size restrictions, as outlined in https://jira.mariadb.org/browse/MDEV-11657

Marko Mäkelä made changes - 2017-10-12 06:49

Field	Original Value	New Value
Labels	innodb	innodb upstream

Marko Mäkelä made changes - 2017-10-12 06:49

Assignee

Marko Mäkelä [ marko ]

Geoff Montee (Inactive) made changes - 2017-10-19 12:56

Priority

Major [ 3 ]

Critical [ 2 ]

Geoff Montee (Inactive) made changes - 2017-10-19 13:13

Description

A user is seeing an 'Undo log record is too big.' error that only occurs if they update 2 mediumtext fields at once, and if the length of the existing values in each those fields happen to be between 3962 and 4030 characters long.

Some other interesting notes about this:

- The table definition has ROW_FORMAT=COMPACT, so it uses the Antelope format.

- The only indexes on the two mediumtext fields have prefix lengths of 255.

- The problem is repeatable on MySQL 5.5 and 5.7, and MariaDB 10.1 and 10.2, so I don't think it has been fixed in any released version of MySQL or MariaDB.

I have attached an SQL file that can be used to reproduce this issue.

Claudio Nanni added a comment - 2017-10-20 10:18

For completeness the problem is reproducible also with one single field, upper limit should be of 8061 characters.

------------------

LENGTH(a_str_18)

------------------

63381

------------------
1 row in set (0.02 sec)

mysql> update test_tab set a_str_18=CONCAT(a_str_18,1);
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> UPDATE test_tab set a_str_18=LEFT(a_str_18,8000);
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> update test_tab set a_str_18=CONCAT(a_str_18,1);
ERROR 1713 (HY000): Undo log record is too big.

Claudio Nanni added a comment - 2017-10-20 10:18 For completeness the problem is reproducible also with one single field, upper limit should be of 8061 characters. ------------------ LENGTH(a_str_18) ------------------ 63381 ------------------ 1 row in set (0.02 sec) mysql> update test_tab set a_str_18=CONCAT(a_str_18,1); Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> UPDATE test_tab set a_str_18=LEFT(a_str_18,8000); Query OK, 1 row affected (0.02 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update test_tab set a_str_18=CONCAT(a_str_18,1); ERROR 1713 (HY000): Undo log record is too big.

Marko Mäkelä made changes - 2017-10-20 11:20

Link

This issue relates to MDEV-11657 [ MDEV-11657 ]

Marko Mäkelä made changes - 2017-10-23 09:28

Status

Open [ 1 ]

In Progress [ 3 ]

Marko Mäkelä added a comment - 2017-10-23 09:30

I believe that the problem must be in the function dtuple_convert_big_rec() or in the related logic.
When a clustered index leaf page record is too long to fit in a page, InnoDB would start moving the longest columns to off-page storage.
The goal is to have room for at least 2 records in each leaf page. (For ROW_FORMAT=COMPRESSED, the minimum is 1 record for leaf pages and 2 for non-leaf pages.)
Some boundary condition must be off by a few bytes.

Marko Mäkelä added a comment - 2017-10-23 09:30 I believe that the problem must be in the function dtuple_convert_big_rec() or in the related logic. When a clustered index leaf page record is too long to fit in a page, InnoDB would start moving the longest columns to off-page storage. The goal is to have room for at least 2 records in each leaf page. (For ROW_FORMAT=COMPRESSED, the minimum is 1 record for leaf pages and 2 for non-leaf pages.) Some boundary condition must be off by a few bytes.

Marko Mäkelä added a comment - 2017-10-23 17:56

My guess was wrong, and this indeed is a bug with the InnoDB undo log format.
Here is how I debugged it. I took the attached test, prepended the line

--source include/have_innodb.inc

saved into a .test file, executed with

./mtr --manual-gdb …

and then

break ha_innobase::update_row

run

break trx_undo_page_report_modify

fin

In this function, I started to single-step. The first invocation of trx_undo_page_report_modify() can ‘safely’ fail; there is a fall-back that retries with an empty undo log page.
On the second invocation, we can see the following:

(gdb) p first_free

$3 = 56

(gdb) p undo_page

$4 = (ib_page_t *) 0x7fffef1f4000 ""

(gdb) display ptr-$3-$4

After some single-stepping I set breakpoints to every return statement in the function.
There are two loops that are the problem:

		for (i = 0; i < upd_get_n_fields(update) - extended; i++) {

			ulint	pos = upd_get_nth_field(update, i)->field_no;

			/* Write field number to undo log */

			if (trx_undo_left(undo_page, ptr) < 5) {

				return(0);

…

			if (flen != UNIV_SQL_NULL) {

				if (trx_undo_left(undo_page, ptr) < flen) {

					return(0);

				ut_memcpy(ptr, field, flen);

				ptr += flen;

The first one (above) logs every updated column.
The second loop (below) logs every indexed column, including columns that were already logged above:

		for (col_no = 0; col_no < dict_table_get_n_cols(table);

		     col_no++) {

			const dict_col_t*	col

				= dict_table_get_nth_col(table, col_no);

			if (col->ord_part) {

				ulint	pos;

				/* Write field number to undo log */

				if (trx_undo_left(undo_page, ptr) < 5 + 15) {

					return(0);

				pos = dict_index_get_nth_col_pos(index,

								 col_no);

				ptr += mach_write_compressed(ptr, pos);

…

The space runs out during this loop, apparently at the very last user column. We abort at ptr-$3-$4=15933, right here in the second loop:

				/* Write field number to undo log */

				if (trx_undo_left(undo_page, ptr) < 5 + 15) {

					return(0);

Because this second loop is writing the column number to the undo log record, the fix might be as simple as omitting those indexed columns that were logged as updated.
I will have to carefully check that such a change would be safe with the undo log parsing code.

Furthermore, in MariaDB 10.2 (and MySQL 5.7) the undo log format has changed due to indexed virtual columns (~~MDEV-5800~~).
Omitting the numbers of updated non-virtual columns must be determined to be safe also when indexed virtual columns are present (and affected by the update).

Marko Mäkelä added a comment - 2017-10-23 17:56 My guess was wrong, and this indeed is a bug with the InnoDB undo log format. Here is how I debugged it. I took the attached test, prepended the line --source include/have_innodb.inc saved into a .test file, executed with ./mtr --manual-gdb … and then break ha_innobase::update_row run break trx_undo_page_report_modify c fin In this function, I started to single-step. The first invocation of trx_undo_page_report_modify() can ‘safely’ fail; there is a fall-back that retries with an empty undo log page. On the second invocation, we can see the following: (gdb) p first_free $3 = 56 (gdb) p undo_page $4 = (ib_page_t *) 0x7fffef1f4000 "" (gdb) display ptr-$3-$4 After some single-stepping I set breakpoints to every return statement in the function. There are two loops that are the problem: for (i = 0; i < upd_get_n_fields(update) - extended; i++) { ulint pos = upd_get_nth_field(update, i)->field_no; /* Write field number to undo log */ if (trx_undo_left(undo_page, ptr) < 5) { return (0); } … if (flen != UNIV_SQL_NULL) { if (trx_undo_left(undo_page, ptr) < flen) { return (0); } ut_memcpy(ptr, field, flen); ptr += flen; } } The first one (above) logs every updated column. The second loop (below) logs every indexed column, including columns that were already logged above: for (col_no = 0; col_no < dict_table_get_n_cols(table); col_no++) { const dict_col_t* col = dict_table_get_nth_col(table, col_no); if (col->ord_part) { ulint pos; /* Write field number to undo log */ if (trx_undo_left(undo_page, ptr) < 5 + 15) { return (0); } pos = dict_index_get_nth_col_pos(index, col_no); ptr += mach_write_compressed(ptr, pos); … } } The space runs out during this loop, apparently at the very last user column. We abort at ptr-$3-$4=15933, right here in the second loop: /* Write field number to undo log */ if (trx_undo_left(undo_page, ptr) < 5 + 15) { return (0); } Because this second loop is writing the column number to the undo log record, the fix might be as simple as omitting those indexed columns that were logged as updated. I will have to carefully check that such a change would be safe with the undo log parsing code. Furthermore, in MariaDB 10.2 (and MySQL 5.7) the undo log format has changed due to indexed virtual columns ( MDEV-5800 ). Omitting the numbers of updated non-virtual columns must be determined to be safe also when indexed virtual columns are present (and affected by the update).

Marko Mäkelä added a comment - 2017-10-24 14:13

I pushed to bb-5.5-marko.
The merge upwards will require some effort. At least in 10.2 there are additional tests failing, because the undo log record size limitations are no longer being exceeded.
(In 10.2, also the code is slightly different due to spatial indexes and indexed virtual columns.)

Marko Mäkelä added a comment - 2017-10-24 14:13 I pushed to bb-5.5-marko . The merge upwards will require some effort. At least in 10.2 there are additional tests failing, because the undo log record size limitations are no longer being exceeded. (In 10.2, also the code is slightly different due to spatial indexes and indexed virtual columns.)

Marko Mäkelä made changes - 2017-10-24 14:13

Assignee	Marko Mäkelä [ marko ]	Jan Lindström [ jplindst ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Sergei Golubchik made changes - 2017-10-24 14:40

Fix Version/s		5.5 [ 15800 ]
Fix Version/s		10.0 [ 16000 ]
Fix Version/s		10.1 [ 16100 ]
Fix Version/s		10.2 [ 14601 ]

Marko Mäkelä added a comment - 2017-10-25 14:24

Merged up to 10.2 so far.

Marko Mäkelä added a comment - 2017-10-25 14:24 Merged up to 10.2 so far.

Marko Mäkelä made changes - 2017-10-25 14:24

issue.field.resolutiondate

2017-10-25 14:24:29.0

2017-10-25 14:24:29.758

Marko Mäkelä made changes - 2017-10-25 14:24

Fix Version/s		5.5.59 [ 22612 ]
Fix Version/s		10.0.33 [ 22552 ]
Fix Version/s		10.1.29 [ 22636 ]
Fix Version/s		10.2.10 [ 22615 ]
Fix Version/s		10.3.3 [ 22644 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	5.5 [ 15800 ]
Fix Version/s	10.0 [ 16000 ]
Fix Version/s	10.1 [ 16100 ]
Assignee	Jan Lindström [ jplindst ]	Marko Mäkelä [ marko ]
Resolution		Fixed [ 1 ]
Status	In Review [ 10002 ]	Closed [ 6 ]

Elena Stepanova made changes - 2017-12-30 01:46

Link

This issue relates to ~~MDEV-14799~~ [ ~~MDEV-14799~~ ]

Marko Mäkelä added a comment - 2018-01-02 18:18

This bug caused a permanent ‘corruption’ of data files that causes performance regression for accessing secondary indexes:
~~MDEV-14799~~ After UPDATE of indexed columns, old values will not be purged from secondary indexes

Marko Mäkelä added a comment - 2018-01-02 18:18 This bug caused a permanent ‘corruption’ of data files that causes performance regression for accessing secondary indexes: MDEV-14799 After UPDATE of indexed columns, old values will not be purged from secondary indexes

Sergei Golubchik made changes - 2021-12-06 21:45

Workflow

MariaDB v3 [ 83008 ]

MariaDB v4 [ 152973 ]

Marko Mäkelä made changes - 2022-09-29 12:37

Link

This issue relates to ~~MDEV-29666~~ [ ~~MDEV-29666~~ ]

Jira Automation (IT) made changes - 2024-07-04 08:10

Zendesk Related Tickets

112131

People

Assignee:: Marko Mäkelä

Reporter:: Geoff Montee (Inactive)

Votes:: 2 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2017-10-12 00:21

Updated:: 2024-07-08 00:37

Resolved:: 2017-10-25 14:24

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration