[MDEV-33208] InnoDB: Trying to access update undo rec field 50 in index `PRIMARY` of table NNN but index has only 14 fields Created: 2024-01-10  Updated: 2024-01-18

Status: Needs Feedback
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.11.6
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Alex Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Linux firefly 4.19.0-25-amd64 #1 SMP Debian 4.19.289-2 (2023-08-08) x86_64 GNU/Linux
10.11.6-MariaDB-1:10.11.6+maria~deb10-log


Issue Links:
Relates
relates to MDEV-33275 buf_flush_LRU(): mysql_mutex_assert_o... Closed

 Description   

Hit this error:

2024-01-09 20:15:31 0 [ERROR] InnoDB: Trying to access update undo rec field 50 in index `PRIMARY` of table `meshok`.`btc_invoice` but index has only 14 fields Submit a detailed bug report to https://jira.mariadb.org/. Run also CHECK TABLE `meshok`.`btc_invoice`. n_fields = 44, i = 0
240109 20:15:31 [ERROR] mysqld got signal 11 ;
Sorry, we probably made a mistake, and this is a bug.

Your assistance in bug reporting will enable us to fix this for the next release.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.11.6-MariaDB-1:10.11.6+maria~deb10-log source revision: fecd78b83785d5ae96f2c6ff340375be803cd299
key_buffer_size=131072
read_buffer_size=131072
max_used_connections=1368
max_threads=5002
thread_count=1268
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 11017983 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x560e3e1e4ec8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f66850a6c38 thread_stack 0x49000
/usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x560e3c6bf9ae]
/usr/sbin/mariadbd(handle_fatal_signal+0x475)[0x560e3c1b9d95]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7f9a2e5d4730]
/usr/sbin/mariadbd(+0xdbe208)[0x560e3c535208]
/usr/sbin/mariadbd(+0xd84a03)[0x560e3c4fba03]
/usr/sbin/mariadbd(+0x6f5fba)[0x560e3be6cfba]
/usr/sbin/mariadbd(+0xddb083)[0x560e3c552083]
/usr/sbin/mariadbd(_ZN5tpool10task_group7executeEPNS_4taskE+0x8c)[0x560e3c65462c]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x5f)[0x560e3c65241f]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbbb2f)[0x7f9a2e4f4b2f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7f9a2e5c9fa3]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f9a2e1ed06f]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): (null)
Connection ID (thread ID): 0
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=off,cset_narrowing=off

The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains
information that should help you find out what is causing the crash.

We think the query pointer is invalid, but we will try to print it anyway.
Query:

Writing a core file...
Working directory at /db/main/tables
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1031220 1031220 processes
Max open files 32768 32768 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 1031220 1031220 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: core

Kernel version: Linux version 4.19.0-25-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.289-2 (2023-08-08)



 Comments   
Comment by Marko Mäkelä [ 2024-01-10 ]

Can you please post a full stack trace on the crash, as instructed by https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ (which was included in the error log excerpt that you posted)?

Based on the incompletely resolved stack trace, it looks like InnoDB is crashing either while rolling back an incomplete transaction, or while purging the history of some committed transactions. This crash should be removed (it was missed in MDEV-13542), and we should find out the root cause, to understand what caused the corruption in the first place.

Does the server start up if you set innodb_force_recovery=2 or 3?

Comment by Alex [ 2024-01-13 ]

innodb_force_recovery=2 helped

Comment by Alex [ 2024-01-15 ]

Downgrade to 10.11.4 fixes the problem

Comment by Marko Mäkelä [ 2024-01-15 ]

Can you provide the requested details about the crash?

I don’t think that much changed between 10.11 and 11.0 with regard to undo logs. MDEV-19229 was already implemented in 10.11. MDEV-29986 changed the default value of innodb_undo_tablespaces in 11.0.

Comment by Alex [ 2024-01-15 ]

Too complicated for me. Will observe 10.11.4. Will it happen again or not.

Comment by Jonas Liljestrand [ 2024-01-16 ]

I'm currenty planing on upgrading to MariaDB 10.11.6 and while preparing for that I found this issue (here) and want to know more about it before going live.

Alex may I ask you for details on why you decided to downgrade to 10.11.4?

Comment by Alex [ 2024-01-16 ]

@jonlil , Well, the problem appear after I upgraded from 10.11.4 to 10.11.6. That is why I downgraded. Will observe.

Comment by Marko Mäkelä [ 2024-01-16 ]

10.11.6 fixes some corruption bugs that affect 10.11.4, such as MDEV-32530, MDEV-31767 and MDEV-30531.

This bug might share a root cause with MDEV-33189, which is equally hard to reproduce. If that is the case, the bug would have been introduced already by MDEV-26827 in 10.11.3. It could be that some change that was made after 10.11.4 made it easier to hit this bug in your workload.

Comment by Marko Mäkelä [ 2024-01-17 ]

If my hypothesis in MDEV-33189 is correct, this corruption could be prevented by setting innodb_adaptive_flushing=OFF. Can anyone try that?

Generated at Thu Feb 08 10:37:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.