Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28312

[FATAL] InnoDB: Page old data size XXX new data size XXX, page old max ins size XXX new max ins size XXX

    XMLWordPrintable

Details

    Description

      Hello,
      we are running a Galera cluster using three MariaDB 10.6 nodes for some time without any issues.
      Recently, we have faced the same bug on two from three servers, caused by following errors:

      2022-04-13  6:07:32 163019 [ERROR] [FATAL] InnoDB: Page old data size 15868 new data size 12512, page old max ins size 15 new max ins size 3371
      220413  6:07:32 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.6.4-MariaDB-1:10.6.4+maria~focal-log
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=71
      max_threads=2002
      thread_count=73
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4539512 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x7fd980000c58
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7fda262b5d98 thread_stack 0x49000
      /usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x556304819f12]
      ??:0(my_print_stacktrace)[0x5563042d42b5]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fe11f5883c0]
      /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fe11f08e18b]
      ??:0(gsignal)[0x7fe11f06d859]
      /usr/sbin/mariadbd(+0x654687)[0x556303f74687]
      ??:0(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x556303f74fbf]
      ??:0(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x5563046d89ab]
      ??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x5563045f79a2]
      ??:0(void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&))[0x5563045fab61]
      ??:0(void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&))[0x55630470bcd3]
      ??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x5563045fbf78]
      ??:0(void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&))[0x5563045ff132]
      ??:0(void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&))[0x55630460024b]
      ??:0(void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&))[0x5563046f6826]
      ??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x556304660997]
      ??:0(void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag))[0x556304661eea]
      ??:0(void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag))[0x5563046643fe]
      ??:0(void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag))[0x556304674754]
      ??:0(void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag))[0x5563045c863a]
      ??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0x5563042e354f]
      ??:0(handler::ha_write_row(unsigned char const*))[0x556304069b0d]
      ??:0(write_record(THD*, TABLE*, st_copy_info*, select_result*))[0x5563040702c0]
      ??:0(mysql_insert(THD*, TABLE_LIST*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool, select_result*))[0x5563040aa5c6]
      ??:0(mysql_execute_command(THD*, bool))[0x55630409a367]
      ??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x556304099db6]
      ??:0(mysql_init_multi_delete(LEX*))[0x5563040a76ea]
      ??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x5563040a8208]
      ??:0(do_command(THD*, bool))[0x5563041b7867]
      ??:0(do_handle_one_connection(CONNECT*, bool))[0x5563041b7bbd]
      ??:0(handle_one_connection)[0x55630451517d]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609)[0x7fe11f57c609]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7fe11f16a293]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7fd9800108d0): INSERT INTO `log` (`data_rozpoczecia`, `data_zakonczenia`, `data_startu`, `data_kontaktu`, `data_usuniecia`, `data_odlozenia`, `data_aktualizacji`, `czas_sesji`, `czas_trwania`, `id_ankieta`, `id_respondent`, `id_strona`, `id_log`, `id_log_org`, `invalid`, `timeout`, `odlozone`, `zakwalifikowany`, `auth_key`, `verify_key`, `smclientid`, `respondent_key`, `external_key`, `user_token`, `testowe`, `mobile`, `niedokonczone`, `ponowne`, `screenout`, `usunieto`, `kolejnosc`, `sciezka`, `miedzyczas`, `ip`, `referer`, `user_agent`, `from_widget`, `ft_integration`, `is_copied`) VALUES ('2022-04-12 18:51:05', NULL, NULL, NULL, NULL, NULL, NULL, '0', NULL, '689412', '0', '3363250', '0', '0', '0', '0', '0', '0', NULL, NULL, '', NULL, '', NULL, '0', '0', '1', '0', '0', '0', '{\"1\":\"3363250\",\"2\":\"3363368\",\"3\":\"3363380\",\"4\":\"3363388\",\"5\":\"3363408\",\"6\":\"3363418\",\"7\":\"3363426\",\"8\":\"3363428\"}', NULL, '[1649782265]', '108.174.2.215', '', 'LinkedInBot/1.0 (compatible; Mozilla/5.0; Apache-HttpClient +http://www.linkedin.com)', '0', NULL, '0')
       
      Connection ID (thread ID): 163019
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
       
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Writing a core file...
      Working directory at /var/lib/mysql
      Resource Limits:
      Limit                     Soft Limit           Hard Limit           Units     
      Max cpu time              unlimited            unlimited            seconds   
      Max file size             unlimited            unlimited            bytes     
      Max data size             unlimited            unlimited            bytes     
      Max stack size            8388608              unlimited            bytes     
      Max core file size        0                    unlimited            bytes     
      Max resident set          unlimited            unlimited            bytes     
      Max processes             144426               144426               processes 
      Max open files            32768                32768                files     
      Max locked memory         65536                65536                bytes     
      Max address space         unlimited            unlimited            bytes     
      Max file locks            unlimited            unlimited            locks     
      Max pending signals       144426               144426               signals   
      Max msgqueue size         819200               819200               bytes     
      Max nice priority         0                    0                    
      Max realtime priority     0                    0                    
      Max realtime timeout      unlimited            unlimited            us        
      Core pattern: |/usr/share/apport/apport %p %s %c %d %P %E
      

      We've tried to search for such issue, but there is no cause found or any solution proposed. We've only found, that this issue can be caused by the indexes in some way, and we think that it started after a particular index has been created for the log table.

      We've been forced to do a re-sync from scratch of two cluster nodes. One of them started to hand with following errors:

      2022-04-13  6:42:17 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2022-04-13  6:42:17 0 [Note] InnoDB: Number of pools: 1
      2022-04-13  6:42:17 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2022-04-13  6:42:17 0 [Note] mysqld: O_TMPFILE is not supported on /tmp (disabling future attempts)
      2022-04-13  6:42:18 0 [Note] InnoDB: Using Linux native AIO
      2022-04-13  6:42:18 0 [Note] InnoDB: Initializing buffer pool, total size = 27917287424, chunk size = 134217728
      2022-04-13  6:42:18 0 [Note] InnoDB: Completed initialization of buffer pool
      2022-04-13  6:42:18 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=6295863748620,6295863748620
      2022-04-13  6:42:19 0 [Note] InnoDB: 2 transaction(s) which must be rolled back or cleaned up in total 2 row operations to undo
      2022-04-13  6:42:19 0 [Note] InnoDB: Trx id counter is 37394291227
      2022-04-13  6:42:19 0 [Note] InnoDB: Starting final batch to recover 25860 pages from redo log.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=3076, page number=2608001]
      2022-04-13  6:42:20 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      

      We think that the error mentioned in the title is the root cause and probably causes some corruption of the data.

      We'll appreciate any help!
      Regards

      Attachments

        Issue Links

          Activity

            People

              jplindst Jan Lindström (Inactive)
              r.kubik Robert Kubik
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.