Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12630

InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0 or signal 8, encryption.innodb_encryption-page-compression failed in buildbot

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/7915/steps/test_5/logs/stdio

      encryption.innodb_encryption-page-compression 'xtradb' w3 [ fail ]
              Test ended at 2017-04-28 23:18:55
       
      CURRENT_TEST: encryption.innodb_encryption-page-compression
      Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-ca=/mnt/buildbot/build/mariadb-10.1.23/mysql-test/std_data/cacert.pem'
      Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-cert=/mnt/buildbot/build/mariadb-10.1.23/mysql-test/std_data/client-cert.pem'
      Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-key=/mnt/buildbot/build/mariadb-10.1.23/mysql-test/std_data/client-key.pem'
      Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown option '--loose-skip-ssl'
      innodb_open_files should not be greater than the open_files_limit.
      2017-04-28 23:18:54 b05ffb40  InnoDB: Assertion failure in thread 2959080256 in file buf0flu.cc line 2881
      InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      mysqltest got signal 6
      read_command_buf (0x91ff580):  while; end
      conn->name (0x8f8f8f8f): Can't read from address 0x8f8f8f8f
      conn->cur_query (0x8f8f8f8f): Can't read from address 0x8f8f8f8f
      Attempting backtrace...
      stack_bottom = 0x0 thread_stack 0x48400
      /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded(my_print_stacktrace+0x32)[0x81f2c05]
      /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded[0x81d5e83]
      /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded[0x81d5ec1]
      [0xb76e7400]
      [0xb76e7424]
      /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xb71c31ef]
      /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xb71c6835]
      mysys/stacktrace.c:267(my_print_stacktrace)[0x8a646bc]
      /lib/i386-linux-gnu/libpthread.so.0(+0x6d4c)[0xb76cad4c]
      /lib/i386-linux-gnu/libc.so.6(clone+0x5e)[0xb727face]
      Writing a core file...
      

      Attachments

        Issue Links

          Activity

            I ran across this once when running 10.1 tests locally, in another encryption test:

            encryption.create_or_replace 'cbc,innodb_plugin' w3 [ fail ]
                    Test ended at 2017-08-23 10:19:31
             
            CURRENT_TEST: encryption.create_or_replace
            mysqltest: At line 101: Cannot reap on a connection without pending send
            …
            2017-08-23 10:18:39 140209668322880 [Note] /mariadb/10.1/build/sql/mysqld: ready for connections.
            Version: '10.1.27-MariaDB-debug'  socket: '/mariadb/10.1/build/mysql-test/var/tmp/3/mysqld.1.sock'  port: 16140  Source distribution
            2017-08-23 10:18:39 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 1.
            2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 4.
            2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #2 thread id 140209240880896 total threads 4.
            2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #3 thread id 140209232488192 total threads 4.
            2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #4 thread id 140209224095488 total threads 4.
            170823 10:19:31 [ERROR] mysqld got signal 8 ;
            

            This is clearly a division by zero in the debug output in fil_crypt_realloc_iops().

            		DBUG_PRINT("ib_crypt",
            			("thr_no: %u only waited %lu%% skip re-estimate.",
            			state->thread_no,
            			(100 * state->cnt_waited) / state->batch));
            

            I believe that thanks to MDEV-11713 in MariaDB 10.2.4 this will not trigger, because DBUG_PRINT would not evaluate the expression unless the label is active.

            $1 = {thread_no = 0, first = false, space = 0x7f84cc0ae898, offset = 7, 
              batch = 0, min_key_version_found = 0, end_lsn = 9366944, 
              estimated_max_iops = 1, allocated_iops = 1, cnt_waited = 0, 
              sum_waited_us = 0, crypt_stat = {pages_read_from_cache = 0, 
                pages_read_from_disk = 0, pages_modified = 0, pages_flushed = 0, 
                estimated_iops = 1}, scrub_data = {space = 166, scrubbing = false, 
                compressed = false, current_table = 0x0, current_index = 0x0, 
                savepoint = 0, scrub_stat = {page_reorganizations = 0, page_splits = 0, 
                  page_split_failures_underflow = 0, 
                  page_split_failures_out_of_filespace = 0, 
                  page_split_failures_missing_index = 0, 
                  page_split_failures_unknown = 0}}}
            (gdb) p srv_alloc_time
            $2 = 3
            

            Based on the above information, it seems to me that either space->batch was never assigned after it was zero-initialized by rotate_thread_t::rotate_thread_t(uint), or space->batch was assigned to 0 in fil_crypt_find_page_to_rotate():

            	bool found = crypt_data->rotate_state.max_offset >=
            		crypt_data->rotate_state.next_offset;
             
            	if (found) {
            		state->offset = crypt_data->rotate_state.next_offset;
            		ulint remaining = crypt_data->rotate_state.max_offset -
            			crypt_data->rotate_state.next_offset;
             
            		if (batch <= remaining) {
            			state->batch = batch;
            		} else {
            			state->batch = remaining;
            		}
            	}
            

            If remaining==0 is possible here, then state->batch would be assigned to 0 as well.

            Even though the failure is in debug output only, I think that the logic should be reviewed carefully. Is there anywhere else an assumption that the rotate_thread_t::batch is always growing and never resetting to 0?

            marko Marko Mäkelä added a comment - I ran across this once when running 10.1 tests locally, in another encryption test: encryption.create_or_replace 'cbc,innodb_plugin' w3 [ fail ] Test ended at 2017-08-23 10:19:31   CURRENT_TEST: encryption.create_or_replace mysqltest: At line 101: Cannot reap on a connection without pending send … 2017-08-23 10:18:39 140209668322880 [Note] /mariadb/10.1/build/sql/mysqld: ready for connections. Version: '10.1.27-MariaDB-debug' socket: '/mariadb/10.1/build/mysql-test/var/tmp/3/mysqld.1.sock' port: 16140 Source distribution 2017-08-23 10:18:39 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 1. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 4. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #2 thread id 140209240880896 total threads 4. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #3 thread id 140209232488192 total threads 4. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #4 thread id 140209224095488 total threads 4. 170823 10:19:31 [ERROR] mysqld got signal 8 ; This is clearly a division by zero in the debug output in fil_crypt_realloc_iops(). DBUG_PRINT("ib_crypt", ("thr_no: %u only waited %lu%% skip re-estimate.", state->thread_no, (100 * state->cnt_waited) / state->batch)); I believe that thanks to MDEV-11713 in MariaDB 10.2.4 this will not trigger, because DBUG_PRINT would not evaluate the expression unless the label is active. $1 = {thread_no = 0, first = false, space = 0x7f84cc0ae898, offset = 7, batch = 0, min_key_version_found = 0, end_lsn = 9366944, estimated_max_iops = 1, allocated_iops = 1, cnt_waited = 0, sum_waited_us = 0, crypt_stat = {pages_read_from_cache = 0, pages_read_from_disk = 0, pages_modified = 0, pages_flushed = 0, estimated_iops = 1}, scrub_data = {space = 166, scrubbing = false, compressed = false, current_table = 0x0, current_index = 0x0, savepoint = 0, scrub_stat = {page_reorganizations = 0, page_splits = 0, page_split_failures_underflow = 0, page_split_failures_out_of_filespace = 0, page_split_failures_missing_index = 0, page_split_failures_unknown = 0}}} (gdb) p srv_alloc_time $2 = 3 Based on the above information, it seems to me that either space->batch was never assigned after it was zero-initialized by rotate_thread_t::rotate_thread_t(uint), or space->batch was assigned to 0 in fil_crypt_find_page_to_rotate(): bool found = crypt_data->rotate_state.max_offset >= crypt_data->rotate_state.next_offset;   if (found) { state->offset = crypt_data->rotate_state.next_offset; ulint remaining = crypt_data->rotate_state.max_offset - crypt_data->rotate_state.next_offset;   if (batch <= remaining) { state->batch = batch; } else { state->batch = remaining; } } If remaining==0 is possible here, then state->batch would be assigned to 0 as well. Even though the failure is in debug output only, I think that the logic should be reviewed carefully. Is there anywhere else an assumption that the rotate_thread_t::batch is always growing and never resetting to 0?

            MariaDB Server 10.1 reached its end of life on October 17, 2020, and the last release (mariadb-10.1.48) was a couple of weeks after that.

            In the cross-reference, I did not find any failures exactly like this, even for 10.1. The last test failure on 2020-08-31 was something harmless:

            10.1 f69cc26757733724254ee37aec5a092f

            encryption.innodb_encryption-page-compression 'innodb_plugin' w2 [ fail ]  Found warnings/errors in server log file!
                    Test ended at 2020-08-31 12:01:48
            line
            2020-08-31 12:01:41 2734676800 [ERROR] InnoDB: Trying to access page [space=6: page=12] but the tablespace does not exist or is just being dropped.
            ^ Found warnings in /mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/mysqld.1.err
            ok
             
             - saving '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/encryption.innodb_encryption-page-compression-innodb_plugin/' to '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/log/encryption.innodb_encryption-page-compression-innodb_plugin/'
             
            Retrying test encryption.innodb_encryption-page-compression, attempt(2/3)...
            

            Like I wrote some years ago, the division by zero was avoided in 10.2.4 by MDEV-11713.

            marko Marko Mäkelä added a comment - MariaDB Server 10.1 reached its end of life on October 17, 2020, and the last release ( mariadb-10.1.48 ) was a couple of weeks after that. In the cross-reference, I did not find any failures exactly like this, even for 10.1. The last test failure on 2020-08-31 was something harmless: 10.1 f69cc26757733724254ee37aec5a092f encryption.innodb_encryption-page-compression 'innodb_plugin' w2 [ fail ] Found warnings/errors in server log file! Test ended at 2020-08-31 12:01:48 line 2020-08-31 12:01:41 2734676800 [ERROR] InnoDB: Trying to access page [space=6: page=12] but the tablespace does not exist or is just being dropped. ^ Found warnings in /mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/mysqld.1.err ok   - saving '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/encryption.innodb_encryption-page-compression-innodb_plugin/' to '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/log/encryption.innodb_encryption-page-compression-innodb_plugin/'   Retrying test encryption.innodb_encryption-page-compression, attempt(2/3)... Like I wrote some years ago, the division by zero was avoided in 10.2.4 by MDEV-11713 .
            james.wang james wang added a comment -

            Marko, my version 10.2.27 and had the following error on Jan 3rd 2022, please shed some light, thanks:

            2022-01-03 3:57:31 140478454376192 [Note] InnoDB: Starting shutdown...
            2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Dumping buffer pool(s) to /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool
            2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Buffer pool(s) dump completed at 220103 3:57:31
            2022-01-03 03:57:38 0x7fa063d36700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/storage/innobase/buf/buf0flu.cc line 3415
            InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0
            InnoDB: We intentionally generate a memory trap.
            InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            InnoDB: If you get repeated assertion failures or crashes, even
            InnoDB: immediately after the mysqld startup, there may be
            InnoDB: corruption in the InnoDB tablespace. Please refer to
            InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
            InnoDB: about forcing recovery.
            220103 3:57:38 [ERROR] mysqld got signal 6 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.2.27-MariaDB-log
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=1
            max_threads=3102
            thread_count=0
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3770738 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x49000
            /glide/mariadb/10.2.27snc2/bin/mysqld(my_print_stacktrace+0x2e)[0x5597774892ee]
            /glide/mariadb/10.2.27snc2/bin/mysqld(handle_fatal_signal+0x30d)[0x559776ecac7d]
            /lib64/libpthread.so.0(+0xf630)[0x7fc3fe694630]
            /lib64/libc.so.6(gsignal+0x37)[0x7fc3fd49e387]
            :0(__GI_raise)[0x7fc3fd49fa78]
            /glide/mariadb/10.2.27snc2/bin/mysqld(+0x48823d)[0x559776c7423d]
            ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x559777229ad4]
            /lib64/libpthread.so.0(+0x7ea5)[0x7fc3fe68cea5]
            /lib64/libc.so.6(clone+0x6d)[0x7fc3fd5669fd]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            Writing a core file...
            Working directory at /glide/mysqld/ntnewsub2_3402_ultra/data
            Resource Limits:
            Limit Soft Limit Hard Limit Units
            Max cpu time unlimited unlimited seconds
            Max file size unlimited unlimited bytes
            Max data size unlimited unlimited bytes
            Max stack size 8388608 unlimited bytes
            Max core file size 0 unlimited bytes
            Max resident set unlimited unlimited bytes
            Max processes 2314907 2314907 processes
            Max open files 32768 32768 files
            Max locked memory 65536 65536 bytes
            Max address space unlimited unlimited bytes
            Max file locks unlimited unlimited locks
            Max pending signals 2314907 2314907 signals
            Max msgqueue size 819200 819200 bytes
            Max nice priority 0 0
            Max realtime priority 0 0
            Max realtime timeout unlimited unlimited us
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Uses event mutexes
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Compressed tables use zlib 1.2.11
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using Linux native AIO
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Number of pools: 1
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using SSE2 crc32 instructions
            2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Initializing buffer pool, total size = 132G, instances = 12, chunk size = 128M
            2022-01-03 3:57:51 140140505790272 [Note] InnoDB: Completed initialization of buffer pool
            2022-01-03 3:57:52 139987587221248 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
            2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Highest supported file format is Barracuda.
            2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Starting crash recovery from checkpoint LSN=244424587382098
            2022-01-03 3:57:54 140140505790272 [Note] InnoDB: Starting final batch to recover 13266 pages from redo log.
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Last binlog file '/glide/mysqld/ntnewsub2_3402_ultra/log/binlog/mysql-binlog.000009', position 5482
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 128 out of 128 rollback segments are active.
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating shared tablespace for temporary tables
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Waiting for purge to start
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 5.7.27 started; log sequence number 244424602811341
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #1 encryption thread id 139991063611136 total threads 4.
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #2 encryption thread id 139991055218432 total threads 4.
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #3 encryption thread id 139991046825728 total threads 4.
            2022-01-03 3:57:57 139991072003840 [Note] InnoDB: Loading buffer pool(s) from /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool
            2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #4 encryption thread id 139990700914432 total threads 4.
            2022-01-03 3:57:57 140140505790272 [Note] Plugin 'FEEDBACK' is disabled.
            2022-01-03 3:57:57 140140505790272 [Note] Using encryption key id 1 for temporary files
            2022-01-03 3:57:57 140140505790272 [Note] Server socket created on IP: '0.0.0.0'.
            2022-01-03 3:57:57 140140505790272 [Note] Reading of all Master_info entries succeeded
            2022-01-03 3:57:57 140140505790272 [Note] Added new Master_info '' to hash table
            2022-01-03 3:57:57 140140505790272 [Note] /glide/mariadb/10.2.27snc2/bin/mysqld: ready for connections.
            Version: '10.2.27-MariaDB-log' socket: '/tmp/mysqld_ntnewsub2_3402.sock' port: 3402 MariaDB Server

            james.wang james wang added a comment - Marko, my version 10.2.27 and had the following error on Jan 3rd 2022, please shed some light, thanks: 2022-01-03 3:57:31 140478454376192 [Note] InnoDB: Starting shutdown... 2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Dumping buffer pool(s) to /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool 2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Buffer pool(s) dump completed at 220103 3:57:31 2022-01-03 03:57:38 0x7fa063d36700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/storage/innobase/buf/buf0flu.cc line 3415 InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ InnoDB: about forcing recovery. 220103 3:57:38 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.2.27-MariaDB-log key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=1 max_threads=3102 thread_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3770738 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x49000 /glide/mariadb/10.2.27snc2/bin/mysqld(my_print_stacktrace+0x2e) [0x5597774892ee] /glide/mariadb/10.2.27snc2/bin/mysqld(handle_fatal_signal+0x30d) [0x559776ecac7d] /lib64/libpthread.so.0(+0xf630) [0x7fc3fe694630] /lib64/libc.so.6(gsignal+0x37) [0x7fc3fd49e387] :0(__GI_raise) [0x7fc3fd49fa78] /glide/mariadb/10.2.27snc2/bin/mysqld(+0x48823d) [0x559776c7423d] ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5] ) [0x559777229ad4] /lib64/libpthread.so.0(+0x7ea5) [0x7fc3fe68cea5] /lib64/libc.so.6(clone+0x6d) [0x7fc3fd5669fd] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. Writing a core file... Working directory at /glide/mysqld/ntnewsub2_3402_ultra/data Resource Limits: Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 unlimited bytes Max resident set unlimited unlimited bytes Max processes 2314907 2314907 processes Max open files 32768 32768 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 2314907 2314907 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Uses event mutexes 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Compressed tables use zlib 1.2.11 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using Linux native AIO 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Number of pools: 1 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using SSE2 crc32 instructions 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Initializing buffer pool, total size = 132G, instances = 12, chunk size = 128M 2022-01-03 3:57:51 140140505790272 [Note] InnoDB: Completed initialization of buffer pool 2022-01-03 3:57:52 139987587221248 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Highest supported file format is Barracuda. 2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Starting crash recovery from checkpoint LSN=244424587382098 2022-01-03 3:57:54 140140505790272 [Note] InnoDB: Starting final batch to recover 13266 pages from redo log. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Last binlog file '/glide/mysqld/ntnewsub2_3402_ultra/log/binlog/mysql-binlog.000009', position 5482 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 128 out of 128 rollback segments are active. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating shared tablespace for temporary tables 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Waiting for purge to start 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 5.7.27 started; log sequence number 244424602811341 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #1 encryption thread id 139991063611136 total threads 4. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #2 encryption thread id 139991055218432 total threads 4. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #3 encryption thread id 139991046825728 total threads 4. 2022-01-03 3:57:57 139991072003840 [Note] InnoDB: Loading buffer pool(s) from /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #4 encryption thread id 139990700914432 total threads 4. 2022-01-03 3:57:57 140140505790272 [Note] Plugin 'FEEDBACK' is disabled. 2022-01-03 3:57:57 140140505790272 [Note] Using encryption key id 1 for temporary files 2022-01-03 3:57:57 140140505790272 [Note] Server socket created on IP: '0.0.0.0'. 2022-01-03 3:57:57 140140505790272 [Note] Reading of all Master_info entries succeeded 2022-01-03 3:57:57 140140505790272 [Note] Added new Master_info '' to hash table 2022-01-03 3:57:57 140140505790272 [Note] /glide/mariadb/10.2.27snc2/bin/mysqld: ready for connections. Version: '10.2.27-MariaDB-log' socket: '/tmp/mysqld_ntnewsub2_3402.sock' port: 3402 MariaDB Server

            james.wang, the assertion says that some modified pages have not been written back to the file system, like they are expected to on shutdown.

            Unfortunately the built-in stack trace output for the current thread looks like garbage. For this problem, I think that we would really need the stack traces of all threads, and to examine the buffer pool data structures.

            Starting with 10.3, the buffer pool would be excluded from core dumps in non-debug builds by default; see MDEV-22186 and MDEV-10814. But, you are on 10.2.

            If this is reasonably easily repeatable, I would suggest to attach a debugger to the process before it crashes, so that you can avoid the write of a huge core dump.

            I cannot give any specific advice before I have something like the output of thread apply all backtrace and a dump of each non-empty buf_pool->flush_list as a starting point.

            It would be interesting to know if this is repeatable with a more recent 10.2. Since 10.2.27, there have been many changes that could have fixed this, including MDEV-23776 and MDEV-20813.

            Starting with 10.5, there would be only one buf_pool and therefore only one buf_pool.flush_list. In 10.5 and 10.6, the encryption code was cleaned up a lot. It is possible that in older versions, encryption threads are accessing pages that have been marked as free. But, also this assertion was removed, and MDEV-25113 introduced lazy deletion from buf_pool.flush_list. The predicate there would be !buf_pool.get_oldest_modification(0). I think that a recent 10.5 or 10.6 should hang on shutdown in a scenario like this. (But I am not aware of any shutdown hangs in our internal testing.)

            marko Marko Mäkelä added a comment - james.wang , the assertion says that some modified pages have not been written back to the file system, like they are expected to on shutdown. Unfortunately the built-in stack trace output for the current thread looks like garbage. For this problem, I think that we would really need the stack traces of all threads, and to examine the buffer pool data structures. Starting with 10.3, the buffer pool would be excluded from core dumps in non-debug builds by default; see MDEV-22186 and MDEV-10814 . But, you are on 10.2. If this is reasonably easily repeatable, I would suggest to attach a debugger to the process before it crashes, so that you can avoid the write of a huge core dump. I cannot give any specific advice before I have something like the output of thread apply all backtrace and a dump of each non-empty buf_pool->flush_list as a starting point. It would be interesting to know if this is repeatable with a more recent 10.2. Since 10.2.27, there have been many changes that could have fixed this, including MDEV-23776 and MDEV-20813 . Starting with 10.5, there would be only one buf_pool and therefore only one buf_pool.flush_list . In 10.5 and 10.6, the encryption code was cleaned up a lot. It is possible that in older versions, encryption threads are accessing pages that have been marked as free. But, also this assertion was removed, and MDEV-25113 introduced lazy deletion from buf_pool.flush_list . The predicate there would be !buf_pool.get_oldest_modification(0) . I think that a recent 10.5 or 10.6 should hang on shutdown in a scenario like this. (But I am not aware of any shutdown hangs in our internal testing.)
            james.wang james wang added a comment -

            Thanks a lot, Marko

            james.wang james wang added a comment - Thanks a lot, Marko

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.