[MDEV-12630] InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0 or signal 8, encryption.innodb_encryption-page-compression failed in buildbot - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Affects Version/s: 10.1(EOL)
Fix Version/s: 10.2.4
Component/s: Encryption, Storage Engine - InnoDB, Tests
Labels:
None

Description

http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/7915/steps/test_5/logs/stdio

encryption.innodb_encryption-page-compression 'xtradb' w3 [ fail ]

        Test ended at 2017-04-28 23:18:55

CURRENT_TEST: encryption.innodb_encryption-page-compression

Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-ca=/mnt/buildbot/build/mariadb-10.1.23/mysql-test/std_data/cacert.pem'

Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-cert=/mnt/buildbot/build/mariadb-10.1.23/mysql-test/std_data/client-cert.pem'

Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-key=/mnt/buildbot/build/mariadb-10.1.23/mysql-test/std_data/client-key.pem'

Warning: /mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded: unknown option '--loose-skip-ssl'

innodb_open_files should not be greater than the open_files_limit.

2017-04-28 23:18:54 b05ffb40  InnoDB: Assertion failure in thread 2959080256 in file buf0flu.cc line 2881

InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0

InnoDB: We intentionally generate a memory trap.

InnoDB: Submit a detailed bug report to http://bugs.mysql.com.

InnoDB: If you get repeated assertion failures or crashes, even

InnoDB: immediately after the mysqld startup, there may be

InnoDB: corruption in the InnoDB tablespace. Please refer to

InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html

InnoDB: about forcing recovery.

mysqltest got signal 6

read_command_buf (0x91ff580):  while; end

conn->name (0x8f8f8f8f): Can't read from address 0x8f8f8f8f

conn->cur_query (0x8f8f8f8f): Can't read from address 0x8f8f8f8f

Attempting backtrace...

stack_bottom = 0x0 thread_stack 0x48400

/mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded(my_print_stacktrace+0x32)[0x81f2c05]

/mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded[0x81d5e83]

/mnt/buildbot/build/mariadb-10.1.23/libmysqld/examples/mysqltest_embedded[0x81d5ec1]

[0xb76e7400]

[0xb76e7424]

/lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xb71c31ef]

/lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xb71c6835]

mysys/stacktrace.c:267(my_print_stacktrace)[0x8a646bc]

/lib/i386-linux-gnu/libpthread.so.0(+0x6d4c)[0xb76cad4c]

/lib/i386-linux-gnu/libc.so.6(clone+0x5e)[0xb727face]

Writing a core file...

Attachments

Issue Links

mentioned in: Page Loading...

Activity

Ascending order - Click to sort in descending order

View 4 older comments

Marko Mäkelä added a comment - 2017-08-23 07:44

I ran across this once when running 10.1 tests locally, in another encryption test:

encryption.create_or_replace 'cbc,innodb_plugin' w3 [ fail ]

        Test ended at 2017-08-23 10:19:31

CURRENT_TEST: encryption.create_or_replace

mysqltest: At line 101: Cannot reap on a connection without pending send

…

2017-08-23 10:18:39 140209668322880 [Note] /mariadb/10.1/build/sql/mysqld: ready for connections.

Version: '10.1.27-MariaDB-debug'  socket: '/mariadb/10.1/build/mysql-test/var/tmp/3/mysqld.1.sock'  port: 16140  Source distribution

2017-08-23 10:18:39 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 1.

2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 4.

2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #2 thread id 140209240880896 total threads 4.

2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #3 thread id 140209232488192 total threads 4.

2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #4 thread id 140209224095488 total threads 4.

170823 10:19:31 [ERROR] mysqld got signal 8 ;

This is clearly a division by zero in the debug output in fil_crypt_realloc_iops().

		DBUG_PRINT("ib_crypt",

			("thr_no: %u only waited %lu%% skip re-estimate.",

			state->thread_no,

			(100 * state->cnt_waited) / state->batch));

I believe that thanks to ~~MDEV-11713~~ in MariaDB 10.2.4 this will not trigger, because DBUG_PRINT would not evaluate the expression unless the label is active.

$1 = {thread_no = 0, first = false, space = 0x7f84cc0ae898, offset = 7,

  batch = 0, min_key_version_found = 0, end_lsn = 9366944,

  estimated_max_iops = 1, allocated_iops = 1, cnt_waited = 0,

  sum_waited_us = 0, crypt_stat = {pages_read_from_cache = 0,

    pages_read_from_disk = 0, pages_modified = 0, pages_flushed = 0,

    estimated_iops = 1}, scrub_data = {space = 166, scrubbing = false,

    compressed = false, current_table = 0x0, current_index = 0x0,

    savepoint = 0, scrub_stat = {page_reorganizations = 0, page_splits = 0,

      page_split_failures_underflow = 0,

      page_split_failures_out_of_filespace = 0,

      page_split_failures_missing_index = 0,

      page_split_failures_unknown = 0}}}

(gdb) p srv_alloc_time

$2 = 3

Based on the above information, it seems to me that either space->batch was never assigned after it was zero-initialized by rotate_thread_t::rotate_thread_t(uint), or space->batch was assigned to 0 in fil_crypt_find_page_to_rotate():

	bool found = crypt_data->rotate_state.max_offset >=

		crypt_data->rotate_state.next_offset;

	if (found) {

		state->offset = crypt_data->rotate_state.next_offset;

		ulint remaining = crypt_data->rotate_state.max_offset -

			crypt_data->rotate_state.next_offset;

		if (batch <= remaining) {

			state->batch = batch;

		} else {

			state->batch = remaining;

If remaining==0 is possible here, then state->batch would be assigned to 0 as well.

Even though the failure is in debug output only, I think that the logic should be reviewed carefully. Is there anywhere else an assumption that the rotate_thread_t::batch is always growing and never resetting to 0?

Marko Mäkelä added a comment - 2017-08-23 07:44 I ran across this once when running 10.1 tests locally, in another encryption test: encryption.create_or_replace 'cbc,innodb_plugin' w3 [ fail ] Test ended at 2017-08-23 10:19:31 CURRENT_TEST: encryption.create_or_replace mysqltest: At line 101: Cannot reap on a connection without pending send … 2017-08-23 10:18:39 140209668322880 [Note] /mariadb/10.1/build/sql/mysqld: ready for connections. Version: '10.1.27-MariaDB-debug' socket: '/mariadb/10.1/build/mysql-test/var/tmp/3/mysqld.1.sock' port: 16140 Source distribution 2017-08-23 10:18:39 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 1. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #1 thread id 140209249273600 total threads 4. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #2 thread id 140209240880896 total threads 4. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #3 thread id 140209232488192 total threads 4. 2017-08-23 10:18:44 140209416653568 [Note] InnoDB: Creating #4 thread id 140209224095488 total threads 4. 170823 10:19:31 [ERROR] mysqld got signal 8 ; This is clearly a division by zero in the debug output in fil_crypt_realloc_iops(). DBUG_PRINT("ib_crypt", ("thr_no: %u only waited %lu%% skip re-estimate.", state->thread_no, (100 * state->cnt_waited) / state->batch)); I believe that thanks to MDEV-11713 in MariaDB 10.2.4 this will not trigger, because DBUG_PRINT would not evaluate the expression unless the label is active. $1 = {thread_no = 0, first = false, space = 0x7f84cc0ae898, offset = 7, batch = 0, min_key_version_found = 0, end_lsn = 9366944, estimated_max_iops = 1, allocated_iops = 1, cnt_waited = 0, sum_waited_us = 0, crypt_stat = {pages_read_from_cache = 0, pages_read_from_disk = 0, pages_modified = 0, pages_flushed = 0, estimated_iops = 1}, scrub_data = {space = 166, scrubbing = false, compressed = false, current_table = 0x0, current_index = 0x0, savepoint = 0, scrub_stat = {page_reorganizations = 0, page_splits = 0, page_split_failures_underflow = 0, page_split_failures_out_of_filespace = 0, page_split_failures_missing_index = 0, page_split_failures_unknown = 0}}} (gdb) p srv_alloc_time $2 = 3 Based on the above information, it seems to me that either space->batch was never assigned after it was zero-initialized by rotate_thread_t::rotate_thread_t(uint), or space->batch was assigned to 0 in fil_crypt_find_page_to_rotate(): bool found = crypt_data->rotate_state.max_offset >= crypt_data->rotate_state.next_offset; if (found) { state->offset = crypt_data->rotate_state.next_offset; ulint remaining = crypt_data->rotate_state.max_offset - crypt_data->rotate_state.next_offset; if (batch <= remaining) { state->batch = batch; } else { state->batch = remaining; } } If remaining==0 is possible here, then state->batch would be assigned to 0 as well. Even though the failure is in debug output only, I think that the logic should be reviewed carefully. Is there anywhere else an assumption that the rotate_thread_t::batch is always growing and never resetting to 0?

Marko Mäkelä added a comment - 2021-10-26 07:13

MariaDB Server 10.1 reached its end of life on October 17, 2020, and the last release (mariadb-10.1.48) was a couple of weeks after that.

In the cross-reference, I did not find any failures exactly like this, even for 10.1. The last test failure on 2020-08-31 was something harmless:

10.1 f69cc26757733724254ee37aec5a092f
encryption.innodb_encryption-page-compression 'innodb_plugin' w2 [ fail ] Found warnings/errors in server log file!
Test ended at 2020-08-31 12:01:48
line
2020-08-31 12:01:41 2734676800 [ERROR] InnoDB: Trying to access page [space=6: page=12] but the tablespace does not exist or is just being dropped.
^ Found warnings in /mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/mysqld.1.err
ok

- saving '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/encryption.innodb_encryption-page-compression-innodb_plugin/' to '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/log/encryption.innodb_encryption-page-compression-innodb_plugin/'

Retrying test encryption.innodb_encryption-page-compression, attempt(2/3)...

Like I wrote some years ago, the division by zero was avoided in 10.2.4 by ~~MDEV-11713~~.

Marko Mäkelä added a comment - 2021-10-26 07:13 MariaDB Server 10.1 reached its end of life on October 17, 2020, and the last release ( mariadb-10.1.48 ) was a couple of weeks after that. In the cross-reference, I did not find any failures exactly like this, even for 10.1. The last test failure on 2020-08-31 was something harmless: 10.1 f69cc26757733724254ee37aec5a092f encryption.innodb_encryption-page-compression 'innodb_plugin' w2 [ fail ] Found warnings/errors in server log file! Test ended at 2020-08-31 12:01:48 line 2020-08-31 12:01:41 2734676800 [ERROR] InnoDB: Trying to access page [space=6: page=12] but the tablespace does not exist or is just being dropped. ^ Found warnings in /mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/mysqld.1.err ok - saving '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/2/log/encryption.innodb_encryption-page-compression-innodb_plugin/' to '/mnt/buildbot/build/mariadb-10.1.47/mysql-test/var/log/encryption.innodb_encryption-page-compression-innodb_plugin/' Retrying test encryption.innodb_encryption-page-compression, attempt(2/3)... Like I wrote some years ago, the division by zero was avoided in 10.2.4 by MDEV-11713 .

james wang added a comment - 2022-01-05 08:46

Marko, my version 10.2.27 and had the following error on Jan 3rd 2022, please shed some light, thanks:

2022-01-03 3:57:31 140478454376192 [Note] InnoDB: Starting shutdown...
2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Dumping buffer pool(s) to /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool
2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Buffer pool(s) dump completed at 220103 3:57:31
2022-01-03 03:57:38 0x7fa063d36700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/storage/innobase/buf/buf0flu.cc line 3415
InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
220103 3:57:38 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.2.27-MariaDB-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=1
max_threads=3102
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3770738 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/glide/mariadb/10.2.27snc2/bin/mysqld(my_print_stacktrace+0x2e)[0x5597774892ee]
/glide/mariadb/10.2.27snc2/bin/mysqld(handle_fatal_signal+0x30d)[0x559776ecac7d]
/lib64/libpthread.so.0(+0xf630)[0x7fc3fe694630]
/lib64/libc.so.6(gsignal+0x37)[0x7fc3fd49e387]
:0(__GI_raise)[0x7fc3fd49fa78]
/glide/mariadb/10.2.27snc2/bin/mysqld(+0x48823d)[0x559776c7423d]
ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x559777229ad4]
/lib64/libpthread.so.0(+0x7ea5)[0x7fc3fe68cea5]
/lib64/libc.so.6(clone+0x6d)[0x7fc3fd5669fd]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /glide/mysqld/ntnewsub2_3402_ultra/data
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 2314907 2314907 processes
Max open files 32768 32768 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 2314907 2314907 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Uses event mutexes
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using Linux native AIO
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Number of pools: 1
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using SSE2 crc32 instructions
2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Initializing buffer pool, total size = 132G, instances = 12, chunk size = 128M
2022-01-03 3:57:51 140140505790272 [Note] InnoDB: Completed initialization of buffer pool
2022-01-03 3:57:52 139987587221248 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Highest supported file format is Barracuda.
2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Starting crash recovery from checkpoint LSN=244424587382098
2022-01-03 3:57:54 140140505790272 [Note] InnoDB: Starting final batch to recover 13266 pages from redo log.
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Last binlog file '/glide/mysqld/ntnewsub2_3402_ultra/log/binlog/mysql-binlog.000009', position 5482
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 128 out of 128 rollback segments are active.
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating shared tablespace for temporary tables
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Waiting for purge to start
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 5.7.27 started; log sequence number 244424602811341
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #1 encryption thread id 139991063611136 total threads 4.
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #2 encryption thread id 139991055218432 total threads 4.
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #3 encryption thread id 139991046825728 total threads 4.
2022-01-03 3:57:57 139991072003840 [Note] InnoDB: Loading buffer pool(s) from /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool
2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #4 encryption thread id 139990700914432 total threads 4.
2022-01-03 3:57:57 140140505790272 [Note] Plugin 'FEEDBACK' is disabled.
2022-01-03 3:57:57 140140505790272 [Note] Using encryption key id 1 for temporary files
2022-01-03 3:57:57 140140505790272 [Note] Server socket created on IP: '0.0.0.0'.
2022-01-03 3:57:57 140140505790272 [Note] Reading of all Master_info entries succeeded
2022-01-03 3:57:57 140140505790272 [Note] Added new Master_info '' to hash table
2022-01-03 3:57:57 140140505790272 [Note] /glide/mariadb/10.2.27snc2/bin/mysqld: ready for connections.
Version: '10.2.27-MariaDB-log' socket: '/tmp/mysqld_ntnewsub2_3402.sock' port: 3402 MariaDB Server

james wang added a comment - 2022-01-05 08:46 Marko, my version 10.2.27 and had the following error on Jan 3rd 2022, please shed some light, thanks: 2022-01-03 3:57:31 140478454376192 [Note] InnoDB: Starting shutdown... 2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Dumping buffer pool(s) to /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool 2022-01-03 3:57:31 140330246588160 [Note] InnoDB: Buffer pool(s) dump completed at 220103 3:57:31 2022-01-03 03:57:38 0x7fa063d36700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/storage/innobase/buf/buf0flu.cc line 3415 InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ InnoDB: about forcing recovery. 220103 3:57:38 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.2.27-MariaDB-log key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=1 max_threads=3102 thread_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3770738 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x49000 /glide/mariadb/10.2.27snc2/bin/mysqld(my_print_stacktrace+0x2e) [0x5597774892ee] /glide/mariadb/10.2.27snc2/bin/mysqld(handle_fatal_signal+0x30d) [0x559776ecac7d] /lib64/libpthread.so.0(+0xf630) [0x7fc3fe694630] /lib64/libc.so.6(gsignal+0x37) [0x7fc3fd49e387] :0(__GI_raise) [0x7fc3fd49fa78] /glide/mariadb/10.2.27snc2/bin/mysqld(+0x48823d) [0x559776c7423d] ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5] ) [0x559777229ad4] /lib64/libpthread.so.0(+0x7ea5) [0x7fc3fe68cea5] /lib64/libc.so.6(clone+0x6d) [0x7fc3fd5669fd] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. Writing a core file... Working directory at /glide/mysqld/ntnewsub2_3402_ultra/data Resource Limits: Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 unlimited bytes Max resident set unlimited unlimited bytes Max processes 2314907 2314907 processes Max open files 32768 32768 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 2314907 2314907 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Uses event mutexes 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Compressed tables use zlib 1.2.11 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using Linux native AIO 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Number of pools: 1 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Using SSE2 crc32 instructions 2022-01-03 3:57:47 140140505790272 [Note] InnoDB: Initializing buffer pool, total size = 132G, instances = 12, chunk size = 128M 2022-01-03 3:57:51 140140505790272 [Note] InnoDB: Completed initialization of buffer pool 2022-01-03 3:57:52 139987587221248 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Highest supported file format is Barracuda. 2022-01-03 3:57:52 140140505790272 [Note] InnoDB: Starting crash recovery from checkpoint LSN=244424587382098 2022-01-03 3:57:54 140140505790272 [Note] InnoDB: Starting final batch to recover 13266 pages from redo log. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Last binlog file '/glide/mysqld/ntnewsub2_3402_ultra/log/binlog/mysql-binlog.000009', position 5482 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 128 out of 128 rollback segments are active. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating shared tablespace for temporary tables 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Waiting for purge to start 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: 5.7.27 started; log sequence number 244424602811341 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #1 encryption thread id 139991063611136 total threads 4. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #2 encryption thread id 139991055218432 total threads 4. 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #3 encryption thread id 139991046825728 total threads 4. 2022-01-03 3:57:57 139991072003840 [Note] InnoDB: Loading buffer pool(s) from /glide/mysqld/ntnewsub2_3402_ultra/data/ib_buffer_pool 2022-01-03 3:57:57 140140505790272 [Note] InnoDB: Creating #4 encryption thread id 139990700914432 total threads 4. 2022-01-03 3:57:57 140140505790272 [Note] Plugin 'FEEDBACK' is disabled. 2022-01-03 3:57:57 140140505790272 [Note] Using encryption key id 1 for temporary files 2022-01-03 3:57:57 140140505790272 [Note] Server socket created on IP: '0.0.0.0'. 2022-01-03 3:57:57 140140505790272 [Note] Reading of all Master_info entries succeeded 2022-01-03 3:57:57 140140505790272 [Note] Added new Master_info '' to hash table 2022-01-03 3:57:57 140140505790272 [Note] /glide/mariadb/10.2.27snc2/bin/mysqld: ready for connections. Version: '10.2.27-MariaDB-log' socket: '/tmp/mysqld_ntnewsub2_3402.sock' port: 3402 MariaDB Server

Marko Mäkelä added a comment - 2022-01-05 09:21

james.wang, the assertion says that some modified pages have not been written back to the file system, like they are expected to on shutdown.

Unfortunately the built-in stack trace output for the current thread looks like garbage. For this problem, I think that we would really need the stack traces of all threads, and to examine the buffer pool data structures.

Starting with 10.3, the buffer pool would be excluded from core dumps in non-debug builds by default; see ~~MDEV-22186~~ and ~~MDEV-10814~~. But, you are on 10.2.

If this is reasonably easily repeatable, I would suggest to attach a debugger to the process before it crashes, so that you can avoid the write of a huge core dump.

I cannot give any specific advice before I have something like the output of thread apply all backtrace and a dump of each non-empty buf_pool->flush_list as a starting point.

It would be interesting to know if this is repeatable with a more recent 10.2. Since 10.2.27, there have been many changes that could have fixed this, including ~~MDEV-23776~~ and ~~MDEV-20813~~.

Starting with 10.5, there would be only one buf_pool and therefore only one buf_pool.flush_list. In 10.5 and 10.6, the encryption code was cleaned up a lot. It is possible that in older versions, encryption threads are accessing pages that have been marked as free. But, also this assertion was removed, and ~~MDEV-25113~~ introduced lazy deletion from buf_pool.flush_list. The predicate there would be !buf_pool.get_oldest_modification(0). I think that a recent 10.5 or 10.6 should hang on shutdown in a scenario like this. (But I am not aware of any shutdown hangs in our internal testing.)

Marko Mäkelä added a comment - 2022-01-05 09:21 james.wang , the assertion says that some modified pages have not been written back to the file system, like they are expected to on shutdown. Unfortunately the built-in stack trace output for the current thread looks like garbage. For this problem, I think that we would really need the stack traces of all threads, and to examine the buffer pool data structures. Starting with 10.3, the buffer pool would be excluded from core dumps in non-debug builds by default; see MDEV-22186 and MDEV-10814 . But, you are on 10.2. If this is reasonably easily repeatable, I would suggest to attach a debugger to the process before it crashes, so that you can avoid the write of a huge core dump. I cannot give any specific advice before I have something like the output of thread apply all backtrace and a dump of each non-empty buf_pool->flush_list as a starting point. It would be interesting to know if this is repeatable with a more recent 10.2. Since 10.2.27, there have been many changes that could have fixed this, including MDEV-23776 and MDEV-20813 . Starting with 10.5, there would be only one buf_pool and therefore only one buf_pool.flush_list . In 10.5 and 10.6, the encryption code was cleaned up a lot. It is possible that in older versions, encryption threads are accessing pages that have been marked as free. But, also this assertion was removed, and MDEV-25113 introduced lazy deletion from buf_pool.flush_list . The predicate there would be !buf_pool.get_oldest_modification(0) . I think that a recent 10.5 or 10.6 should hang on shutdown in a scenario like this. (But I am not aware of any shutdown hangs in our internal testing.)

james wang added a comment - 2022-01-05 09:29

Thanks a lot, Marko

james wang added a comment - 2022-01-05 09:29 Thanks a lot, Marko

MariaDB Server

InnoDB: Failing assertion: UT_LIST_GET_LEN(buf_pool->flush_list) == 0 or signal 8, encryption.innodb_encryption-page-compression failed in buildbot

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration