[MDEV-10283] SEGV on ANALYZE LOCAL TABLE __ PERSISTENT FOR ALL Created: 2016-06-24  Updated: 2021-04-28  Resolved: 2021-04-28

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.0.25
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Tom Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Environment:

Linux hv3 3.13.0-86-generic #131-Ubuntu SMP Thu May 12 23:33:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux


Issue Links:
Duplicate
duplicates MDEV-6529 optimize VARCHAR temp storage during ... Confirmed

 Description   

This happened on two different servers that ran the same command on copies of the same MyISAM table.

Command was: ANALYZE LOCAL TABLE ac_schematic PERSISTENT FOR ALL.

CREATE TABLE `ac_schematic` (
  `srcid` int(11) NOT NULL DEFAULT '0',
  `pid` int(11) NOT NULL DEFAULT '0',
  `ppid` int(11) NOT NULL DEFAULT '0',
  `aid` int(11) NOT NULL DEFAULT '0',
  `did` int(11) NOT NULL DEFAULT '0',
  `cid` int(11) NOT NULL DEFAULT '0',
  `sid` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`srcid`,`pid`),
  KEY `srcid_ppid_pid` (`srcid`,`ppid`,`pid`),
  KEY `cid` (`cid`),
  KEY `aid` (`aid`),
  KEY `did` (`did`),
  KEY `sid` (`sid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Table has 66126857 rows, 1.8GiB data length, 5.1GiB index length.

Same command on two other somewhat smaller tables did not crash.

$ cat /etc/mysql/my.cnf
[client]
default-character-set           = utf8
 
[mysql]
max_allowed_packet              = 1024M
 
[mysqldump]
max_allowed_packet              = 1024M
 
[mysqld_safe]
socket                          = /var/run/mysqld/mysqld.sock
nice                            = 0
open-files-limit                 = 40000
 
[mysqld]
user                            = mysql
pid_file                        = /var/run/mysqld/mysqld.pid
datadir                         = /home/mysql/datadir
tmpdir                          = /home/mysql/tmpdir
skip_symbolic_links
skip_name_resolve
skip_external_locking
innodb_status_file              = 1
bind_address                    = 0.0.0.0
character-set-server            = utf8
collation-server                = utf8_general_ci
character_set_server            = utf8
collation_server                = utf8_general_ci
myisam_recover_options          = BACKUP
 
#https://blog.mariadb.org/installing-mariadb-galera-cluster-on-debian-ubuntu/
# mysql settings
binlog_format                   = ROW
default_storage_engine          = innodb
innodb_autoinc_lock_mode        = 2
innodb_doublewrite              = 1
# Query cache makes Galera burn to the ground
query_cache_size                = 0
query_cache_type                = 0
# galera settings
wsrep_on                        = ON
wsrep_provider                  = /usr/lib/galera/libgalera_smm.so
wsrep_provider_options          = "gcs.fc_limit = 40; gcache.size = 512M"
wsrep_cluster_name              = "spinitron_galera"
wsrep_cluster_address           = "gcomm://10.1.1.5,10.1.1.3,10.1.1.4"
wsrep_node_address              = 10.1.1.3
wsrep_sst_method                = rsync
wsrep_retry_autocommit          = 3
 
# these are huge myisam dbs generated once a week by a batch sql script
# and indexed with sphinx. they are never written after initial generation
binlog_ignore_db                = autocomplete1
binlog_ignore_db                = autocomplete2
replicate_ignore_db             = autocomplete1
replicate_ignore_db             = autocomplete2
 
## TUNING
##
key_buffer_size                 = 8G
innodb_buffer_pool_size         = 16G
table_open_cache                = 5000
max_connections                 = 200
max_connect_errors              = 100000
net_write_timeout               = 3600
net_read_timeout                = 3600
max_allowed_packet              = 1024M
bulk_insert_buffer_size	        = 16M
tmp_table_size                  = 1G
max_heap_table_size             = 1G
innodb_use_native_aio           = 0
# http://www.percona.com/blog/2013/09/20/innodb-performance-optimization-basics-updated/
# We have very small transaction writes and less than one per second, so
#innodb_log_file_size           = use default 48MB
#innodb_log_buffer_size         = use default
innodb_flush_log_at_trx_commit  = 0
innodb_flush_method             = O_DIRECT
innodb_thread_concurrency       = 8
 
# for checking the autocomplete dbs
[isamchk]
key_buffer_size                 = 1G
read_buffer_size                = 64M
write_buffer_size               = 64M


 
Server version: 10.0.25-MariaDB-1~trusty-wsrep
key_buffer_size=8589934592
read_buffer_size=131072
max_used_connections=24
max_threads=202
thread_count=4
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8832222 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f00966ab008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f06ec08ddf0 thread_stack 0x48000
(my_addr_resolve failure: fork)
/usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x7f0710b08cee]
/usr/sbin/mysqld(handle_fatal_signal+0x457) [0x7f0710638547]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330) [0x7f070edd7330]
/usr/sbin/mysqld(Field_long::cmp(unsigned char const*, unsigned char const*)+0xf) [0x7f071061463f]
/usr/sbin/mysqld(queue_insert+0x84) [0x7f0710b08454]
/usr/sbin/mysqld(merge_buffers(Sort_param*, st_io_cache*, st_io_cache*, unsigned char*, st_buffpek*, st_buffpek*, st_buffpek*, int)+0x1ed) [0x7f07106358cd]
/usr/sbin/mysqld(merge_many_buff(Sort_param*, unsigned char*, st_buffpek*, unsigned int*, st_io_cache*)+0x192) [0x7f07106362a2]
/usr/sbin/mysqld(Unique::merge(TABLE*, unsigned char*, bool)+0x114) [0x7f071056a834]
/usr/sbin/mysqld(Unique::walk(TABLE*, int (*)(void*, unsigned int, void*), void*)+0x251) [0x7f071056abe1]
/usr/sbin/mysqld(collect_statistics_for_table(THD*, TABLE*)+0xdf1) [0x7f0710523951]
/usr/sbin/mysqld(+0x4f782b) [0x7f071059482b]
/usr/sbin/mysqld(Sql_cmd_analyze_table::execute(THD*)+0xed) [0x7f071059622d]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0xada) [0x7f07104a8fda]
/usr/sbin/mysqld(+0x415c5b) [0x7f07104b2c5b]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x1fa1) [0x7f07104b5261]
/usr/sbin/mysqld(do_command(THD*)+0x28f) [0x7f07104b5f6f]
/usr/sbin/mysqld(do_handle_one_connection(THD*)+0x32b) [0x7f07105860cb]
/usr/sbin/mysqld(handle_one_connection+0x40) [0x7f07105861b0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f070edcf184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f070e4f237d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f0053908020): is an invalid pointer
Connection ID (thread ID): 11139929
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_co
ndition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table
_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bk
a=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on



 Comments   
Comment by Elena Stepanova [ 2016-06-25 ]

So far I couldn't reproduce the problem with the same (mostly) config file, same table structure and size and random data, on a Galera 10.0.25 node.
So, if you can answer any of the questions below, it would help. I understand that some questions presume experiments which could result in new crashes and you might not be able to afford it.

  • is it certainly the right table definition? ANALYZE is run on ac2_schematic, but CREATE is for ac_schematic;
  • is the crash reproducible on the same server where it happened before?
  • is the crash reproducible on the server when it is not a part of a Galera cluster (and better yet, started without any wsrep options at all, so it's not even a single-node cluster, just a standalone server)?
  • is there anything suspicious in the error log before the crash, especially anything that indicates that either ac_schematic or any tables in mysql schema are corrupted?
  • did you try to run CHECK TABLE on ac_schematic and statistics tables? If not, could you please do it?

    CHECK TABLE mysql.column_stats, mysql.index_stats, mysql.table_stats EXTENDED;
    -- the one below would also be better with EXTENDED, but it can take a lot of time:
    CHECK TABLE ac_schematic;
    

  • would it be possible to upload the .frm, .MYI and .MAD files of all four tables to our FTP server ftp.askmonty.org/private?
Comment by Elena Stepanova [ 2016-06-25 ]

Meanwhile, psergey, cvicentiu, could you please take a look at the stack trace? Even though not debug, it's readable, maybe you'll have any ideas.
I'm adding cvicentiu to the loop because he fixed a somewhat similar one some time ago: MDEV-7912. That time it was a Windows-specific problem though.

Comment by Tom [ 2016-06-25 ]

Hi Elena,

I'll do my best to answer your questions. Some are easier than others. I'll start with the easy ones.

I was not reproduce the crash with the simple tests given below. But I want to reiterate that the same crash happened on two servers, hv3 and hv4, at the same point in a weekly process with about 10 minutes separation, which corresponds to the staggered update start times. There is more possibly relevant detail in 5) below.

1) The ac_schematic table definition correct. I typed ac2_schematic into the report by mistake (now corrected). The software using the table is called ac2 but the table is called ac_schematic. There is no table called ac2_schematic.

2) No However, after I recovered the mysqlds from the crash yesterday, I ran Analyze from MysqlWorkbench. I ran this today:

MariaDB [autocomplete1]> ANALYZE LOCAL TABLE ac_schematic PERSISTENT FOR ALL;
+----------------------------+---------+----------+-----------------------------------------+
| Table                      | Op      | Msg_type | Msg_text                                |
+----------------------------+---------+----------+-----------------------------------------+
| autocomplete1.ac_schematic | analyze | status   | Engine-independent statistics collected |
| autocomplete1.ac_schematic | analyze | status   | Table is already up to date             |
+----------------------------+---------+----------+-----------------------------------------+
2 rows in set (4 min 23.83 sec)

3) This would be very nervous-making. I'm not sure it's worth the trouble given the result in 2) above.

4) There is nothing unusual in the error log. Every hour, node hv5 desyncs itself for about 4 minutes to make a backup. hv3 and hv4 (on which 2 servers the same crash happened executing the same command) have a note in the error log every time it happens, which serves as a timestamp in the error log, which looks like this

160624 12:35:02 [Note] WSREP: Member 1.0 (hv5) desyncs itself from group
160624 12:40:26 [Note] WSREP: Member 1.0 (hv5) resyncs itself to group
160624 12:40:26 [Note] WSREP: Member 1.0 (hv5) synced with group.
160624 13:35:03 [Note] WSREP: Member 1.0 (hv5) desyncs itself from group
160624 13:39:17 [Note] WSREP: Member 1.0 (hv5) resyncs itself to group
160624 13:39:17 [Note] WSREP: Member 1.0 (hv5) synced with group.
160624 14:35:02 [Note] WSREP: Member 1.0 (hv5) desyncs itself from group
160624 14:38:30 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

For a while I considered that the size of the error log on hv3 might be causing a problem. It was 4.7GiB. But then the same crash happened on hv4 a few minutes later and its error log was a reasonable size.

5) The myisam data and index files for this database are prepared elsewhere and copied to the server. After copy to a temp dir outside datadir, the script runs in a shell "myisamchk --silent --force --update-state --key_buffer_size=1G --sort_buffer_size=1G --read_buffer_size=16M --write_buffer_size=16M *.MYI". After this they are copied into datadir. It didn't log any errors.

Then the same script that runs runs CHECK TABLE CHANGED on each tables immediately prior to ANALYZE. It didn't log any errors.

MariaDB [(none)]> CHECK TABLE mysql.column_stats, mysql.index_stats, mysql.table_stats EXTENDED;
+--------------------+-------+----------+----------+
| Table              | Op    | Msg_type | Msg_text |
+--------------------+-------+----------+----------+
| mysql.column_stats | check | status   | OK       |
| mysql.index_stats  | check | status   | OK       |
| mysql.table_stats  | check | status   | OK       |
+--------------------+-------+----------+----------+
3 rows in set (0.00 sec)
 
MariaDB [(none)]> CHECK TABLE autocomplete1.ac_schematic;
+----------------------------+-------+----------+----------+
| Table                      | Op    | Msg_type | Msg_text |
+----------------------------+-------+----------+----------+
| autocomplete1.ac_schematic | check | status   | OK       |
+----------------------------+-------+----------+----------+
1 row in set (36.68 sec)

6) Yes, I'll upload those files if I can manage.

Comment by Tom [ 2016-06-25 ]

I want to draw attention to these lines in my.cnf:

binlog_ignore_db                = autocomplete1
binlog_ignore_db                = autocomplete2
replicate_ignore_db             = autocomplete1
replicate_ignore_db             = autocomplete2

The ac_* tables in the autocomplete1 and 2 databases are read-only as far as the app is concerned. They are recreated once a week double-buffer style. They are MyISAM because the process of creating the tables is extremely slow in InnoDB (10s of hours vs hours). The recreation and updates use massive DML queries that I don't dare replicate in Galera. I would prefer to have these tables in a separate mysqld from the rest of the app but haven't had time to do that yet.

The I introduced the ANALYZE LOCAL TABLE command in the weekly process (as described in 5) above) for the first time this week on production servers. I did this because after a MariaDB update a few weeks ago, Sphinx indexer sometimes selects far fewer rows from the autocomplete tables than it should. This error corresponded to incorrect table stats as shown in MysqlWorkbench. And Sphinx indexer ran correctly after I analyzed tables from MysqlWorkbench.

My suspicion is that the problem could be related to how DDL like ANALYZE TABLE (statement replication) behaves with replicate_ignore_ and binlog_ignore in a Galera cluster. Just speculation.

For now I will remove "PERSISTENT FOR ALL" and see if "ANALYZE LOCAL TABLE ac_schematic" crashes any servers next week.

Comment by Tom [ 2016-06-25 ]

I believe I have uploaded the requested files via ftp. Please confirm.

Comment by Elena Stepanova [ 2016-06-25 ]

Yes, the files have arrived, thanks. Maybe we'll have some luck with them, although of course since it's not reliably reproducible on your side, the hope is rather optimistic.
Thanks for your answers to the questions. Regarding (3) I agree that there is no point in this experiment if the crash is not reproducible in (2).

Comment by Tom [ 2016-06-26 ]

ANALYZE LOCAL TABLE __ PERSISTENT FOR ALL can use a surprising amount of filesystem space.

While trying to recreate the crash, I had a test VM with 55.7GiB used and 42.6GiB free in the filesystem holding datadir and tmpdir. While the analyze command ran, that gradually reached 98.3GiB and 0 (zero), then returned to the initial state in the next 5 seconds.

In the command output there was a line

autocomplete2.ac_table analyze Error Error writing file '/home/mysql/tmpdir/MYfY0Vne' (Errcode: 28 "No space left on device")

The largest data table analyzed was 6.6GiB and the largest index file was 5.1GiB. So it's surprising that mysqld used more than 42GiB additional file space to analyze the tables. The docs provide a warning: "Currently, engine-independent statistics is collected by doing full table and full index scans, and can be quite expensive." But this implies io and cpu expense, not disk space.

Comment by Elena Stepanova [ 2016-06-26 ]

The note actually implies everything (maybe it's a bad wording though). We have a task about the tmp space problem: MDEV-6529.
I thought it could have been related to the crash, but I expect the error log would have some indication of it, and there is none.
If you have a reason to believe tmp could have gone full at the time of the crash, please let us know.

Comment by Sergei Petrunia [ 2016-06-26 ]

The stacktrace unfortunately doesn't give me any clue about what could be happening.

Comment by Tom [ 2016-06-26 ]

The test VM I used had equivalent tables and data and FS dimensions. It used more than 42GiB to do the ANALYZE TABLE. The server that crashed had about 40 free. So I think I can infer that it did run out of space.

The test VM did not crash but it was also not running Galera with a steady stream of wsreps going both up and down.

Comment by Elena Stepanova [ 2016-06-26 ]

Thanks, it's helpful. I suppose it's possible that if the server runs out of space at an unfortunate moment, it can crash rather than complain about it as it did in your test VM and as it was happening in the other bug report. I will see if I can reproduce the crash this way.

Comment by Elena Stepanova [ 2016-06-29 ]

Looking further into it, I take back my words about it being the same problem as MDEV-6529.
MDEV-6529 is mostly about big VARCHAR columns (and unfortunate data layouts); but your table does not have a single VARCHAR column.

While with INT columns the disk usage is also somewhat more than desired, it should be nowhere near the values that you observe.

I ran the same ANALYZE on your table, and expectedly had the top tmp disk usage of ~300 MB (measurements were taken every second, so the peak couldn't have been missed).

So, if your server indeed takes over 40 GB executing the same query on the same table, it's very strange and worrisome.

If you can give us temporary access to your test server, psergey is willing to investigate the disk usage problem "on site". If he discovers the reason, maybe it will turn out to be the root cause of the initial crash as well.

Comment by Elena Stepanova [ 2016-07-27 ]

thefsb, are you still experiencing the problem? And if you are, can you provide the access as suggested above?

Comment by Tom [ 2016-07-30 ]

Hi Elena,

I changed my app, removing the PERSISTENT FOR ALL clause and I no longer experience this problem. I cannot easily set up the access you would need. So that's all rather unhelpful. Sorry.

On the plus side, you can maybe close this bug because I think got some information wrong in the report. My app runs ANALYZE TABLE on about 10 tables in sequence and I think I was wrong about which of them triggered this error. I now think it was the previous table which does have VARCHARs.

Generated at Thu Feb 08 07:41:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.