[MCOL-487] too many binlog file in db directory Created: 2016-12-27  Updated: 2017-05-08  Resolved: 2017-05-08

Status: Closed
Project: MariaDB ColumnStore
Component/s: MariaDB Server
Affects Version/s: 1.0.6.1
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Si Tong Assignee: Unassigned
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

centos 6.5 x86_64


Attachments: File columnstoreSupportReport.columnstore-1.tar.gz     File mysql.err.tar.gz    

 Description   

env desc:
same with MCOL-486, but dbroot1 is handled by PM2, dbroot2 is handled by PM1.

reproduce step:
continue from MCOL-486, after restart machine of PM1 and mcs restored(with dbroot1 is handled by PM2, dbroot2 is handled by PM1), I left a.sh running for a whole night, on the next morning I came back only find the mcs can not logged in after swap size threshold(at about 10h) was reached and mcs performed a restartsystem but failed. it was similar with MCOL-486.

question:
1. BUT I failed to restore the system.
2. I found there were nearly 4000 binlog files in the mysql/db directory. every one is tiny, most of them contained 327 bytes, contents pasted below. why are there so many binlog files

[root@s237 db]# ./../bin/mysqlbinlog -vvv mysql-bin.003995
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#161226 21:13:20 server id 1  end_log_pos 249     Start: binlog v 4, server v 10.1.19-MariaDB created 161226 21:13:20 at startup
ROLLBACK/*!*/;
BINLOG '
QM5hWA8BAAAA9QAAAPkAAAAAAAQAMTAuMS4xOS1NYXJpYURCAGxvZwAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAABAzmFYEzgNAAgAEgAEBAQEEgAA3QAEGggAAAAICAgCAAAACgoKAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAEEwQAAJj+BD8=
'/*!*/;
# at 249
#161226 21:13:20 server id 1  end_log_pos 288     Gtid list [0-1-45]
# at 288
#161226 21:13:20 server id 1  end_log_pos 327     Binlog checkpoint mysql-bin.003995
# at 327
#161226 21:16:27 server id 1  end_log_pos 346     Stop
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

result file of columnstoreSupport and the log of mysql instance is attached.



 Comments   
Comment by David Thompson (Inactive) [ 2016-12-28 ]

i haven't seen this before, do the timestamps correlate to the failover time or are they spread over time? I assume your repeating query client stress test continued to run overnight?

Comment by Si Tong [ 2016-12-28 ]

yes i ran the query script all the night. nearly 15 hours.
I checked the log again, and got 3993 repeated log snippets like below in mariadb instance's log, this must be the reason for those binlog files.
but still not sure why the mariadb server fail to start again and again.
the mcs itself only restart once due to reach the swap threshold.

161226 21:13:16 mysqld_safe Number of processes running now: 0
161226 21:13:16 mysqld_safe mysqld restarted
2016-12-26 21:13:16 140623131330528 [Note] /usr/local/mariadb/columnstore/mysql//bin/mysqld (mysqld 10.1.19-MariaDB) starting as process 31725 ...
161226 21:13:16 Columnstore: Started; Version: 1.0.2-1
161226 21:13:16 InfiniDB: Started; Version: 1.0.2-1
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: The InnoDB memory heap is disabled
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: GCC builtin __sync_synchronize() is used for memory barrier
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: Compressed tables use zlib 1.2.3
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: Using generic crc32 instructions
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2016-12-26 21:13:16 140623131330528 [Note] InnoDB: Completed initialization of buffer pool
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Highest supported file format is Barracuda.
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: The log sequence numbers 1649465 and 1649465 in ibdata files do not match the log sequence number 1690575 in the ib_logfiles!
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Database was not shutdown normally!
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Starting crash recovery.
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Reading tablespace information from the .ibd files...
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Restoring possible half-written data pages
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Last MySQL binlog file position 0 9403, file name /usr/local/mariadb/columnstore/mysql/db/mysql-bin.000001
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: 128 rollback segment(s) are active.
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Waiting for purge to start
2016-12-26 21:13:17 140623131330528 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.32-79.0 started; log sequence number 1690575
2016-12-26 21:13:17 140621993998080 [Note] InnoDB: Dumping buffer pool(s) not yet started
2016-12-26 21:13:17 140623131330528 [Note] Plugin 'FEEDBACK' is disabled.
2016-12-26 21:13:17 140623131330528 [Note] Recovering after a crash using /usr/local/mariadb/columnstore/mysql/db/mysql-bin
2016-12-26 21:13:17 140623131330528 [Note] Starting crash recovery...
2016-12-26 21:13:17 140623131330528 [Note] Crash recovery finished.
2016-12-26 21:13:17 140623131330528 [Note] Server socket created on IP: '::'.
2016-12-26 21:13:17 140623131330528 [Note] /usr/local/mariadb/columnstore/mysql//bin/mysqld: ready for connections.
Version: '10.1.19-MariaDB' socket: '/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock' port: 3306 Columnstore 1.0.6-1
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 52 inet: 172.16.90.239 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 52 inet: 172.16.90.239 port: 8616
ControllerSegmentTable: RWLock() threw: No such file or directory
terminate called after throwing an instance of 'boost::interprocess::interprocess_exception'
what(): No such file or directory
161226 21:13:19 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.19-MariaDB
key_buffer_size=536870912
read_buffer_size=4194304
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1780777 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x388b4d8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fe524c4bd98 thread_stack 0x80000
/usr/local/mariadb/columnstore/mysql//bin/mysqld(my_print_stacktrace+0x29)[0xb33499]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(handle_fatal_signal+0x442)[0x71f292]
/lib64/libpthread.so.0[0x3bfde0f710]
/lib64/libc.so.6(gsignal+0x35)[0x3bfda32925]
/lib64/libc.so.6(abort+0x175)[0x3bfda34105]
/usr/lib64/libstdc++.so.6(ZN9gnu_cxx27_verbose_terminate_handlerEv+0x12d)[0x3c09abea5d]
/usr/lib64/libstdc++.so.6[0x3c09abcbe6]
/usr/lib64/libstdc++.so.6[0x3c09abcc13]
/usr/lib64/libstdc++.so.6[0x3c09abcc26]
/usr/lib64/libstdc++.so.6(__cxa_call_unexpected+0x43)[0x3c09abc2b3]
/usr/local/mariadb/columnstore/lib/libbrm.so.1(_ZN3BRM4DBRMC1Eb+0x3f5)[0x7fe536f91a75]
/usr/local/mariadb/columnstore/lib/libexecplan.so.1(_ZN8execplan14SessionManagerC2Ev+0x41)[0x7fe55d5f1581]
/usr/local/mariadb/columnstore/lib/libexecplan.so.1(_ZN8execplan20CalpontSystemCatalogC2Ev+0x4e5)[0x7fe55d57a135]
/usr/local/mariadb/columnstore/lib/libexecplan.so.1(_ZN8execplan20CalpontSystemCatalog24makeCalpontSystemCatalogEj+0x1cd)[0x7fe55d57a61d]
mysys/stacktrace.c:268(my_print_stacktrace)[0x7fe537d1a9e3]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN4JOIN13exec_infinidbEv+0x21f)[0x5fc8bf]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN4JOIN10exec_innerEv+0x98)[0x607ca8]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN4JOIN4execEv+0x4c)[0x60a25c]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x2be)[0x60a58e]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x276)[0x60b3d6]
sql/handler.h:2790(handler::ha_rnd_init(bool))[0x5bde69]
sql/sql_select.cc:2755(JOIN::exec_inner())[0x5be10e]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z16idb_parse_vtableP3THDR6StringNS_14infinidb_stateE+0x13c)[0x5be27c]
sql/sql_parse.cc:3464(mysql_execute_command(THD*))[0x5bf1ec]
sql/sql_parse.cc:1507(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x5c28be]
sql/sql_parse.cc:1128(do_command(THD*))[0x5c32fd]
sql/sql_connect.cc:1350(do_handle_one_connection(THD*))[0x67e0d4]
sql/sql_connect.cc:1264(handle_one_connection)[0x67e160]
/lib64/libpthread.so.0[0x3bfde079d1]
/lib64/libc.so.6(clone+0x6d)[0x3bfdae8b6d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fe4ec005f40): is an invalid pointer
Connection ID (thread ID): 3
Status: NOT_KILLED

Optimizer switch: index_merge=off,index_merge_union=off,index_merge_sort_union=off,index_merge_intersection=off,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=off,in_to_exists=on,semijoin=off,partial_match_rowid_merge=off,partial_match_table_scan=off,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=off,exists_to_in=on,orderby_uses_equalities=off

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

We think the query pointer is invalid, but we will try to print it anyway.
Query: create temporary table infinidb_vtable.$vtable_3 engine = aria as select * from t2 limit 1

Comment by David Thompson (Inactive) [ 2016-12-28 ]

I think this environment is the same as your other bugs? If so then almost certainly you are running into memory issues with 1gb vms. Columnstore by design will consume a lot more memory than a vanilla mariadb/innodb setup. Especially with the data and query volume you are using you should use more memory. Our prod recommendations are here: https://mariadb.com/kb/en/mariadb/columnstore-minimum-hardware-specification/, for development purposes i'd recommend a minimum of 4gb per server (with an absolute floor of 1.5gb based on our testing).

mysqld_safe is likely restarting the mysqld process due to memory issues. Similarly ProcMon will do the same to ExeMgr which will cause no queries to run.

Comment by David Thompson (Inactive) [ 2017-05-08 ]

This appears to be a symptom of an undersized environment so closing as not a bug.

Generated at Thu Feb 08 02:21:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.