[MDEV-14047] "Too many open files" Created: 2017-10-11  Updated: 2019-01-10

Status: Confirmed
Project: MariaDB Server
Component/s: Storage Engine - RocksDB
Affects Version/s: 10.2.9
Fix Version/s: 10.2

Type: Bug Priority: Minor
Reporter: Julien Muchembled Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: upstream-fixed

Issue Links:
Relates
relates to MDEV-14220 (draft) set global rocksdb_pause_back... Closed
relates to MDEV-13975 [out of disk space assert] I was uplo... Open

 Description   

Using default settings, I got the following error, once the DB reached 250 GB:

2017-10-10  7:31:56 140711626262272 [ERROR] LibRocksDB:[/opt/slapgrid/7c3e140d2be16dd0fc4001874fc875be/parts/mariadb__compile__/mariadb-10.2.9/storage/rocksdb/rocksdb/db/compaction_job.cc:1240] [default] [JOB 13239] OpenCompactionOutputFiles for table #28903 fails at NewWritableFile with status IO error: While open a file for appending: ./.rocksdb/028903.sst: Too many open files
2017-10-10  7:31:56 140711626262272 [ERROR] LibRocksDB:[/opt/slapgrid/7c3e140d2be16dd0fc4001874fc875be/parts/mariadb__compile__/mariadb-10.2.9/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1335] Waiting after background compaction error: IO error: While open a file for appending: ./.rocksdb/028903.sst: Too many open files, Accumulated background error counts: 1
mysqld: /opt/slapgrid/7c3e140d2be16dd0fc4001874fc875be/parts/mariadb__compile__/mariadb-10.2.9/storage/rocksdb/rocksdb/db/db_impl.cc:710: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
171010  7:31:57 [ERROR] mysqld got signal 6 ;

I increased limits so that it can reach 6 TB:

  • rocksdb_max_open_files = 32768
  • rocksdb_merge_buf_size = 256M

However, RocksDB continues to split the DB in files of 64 MB so I'm not sure I changed the correct variable.

Anyway, we think that RocksDB should adapt automatically as the DB grows.

edit: Increasing rocksdb_max_open_files does nothing if open_files_limit is not also increased. Maybe one of them should be calculated automatically from the other.



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2017-10-30 ]

I am able to reproduce the crash by generating write load with very low open_files_limit :

(requires 10.2.9 with RocksDB package installed in the system):

ulimit -n 60 || { echo "error: could not set open files limit ($?)" ; exit 1; }
 
env1=${ENVIRON:-m9-system2}
EXTRA_OPT=${EXTRA_OPT}
set -e
set -u
[ ! -d  mariadb-environs ] || cd mariadb-environs
 
if [ ! -e common.sh ] ; then
  git clone http://github.com/AndriiNikitin/mariadb-environs
  cd mariadb-environs
fi
 
./replant.sh $env1
./build_or_download.sh $env1
 
trap "exit" INT TERM
trap "read -n1 -r -p 'Test finished. Press any key to clean up processes...' key ; kill 0" EXIT
 
if ls $env1/configure_rocksdb_plugin.sh 2>/dev/null ; then
  EXTRA_OPT="configure_rocksdb_plugin=1 $EXTRA_OPT"
else
  EXTRA_OPT="plugin_load_add=ha_rocksdb $EXTRA_OPT"
fi
 
[[ $EXTRA_OPT =~ max_connections ]] || EXTRA_OPT="max_connections=250 $EXTRA_OPT"
 
$env1/gen_cnf.sh rocksdb_flush_log_at_trx_commit=2 $EXTRA_OPT
$env1/install_db.sh
$env1/startup.sh
 
$env1/sql.sh "create table tx (a int, b int, c varbinary(40), primary key(a,b,c) ) engine=RocksDB"
$env1/sql.sh "insert into tx select 1,1,uuid()"
$env1/sql.sh "insert into tx select 1,1,uuid() from tx a, tx b, tx c"
$env1/sql.sh "insert into tx select 1,1,uuid() from tx a, tx b, tx c"
$env1/sql.sh "insert into tx select 1,1,uuid() from tx a, tx b, tx c"
# should be ~1M rows without LIMIT
$env1/sql.sh "insert into tx select 1,1,uuid() from tx a, tx b LIMIT 100000"
 
# just try to load plenty of data from 10 connections in big chunks
for i in {0..100} ; do
  $env1/sql.sh "create table t$i (a int, b int, c varbinary(40), primary key(a,b,c) ) engine=RocksDB"
done
 
# insert loops
for i in {0..100} ; do
  $env1/sql_loop.sh "begin; set @N=0; insert into t$i select @N,@N:=@N+1,uuid() from tx limit 10000; commit" &
done
 
while :; do
# $env1/sql.sh 'show processlist; show global status like "Com_insert%"; show global status like "Handler_commit"'
$env1/status.sh
 
echo .rocksdb size:
du -sh $env1/dt/.rocksdb
echo log file count : $(ls -la $env1/dt/.rocksdb/*.log | wc -l)
echo sst file count : $(ls -la $env1/dt/.rocksdb/*.sst | wc -l)
 
sleep 10
done

Error log m9-system2/dt/error.log

2017-10-30 16:22:22 140464772102336 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.2.9-MariaDB-10.2.9+maria~zesty'  socket: '/home/a/bugs/mariadb-environs/m9-system2/dt/my.sock'  port: 3315  mariadb.org binary distribution
2017-10-30 16:22:41 140464772102336 [ERROR] Error in accept: Too many open files
2017-10-30 16:23:47 140463382181632 [ERROR] LibRocksDB:[/home/buildbot/buildbot/build/mariadb-10.2.9/storage/rocksdb/rocksdb/db/compaction_job.cc:1240] [default] [JOB 12] OpenCompactionOutputFiles for table #39 fails at NewWritableFile with status IO error: While open a file for appending: ./.rocksdb/000039.sst: Too many open files
2017-10-30 16:23:47 140463382181632 [ERROR] LibRocksDB:[/home/buildbot/buildbot/build/mariadb-10.2.9/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1335] Waiting after background compaction error: IO error: While open a file for appending: ./.rocksdb/000039.sst: Too many open files, Accumulated background error counts: 1
2017-10-30 16:23:47 140464447239936 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000039.sst: Too many open files
2017-10-30 16:23:47 140464447239936 [ERROR] MyRocks: aborting on WAL write error.
171030 16:23:47 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.2.9-MariaDB-10.2.9+maria~zesty
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=13
max_threads=252
thread_count=19
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 684736 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7fbfc40009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fc06d7b2cf8 thread_stack 0x49000
(my_addr_resolve failure: pipe(in))
/usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x560b38ae875e]
/usr/sbin/mysqld(handle_fatal_signal+0x3cd) [0x560b385866cd]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11670) [0x7fc07f46c670]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x9f) [0x7fc07ea3877f]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7fc07ea3a37a]
/usr/lib/mysql/plugin/ha_rocksdb.so(myrocks::rdb_handle_io_error(rocksdb::Status, myrocks::RDB_IO_ERROR_TYPE)+0x6d) [0x7fc074f3778d]
/usr/lib/mysql/plugin/ha_rocksdb.so(myrocks::Rdb_transaction_impl::commit_no_binlog()+0x59) [0x7fc074f3d669]
/usr/lib/mysql/plugin/ha_rocksdb.so(+0x18acab) [0x7fc074f20cab]
/usr/sbin/mysqld(+0x65433e) [0x560b3858733e]
/usr/sbin/mysqld(ha_commit_trans(THD*, bool)+0x3ae) [0x560b385899be]
/usr/sbin/mysqld(trans_commit(THD*)+0x5b) [0x560b384cd99b]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0x3653) [0x560b383efa03]
/usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool)+0x28a) [0x560b383f540a]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool)+0x1fb6) [0x560b383f8266]
/usr/sbin/mysqld(do_command(THD*)+0x167) [0x560b383f8ba7]
/usr/sbin/mysqld(do_handle_one_connection(CONNECT*)+0x25a) [0x560b384bf1ca]
/usr/sbin/mysqld(handle_one_connection+0x3d) [0x560b384bf33d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76da) [0x7fc07f4626da]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f) [0x7fc07eb0bd7f]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fbfc400ef70): is an invalid pointer
Connection ID (thread ID): 813
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway.
Query: commit

commit shouldn't crash on resource limit and just return user error.

Comment by Andrii Nikitin (Inactive) [ 2017-10-30 ]

jmuchemb I think you need to increase only global open_files_limit , leaving rocksdb_max_open_files at default value -1, unless you have reasons to have defined value for rocksdb_max_open_files.
could you provide outputs from problem system?

du -sh .rocksdb
echo log file count : $(ls -la .rocksdb/*.log | wc -l)
echo sst file count : $(ls -la .rocksdb/*.sst | wc -l)

Comment by Andrii Nikitin (Inactive) [ 2017-10-30 ]

To reproduce with mtr one can use:

EDIT: this command alone doesn't actually reproduce the problem, see further comments

./mtr bulk_load --suite=rocksdb --mysqld=--open-files-limit=60

~/env1/_depot/m-branch/m2-10.2-bld/mysql-test$ ./mtr bulk_load --suite=rocksdb --mysqld=--open-files-limit=60
Logging: /home/a/env1/m2-10.2/src/mysql-test/mysql-test-run.pl  bulk_load --suite=rocksdb --mysqld=--open-files-limit=60
vardir: /home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var'...
Checking supported features...
MariaDB Version 10.2.10-MariaDB
 - SSL connections supported
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rocksdb.bulk_load                        [ fail ]
        Test ended at 2017-10-30 17:04:01
 
CURRENT_TEST: rocksdb.bulk_load
mysqltest: In included file "/home/a/env1/_depot/m-branch/m2-10.2-src/storage/rocksdb/mysql-test/rocksdb/t/bulk_load.inc": 
included from /home/a/env1/_depot/m-branch/m2-10.2-src/storage/rocksdb/mysql-test/rocksdb/t/bulk_load.test at line 11:
At line 103: query 'LOAD DATA INFILE '$file' INTO TABLE t1' failed: 2013: Lost connection to MySQL server during query
 
The result from queries just before the failure was:
< snip >
CREATE TABLE t3(
pk CHAR(5),
a CHAR(30),
b CHAR(30),
PRIMARY KEY(pk) COMMENT "cf1",
KEY(a)
) COLLATE 'latin1_bin' PARTITION BY KEY() PARTITIONS 4;
connect  other,localhost,root,,;
set session transaction isolation level repeatable read;
select * from information_schema.rocksdb_dbstats where stat_type='DB_NUM_SNAPSHOTS';
STAT_TYPE	VALUE
DB_NUM_SNAPSHOTS	0
start transaction with consistent snapshot;
select * from information_schema.rocksdb_dbstats where stat_type='DB_NUM_SNAPSHOTS';
STAT_TYPE	VALUE
DB_NUM_SNAPSHOTS	1
connection default;
set rocksdb_bulk_load=1;
set rocksdb_bulk_load_size=100000;
LOAD DATA INFILE <input_file> INTO TABLE t1;
 
More results from queries before failure can be found in /home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/log/bulk_load.log
 
 
Server [mysqld.1 - pid: 10802, winpid: 10802, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
2017-10-30 17:02:58 139724792207168 [Note] /home/a/env1/m2-10.2/bld/sql/mysqld (mysqld 10.2.10-MariaDB-log) starting as process 10803 ...
2017-10-30 17:02:58 139724792207168 [Warning] Could not increase number of max_open_files to more than 80 (request: 4162)
2017-10-30 17:02:58 139724792207168 [Note] RocksDB: Got ENOENT when listing column families
2017-10-30 17:02:58 139724792207168 [Note] RocksDB:   assuming that we're creating a new database
2017-10-30 17:02:58 139724792207168 [Note] RocksDB: Column Families at start:
2017-10-30 17:02:58 139724792207168 [Note]   cf=default
2017-10-30 17:02:58 139724792207168 [Note]     write_buffer_size=67108864
2017-10-30 17:02:58 139724792207168 [Note]     target_file_size_base=67108864
2017-10-30 17:02:58 139724792207168 [Note] RocksDB: creating a column family __system__
2017-10-30 17:02:58 139724792207168 [Note]     write_buffer_size=67108864
2017-10-30 17:02:58 139724792207168 [Note]     target_file_size_base=67108864
2017-10-30 17:02:58 139724792207168 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
2017-10-30 17:02:58 139724792207168 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
2017-10-30 17:02:58 139724792207168 [Note] MyRocks storage engine plugin has been successfully initialized.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'SEQUENCE' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'InnoDB' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_DATAFILES' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_LOCKS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_MUTEXES' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_CMPMEM' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_CMP_PER_INDEX' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_CMP' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_TABLESPACES_ENCRYPTION' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_INDEXES' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_BUFFER_PAGE_LRU' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_COLUMNS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_FT_INDEX_TABLE' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_CMP_PER_INDEX_RESET' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_FT_INDEX_CACHE' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_FOREIGN_COLS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_FT_BEING_DELETED' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_TRX' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_FOREIGN' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_TABLES' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_FT_DEFAULT_STOPWORD' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_FT_CONFIG' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_TABLESPACES' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_TABLESPACES_SCRUBBING' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_SEMAPHORE_WAITS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_FIELDS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_METRICS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_FT_DELETED' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'user_variables' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_BUFFER_POOL_STATS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_SYS_VIRTUAL' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'INNODB_BUFFER_PAGE' is disabled.
2017-10-30 17:02:58 139724792207168 [Note] Plugin 'FEEDBACK' is disabled.
2017-10-30 17:02:58 139724792207168 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-feedback-debug-startup-interval=20'
2017-10-30 17:02:58 139724792207168 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-feedback-debug-first-interval=60'
2017-10-30 17:02:58 139724792207168 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-feedback-debug-interval=60'
2017-10-30 17:02:58 139724792207168 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown option '--loose-pam-debug'
2017-10-30 17:02:58 139724792207168 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown option '--loose-enable-rocksdb_lock_info'
2017-10-30 17:02:58 139724792207168 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-debug-sync-timeout=300'
2017-10-30 17:02:58 139724792207168 [Note] Server socket created on IP: '127.0.0.1'.
2017-10-30 17:02:58 139724792207168 [Note] /home/a/env1/m2-10.2/bld/sql/mysqld: ready for connections.
Version: '10.2.10-MariaDB-log'  socket: '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/tmp/mysqld.1.sock'  port: 16000  Source distribution
2017-10-30 17:02:58 139724567672576 [Note] RocksDB: Manual memtable flush.
2017-10-30 17:02:59 139724567672576 [Note] RocksDB: creating a column family cf1
2017-10-30 17:02:59 139724567672576 [Note]     write_buffer_size=67108864
2017-10-30 17:02:59 139724567672576 [Note]     target_file_size_base=67108864
2017-10-30 17:04:00 139724595984128 [ERROR] LibRocksDB:[/home/a/env1/m2-10.2/src/storage/rocksdb/rocksdb/db/compaction_job.cc:1275] [default] [JOB 7] OpenCompactionOutputFiles for table #26 fails at NewWritableFile with status IO error: While open a file for appending: ./.rocksdb/000026.sst: Too many open files
2017-10-30 17:04:00 139724595984128 [ERROR] LibRocksDB:[/home/a/env1/m2-10.2/src/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1359] Waiting after background compaction error: IO error: While open a file for appending: ./.rocksdb/000026.sst: Too many open files, Accumulated background error counts: 1
2017-10-30 17:04:00 139724567672576 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000026.sst: Too many open files
2017-10-30 17:04:00 139724567672576 [ERROR] MyRocks: aborting on WAL write error.
171030 17:04:00 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.2.10-MariaDB-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=2
max_threads=153
thread_count=3
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 62996 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f14040009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f142938ee78 thread_stack 0x49000
(my_addr_resolve failure: pipe(out))
/home/a/env1/m2-10.2/bld/sql/mysqld(my_print_stacktrace+0x29) [0x55e157cce9d9]
/home/a/env1/m2-10.2/bld/sql/mysqld(handle_fatal_signal+0x3bd) [0x55e15780be6d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11670) [0x7f14356a4670]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x9f) [0x7f1434a5977f]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7f1434a5b37a]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::rdb_handle_io_error(rocksdb::Status, myrocks::RDB_IO_ERROR_TYPE)+0x6e) [0x7f142bb0f63e]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::Rdb_transaction_impl::commit_no_binlog()+0xaf) [0x7f142bb1544f]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::ha_rocksdb::do_bulk_commit(myrocks::Rdb_transaction*)+0x56) [0x7f142bb05676]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::ha_rocksdb::update_write_row(unsigned char const*, unsigned char const*, bool)+0xf0) [0x7f142bb07290]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::ha_rocksdb::write_row(unsigned char*)+0x38) [0x7f142bb05a28]
/home/a/env1/m2-10.2/bld/sql/mysqld(handler::ha_write_row(unsigned char*)+0x92) [0x55e1578153b2]
/home/a/env1/m2-10.2/bld/sql/mysqld(write_record(THD*, TABLE*, st_copy_info*)+0x6a) [0x55e15767023a]
/home/a/env1/m2-10.2/bld/sql/mysqld(mysql_load(THD*, sql_exchange*, TABLE_LIST*, List<Item>&, List<Item>&, List<Item>&, enum_duplicates, bool, bool)+0x2485) [0x55e15792fea5]
/home/a/env1/m2-10.2/bld/sql/mysqld(mysql_execute_command(THD*)+0x196f) [0x55e15768c07f]
/home/a/env1/m2-10.2/bld/sql/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool)+0x27d) [0x55e15769332d]
/home/a/env1/m2-10.2/bld/sql/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool)+0x1fd4) [0x55e1576960f4]
/home/a/env1/m2-10.2/bld/sql/mysqld(do_command(THD*)+0x161) [0x55e157696a31]
/home/a/env1/m2-10.2/bld/sql/mysqld(do_handle_one_connection(CONNECT*)+0x234) [0x55e157751064]
/home/a/env1/m2-10.2/bld/sql/mysqld(handle_one_connection+0x34) [0x55e1577511d4]
/home/a/env1/m2-10.2/bld/sql/mysqld(+0xab5144) [0x55e157c97144]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76da) [0x7f143569a6da]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f) [0x7f1434b2cd7f]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f140400f010): is an invalid pointer
Connection ID (thread ID): 4
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway. 
Query: LOAD DATA INFILE '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/mysqld.1/data/test_loadfile.txt' INTO TABLE t1
 
Writing a core file
----------SERVER LOG END-------------
 
 
 - saving '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/log/rocksdb.bulk_load/' to '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/log/rocksdb.bulk_load/'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 69 seconds executing testcases
 
Failure: Failed 1/1 tests, 0.00% were successful.
 
Failing test(s): rocksdb.bulk_load
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

Of course open-files-limit=60 is kind of artificial limitation, but it is still good to demonstrate behavior of engine when limit is reached

Comment by Julien Muchembled [ 2017-10-30 ]

# du -sh .rocksdb
2.4T    .rocksdb
# echo log file count : $(ls -la .rocksdb/*.log | wc -l)
log file count : 5
# echo sst file count : $(ls -la .rocksdb/*.sst | wc -l)
sst file count : 38696
# { cd /proc/32224/fd; echo *; } |wc -w
26206

Comment by Andrii Nikitin (Inactive) [ 2017-10-30 ]

Thank you for the output. To avoid hitting the problem again - I think you should try to keep open_files_limit "high" and rocksdb_max_open_files "low".
Exact numbers may depend on many factors like number of tables in non-RocksDB engines, table_open_cache size, number of temporary tables, number of user connections, etc.
So, I'd try to stick with number rocksdb_max_open_files=10K, maybe up to 20K if you have related performance considerations . Then total file limit shouldn't hit 30K and problem shouldn't happen if open_files_limit is set more than 30K

I expect RocksDB should manage well 40K+ files with 20K limit for rocksdb_max_open_files, maybe even with 10K limit. E.g. when I add to mtr test `--mysqld=--rocksdb_max_open_files=20` together with `--mysqld=--open-files-limit=60` - no crash happens: <- EDIT: the crash still happens even with low rocksdb_max_open_files - see comment below

Comment by Andrii Nikitin (Inactive) [ 2017-10-31 ]

I couldn't reproduce crash with mtr command from earlier comment anymore, so realized that actual steps are:

ulimit -n 60
./mtr bulk_load --suite=rocksdb

Unfortunately providing --mysqld=--rocksdb_max_open_files=5 to the test doesn't help and the case still crashes exactly the same way.
Sometimes it has little different stack when meeting the same problem with .log file:

./mtr bulk_load --suite=rocksdb --mysqld=--rocksdb_max_open_files=5
Logging: /home/a/env1/m2-10.2/src/mysql-test/mysql-test-run.pl  bulk_load --suite=rocksdb --mysqld=--rocksdb_max_open_files=5
vardir: /home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var'...
Checking supported features...
MariaDB Version 10.2.10-MariaDB
 - SSL connections supported
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rocksdb.bulk_load                        [ fail ]
        Test ended at 2017-10-31 10:15:16
 
CURRENT_TEST: rocksdb.bulk_load
mysqltest: In included file "/home/a/env1/_depot/m-branch/m2-10.2-src/storage/rocksdb/mysql-test/rocksdb/t/bulk_load.inc": 
included from /home/a/env1/_depot/m-branch/m2-10.2-src/storage/rocksdb/mysql-test/rocksdb/t/bulk_load.test at line 11:
At line 105: query 'LOAD DATA INFILE '$file' INTO TABLE t2' failed: 2013: Lost connection to MySQL server during query
 
The result from queries just before the failure was:
< snip >
pk CHAR(5),
a CHAR(30),
b CHAR(30),
PRIMARY KEY(pk) COMMENT "cf1",
KEY(a)
) COLLATE 'latin1_bin' PARTITION BY KEY() PARTITIONS 4;
connect  other,localhost,root,,;
set session transaction isolation level repeatable read;
select * from information_schema.rocksdb_dbstats where stat_type='DB_NUM_SNAPSHOTS';
STAT_TYPE	VALUE
DB_NUM_SNAPSHOTS	0
start transaction with consistent snapshot;
select * from information_schema.rocksdb_dbstats where stat_type='DB_NUM_SNAPSHOTS';
STAT_TYPE	VALUE
DB_NUM_SNAPSHOTS	1
connection default;
set rocksdb_bulk_load=1;
set rocksdb_bulk_load_size=100000;
LOAD DATA INFILE <input_file> INTO TABLE t1;
LOAD DATA INFILE <input_file> INTO TABLE t2;
 
More results from queries before failure can be found in /home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/log/bulk_load.log
 
 
Server [mysqld.1 - pid: 6760, winpid: 6760, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
2017-10-31 10:14:12 140398942340928 [Note] /home/a/env1/m2-10.2/bld/sql/mysqld (mysqld 10.2.10-MariaDB-log) starting as process 6761 ...
2017-10-31 10:14:12 140398942340928 [Warning] Could not increase number of max_open_files to more than 80 (request: 4162)
2017-10-31 10:14:12 140398942340928 [Note] RocksDB: Got ENOENT when listing column families
2017-10-31 10:14:12 140398942340928 [Note] RocksDB:   assuming that we're creating a new database
2017-10-31 10:14:12 140398942340928 [Note] RocksDB: Column Families at start:
2017-10-31 10:14:12 140398942340928 [Note]   cf=default
2017-10-31 10:14:12 140398942340928 [Note]     write_buffer_size=67108864
2017-10-31 10:14:12 140398942340928 [Note]     target_file_size_base=67108864
2017-10-31 10:14:12 140398942340928 [Note] RocksDB: creating a column family __system__
2017-10-31 10:14:12 140398942340928 [Note]     write_buffer_size=67108864
2017-10-31 10:14:12 140398942340928 [Note]     target_file_size_base=67108864
2017-10-31 10:14:13 140398942340928 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
2017-10-31 10:14:13 140398942340928 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
2017-10-31 10:14:13 140398942340928 [Note] MyRocks storage engine plugin has been successfully initialized.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'SEQUENCE' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'InnoDB' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_DATAFILES' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_LOCKS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_MUTEXES' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_CMPMEM' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_CMP_PER_INDEX' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_CMP' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_TABLESPACES_ENCRYPTION' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_INDEXES' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_BUFFER_PAGE_LRU' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_COLUMNS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_FT_INDEX_TABLE' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_CMP_PER_INDEX_RESET' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_FT_INDEX_CACHE' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_FOREIGN_COLS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_FT_BEING_DELETED' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_TRX' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_FOREIGN' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_TABLES' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_FT_DEFAULT_STOPWORD' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_FT_CONFIG' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_TABLESPACES' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_TABLESPACES_SCRUBBING' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_SEMAPHORE_WAITS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_FIELDS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_METRICS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_FT_DELETED' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'user_variables' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_BUFFER_POOL_STATS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_SYS_VIRTUAL' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'INNODB_BUFFER_PAGE' is disabled.
2017-10-31 10:14:13 140398942340928 [Note] Plugin 'FEEDBACK' is disabled.
2017-10-31 10:14:13 140398942340928 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-feedback-debug-startup-interval=20'
2017-10-31 10:14:13 140398942340928 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-feedback-debug-first-interval=60'
2017-10-31 10:14:13 140398942340928 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-feedback-debug-interval=60'
2017-10-31 10:14:13 140398942340928 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown option '--loose-pam-debug'
2017-10-31 10:14:13 140398942340928 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown option '--loose-enable-rocksdb_lock_info'
2017-10-31 10:14:13 140398942340928 [Warning] /home/a/env1/m2-10.2/bld/sql/mysqld: unknown variable 'loose-debug-sync-timeout=300'
2017-10-31 10:14:13 140398942340928 [Note] Server socket created on IP: '127.0.0.1'.
2017-10-31 10:14:13 140398942340928 [Note] /home/a/env1/m2-10.2/bld/sql/mysqld: ready for connections.
Version: '10.2.10-MariaDB-log'  socket: '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/tmp/mysqld.1.sock'  port: 16000  Source distribution
2017-10-31 10:14:13 140398793119488 [Note] RocksDB: Manual memtable flush.
2017-10-31 10:14:13 140398793119488 [Note] RocksDB: creating a column family cf1
2017-10-31 10:14:13 140398793119488 [Note]     write_buffer_size=67108864
2017-10-31 10:14:13 140398793119488 [Note]     target_file_size_base=67108864
2017-10-31 10:15:16 140398793119488 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000028.log: Too many open files
2017-10-31 10:15:16 140398793119488 [ERROR] MyRocks: aborting on WAL write error.
171031 10:15:16 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.2.10-MariaDB-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=2
max_threads=153
thread_count=3
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 62996 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7fb1000009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fb12430ce78 thread_stack 0x49000
(my_addr_resolve failure: pipe(in))
/home/a/env1/m2-10.2/bld/sql/mysqld(my_print_stacktrace+0x29) [0x5602701b89d9]
/home/a/env1/m2-10.2/bld/sql/mysqld(handle_fatal_signal+0x3bd) [0x56026fcf5e6d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11670) [0x7fb12be4f670]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x9f) [0x7fb12b20477f]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7fb12b20637a]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::rdb_handle_io_error(rocksdb::Status, myrocks::RDB_IO_ERROR_TYPE)+0x6e) [0x7fb1262e463e]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::Rdb_transaction_impl::commit_no_binlog()+0xaf) [0x7fb1262ea44f]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::ha_rocksdb::do_bulk_commit(myrocks::Rdb_transaction*)+0x56) [0x7fb1262da676]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::ha_rocksdb::update_write_row(unsigned char const*, unsigned char const*, bool)+0xf0) [0x7fb1262dc290]
/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/plugins/ha_rocksdb.so(myrocks::ha_rocksdb::write_row(unsigned char*)+0x38) [0x7fb1262daa28]
/home/a/env1/m2-10.2/bld/sql/mysqld(handler::ha_write_row(unsigned char*)+0x92) [0x56026fcff3b2]
/home/a/env1/m2-10.2/bld/sql/mysqld(write_record(THD*, TABLE*, st_copy_info*)+0x6a) [0x56026fb5a23a]
/home/a/env1/m2-10.2/bld/sql/mysqld(mysql_load(THD*, sql_exchange*, TABLE_LIST*, List<Item>&, List<Item>&, List<Item>&, enum_duplicates, bool, bool)+0x2485) [0x56026fe19ea5]
/home/a/env1/m2-10.2/bld/sql/mysqld(mysql_execute_command(THD*)+0x196f) [0x56026fb7607f]
/home/a/env1/m2-10.2/bld/sql/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool)+0x27d) [0x56026fb7d32d]
/home/a/env1/m2-10.2/bld/sql/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool)+0x1fd4) [0x56026fb800f4]
/home/a/env1/m2-10.2/bld/sql/mysqld(do_command(THD*)+0x161) [0x56026fb80a31]
/home/a/env1/m2-10.2/bld/sql/mysqld(do_handle_one_connection(CONNECT*)+0x234) [0x56026fc3b064]
/home/a/env1/m2-10.2/bld/sql/mysqld(handle_one_connection+0x34) [0x56026fc3b1d4]
/home/a/env1/m2-10.2/bld/sql/mysqld(+0xab5144) [0x560270181144]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76da) [0x7fb12be456da]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f) [0x7fb12b2d7d7f]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fb10000f010): is an invalid pointer
Connection ID (thread ID): 4
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway. 
Query: LOAD DATA INFILE '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/mysqld.1/data/test_loadfile.txt' INTO TABLE t2
 
Writing a core file
----------SERVER LOG END-------------
 
 
 - saving '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/log/rocksdb.bulk_load/' to '/home/a/env1/_depot/m-branch/m2-10.2-bld/mysql-test/var/log/rocksdb.bulk_load/'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 69 seconds executing testcases
 
Failure: Failed 1/1 tests, 0.00% were successful.
 
Failing test(s): rocksdb.bulk_load
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

Comment by Andrii Nikitin (Inactive) [ 2017-10-31 ]

I've tried to reproduce problem with myrocks 5.6 master branch, but after some struggling couldn't make bulk_load tests pass there.

But the sh script with current trunk of MyRocks 5.6 installed and modifications below

$ diff ~/bugs/MDEV-14047.sh ~/bugs/MDEV-14047-myrocks.sh
19a20,21
> EXTRA_OPT="$EXTRA_OPT skip-innodb rocksdb default-storage-engine=rocksdb default-tmp-storage-engine=MyISAM"
> 
29c31,33
< $env1/install_db.sh
---
> # $env1/install_db.sh
> export PATH=$PATH:/usr/local/mysql/bin
> /usr/local/mysql/scripts/mysql_install_db --defaults-file=/home/a/env1/$env1/my.cnf --user=$(whoami) --basedir=/usr/local/mysql --force

shows the same crash after few minutes :

2017-10-31 11:50:56 11189 [Note] MySQL git hash: f7bbd1da8d6651a3017b6da4ca0d08d0f13df6f4 - 2017-10-27T20:38:07-07:00
2017-10-31 11:50:56 11189 [Note] RocksDB git hash: 6a9335dbbb3ecc87fb1cb8701be39b3ec16beb20 - 2017-10-27T15:56:35-07:00
2017-10-31 11:50:56 11189 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.6.35'  socket: '/home/a/env1/m9-system2/dt/my.sock'  port: 3315  Source distribution
...
2017-10-31 11:51:20 11189 [ERROR] /usr/local/mysql/bin/mysqld: Can't open file: './test/t73.frm' (errno: 24 - Too many open files)
2017-10-31 11:51:26 11189 [ERROR] RocksDB: Error detected in background, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000044.sst: Too many open files
2017-10-31 11:51:26 11189 [ERROR] RocksDB: BackgroundErrorReason: 0
2017-10-31 11:51:26 11189 [ERROR] LibRocksDB:[/home/a/env1/_depot/r-branch/master/rocksdb/db/db_impl_compaction_flush.cc:1292] Waiting after background flush error: IO error: While open a file for appending: ./.rocksdb/000044.sst: Too many open filesAccumulated background error counts: 1
2017-10-31 11:51:27 11189 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000044.sst: Too many open files
2017-10-31 11:51:27 11189 [ERROR] MyRocks: aborting on WAL write error.
 
The current stack traces:
/usr/local/mysql/bin/mysqld(my_pstack+0x37) [0x55eb99b1d657]
/usr/local/mysql/bin/mysqld(_ZN7myrocks19rdb_handle_io_errorEN7rocksdb6StatusENS_17RDB_IO_ERROR_TYPEE+0x6d) [0x55eb99d6fe5d]
/usr/local/mysql/bin/mysqld(_ZN7myrocks20Rdb_transaction_impl16commit_no_binlogEv+0x189) [0x55eb99d77519]
/usr/local/mysql/bin/mysqld(+0xbb8745) [0x55eb99d54745]
/usr/local/mysql/bin/mysqld(_Z13ha_commit_lowP3THDbb+0xaa) [0x55eb99736eba]
/usr/local/mysql/bin/mysqld(_ZN12TC_LOG_DUMMY6commitEP3THDbb+0x15) [0x55eb9980b475]
/usr/local/mysql/bin/mysqld(_Z15ha_commit_transP3THDbbb+0xdd) [0x55eb9973e1dd]
/usr/local/mysql/bin/mysqld(_Z12trans_commitP3THDb+0x4a) [0x55eb9996049a]
/usr/local/mysql/bin/mysqld(_Z21mysql_execute_commandP3THDPyS1_+0x13c8) [0x55eb998bfaa8]
/usr/local/mysql/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statePyS1_+0x63e) [0x55eb998c586e]
/usr/local/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjP11Srv_session+0x11a1) [0x55eb998c7041]
/usr/local/mysql/bin/mysqld(_Z10do_commandP3THD+0x15d) [0x55eb998c932d]
/usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x46e) [0x55eb998858de]
/usr/local/mysql/bin/mysqld(handle_one_connection+0x9) [0x55eb99885b89]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76da) [0x7ffb63a8e6da]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f) [0x7ffb62f20d7f]
 
All stack traces from pstack 11189:
Couldn't run "pstack 11189" by popen.
10:51:27 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=31
max_threads=250
thread_count=24
connection_count=24
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 108916 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x55eb9bd051d0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7ffb60152ad8 thread_stack 0x50000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x3b)[0x55eb99b1d5bb]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x406)[0x55eb9982c256]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11670)[0x7ffb63a98670]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x9f)[0x7ffb62e4d77f]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7ffb62e4f37a]
/usr/local/mysql/bin/mysqld(_ZN7myrocks19rdb_handle_io_errorEN7rocksdb6StatusENS_17RDB_IO_ERROR_TYPEE+0x72)[0x55eb99d6fe62]
/usr/local/mysql/bin/mysqld(_ZN7myrocks20Rdb_transaction_impl16commit_no_binlogEv+0x189)[0x55eb99d77519]
/usr/local/mysql/bin/mysqld(+0xbb8745)[0x55eb99d54745]
/usr/local/mysql/bin/mysqld(_Z13ha_commit_lowP3THDbb+0xaa)[0x55eb99736eba]
/usr/local/mysql/bin/mysqld(_ZN12TC_LOG_DUMMY6commitEP3THDbb+0x15)[0x55eb9980b475]
/usr/local/mysql/bin/mysqld(_Z15ha_commit_transP3THDbbb+0xdd)[0x55eb9973e1dd]
2017-10-31 11:51:27 11189 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000044.sst: Too many open files
2017-10-31 11:51:27 11189 [ERROR] MyRocks: aborting on WAL write error.
/usr/local/mysql/bin/mysqld(_Z12trans_commitP3THDb+0x4a)[0x55eb9996049a]
/usr/local/mysql/bin/mysqld(_Z21mysql_execute_commandP3THDPyS1_+0x13c8)[0x55eb998bfaa8]
2017-10-31 11:51:27 11189 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: While open a file for appending: ./.rocksdb/000044.sst: Too many open files
2017-10-31 11:51:27 11189 [ERROR] MyRocks: aborting on WAL write error.

Comment by Sergei Petrunia [ 2017-11-01 ]

Investigated how options are handled:

Ok, --open_files_limit takes the value from my.cnf (or mysqld argument) and then

  • uses setrlimit() adjust the OS limit of max file descriptors //
  • takes the value that we end up using and stores it in @@open_files_limit (so it's visible to the user)

After that --rocksdb_open_files_limit value is passed to RocksDB.

Its value is also clipped to the "original" max number of open descriptors (but NOT the value that was set at step ). Below is the stack trace of where the
clipping happens.

However, the user-visible value of @@rocksdb_open_files_limit() remains as it was specified in my.cnf file. The result of clipping is not visible.

  #0  rocksdb::SanitizeOptions (dbname=..., src=...) at /home/psergey/dev-git/10.2-mariarocks/storage/rocksdb/rocksdb/db/db_impl_open.cc:43
  #1  0x00007fffe7755bbf in rocksdb::DBImpl::DBImpl (this=0x555557bde100, options=..., dbname=...) at /home/psergey/dev-git/10.2-mariarocks/storage/rocksdb/rocksdb/db/db_impl.cc:194
  #2  0x00007fffe77cc5b9 in rocksdb::DB::Open (db_options=..., dbname=..., column_families=..., handles=0x7fffffffcdb0, dbptr=0x7fffffffc948) at /home/psergey/dev-git/10.2-mariarocks/storage/rocksdb/rocksdb/db/db_impl_open.cc:951
  #3  0x00007fffe79edc8c in rocksdb::TransactionDB::Open (db_options=..., txn_db_options=..., dbname=..., column_families=..., handles=0x7fffffffcdb0, dbptr=0x7fffe7fd7e08 <myrocks::rdb>) at /home/psergey/dev-git/10.2-mariarocks/storage/rocksdb/rocksdb/utilities/transactions/pessimistic_transaction_db.cc:201
  #4  0x00007fffe768b22c in myrocks::rocksdb_init_func (p=0x555557ba9490) at /home/psergey/dev-git/10.2-mariarocks/storage/rocksdb/ha_rocksdb.cc:4208
  #5  0x0000555555d7ab38 in ha_initialize_handlerton (plugin=0x555557991208) at /home/psergey/dev-git/10.2-mariarocks/sql/handler.cc:520
  #6  0x0000555555b1c888 in plugin_initialize (tmp_root=0x7fffffffd860, plugin=0x555557991208, argc=0x555557094530 <remaining_argc>, argv=0x555557931438, options_only=false) at /home/psergey/dev-git/10.2-mariarocks/sql/sql_plugin.cc:1411
  #7  0x0000555555b1d4d6 in plugin_init (argc=0x555557094530 <remaining_argc>, argv=0x555557931438, flags=2) at /home/psergey/dev-git/10.2-mariarocks/sql/sql_plugin.cc:1693
  #8  0x0000555555a292a0 in init_server_components () at /home/psergey/dev-git/10.2-mariarocks/sql/mysqld.cc:5279
  #9  0x0000555555a2a343 in mysqld_main (argc=19, argv=0x555557931438) at /home/psergey/dev-git/10.2-mariarocks/sql/mysqld.cc:5869

Questions:

  • Should @@rocksdb_open_files_limit be made to reflect the actual limit that is being used (like open_files_limit does)?
  • Should myrocks::rocksdb_init_func (that is, MyRocks) adjust main_opts.max_open_files to not be bigger than the clipped value of @@open_files_limit ?
    Actually, since file descriptors are used for other purposes besides MyRocks, it could be e.g. 0.9 *open_files_limit
    ?
Comment by Andrii Nikitin (Inactive) [ 2017-11-01 ]

In my understanding no matter how we try to auto-adjust rocksdb_open_files_limit - there will be practical situations where IO attempt will receive errno: 24.
And the Engine must survive that error, e.g. just returning user error the same way when Server fails to open .frm file.

So I suggest looking at auto-adjustment of rocksdb_open_files_limit in another ticket.
AFAIK current formula for adjusting table_open_cache and max_connections is something like :
open_files_limit > 9 + 2*table_open_cache + max_connections

That comes with assumption that for 100 open MyISAM tables and 50 simultaneous connections server will need at least 2*100+50 file descriptors. (both .MYI and .MYD files are counted).
In my understanding RocksDB may be hungry for file descriptors without correlation to neither table_open_cache nor max_connections. (E.g. huge single RocksDB table may be stored in many .sst files).
So I think we should try to investigate something like :

long_size_t requested_limit =  9 + 2*configured_table_open_cache + configured_max_connections + configured_rocksdb_open_files_limit ;
double koefficient = ( requested_limit < open_files_limit ) ? 1 : requested_limit / open_files_limit ;
table_open_cache=configured_table_open_cache / koefficient;
max_connections=configured_table_open_cache / koefficient;
rocksdb_open_files_limits=configured_table_open_cache / koefficient;

Comment by Sergei Petrunia [ 2017-11-02 ]

In my understanding no matter how we try to auto-adjust rocksdb_open_files_limit - there will be practical situations where IO attempt will receive errno: 24.

Well if rocksdb_open_files_limit is sufficiently low and all non-RocksDB file descriptor usage utilizes not more than open_files_limit - rocksdb_open_files_limit descriptors, we should be fine, right?

And the Engine must survive that error, e.g. just returning user error the same way when Server fails to open .frm file.

It could actually do better - it could try closing a few files it has open and retry.
I am not sure about how complex it is implement this, though.

In my understanding RocksDB may be hungry for file descriptors without correlation to neither table_open_cache nor max_connections. (E.g. huge single RocksDB table may be stored in many .sst files).

Correct.

So I think we should try to investigate something like :

Does this imply that the values of table_open_cache and max_connections should be "auto-magically" adjusted when MyRocks plugin is loaded ?

Comment by Sergei Petrunia [ 2017-11-03 ]

For the record, I was able to reproduce.

I was loading data (concurrently, without bulk load modes) until the datadir was about 64G, and there were 1048 SST files.

With that size, the server is not able to start when one is using ulimit -n 512. It fails with a message:

2017-11-03  9:13:56 140046930872128 [ERROR] LibRocksDB:[/home/ubuntu/mariadb-10.2/storage/
 rocksdb/rocksdb/db/version_set.cc:2624] [default] Error in committing version edit to MANIFEST: 
...
2017-11-03  9:13:56 140046930872128 [ERROR] RocksDB: Error opening instance, Status Code: 5, 
  Status: IO error: While open a file for appending: ./.rocksdb/019884.dbtmp: Too many open files

Then I've started mysqld again under ulimit -n 1200 and continued data loading. Datadir grew to 1108 SST files and 68 G and then the server crashed:

2017/11/03-12:16:50.546730 7f3cf2907700 [/home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/flush_job.cc:293] [default] [JOB 297] Level-0 flush table #21189: started
2017/11/03-12:16:50.546750 7f3cf2907700 [/home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/flush_job.cc:324] [default] [JOB 297] Level-0 flush table #21189: 0 bytes IO error: While open a file for appending: ./.rocksdb/021189.sst: Too many open files
2017/11/03-12:16:50.546769 7f3cf2907700 [ERROR] [/home/ubuntu/mariadb-10.2/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1279] Waiting after background flush error: IO error: While open a file for appending: ./.rocksdb/021189.sst: Too many open filesAccumulated background error counts: 1

Looking at what limits are there by default:

ubuntu@ip-172-31-26-163:~$ ulimit -n
1024
ubuntu@ip-172-31-26-163:~$ ulimit -n -H
1048576

Started mysqld and examined its /proc/$PID/limits :

Limit                     Soft Limit           Hard Limit           Units
Max open files            4162                 4162                 files

That number goes higher if one sets higher open_files_limit.

I think for the moment we can de-prioritize this bug
and just add a note to the tuning guide that those who which to have >200G of data need to increase their open_files_limit .
anikitin any objections?

Comment by Sergei Petrunia [ 2017-11-04 ]

Also asked on MyRocks-Dev: https://groups.google.com/forum/#!topic/myrocks-dev/jp40Sk--CY8

Comment by Daniel Black [ 2019-01-10 ]

MyRocks fix https://github.com/facebook/mysql-5.6/commit/aac18139d3cffa7b3b9c19d5f9ad2d6c1d6bbbff

Generated at Thu Feb 08 08:10:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.