Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20406

Rocksdb gets corrupted on OOM during ALTER

    XMLWordPrintable

Details

    Description

      I was adding a field on a very large table
      ALTER TABLE npadata ADD COLUMN name1 VARCHAR(64) NOT NULL DEFAULT '';
      ERROR 2013 (HY000): Lost connection to MySQL server during query
      Note: there is plenty of space available in the box

      tail /var/lib/mysql/*.err

      Version: '10.4.7-MariaDB-1:10.4.7+maria~bionic'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  mariadb.org binary distribution
      2019-08-19 15:32:00 419 [Note] Zerofilling moved table:  './mysql/proc'
      2019-08-20 14:19:50 901 [Note] mysqld: O_TMPFILE is not supported on /temp (disabling future attempts)
      2019-08-22  5:53:01 0 [Note] RocksDB: 2 column families found
      2019-08-22  5:53:01 0 [Note] RocksDB: Column Families at start:
      2019-08-22  5:53:01 0 [Note]   cf=default
      2019-08-22  5:53:01 0 [Note]     write_buffer_size=268435456
      2019-08-22  5:53:01 0 [Note]     target_file_size_base=67108864
      2019-08-22  5:53:01 0 [Note]   cf=__system__
      2019-08-22  5:53:01 0 [Note]     write_buffer_size=268435456
      2019-08-22  5:53:01 0 [Note]     target_file_size_base=67108864
      2019-08-22  5:53:44 0 [ERROR] RocksDB: Error opening instance, Status Code: 2, Status: Corruption: truncated header
      2019-08-22  5:53:44 0 [ERROR] Plugin 'ROCKSDB' init function returned error.
      2019-08-22  5:53:44 0 [ERROR] Plugin 'ROCKSDB' registration as a STORAGE ENGINE failed.
      2019-08-22  5:53:44 0 [ERROR] Unknown/unsupported storage engine: rocksdb
      2019-08-22  5:53:44 0 [ERROR] Aborting
      

      Note:
      If the table is very large, adding a fields takes literally days. But also, which is worse, if you need to add many fields at once, there is no way to do it in one single event. It has to copy the entire table many times. In my case, the table has 1.92BN records. One additional field took 3 days to add. That was las week. Now I need to add 5 fields. This is not acceptable even in case that this corruption can be resolved.

      The box

      df -H

      Filesystem      Size  Used Avail Use% Mounted on
      /dev/sdb1       1.1T  649G  425G  61% /
      none            504k     0  504k   0% /dev
      tmpfs           127G     0  127G   0% /dev/shm
      tmpfs           127G  1.0M  127G   1% /run
      tmpfs           5.3M     0  5.3M   0% /run/lock
      tmpfs           127G     0  127G   0% /sys/fs/cgroup
      tmpfs            26G     0   26G   0% /run/user/0
      

      free -g

                    total        used        free      shared  buff/cache   available
      Mem:            236          30         196           0           8         203
      Swap:             7           4           3
      

      my.cnf

      [mysqld]
      open_files_limit=65535
      log_warnings= 0
      datadir=/var/lib/mysql
      tmpdir=/temp
      slave_load_tmpdir=/data/temp
      plugin-load-add=ha_rocksdb.so
      port =3306 
      table_open_cache=3000
      table_open_cache_instances=24
      default-storage-engine = rocksdb
      skip_innodb=1
      max_allowed-packet=1G
      large_pages=1
      log_error=/var/lib/mysql/mariadb.err
      general_log=0
      general_log_file=mariadb.log
      log_output='FILE'
      lower_case_table_names=1
      max_heap_table_size=20G
      memlock=1
      query_cache_size=0
      query_cache_type=0
      skip_name_resolve=1
      sort_buffer_size=8M
      symbolic_links=0
      thread_cache_size=256
      thread_handling=pool-of-threads
      thread-pool-size=64
      thread_stack=192K
      tmp_table_size=64M
      general-log=0
      general-log-file=queries.log
      log-output=file
      join_buffer_size=8M
      event_scheduler = 1
      max_connections=1250
      extra_max_connections=10
      extra_port=9433
      connect_timeout=200
      interactive_timeout=3000000
      net_read_timeout=2000
      net_write_timeout=2000
      slave_net_timeout=3600
      wait_timeout=6000
      rocksdb_commit_in_the_middle=1
      rocksdb_allow_concurrent_memtable_write=1
      rocksdb_blind_delete_primary_key=1
      rocksdb_tmpdir=/temp
      rocksdb_bulk_load_size=10000
      rocksdb_max_total_wal_size=14G
      rocksdb_table_cache_numshardbits=19
      rocksdb_default_cf_options=write_buffer_size=256m;target_file_size_base=64m;max_bytes_for_level_base=512m;level0_file_num_compaction_trigger=4;level0_slowdown_writes_trigger=256;level0_stop_writes_trigger=256;max_write_buffer_number=16;compression_per_level=kNoCompression:kNoCompression:kNoCompression:kZlibCompression:kZlibCompression:kZlibCompression;bottommost_compression=kZlibCompression;compression_opts=-14:1:0;block_based_table_factory={cache_index_and_filter_blocks=1;filter_policy=bloomfilter:10:false;whole_key_filtering=1};level_compaction_dynamic_level_bytes=true;optimize_filters_for_hits=true
      rocksdb_override_cf_options=system={memtable=skip_list:16}
      rocksdb_write_disable_wal=0
      rocksdb_flush_log_at_trx_commit=1
      rocksdb_strict_collation_check=off
      sync_binlog=0
      join_buffer_size=256K
      sort_buffer_size=256K
       ##innodb_spin_wait_delay=96
      rocksdb_max_background_jobs=24
      rocksdb_compaction_sequential_deletes=199999
      rocksdb_compaction_sequential_deletes_window=200000
      rocksdb_max_subcompactions=16
      rocksdb_compaction_readahead_size=16m
      rocksdb_use_direct_reads=ON
      rocksdb_use_direct_io_for_flush_and_compaction=ON
      rocksdb_max_row_locks=1073741824
      rocksdb-table-stats-sampling-pct=15
       
      rocksdb_max_open_files=-1
      rocksdb_block_size=16384
      rocksdb_block_cache_size=62G
       
      rocksdb_bytes_per_sync=4194304
      rocksdb_wal_bytes_per_sync=4194304
      rocksdb_rate_limiter_bytes_per_sec=204857600 
       
      rocksdb_compaction_sequential_deletes_count_sd=1
      rocksdb_compaction_sequential_deletes=199999
      

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              philip_38 Philip orleans
              Votes:
              3 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.