Details

    Description

      EDIT: further analysis shows that the problem is repeatable without OS crash as well
      Use these scripts to crash your OS under load and try to recover later:
      https://github.com/AndriiNikitin/bugs/blob/master/MDEV-14103-crashOS.sh
      https://github.com/AndriiNikitin/bugs/blob/master/MDEV-14103-recoverOS.sh

      Prerequisites:

      • physical machine
      • git, m4, bash installed
      • MariaDB 10.2.9 with RocksDB plugin installed
      • ulimit -n 4000
      • up to 200Mb of disk space
      • ext3, ext4 or xfs or similar filesystem.

      1. Comment out "exit 1" at top of script MDEV-14103-crashOS.sh if you have recent backup
      2. Clone scripts with command below and make sure that cloned folder is your working directory for this case:

      git clone http://github.com/AndriiNikitin/mariadb-environs
      cd mariadb-environs
      

      3. Run command below, which will ask root password, generate helper scripts, initialize new datadir in current folder, start server on port 3313, put high load on it :

      ENGINE=RocksDB EXTRA_OPT='sync_binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON' bash MDEV-14103-crashOS.sh
      

      4. After ~1 min the terminal will ask to press any key to crash your OS
      5. Make sure that OS is not usable anymore, reboot
      6. cd to the working folder (mariadb-environs)
      7. Run command below, which will check consistency of datafiles in recovered data, and load binlog into new instance on port 3314 and compare content of tables in original recovered datadir and one which was rebuilt from binlog:

      bash MDEV-14103-recoverOS.sh
      

      Examine outcome in MDEV-14103.log
      When ENGINE=InnoDB together with sync_binlog=1 are used - all checks pass properly.
      Various inconsistencies observed in other cases, which is expected - except case with "RocksDB sync-binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON":

      ===========================================
      ===========================================
      InnoDB sync-binlog=1
      ===========================================
      consistency checks completed
      done
      ===========================================
      ===========================================
      InnoDB sync-binlog=0
      ===========================================
      consistency checks completed
      row count is different in d1 : (1706 1707) vs (1275 1275)
      row count is different in d2 : (1509 1509) vs (1308 1308)
      row count is different in d3 : (1367 1367) vs (1164 1164)
      row count is different in d4 : (1546 1546) vs (1184 1184)
      row count is different in d5 : (1491 1491) vs (1066 1066)
      row count is different in d6 : (1397 1397) vs (1164 1164)
      row count is different in d7 : (1698 1698) vs (1143 1143)
      row count is different in d8 : (1810 1810) vs (1307 1307)
      row count is different in d9 : (1336 1336) vs (989 989)
      row count is different in d10 : (1506 1506) vs (1100 1100)
      row count is different in d11 : (1762 1762) vs (1358 1358)
      row count is different in d12 : (1523 1523) vs (1251 1251)
      row count is different in d13 : (1622 1622) vs (1225 1225)
      row count is different in d14 : (1720 1720) vs (1187 1187)
      row count is different in d15 : (1535 1535) vs (1159 1159)
      row count is different in d16 : (1742 1742) vs (1256 1256)
      row count is different in d17 : (1852 1852) vs (1329 1329)
      row count is different in d18 : (1729 1729) vs (1263 1263)
      row count is different in d19 : (1746 1746) vs (1379 1379)
      row count is different in d20 : (1615 1615) vs (1289 1289)
      row count is different in d21 : (1250 1250) vs (1067 1067)
      row count is different in d22 : (1548 1548) vs (1144 1144)
      row count is different in d23 : (1717 1717) vs (1358 1358)
      row count is different in d24 : (1487 1487) vs (1199 1199)
      row count is different in d25 : (1410 1410) vs (1063 1063)
      row count is different in d26 : (1370 1370) vs (1096 1096)
      row count is different in d27 : (1458 1458) vs (1164 1164)
      row count is different in d28 : (1422 1422) vs (1118 1118)
      row count is different in d29 : (1638 1638) vs (1226 1226)
      row count is different in d30 : (1464 1464) vs (1083 1083)
      row count is different in d31 : (1827 1827) vs (1352 1352)
      row count is different in d32 : (1756 1756) vs (1298 1298)
      row count is different in d33 : (1450 1450) vs (1163 1163)
      row count is different in d34 : (1542 1542) vs (1149 1149)
      row count is different in d35 : (1716 1716) vs (1346 1346)
      row count is different in d36 : (1758 1758) vs (1269 1269)
      row count is different in d37 : (1660 1660) vs (1108 1108)
      row count is different in d38 : (1846 1846) vs (1331 1331)
      row count is different in d39 : (1529 1529) vs (1180 1180)
      row count is different in d40 : (1516 1516) vs (1274 1274)
      magic row not found
      done
      ===========================================
      ===========================================
      RocksDB sync-binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON
      ===========================================
      consistency checks completed
      row count is different in d3 : (286 286) vs (290 290)
      row count is different in d4 : (287 287) vs (290 290)
      row count is different in d5 : (285 285) vs (287 287)
      row count is different in d7 : (292 292) vs (293 293)
      row count is different in d8 : (288 288) vs (289 289)
      row count is different in d9 : (280 280) vs (281 281)
      row count is different in d10 : (283 283) vs (285 285)
      row count is different in d12 : (286 286) vs (287 287)
      row count is different in d15 : (280 280) vs (281 281)
      row count is different in d16 : (288 288) vs (289 289)
      row count is different in d18 : (283 283) vs (284 284)
      row count is different in d21 : (277 277) vs (278 278)
      row count is different in d22 : (285 285) vs (288 288)
      row count is different in d25 : (282 282) vs (284 284)
      row count is different in d26 : (283 283) vs (284 284)
      row count is different in d27 : (280 280) vs (282 282)
      row count is different in d28 : (286 286) vs (288 288)
      row count is different in d29 : (281 281) vs (283 283)
      row count is different in d30 : (280 280) vs (281 281)
      row count is different in d31 : (281 281) vs (282 282)
      row count is different in d32 : (284 284) vs (285 285)
      row count is different in d34 : (280 280) vs (281 281)
      row count is different in d37 : (280 280) vs (281 281)
      row count is different in d38 : (283 283) vs (285 285)
      row count is different in d39 : (282 282) vs (285 285)
      row count is different in d40 : (282 282) vs (284 284)
      done
      ===========================================
      ===========================================
      InnoDB sync-binlog=1 innodb_flush_log_at_trx_commit=0
      ===========================================
      consistency checks completed
      done
      

      Attachments

        Issue Links

          Activity

            I've discovered option rocksdb_use_fsync , which is OFF by default, should re-test with it.

            anikitin Andrii Nikitin (Inactive) added a comment - I've discovered option rocksdb_use_fsync , which is OFF by default, should re-test with it.

            OK, rocksdb_use_fsync doesn't seem to help:

            ===========================================
            ===========================================
            RocksDB sync_binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON rocksdb_use_fsync=1
            ===========================================
            consistency checks completed
            row count is different in d1 : (216 217) vs (218 219)
            row count is different in d2 : (216 216) vs (217 217)
            row count is different in d4 : (216 216) vs (217 217)
            row count is different in d5 : (209 209) vs (211 211)
            row count is different in d6 : (213 213) vs (215 215)
            row count is different in d9 : (216 216) vs (218 218)
            row count is different in d10 : (209 209) vs (210 210)
            row count is different in d11 : (215 215) vs (216 216)
            row count is different in d16 : (216 216) vs (217 217)
            row count is different in d17 : (217 217) vs (218 218)
            row count is different in d18 : (211 211) vs (213 213)
            row count is different in d19 : (211 211) vs (213 213)
            row count is different in d20 : (212 212) vs (214 214)
            row count is different in d21 : (211 211) vs (213 213)
            row count is different in d22 : (214 214) vs (216 216)
            row count is different in d24 : (211 211) vs (212 212)
            row count is different in d25 : (208 208) vs (209 209)
            row count is different in d26 : (210 210) vs (212 212)
            row count is different in d27 : (209 209) vs (211 211)
            row count is different in d29 : (210 210) vs (213 213)
            row count is different in d30 : (212 212) vs (213 213)
            row count is different in d31 : (211 211) vs (212 212)
            row count is different in d32 : (206 206) vs (209 209)
            row count is different in d33 : (212 212) vs (215 215)
            row count is different in d35 : (207 207) vs (208 208)
            row count is different in d36 : (212 212) vs (214 214)
            row count is different in d37 : (206 206) vs (207 207)
            row count is different in d38 : (212 212) vs (213 213)
            row count is different in d39 : (210 210) vs (211 211)
            row count is different in d40 : (211 211) vs (214 214)
            done
            
            

            anikitin Andrii Nikitin (Inactive) added a comment - OK, rocksdb_use_fsync doesn't seem to help: =========================================== =========================================== RocksDB sync_binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON rocksdb_use_fsync=1 =========================================== consistency checks completed row count is different in d1 : (216 217) vs (218 219) row count is different in d2 : (216 216) vs (217 217) row count is different in d4 : (216 216) vs (217 217) row count is different in d5 : (209 209) vs (211 211) row count is different in d6 : (213 213) vs (215 215) row count is different in d9 : (216 216) vs (218 218) row count is different in d10 : (209 209) vs (210 210) row count is different in d11 : (215 215) vs (216 216) row count is different in d16 : (216 216) vs (217 217) row count is different in d17 : (217 217) vs (218 218) row count is different in d18 : (211 211) vs (213 213) row count is different in d19 : (211 211) vs (213 213) row count is different in d20 : (212 212) vs (214 214) row count is different in d21 : (211 211) vs (213 213) row count is different in d22 : (214 214) vs (216 216) row count is different in d24 : (211 211) vs (212 212) row count is different in d25 : (208 208) vs (209 209) row count is different in d26 : (210 210) vs (212 212) row count is different in d27 : (209 209) vs (211 211) row count is different in d29 : (210 210) vs (213 213) row count is different in d30 : (212 212) vs (213 213) row count is different in d31 : (211 211) vs (212 212) row count is different in d32 : (206 206) vs (209 209) row count is different in d33 : (212 212) vs (215 215) row count is different in d35 : (207 207) vs (208 208) row count is different in d36 : (212 212) vs (214 214) row count is different in d37 : (206 206) vs (207 207) row count is different in d38 : (212 212) vs (213 213) row count is different in d39 : (210 210) vs (211 211) row count is different in d40 : (211 211) vs (214 214) done

            With small tweaks I did run the script with MyRocks binaries installed from current Master 5.6.
            The test passed with "sync_binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON rocksdb_use_fsync=1" showing identical checksums for 40 tables which were ongoing concurrent write during the crash. (I.e. instance after crash recovery has identical data to instance restored from binlog)/

            Btw the recovery log contains following, some of which I didn't see in MariaDB case :

            2017-10-30 23:46:54 4669 [Note] Plugin 'FEDERATED' is disabled.
            2017-10-30 23:46:54 4669 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release
            2017-10-30 23:46:54 4669 [Note] Plugin 'InnoDB' is disabled.
            2017-10-30 23:46:54 4669 [Note] RocksDB: 2 column families found
            2017-10-30 23:46:54 4669 [Note] RocksDB: Column Families at start:
            2017-10-30 23:46:54 4669 [Note]   cf=default
            2017-10-30 23:46:54 4669 [Note]     write_buffer_size=67108864
            2017-10-30 23:46:54 4669 [Note]     target_file_size_base=67108864
            2017-10-30 23:46:54 4669 [Note]   cf=__system__
            2017-10-30 23:46:54 4669 [Note]     write_buffer_size=67108864
            2017-10-30 23:46:54 4669 [Note]     target_file_size_base=67108864
            2017-10-30 23:46:57 4669 [Note] RocksDB: Table_store: loaded DDL data for 84 tables
            2017-10-30 23:46:58 4669 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
            2017-10-30 23:46:58 4669 [Note] MyRocks storage engine plugin has been successfully initialized.
            2017-10-30 23:46:58 4669 [Note] Recovering after a crash using blog
            2017-10-30 23:46:58 4669 [Note] Starting crash recovery...
            RocksDB: Last binlog file position 3067053, file name blog.000001
            2017-10-30 23:46:58 4669 [Note] Found 18 prepared transaction(s) in ROCKSDB
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\332\313\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\273\310\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\7\313\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\35\313\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\330\313\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0k\314\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0r\314\2\0\0\0\0\0'
            2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\214\314\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\270\313\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\314\313\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0]\314\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0y\314\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\33\314\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0^\310\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0b\314\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\12\314\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\343\313\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\366\313\2\0\0\0\0\0'
            2017-10-30 23:46:59 4669 [Note] Crash recovery finished.
            2017-10-30 23:46:59 4669 [Note] RSA private key file not found: /home/a/env1/m7-system2/dt//private_key.pem. Some authentication plugins will not work.
            2017-10-30 23:46:59 4669 [Note] RSA public key file not found: /home/a/env1/m7-system2/dt//public_key.pem. Some authentication plugins will not work.
            2017-10-30 23:46:59 4669 [Note] Server hostname (bind-address): '*'; port: 3313
            2017-10-30 23:46:59 4669 [Note] IPv6 is available.
            2017-10-30 23:46:59 4669 [Note]   - '::' resolves to '::';
            2017-10-30 23:46:59 4669 [Note] Server socket created on IP: '::'.
            2017-10-30 23:47:00 4669 [Note] Event Scheduler: Loaded 0 events
            2017-10-30 23:47:00 4669 [Note] MySQL git hash: f7bbd1da8d6651a3017b6da4ca0d08d0f13df6f4 - 2017-10-27T20:38:07-07:00
            2017-10-30 23:47:00 4669 [Note] RocksDB git hash: 6a9335dbbb3ecc87fb1cb8701be39b3ec16beb20 - 2017-10-27T15:56:35-07:00
            2017-10-30 23:47:00 4669 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
            Version: '5.6.35-log'  socket: '/home/a/env1/m7-system2/dt/my.sock'  port: 3313  Source distribution
            

            anikitin Andrii Nikitin (Inactive) added a comment - With small tweaks I did run the script with MyRocks binaries installed from current Master 5.6. The test passed with "sync_binlog=1 rocksdb_flush_log_at_trx_commit=1 rocksdb_enable_2pc=ON rocksdb_use_fsync=1" showing identical checksums for 40 tables which were ongoing concurrent write during the crash. (I.e. instance after crash recovery has identical data to instance restored from binlog)/ Btw the recovery log contains following, some of which I didn't see in MariaDB case : 2017-10-30 23:46:54 4669 [Note] Plugin 'FEDERATED' is disabled. 2017-10-30 23:46:54 4669 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release 2017-10-30 23:46:54 4669 [Note] Plugin 'InnoDB' is disabled. 2017-10-30 23:46:54 4669 [Note] RocksDB: 2 column families found 2017-10-30 23:46:54 4669 [Note] RocksDB: Column Families at start: 2017-10-30 23:46:54 4669 [Note] cf=default 2017-10-30 23:46:54 4669 [Note] write_buffer_size=67108864 2017-10-30 23:46:54 4669 [Note] target_file_size_base=67108864 2017-10-30 23:46:54 4669 [Note] cf=__system__ 2017-10-30 23:46:54 4669 [Note] write_buffer_size=67108864 2017-10-30 23:46:54 4669 [Note] target_file_size_base=67108864 2017-10-30 23:46:57 4669 [Note] RocksDB: Table_store: loaded DDL data for 84 tables 2017-10-30 23:46:58 4669 [Note] RocksDB: global statistics using get_sched_indexer_t indexer 2017-10-30 23:46:58 4669 [Note] MyRocks storage engine plugin has been successfully initialized. 2017-10-30 23:46:58 4669 [Note] Recovering after a crash using blog 2017-10-30 23:46:58 4669 [Note] Starting crash recovery... RocksDB: Last binlog file position 3067053, file name blog.000001 2017-10-30 23:46:58 4669 [Note] Found 18 prepared transaction(s) in ROCKSDB 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\332\313\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\273\310\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\7\313\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\35\313\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\330\313\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0k\314\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0r\314\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\214\314\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\270\313\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\314\313\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0]\314\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0y\314\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\33\314\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0^\310\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0b\314\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\12\314\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\343\313\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\366\313\2\0\0\0\0\0' 2017-10-30 23:46:59 4669 [Note] Crash recovery finished. 2017-10-30 23:46:59 4669 [Note] RSA private key file not found: /home/a/env1/m7-system2/dt//private_key.pem. Some authentication plugins will not work. 2017-10-30 23:46:59 4669 [Note] RSA public key file not found: /home/a/env1/m7-system2/dt//public_key.pem. Some authentication plugins will not work. 2017-10-30 23:46:59 4669 [Note] Server hostname (bind-address): '*'; port: 3313 2017-10-30 23:46:59 4669 [Note] IPv6 is available. 2017-10-30 23:46:59 4669 [Note] - '::' resolves to '::'; 2017-10-30 23:46:59 4669 [Note] Server socket created on IP: '::'. 2017-10-30 23:47:00 4669 [Note] Event Scheduler: Loaded 0 events 2017-10-30 23:47:00 4669 [Note] MySQL git hash: f7bbd1da8d6651a3017b6da4ca0d08d0f13df6f4 - 2017-10-27T20:38:07-07:00 2017-10-30 23:47:00 4669 [Note] RocksDB git hash: 6a9335dbbb3ecc87fb1cb8701be39b3ec16beb20 - 2017-10-27T15:56:35-07:00 2017-10-30 23:47:00 4669 [Note] /usr/local/mysql/bin/mysqld: ready for connections. Version: '5.6.35-log' socket: '/home/a/env1/m7-system2/dt/my.sock' port: 3313 Source distribution

            rocksdb_use_fsync means "Use fsync() call instead of fdatasync()". Both variants should provide persistence.

            The above RocksDB settings should provide data persistence (I have also checked this with Herman Lee @ FB). So there's something to look at, here.

            psergei Sergei Petrunia added a comment - rocksdb_use_fsync means "Use fsync() call instead of fdatasync()". Both variants should provide persistence. The above RocksDB settings should provide data persistence (I have also checked this with Herman Lee @ FB). So there's something to look at, here.

            I can reproduce the problem without system crash as well - by just replacing last line `echo c | sudo tee /proc/sysrq-trigger` with `kill -9 $(cat $env1/dt/p.id)` .
            Since:

            • recovered instance from binlog has more rows in the tables
            • it is suspicions that MariaDB error log doesn't have following messages, which present in MyRocks:

              RocksDB: Last binlog file position 3067053, file name blog.000001
              2017-10-30 23:46:58 4669 [Note] Found 18 prepared transaction(s) in ROCKSDB
              2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\332\313\2\0\0\0\0\0'
              2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\273\310\2\0\0\0\0\0'
              ...
              

            it looks that binlog recovery is working properly and it is RocksDB crash recovery to blame.

            anikitin Andrii Nikitin (Inactive) added a comment - I can reproduce the problem without system crash as well - by just replacing last line `echo c | sudo tee /proc/sysrq-trigger` with `kill -9 $(cat $env1/dt/p.id)` . Since: recovered instance from binlog has more rows in the tables it is suspicions that MariaDB error log doesn't have following messages, which present in MyRocks: RocksDB: Last binlog file position 3067053, file name blog.000001 2017-10-30 23:46:58 4669 [Note] Found 18 prepared transaction(s) in ROCKSDB 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\332\313\2\0\0\0\0\0' 2017-10-30 23:46:58 4669 [Note] commit xid 'MySQLXid\7\0\0\0\0\0\0\0\273\310\2\0\0\0\0\0' ... it looks that binlog recovery is working properly and it is RocksDB crash recovery to blame.
            anikitin Andrii Nikitin (Inactive) added a comment - - edited

            Indeed binary log has rows which are missing from the instance after crash recovery:

            # this is original instance recovered from kill -9
            $ mysql --defaults-file=~/env1/m7-system2/my.cnf -e 'select * from d2.a where a=68886'
            # this is instance restored from binlog
            $ mysql --defaults-file=~/env1/m8-system2/my.cnf -e 'select * from d2.a where a=68886'
            +-------+-------+-------+
            | a     | b     | c     |
            +-------+-------+-------+
            | 68886 | 71745 | 71746 |
            +-------+-------+-------+
            

            And this is the only reference of the row in binary log (right before process got killed):

            BINLOG '
            tc34WRMHAAAALgAAAGrWhAAAABUAAAAAAAEAAmQyAAFhAAMDAw8CKAAArZqFHw==
            tc34WRcHAAAAMAAAAJrWhAAAABUAAAAAAAEAA//4Fg0BAEEYAQAFNzE3NDbNkPRh
            '/*!*/;
            ### INSERT INTO `d2`.`a`
            ### SET
            ###   @1=68886 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2=71745 /* INT meta=0 nullable=0 is_null=0 */
            ###   @3='71746' /* VARSTRING(40) meta=40 nullable=0 is_null=0 */
            # at 8705690
            # at 8705753
            #171031 20:23:33 server id 7  end_log_pos 8705753 CRC32 0xdb21e336      Annotate_rows:
            #Q> insert into d2.b select @N,@N+@M,@N+@M+1
            #171031 20:23:33 server id 7  end_log_pos 8705799 CRC32 0xbe6b7d5c      Table_map: `d2`.`b` mapped to number 22
            # at 8705799
            #171031 20:23:33 server id 7  end_log_pos 8705847 CRC32 0x3382a56e      Write_rows: table id 22 flags: STMT_END_F
             
            BINLOG '
            tc34WRMHAAAALgAAAAfXhAAAABYAAAAAAAEAAmQyAAFiAAMDAw8CKAAEXH1rvg==
            tc34WRcHAAAAMAAAADfXhAAAABYAAAAAAAEAA//4Fg0BAEEYAQAFNzE3NDZupYIz
            '/*!*/;
            ### INSERT INTO `d2`.`b`
            ### SET
            ###   @1=68886 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2=71745 /* INT meta=0 nullable=0 is_null=0 */
            ###   @3='71746' /* VARSTRING(40) meta=40 nullable=1 is_null=0 */
            # at 8705847
            #171031 20:23:33 server id 7  end_log_pos 8705878 CRC32 0xbd418924      Xid = 362618
            COMMIT/*!*/;
            

            So I am changing title from "Recover from binlog is broken with RocksDB Engine after OS Crash" to "RocksDB сrash recovery is broken"

            anikitin Andrii Nikitin (Inactive) added a comment - - edited Indeed binary log has rows which are missing from the instance after crash recovery: # this is original instance recovered from kill -9 $ mysql --defaults-file=~/env1/m7-system2/my.cnf -e 'select * from d2.a where a=68886' # this is instance restored from binlog $ mysql --defaults-file=~/env1/m8-system2/my.cnf -e 'select * from d2.a where a=68886' +-------+-------+-------+ | a | b | c | +-------+-------+-------+ | 68886 | 71745 | 71746 | +-------+-------+-------+ And this is the only reference of the row in binary log (right before process got killed): BINLOG ' tc34WRMHAAAALgAAAGrWhAAAABUAAAAAAAEAAmQyAAFhAAMDAw8CKAAArZqFHw== tc34WRcHAAAAMAAAAJrWhAAAABUAAAAAAAEAA//4Fg0BAEEYAQAFNzE3NDbNkPRh '/*!*/; ### INSERT INTO `d2`.`a` ### SET ### @1=68886 /* INT meta=0 nullable=0 is_null=0 */ ### @2=71745 /* INT meta=0 nullable=0 is_null=0 */ ### @3='71746' /* VARSTRING(40) meta=40 nullable=0 is_null=0 */ # at 8705690 # at 8705753 #171031 20:23:33 server id 7 end_log_pos 8705753 CRC32 0xdb21e336 Annotate_rows: #Q> insert into d2.b select @N,@N+@M,@N+@M+1 #171031 20:23:33 server id 7 end_log_pos 8705799 CRC32 0xbe6b7d5c Table_map: `d2`.`b` mapped to number 22 # at 8705799 #171031 20:23:33 server id 7 end_log_pos 8705847 CRC32 0x3382a56e Write_rows: table id 22 flags: STMT_END_F   BINLOG ' tc34WRMHAAAALgAAAAfXhAAAABYAAAAAAAEAAmQyAAFiAAMDAw8CKAAEXH1rvg== tc34WRcHAAAAMAAAADfXhAAAABYAAAAAAAEAA//4Fg0BAEEYAQAFNzE3NDZupYIz '/*!*/; ### INSERT INTO `d2`.`b` ### SET ### @1=68886 /* INT meta=0 nullable=0 is_null=0 */ ### @2=71745 /* INT meta=0 nullable=0 is_null=0 */ ### @3='71746' /* VARSTRING(40) meta=40 nullable=1 is_null=0 */ # at 8705847 #171031 20:23:33 server id 7 end_log_pos 8705878 CRC32 0xbd418924 Xid = 362618 COMMIT/*!*/; So I am changing title from "Recover from binlog is broken with RocksDB Engine after OS Crash" to "RocksDB сrash recovery is broken"

            I can trigger the problem with steps above every time when 10.2.9 is installed http://ftp.hosteurope.de/mirror/archive.mariadb.org/mariadb-10.2.9/repo/ubuntu/pool/main/m/mariadb-10.2/ , but 10.2.10 and current 10.2 worked flawlessly. So I am OK to close this call .

            anikitin Andrii Nikitin (Inactive) added a comment - I can trigger the problem with steps above every time when 10.2.9 is installed http://ftp.hosteurope.de/mirror/archive.mariadb.org/mariadb-10.2.9/repo/ubuntu/pool/main/m/mariadb-10.2/ , but 10.2.10 and current 10.2 worked flawlessly. So I am OK to close this call .

            Closing as this seems to be no longer repeatable.

            Current code passes the test.

            psergei Sergei Petrunia added a comment - Closing as this seems to be no longer repeatable. Current code passes the test.

            People

              psergei Sergei Petrunia
              anikitin Andrii Nikitin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.