Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22988

InnoDB: The B-tree of index GEN_CLUST_INDEX is corrupted

    XMLWordPrintable

    Details

      Description

      origin/bb-10.5-release 8394979aae274a19689102420ebb86580e26f92e 2020-06-22T12:10:06+02:00
      compiled without debug.
       
       
      The RQG log shows
      In short:
           The connection Thread1 hits the problem, reports it and exits.
           Than the connection Thread2 hits the problem too, reports it and exits.   This might be important for understanding the mysql.err better.
           The main process in RQG becomes aware of the issue and starts to kill processes for aborting the run.
      In detail
      # 2020-06-23T12:12:52 [17433] ERROR: The query 'CHECK TABLE t1 /* E_R Thread1 QNO 2363 CON_ID 13 */ ' passed but has a result set line
      # 2020-06-23T12:12:52 [17433] ERROR: ->test.t1 check Warning InnoDB: The B-tree of index GEN_CLUST_INDEX is corrupted.<-.
      # 2020-06-23T12:12:52 [17433] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
      # 2020-06-23T12:12:52 [17433] ERROR: The query 'CHECK TABLE t1 /* E_R Thread1 QNO 2363 CON_ID 13 */ ' passed but has a result set line
      # 2020-06-23T12:12:52 [17433] ERROR: ->test.t1 check Warning InnoDB: Index 'idx' contains 5 entries, should be 18446744073709551615.<-.
      # 2020-06-23T12:12:52 [17433] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
      # 2020-06-23T12:12:52 [17433] Thread1 in Mixer : Critical failure STATUS_DATABASE_CORRUPTION (106), Error 0 reported at dsn dbi:mysql:host=127.0.0.1:port=27840:user=root:database=test:mysql_local_infile=1
      # 2020-06-23T12:12:52 [17433] GenTest: Server crash or critical failure (STATUS_DATABASE_CORRUPTION) was reported.
      # 2020-06-23T12:12:52 [17433]          The child process for Thread1 will be stopped.
      # 2020-06-23T12:12:52 [17433] GenTest: child 17433 is being stopped with status STATUS_DATABASE_CORRUPTION
      # 2020-06-23T12:12:52 [17436] ERROR: The query 'CHECK TABLE t1 /* E_R Thread2 QNO 2444 CON_ID 12 */ ' passed but has a result set line
      # 2020-06-23T12:12:52 [17436] ERROR: ->test.t1 check Warning InnoDB: The B-tree of index GEN_CLUST_INDEX is corrupted.<-.
      # 2020-06-23T12:12:52 [17436] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
      # 2020-06-23T12:12:52 [17436] ERROR: The query 'CHECK TABLE t1 /* E_R Thread2 QNO 2444 CON_ID 12 */ ' passed but has a result set line
      # 2020-06-23T12:12:52 [17436] ERROR: ->test.t1 check Warning InnoDB: Index idx is marked as corrupted<-.
      # 2020-06-23T12:12:52 [17436] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
      # 2020-06-23T12:12:52 [17436] Thread2 in Mixer : Critical failure STATUS_DATABASE_CORRUPTION (106), Error 0 reported at dsn dbi:mysql:host=127.0.0.1:port=27840:user=root:database=test:mysql_local_infile=1
      # 2020-06-23T12:12:52 [17436] GenTest: Server crash or critical failure (STATUS_DATABASE_CORRUPTION) was reported.
      # 2020-06-23T12:12:52 [17436]          The child process for Thread2 will be stopped.
      # 2020-06-23T12:12:52 [17436] GenTest: child 17436 is being stopped with status STATUS_DATABASE_CORRUPTION
      # 2020-06-23T12:12:53 [10133] Process with pid 17433 for Thread1 ended with status STATUS_DATABASE_CORRUPTION
      # 2020-06-23T12:12:53 [10133] Killing (TERM) remaining worker process with pid 17436...
      # 2020-06-23T12:12:53 [10133] Process with pid 17436 for Thread2 ended with status STATUS_DATABASE_CORRUPTION
      # 2020-06-23T12:12:55 [17432] DEBUG: GenTest::ReporterManager::monitor: Will return the (maximum) status 0
      # 2020-06-23T12:13:05 [17432] DEBUG: GenTest::ReporterManager::monitor: Will return the (maximum) status 0
      # 2020-06-23T12:13:09 [10133] Killing periodic reporting process with pid 17432...
       
      mysql.err shows
      [rr 11138 509260]2020-06-23 12:12:51 12 [Note] InnoDB: Online DDL : Applying log to index
      [rr 11138 510517]2020-06-23 12:12:52 13 [Note] InnoDB: Online DDL : Start reading clustered index of the table and create temporary files
      [rr 11138 510565]2020-06-23 12:12:52 13 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
      [rr 11138 510569]2020-06-23 12:12:52 13 [Note] InnoDB: Online DDL : Applying log to index
      [rr 11138 510869]2020-06-23 12:12:52 13 [Warning] InnoDB: Wrong page type 17855
      [rr 11138 510873]2020-06-23 12:12:52 13 [ERROR] InnoDB: Apparent corruption in space 30 page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
      [rr 11138 510877]2020-06-23 12:12:52 13 [ERROR] InnoDB: In page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
      [rr 11138 510885]2020-06-23 12:12:52 13 [ERROR] InnoDB: Flagged corruption of `idx` in table `test`.`t1` in CHECK TABLE; Wrong count
      [rr 11138 511000]2020-06-23 12:12:52 13 [Warning] Aborted connection 13 to db: 'test' user: 'root' host: 'localhost' (Got an error reading communication packets)
      [rr 11138 511004]2020-06-23 12:12:52 13 [Warning] Aborted connection 13 to db: 'test' user: 'root' host: 'localhost' (This connection closed normally)
      [rr 11138 511021]2020-06-23 12:12:52 12 [Warning] InnoDB: Wrong page type 17855
      [rr 11138 511025]2020-06-23 12:12:52 12 [ERROR] InnoDB: Apparent corruption in space 30 page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
      [rr 11138 511029]2020-06-23 12:12:52 12 [ERROR] InnoDB: In page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
       
      RQG
      git clone https://github.com/mleich1/rqg --branch experimental RQG
      origin/experimental a4fb38dacc00ea901dbad55d72ec31d4514b8830 2020-06-19T17:19:27+02:00
       
      perl rqg.pl \
      --duration=300 \
      --queries=10000000 \
      --threads=2 \
      --no_mask \
      --seed=random \
      --rpl_mode=none \
      --gendata=conf/runtime/alter_online.zz \
      --engine=InnoDB \
      --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      --mysqld=--innodb-lock-wait-timeout=50 \
      --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
      --mysqld=--connect_timeout=60 \
      --mysqld=--net_read_timeout=30 \
      --mysqld=--innodb_stats_persistent=on \
      --mysqld=--innodb_page_size=32K \
      --mysqld=--log_bin_trust_function_creators=1 \
      --mysqld=--net_write_timeout=60 \
      --mysqld=--loose-max-statement-time=30 \
      --mysqld=--loose-idle_write_transaction_timeout=0 \
      --mysqld=--log-output=none \
      --mysqld=--lock-wait-timeout=86400 \
      --mysqld=--log-bin \
      --mysqld=--interactive_timeout=28800 \
      --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      --mysqld=--innodb-buffer-pool-size=256M \
      --mysqld=--slave_net_timeout=60 \
      --mysqld=--loose-table_lock_wait_timeout=50 \
      --mysqld=--wait_timeout=28800 \
      --mysqld=--loose-idle_transaction_timeout=0 \
      --reporters=Backtrace,Deadlock1,ErrorLog \
      --validators=None \
      --grammar=p00025.yy
      --workdir=<local settings> \
      --vardir=<local settings> \
      --mtr-build-thread=<local settings> \
      --basedir1=<local settings>
       
      Replaying the problem with rr enabled, two concurrent sessions and the uploaded simplified grammar
      - a non debug build is rather fast     Just a few RQG runs are required.
      - a debug build is  extreme slow      500 runs had already a "pass" when the first replay happened.
       
      This seems to explain why I have not observed that problem yesterday even though making > 10000 RQG runs.
      debug builds, ~ 30% of tests do not use a grammar which is capable to reveal the problem
       
      Features which are sometimes critical but not used in the test at all
      - Recovery after crash
      - FOREIGN KEYs, TRUNCATE, FLUSH, FULLTEXT
       
      No replay of the problem on
      origin/10.4 b80b52394d41a4c334642ae8b3af16f76f6fac57 2020-06-22T13:25:25+03:00
      build without debug with 2500 RQG runs.
      So it seems to be a 10.5 problem only.
      
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: