Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39668

REPAIR TABLE ... USE_FRM QUICK silently corrupts Aria tables with stale bitmaps; REPAIR TABLE ... USE_FRM QUICK with multiple keys produces false "Found too many records" error

    XMLWordPrintable

Details

    • Unexpected results

    Description

      Both bugs are related to code in ma_check.c.

      Bug A: Silent data loss on tables with stale bitmap (QUICK mode)

      Symptom: `REPAIR TABLE <tbl> USE_FRM QUICK` returns status: OK, but the .MAI file is truncated to 4096 bytes (header only). `SELECT COUNT( * )` returns 0. A subsequent `CHECK TABL`E reports "Record-count is not ok; found N Should be: 0".

      Root cause: In `sort_get_next_record`, the `BLOCK_RECORD` branch, when `fix_datafile == 0` (QUICK mode), `_ma_scan_block_record` is used. This function relies on bitmap pages to locate live records. A table needing repair may have a stale or inconsistent bitmap - pages marked free can still contain valid records. All records are skipped silently, the index is rebuilt as empty, and the engine reports success.

      Proposed fix: In QUICK mode (`!sort_param->fix_datafile`), use `_ma_safe_scan_block_record` instead, which reads every data page unconditionally regardless of bitmap state - consistent with the behaviour of the `info != sort_info->new_info` branch.

      Bug B — False "Found too many records" error on tables with multiple keys

      Symptom: `REPAIR TABLE <tbl> USE_FRM QUICK` on a table with `>=2` keys reports "Key N - Found too many records; Can't continue" for all keys after the first. The engine then retries via keycache and eventually succeeds ("Number of rows changed from 0 to N", status: OK), but the spurious error causes callers that check for any error row to treat the repair as failed.

      Root cause: In `maria_repair_by_sort`, the per-key loop calls `maria_scan_init` / `maria_scan_end` for each key, but sort_info.page - which tracks the last page read inside `_ma_safe_scan_block_record` - is never reset between iterations. After key 0 finishes, sort_info.page equals the EOF position. When key 1 starts, _ma_safe_scan_block_record immediately increments it past EOF -> `HA_ERR_END_OF_FILE` -> 0 records found -> record counter mismatch -> false error. The same stale-page issue reproduces without QUICK after `fix_datafile` rewrites the data file.

      Proposed fix: Add `sort_info.page` = 0; before each `maria_scan_init` call in the per-key loop in `maria_repair_by_sort`.


      Steps to Reproduce

      Bug A: Create an Aria table with many rows. Manually invalidate the bitmap (e.g. by copying the table from another server without zerofill). Run `REPAIR TABLE <tbl> USE_FRM QUIC`K. Observe status: OK but `.MAI` truncated to 4096 bytes.

      Bug B: Create an Aria table with >=2 keys and ~58+ rows (58 is number of rows in my affected table I used for debugging). Run `REPAIR TABLE <tbl> USE_FRM QUICK`. Observe "Key 2 - Found too many records; Can't continue" in the result set despite eventual success.


      Expected Behaviour
      • `REPAIR TABLE ... USE_FRM QUICK` correctly rebuilds all keys regardless of bitmap state.
      • No spurious error rows are produced for secondary keys.

      Attachments

        Activity

          People

            shipjain Shipra Jain
            VadimK Vadim Korolev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.