Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39534

PURGE BINARY LOGS permanently no-ops when log_requests head has out-of-order LSN

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.11.16
    • None
    • Replication, XA
    • None

    Description

      On MariaDB 10.11.16, a single stuck entry at the head of InnoDB's internal log_requests list blocks all subsequent checkpoint-notification delivery, which in turn keeps one or more entries in binlog_xid_count_list pinned with non-zero xid_count. PURGE BINARY LOGS TO <target> returns clean with no error but removes no files. The stuck state is NOT resolved by continued workload, natural binlog rotation, or innodb_flush_log_at_timeout firings. The only observed resolution is an explicit SQL FLUSH LOGS issued after the stuck state occurs; this unblocks the head and a subsequent PURGE succeeds.

      This is closely related to (but not fixed by) MDEV-25611. The commit 35c732cdde that introduced ha_flush_logs() in reset_logs() fixes the RESET MASTER hang but does not address the PURGE path, and more fundamentally does not fix the underlying walk-order bug in log_flush_notify_and_unlock().

      MariaDB 10.11.16 built with debug symbols (release build reproduces equally well).

      • innodb_flush_log_at_trx_commit = 2
      • sync_binlog = 0
      • binlog_format = MIXED
      • transaction_isolation = READ-COMMITTED
      • innodb_flush_log_at_timeout = 1 (default)
      • max_binlog_size = 1048576 (1 MiB, accelerates rotations)
      • binlog_expire_logs_seconds = 0
        max_binlog_size=1 MiB is used for rapid reproduction; the bug reproduces at any size, it just takes longer to accumulate the stuck state.

      Reproducer script:

      #!/bin/bash
      # Session 5 extension: does a continuing workload (without explicit FLUSH
      # LOGS) drain a stuck PURGE state, or does it stay stuck while more files
      # accumulate?
      #
      # Flow:
      #   1. Clean baseline, production-default settings.
      #   2. Mixed-engine writes + rotations (same pattern as prior tests) to
      #      create a stuck xid_count state.
      #   3. FLUSH LOGS + PURGE -> confirm stuck at T+0 and T+60s.
      #   4. Start a continuous mixed-engine write workload for 300s.
      #      Writes do NOT explicitly FLUSH LOGS, but natural rotation at
      #      max_binlog_size=1 MiB will kick in.
      #   5. Sample binlog count + oldest file every 30s during the writes.
      #   6. Stop writes. Sample one more time to see post-quiescent state.
      #   7. If still stuck, kick with FLUSH LOGS + PURGE to confirm unblock
      #      still works.
      #
      # Usage: session5_purge_then_workload.sh
       
      set -u
       
      MB=/opt/mysql/bld10.11.16/bin/mariadb
      SOCK=/tmp/mysql_sandbox53717.sock
       
      mysql_do() {
          "$MB" --user=msandbox --password=msandbox --socket="$SOCK" -e "$@" 2>&1
      }
      mysql_q() {
          "$MB" --user=msandbox --password=msandbox --socket="$SOCK" -Ne "$@" 2>&1
      }
      count_binlogs() { mysql_q "SHOW BINARY LOGS" | wc -l; }
      oldest_binlog() { mysql_q "SHOW BINARY LOGS" | head -1 | awk '{print $1}'; }
       
      log_sample() {
          # label, count, oldest
          local label=$1
          local n=$(count_binlogs)
          local o=$(oldest_binlog)
          printf "%-15s  binlogs=%-6d oldest=%s\n" "$label" "$n" "$o"
      }
       
      echo "==============================================="
      echo "  session 5 extension: PURGE + continued workload"
      echo "  $(date -Is)"
      echo "==============================================="
       
      if ! mysql_q "SELECT 1" >/dev/null 2>&1; then
          echo "ERROR: cannot reach mariadbd at $SOCK" >&2
          exit 1
      fi
       
      # Baseline.
      echo ""
      echo "--- clean baseline ---"
      mysql_do "FLUSH LOGS;" > /dev/null
      cur=$(mysql_q "SHOW MASTER STATUS" | awk '{print $1}')
      mysql_do "PURGE BINARY LOGS TO '$cur';" > /dev/null
      sleep 1
      log_sample "baseline"
       
      # Ensure production-matching settings.
      echo ""
      echo "--- production-matching settings ---"
      mysql_do "SET GLOBAL innodb_flush_log_at_trx_commit = 2;" > /dev/null
      mysql_do "SET GLOBAL innodb_flush_log_at_timeout    = 1;" > /dev/null
      mysql_do "
        SELECT
          @@innodb_flush_log_at_trx_commit AS iflatx,
          @@innodb_flush_log_at_timeout    AS iflat_to,
          @@sync_binlog                    AS sbl,
          @@max_binlog_size                AS max_bl;
      "
       
      # Create the stuck state exactly like session5_purge_prod_defaults.sh.
      echo ""
      echo "--- generating the stuck state (5 rotations) ---"
      mysql_do "
        USE xlt;
        CREATE TABLE IF NOT EXISTS leak_marker (id INT) ENGINE=MyISAM;
      " > /dev/null
      for i in 1 2 3 4 5; do
          mysql_do "
            USE xlt;
            INSERT INTO xl_innodb_1 (id, k, c) VALUES (66000000 + $i, 999, 's$i');
            INSERT INTO leak_marker VALUES ($i);
            FLUSH LOGS;
          " > /dev/null
      done
      mysql_do "USE xlt; DROP TABLE leak_marker;" > /dev/null
      log_sample "after rotations"
       
      # Issue the PURGE (expected to leave files in place).
      echo ""
      echo "--- FLUSH LOGS + PURGE (expected: no-op) ---"
      mysql_do "FLUSH LOGS;" > /dev/null
      target=$(mysql_q "SHOW MASTER STATUS" | awk '{print $1}')
      echo "target: $target"
      mysql_do "PURGE BINARY LOGS TO '$target';" > /dev/null
      log_sample "right after PURGE"
       
      # Verify stuck for 60s of idle.
      echo ""
      echo "--- verifying stuck for 60s of idle (no workload, no FLUSH) ---"
      prev=0
      for t in 30 60; do
          sleep $(( t - prev ))
          prev=$t
          log_sample "idle T+${t}s"
      done
       
      stuck_at_60s_count=$(count_binlogs)
      stuck_at_60s_oldest=$(oldest_binlog)
      echo "stuck snapshot at T+60s: $stuck_at_60s_count binlogs, oldest=$stuck_at_60s_oldest"
       
      if [[ "$stuck_at_60s_count" -le 1 ]]; then
          echo ""
          echo "WARNING: did not reach the stuck state. Test cannot proceed."
          echo "Possible causes: first PURGE actually cleared (unexpected on our"
          echo "sandbox), or prior session state was different."
          exit 2
      fi
       
      # Now start a continuous workload. Run a background mariadb client that
      # pipes in SQL for 300 seconds. Pattern: one InnoDB insert + one MyISAM
      # insert per iteration. No explicit FLUSH LOGS. Natural rotation will
      # happen at max_binlog_size=1 MiB boundaries.
      echo ""
      echo "--- starting 300s continuous mixed-engine workload (NO FLUSH LOGS) ---"
      mysql_do "USE xlt; CREATE TABLE IF NOT EXISTS leak_marker2 (id INT) ENGINE=MyISAM;" > /dev/null
       
      WORKLOAD_PIPE=$(mktemp -u)
      mkfifo "$WORKLOAD_PIPE"
       
      # Writer script: drives SQL through the FIFO for a given duration.
      (
        id_base=55000000
        end=$(( $(date +%s) + 300 ))
        while [[ $(date +%s) -lt $end ]]; do
            id_base=$(( id_base + 1 ))
            # Simple two-inserts-per-txn pattern. We wrap in a txn so group
            # commit fires, ensuring trx_group_commit_leader's xid_count
            # increment path runs.
            cat <<EOF
      USE xlt;
      START TRANSACTION;
      INSERT INTO xl_innodb_1 (id, k, c) VALUES ($id_base, 999, 'w');
      INSERT INTO leak_marker2 (id) VALUES ($id_base);
      COMMIT;
      EOF
        done
        echo "QUIT;"
      ) > "$WORKLOAD_PIPE" &
      writer_pid=$!
       
      "$MB" --user=msandbox --password=msandbox --socket="$SOCK" --force \
            < "$WORKLOAD_PIPE" > /tmp/session5_workload_client.log 2>&1 &
      client_pid=$!
       
      # Sample binlog count every 30s for 300s during the workload.
      echo ""
      echo "--- sampling every 30s during continued workload ---"
      echo "T+<sec>  binlogs  oldest"
      for t in 30 60 90 120 150 180 210 240 270 300; do
          sleep 30
          n=$(count_binlogs)
          o=$(oldest_binlog)
          printf "T+%3ds   %5d    %s\n" "$t" "$n" "$o"
      done
       
      # Wait for writer + client to stop. The writer exits when its while loop
      # ends (after 300s). Give the client a chance to drain.
      wait "$writer_pid" 2>/dev/null || true
      sleep 2
      # If the client is still reading, we've hit the end-of-FIFO and it will
      # exit on QUIT.
      wait "$client_pid" 2>/dev/null || true
      rm -f "$WORKLOAD_PIPE"
       
      echo ""
      echo "--- workload finished, let bg thread settle 30s ---"
      sleep 30
      log_sample "30s post-workload"
       
      post_workload_count=$(count_binlogs)
      post_workload_oldest=$(oldest_binlog)
       
      echo ""
      echo "==============================================="
      echo "  observations"
      echo "==============================================="
      echo "stuck snapshot at T+60s idle:   $stuck_at_60s_count files, oldest=$stuck_at_60s_oldest"
      echo "end of 300s workload + settle:  $post_workload_count files, oldest=$post_workload_oldest"
       
      # Did the oldest advance during the workload?
      if [[ "$post_workload_oldest" != "$stuck_at_60s_oldest" ]]; then
          echo ""
          echo "OBSERVATION: oldest file advanced during the workload."
          echo "Continued writes (with natural rotation) DID drain the stuck state."
      else
          echo ""
          echo "OBSERVATION: oldest file UNCHANGED after 300s of continued writes."
          echo "This is the stronger pathology: writes alone do not drain the stuck"
          echo "state. The stuck head of binlog_xid_count_list is resistant even"
          echo "to continuous LSN advancement and natural binlog rotation."
          echo ""
          echo "Kicking with FLUSH LOGS + PURGE to confirm unblock still works..."
          mysql_do "FLUSH LOGS;" > /dev/null
          target=$(mysql_q "SHOW MASTER STATUS" | awk '{print $1}')
          mysql_do "PURGE BINARY LOGS TO '$target';" > /dev/null
          sleep 2
          log_sample "post-kick"
      fi
       
      # Cleanup marker table.
      mysql_do "USE xlt; DROP TABLE IF EXISTS leak_marker2;" > /dev/null 2>&1
       
      echo ""
      echo "done at: $(date -Is)"
      
      

      Example output:

      >   | tee MariaDB-build/session5_purge_then_workload_run1.log
      ===============================================
        session 5 extension: PURGE + continued workload
        2026-05-05T18:02:49+00:00
      ===============================================
       
      --- clean baseline ---
      baseline         binlogs=1      oldest=mysql-bin.000018
       
      --- production-matching settings ---
      iflatx  iflat_to        sbl     max_bl
      2       1       0       1048576
       
      --- generating the stuck state (5 rotations) ---
      after rotations  binlogs=6      oldest=mysql-bin.000018
       
      --- FLUSH LOGS + PURGE (expected: no-op) ---
      target: mysql-bin.000024
      right after PURGE  binlogs=7      oldest=mysql-bin.000018
       
      --- verifying stuck for 60s of idle (no workload, no FLUSH) ---
      idle T+30s       binlogs=7      oldest=mysql-bin.000018
      idle T+60s       binlogs=7      oldest=mysql-bin.000018
      stuck snapshot at T+60s: 7 binlogs, oldest=mysql-bin.000018
       
      --- starting 300s continuous mixed-engine workload (NO FLUSH LOGS) ---
       
      --- sampling every 30s during continued workload ---
      T+<sec>  binlogs  oldest
      T+ 30s      14    mysql-bin.000018
      T+ 60s      21    mysql-bin.000018
      T+ 90s      29    mysql-bin.000018
      T+120s      36    mysql-bin.000018
      T+150s      44    mysql-bin.000018
      T+180s      51    mysql-bin.000018
      T+210s      59    mysql-bin.000018
      T+240s      66    mysql-bin.000018
      T+270s      73    mysql-bin.000018
      T+300s      81    mysql-bin.000018
       
      --- workload finished, let bg thread settle 30s ---
      30s post-workload  binlogs=81     oldest=mysql-bin.000018
       
      ===============================================
        observations
      ===============================================
      stuck snapshot at T+60s idle:   7 files, oldest=mysql-bin.000018
      end of 300s workload + settle:  81 files, oldest=mysql-bin.000018
       
      OBSERVATION: oldest file UNCHANGED after 300s of continued writes.
      This is the stronger pathology: writes alone do not drain the stuck
      state. The stuck head of binlog_xid_count_list is resistant even
      to continuous LSN advancement and natural binlog rotation.
       
      Kicking with FLUSH LOGS + PURGE to confirm unblock still works...
      post-kick        binlogs=1      oldest=mysql-bin.000099
       
      done at: 2026-05-05T18:09:25+00:00
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            dotmanila Jervin R
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.