Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31688

Corrupt MyISAM Index with certain myisam_sort_buffer_size

    XMLWordPrintable

Details

    Description

      MyISM table index becomes corrupted and table unusable.
      You can either do "load data infile" and the resultant MYI index file will be corrupted or you can do myisamchk and the resultant MYI index file will become corrupted.

      I have found that different myisam_sort_buffer_size causes different results where some are just find and others are corrupted.

      I have written a bash script that demonstrates the problem and can be run inside the official Docker MariaDB.
      I have found that I need about 550,000,000 records in a table, a column with an int that has an index on it, and a large myisam_sort_buffer_size to produce the problem.

      I have not found a version of MariaDB where this works correctly, but I did not try very old versions.

      % cat a.bash
      {{#! /usr/bin/env bash
      #

      1. © 2023 REDX LLC
        #
      2. @author David McLaughlin <david.mclaughlin@redx.com>
        #

      mkfifo /tmp/somefifo

      MARIADB='mariadb -u root -pmy-secret-pw'

      1. 6G works
      2. $MARIADB -e 'set global myisam_sort_buffer_size=6000000000'
      3. 7G fails
        $MARIADB -e 'set global myisam_sort_buffer_size=7000000000'
        $MARIADB -e 'drop database if exists a'
        $MARIADB -e 'create database a'
        $MARIADB -e 'create table a.a(a int not null, key a(a)) engine=myisam default charset=latin1 collate=latin1_general_ci'

      lines=0
      while (($lines < 550000000))
      do
      echo -e '0\n1'
      ((lines+=2))
      done > /tmp/somefifo &

      $MARIADB -e 'load data infile "/tmp/somefifo" into table a.a'
      wait

      rm /tmp/somefifo

      $MARIADB -e 'select count, a from a.a group by a'
      $MARIADB -e 'drop table a.a'
      $MARIADB -e 'drop database a'}}

      The output for the failure is:
      {{------------+

      count a

      ------------+

      101 0
      177777780 1

      ------------+
      }}
      The output for the correct one:
      {{root@a94ad6d650ce:/# time bash a.bash
      ------------+

      count a

      ------------+

      275000000 0
      275000000 1

      ------------+

      real 48m48.884s
      user 33m28.604s
      sys 4m54.678s
      root@a94ad6d650ce:/#
      }}

      To reproduce the problem you can run build/run a docker instance using this command:
      docker run --detach --name some-mariadb --env MARIADB_USER=example-user --env MARIADB_PASSWORD=my_cool_secret --env MARIADB_ROOT_PASSWORD=my-secret-pw mariadb:latest

      Then I connnect to the container using
      docker exec -it some-mariadb bash
      put the bash script in with
      cat > a.bash
      past Bash script from above
      and then run it with
      bash a.bash

      Please contact me if you have any questions.
      David McLaughlin

      Attachments

        Activity

          People

            Unassigned Unassigned
            davidmclaughlin David McLaughlin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.