Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17401

LOAD DATA from very big file into MyISAM table results in EOF error and corrupt index

Details

    Description

      A LOAD DATA statement is consistently failing with the following error:

      MariaDB [(none)]> LOAD DATA  LOCAL INFILE '/home/ec2-user/tab.txt' INTO TABLE db1.`tab` IGNORE 0 LINES;
      ERROR 9 (HY000): Unexpected end-of-file found when reading file '/tmp/STWraeLT' (Errcode: 175 "File too short; Expected more data in file")
      

      The resulting table is also corrupt, and cannot seem to be repaired:

      MariaDB [db1]> CHECK TABLE tab;
      +-------------------+-------+----------+------------------------------------------------------------------------+
      | Table             | Op    | Msg_type | Msg_text                                                               |
      +-------------------+-------+----------+------------------------------------------------------------------------+
      | db1.tab | check | error    | Can't read indexpage from filepos: -1                                  |
      | db1.tab | check | Error    | Index for table './db1/tab.MYI' is corrupt; try to repair it |
      | db1.tab | check | error    | Corrupt                                                                |
      +-------------------+-------+----------+------------------------------------------------------------------------+
      3 rows in set (1 min 51.569 sec)
       
      MariaDB [db1]> REPAIR TABLE tab;
      +-------------------+--------+----------+----------------------------------------------------------------------------------------------------------------------------+
      | Table             | Op     | Msg_type | Msg_text                                                                                                                   |
      +-------------------+--------+----------+----------------------------------------------------------------------------------------------------------------------------+
      | db1.tab | repair | Error    | Unexpected end-of-file found when reading file '/tmp/STjk9WiV' (Errcode: 175 "File too short; Expected more data in file") |
      | db1.tab | repair | Error    | Unexpected end-of-file found when reading file '/tmp/STpQXG9A' (Errcode: 175 "File too short; Expected more data in file") |
      | db1.tab | repair | status   | OK                                                                                                                         |
      +-------------------+--------+----------+----------------------------------------------------------------------------------------------------------------------------+
      3 rows in set (36 min 31.957 sec)
       
      MariaDB [db1]> CHECK TABLE tab;
      +-------------------+-------+----------+-------------------------------------------------------------+
      | Table             | Op    | Msg_type | Msg_text                                                    |
      +-------------------+-------+----------+-------------------------------------------------------------+
      | db1.tab | check | error    | Found key at page -1 that points to record outside datafile |
      | db1.tab | check | error    | Corrupt                                                     |
      +-------------------+-------+----------+-------------------------------------------------------------+
      2 rows in set (15.393 sec)
      

      The table has the following definition:

      MariaDB [(none)]> SHOW CREATE TABLE db1.tab\G
      *************************** 1. row ***************************
             Table: tab
      Create Table: CREATE TABLE `tab` (
        `tab_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
        `col2` int(10) unsigned NOT NULL,
        `col3` int(10) unsigned NOT NULL,
        `col4` int(10) unsigned NOT NULL,
        `col5` int(10) unsigned NOT NULL,
        `col6` tinyint(1) NOT NULL,
        `col7` double DEFAULT NULL,
        `col8` varchar(18) DEFAULT NULL,
        PRIMARY KEY (`tab_id`),
        UNIQUE KEY `unique_idx` (`col2`,`col3`,`col4`,`col6`),
        UNIQUE KEY `col8_idx` (`col8`),
        KEY `col3_idx` (`col3`,`col4`),
        KEY `col2_idx` (`col2`)
      ) ENGINE=MyISAM AUTO_INCREMENT=205163534 DEFAULT CHARSET=latin1
      1 row in set (0.005 sec)
      

      The file being loaded into the table is 14 GB large, and it has 205163533 rows in it.

      The only non-default values in the configuration file are:

      log_error=mysqld.err
      myisam_sort_buffer_size = 4G
      bulk_insert_buffer_size = 2G
      

      Attachments

        Issue Links

          Activity

            I also added here patch for 10.3 (can be helpful for merges from 5.5 up)

            sanja Oleksandr Byelkin added a comment - I also added here patch for 10.3 (can be helpful for merges from 5.5 up)

            Index checked in the new version:

            MariaDB [test]> explain select sum(col2) from tab;
            +------+-------------+-------+-------+---------------+----------+---------+------+-----------+-------------+
            | id   | select_type | table | type  | possible_keys | key      | key_len | ref  | rows      | Extra       |
            +------+-------------+-------+-------+---------------+----------+---------+------+-----------+-------------+
            |    1 | SIMPLE      | tab   | index | NULL          | col2_idx | 4       | NULL | 205163533 | Using index |
            +------+-------------+-------+-------+---------------+----------+---------+------+-----------+-------------+
            1 row in set (0.00 sec)
             
            MariaDB [test]> select sum(col2) from tab;
            +-------------+
            | sum(col2)   |
            +-------------+
            | 64923196368 |
            +-------------+
            1 row in set (4 min 34.69 sec)
             
            sanja@SanjaLaptop:~/Downloads$ cat  motif_feature.txt|perl count.perl 
            64923196368
            

            sanja Oleksandr Byelkin added a comment - Index checked in the new version: MariaDB [test]> explain select sum(col2) from tab; +------+-------------+-------+-------+---------------+----------+---------+------+-----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+----------+---------+------+-----------+-------------+ | 1 | SIMPLE | tab | index | NULL | col2_idx | 4 | NULL | 205163533 | Using index | +------+-------------+-------+-------+---------------+----------+---------+------+-----------+-------------+ 1 row in set (0.00 sec)   MariaDB [test]> select sum(col2) from tab; +-------------+ | sum(col2) | +-------------+ | 64923196368 | +-------------+ 1 row in set (4 min 34.69 sec)   sanja@SanjaLaptop:~/Downloads$ cat motif_feature.txt|perl count.perl 64923196368

            revision-id: f95616dc4774328381044b53263d6a23ae19bfb4 (mariadb-5.5.61-39-gf95616dc477)
            parent(s): 893ebb739e24637985892da1555f2e6f1a8ac5b6
            author: Oleksandr Byelkin
            committer: Oleksandr Byelkin
            timestamp: 2018-10-30 19:36:17 +0100
            message:

            MDEV-17401: LOAD DATA from very big file into MyISAM table results in EOF error and corrupt index

            my_read fixed as in higher versions.
            my_pread made as my_read aware of partial read of huge chunks of files
            MY_FULL_IO enabled for file operations

            sanja Oleksandr Byelkin added a comment - revision-id: f95616dc4774328381044b53263d6a23ae19bfb4 (mariadb-5.5.61-39-gf95616dc477) parent(s): 893ebb739e24637985892da1555f2e6f1a8ac5b6 author: Oleksandr Byelkin committer: Oleksandr Byelkin timestamp: 2018-10-30 19:36:17 +0100 message: MDEV-17401 : LOAD DATA from very big file into MyISAM table results in EOF error and corrupt index my_read fixed as in higher versions. my_pread made as my_read aware of partial read of huge chunks of files MY_FULL_IO enabled for file operations —

            Issue reoccured in 10.3.12

            Richard Richard Stracke added a comment - Issue reoccured in 10.3.12

            OK, the problem in 10.3 is that the flag MY_FULL_IO is not set in merge_buffers()

            sanja Oleksandr Byelkin added a comment - OK, the problem in 10.3 is that the flag MY_FULL_IO is not set in merge_buffers()

            People

              sanja Oleksandr Byelkin
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.