Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5248

Serious incompatibility and data corruption of DATETIME and DATE types due to get_innobase_type_from_mysql_type refactor combined with InnoDB Online DDL

    XMLWordPrintable

Details

    Description

      Note that this issue has been discussed somewhat in MDEV-4024, but this is a wider problem and not a duplicate.

      Per the discussion in MDEV-4024, get_innobase_type_from_mysql_type was refactored in the following commits:

      In addition to the incompatibilities in TINYINT UNSIGNED, ENUM, SET, and YEAR described in MDEV-4024, the DATETIME and DATE types are affected in a much more serious way.

      Using DATETIME as an example (and the MySQL 5.5 code, but it doesn't actually matter), Field_datetime returns HA_KEYTYPE_ULONGLONG as its key_type:

      And inside InnoDB in MySQL 5.5, it is mapped to DATA_INT in get_innobase_type_from_mysql_type:

      However, in MySQL 5.5, it is NOT marked as DATA_UNSIGNED despite it being treated by the handler as HA_KEYTYPE_ULONGLONG. This should perhaps be considered a bug in MySQL/InnoDB, but it is ancient history at this point, and cannot be changed without extensive consideration.

      In MariaDB since the refactor of get_innobase_type_from_mysql_type which occurred in the 5.3 series (and is now included in 5.5 and 10.0), DATETIME is now handled by the generic case for HA_KEYTYPE_ULONGLONG, which does set DATA_UNSIGNED (because this code is shared with BIGINT UNSIGNED, of course):

      InnoDB's on-disk integer format for all sizes reverse the meaning of the sign bit and flips it with XOR on write and read (to allow the big endian byte representation of integers to sort correctly as binary data):

      Nothing breaks immediately because of this, due to the fact that get_innobase_type_from_mysql_type is largely used only for table creation.

      Where this breaks badly, is if InnoDB Online DDL is ever exercised on such a table. In that case the new table will be created using MariaDB's definition of these columns rather than MySQL's. When the values are copied from one table to another, it will be done using buffers inside InnoDB without reinterpreting the values, which will cause them to become corrupted in the destination table.

      Additionally, the same problem exists in reverse with tables created on MariaDB during cross-grade to MySQL 5.6 – the exact same misinterpretation will occur.

      Unfortunately there is not a very good way to determine whether any given table or column may be corrupted or how to properly interpret its values. Furthermore once corruption has occurred, single tables may contain a mix of corrupted-sign-bit and uncorrupted records, making either upgrade or downgrade impossible to resolve the situation.

      How to repeat

      # On a version of MySQL or any older release prior to the refactoring in 5.3:
       
      set session time_zone="+00:00";
       
      drop table if exists t_dt;
      create table t_dt (
        vc varchar(100) not null,
        dt datetime not null,
        primary key (vc)
      ) engine=innodb; 
       
      insert into t_dt (vc, dt)
      values
        ("0000-00-00 00:00:00", "0000-00-00 00:00:00"),
        ("0450-12-31 12:34:56", "0450-12-31 12:34:56"),
        ("2013-11-04 12:34:56", "2013-11-04 12:34:56"),
        ("9876-11-04 12:34:56", "9876-11-04 12:34:56"),
        ("9999-12-31 23:59:59", "9999-12-31 23:59:59");
       
      drop table if exists t_d;
      create table t_d (
        vc varchar(100) not null,
        d date not null,
        primary key (vc)
      ) engine=innodb; 
       
      insert into t_d (vc, d)
      values
        ("0000-00-00", "0000-00-00"),
        ("0450-12-31", "0450-12-31"),
        ("2013-11-04", "2013-11-04"),
        ("9876-11-04", "9876-11-04"),
        ("9999-12-31", "9999-12-31");
       
      # Verify that all values are correct:
       
      set session time_zone="+00:00";
      select * from t_dt;
      select * from t_d;
       
          # +---------------------+---------------------+
          # | vc                  | dt                  |
          # +---------------------+---------------------+
          # | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
          # | 0450-12-31 12:34:56 | 0450-12-31 12:34:56 |
          # | 2013-11-04 12:34:56 | 2013-11-04 12:34:56 |
          # | 9876-11-04 12:34:56 | 9876-11-04 12:34:56 |
          # | 9999-12-31 23:59:59 | 9999-12-31 23:59:59 |
          # +---------------------+---------------------+
       
          # +------------+------------+
          # | vc         | d          |
          # +------------+------------+
          # | 0000-00-00 | 0000-00-00 |
          # | 0450-12-31 | 0450-12-31 |
          # | 2013-11-04 | 2013-11-04 |
          # | 9876-11-04 | 9876-11-04 |
          # | 9999-12-31 | 9999-12-31 |
          # +------------+------------+
       
      # Upgrade to MariaDB 10.0 (10.0.4 in this case), using normal process.
       
      ## Exercise for the reader.
       
      # Verify that all values are correct:
       
      set session time_zone="+00:00";
      select * from t_dt;
      select * from t_d;
       
      # Omitted.
       
      # Check the InnoDB internal type for the columns:
       
      select
        innodb_sys_tables.name as table_name,
        innodb_sys_columns.name as column_name,
        prtype & 0xff as mysql_type,
        mtype as innodb_type,
        prtype & 0x200 = 0x200 as is_unsigned
      from information_schema.innodb_sys_columns
      join information_schema.innodb_sys_tables using (table_id)
      where innodb_sys_tables.name like "test/t_%"
        and innodb_sys_columns.name not in ("vc", "x");
       
          # +------------+-------------+------------+-------------+-------------+
          # | table_name | column_name | mysql_type | innodb_type | is_unsigned |
          # +------------+-------------+------------+-------------+-------------+
          # | test/t_d   | d           |         10 |           6 |           0 |
          # | test/t_dt  | dt          |         12 |           6 |           0 |
          # +------------+-------------+------------+-------------+-------------+
       
      # Issue some ALTER that will use InnoDB Online DDL:
       
      alter table t_dt add x int;
      alter table t_d add x int;
       
      # Notice that values are corrupted:
       
      set session time_zone="+00:00";
      select * from t_dt;
      select * from t_d;
       
          # +---------------------+---------------------+------+
          # | vc                  | dt                  | x    |
          # +---------------------+---------------------+------+
          # | 0000-00-00 00:00:00 | 203-68-54 77:58:08 | NULL |
          # | 0450-12-31 12:34:56 | 653-80-85 89:92:64 | NULL |
          # | 2013-11-04 12:34:56 | 216-79-58 89:92:64 | NULL |
          # | 9876-11-04 12:34:56 | 79-79-58 89:92:64 | NULL |
          # | 9999-12-31 23:59:59 | 02-80-86 01:17:67 | NULL |
          # +---------------------+---------------------+------+
       
          # +------------+------------+------+
          # | vc         | d          | x    |
          # +------------+------------+------+
          # | 0000-00-00 | @384-00-00 | NULL |
          # | 0450-12-31 | @834-12-31 | NULL |
          # | 2013-11-04 | B397-11-04 | NULL |
          # | 9876-11-04 | J260-11-04 | NULL |
          # | 9999-12-31 | J383-12-31 | NULL |
          # +------------+------------+------+
       
      # Notice that the types have changed:
       
      select
        innodb_sys_tables.name as table_name,
        innodb_sys_columns.name as column_name,
        prtype & 0xff as mysql_type,
        mtype as innodb_type,
        prtype & 0x200 = 0x200 as is_unsigned
      from information_schema.innodb_sys_columns
      join information_schema.innodb_sys_tables using (table_id)
      where innodb_sys_tables.name like "test/t_%"
        and innodb_sys_columns.name not in ("vc", "x");
       
        # +------------+-------------+------------+-------------+-------------+
        # | table_name | column_name | mysql_type | innodb_type | is_unsigned |
        # +------------+-------------+------------+-------------+-------------+
        # | test/t_d   | d           |         10 |           6 |           1 |
        # | test/t_dt  | dt          |         12 |           6 |           1 |
        # +------------+-------------+------------+-------------+-------------+

      Attachments

        Issue Links

          Activity

            People

              serg Sergei Golubchik
              jeremycole Jeremy Cole
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.