Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21230

Corrupt database with MariaDB 10.4.10 (see: MDEV-20987)

Details

    Description

      We are still experiencing the issue first reported with MDEV-20987, even with MariaDB 10.4.10. After MariaDB and/or the server is rebooted, we see errors like "Table 'relution.scheduler_task' doesn't exist in engine".

      Steps for reproduction
      Base installation

      • Create new virtual machine
        • "CentOS 7.x / MariaDB 10.4.x"
        • Linux
        • Red Hat (64-bit)
        • 4096 MB base memory
        • 20 GB SSD
      • Install CentOS
        • CentOS-7-x86_64-Minimal-1804.iso
        • Minimal Install
        • Automatic partitioning
        • Wired network
        • Create user during installation ("relution")
      • Reboot

      Post-Installation steps

      • yum -y update
      • lsb_release -a
        • LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
        • Distributor ID: CentOS
        • Description: CentOS Linux release 7.7.1908 (Core)
        • Release: 7.7.1908
        • Codename: Core
      • Reboot

      Install MariaDB

      • Create /etc/yum.repos.d/MariaDB.repo

        # MariaDB 10.4 CentOS repository list - created 2019-12-05 10:15 UTC
        # http://downloads.mariadb.org/mariadb/repositories/
        [mariadb]
        name = MariaDB
        baseurl = http://yum.mariadb.org/10.4/centos7-amd64
        gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
        gpgcheck=1
        

      • yum -y install mariadb-server mariadb
      • mariadb --version

        mariadb  Ver 15.1 Distrib 10.4.10-MariaDB, for Linux (x86_64) using readline 5.1
        

      Configure MariaDB for application

      • Create /etc/my.cnf.d/relution.cnf

        [mysqld]
        collation-server=utf8mb4_general_ci
        character-set-server=utf8mb4
        max_allowed_packet=1G
        innodb_file_per_table=1
        

      • systemctl start mariadb.service
      • systemctl enable mariadb.service

      Install application

      Create initial database

      • Create /opt/create_database.sh =>

        #!/bin/bash
        if [[ $# -ne 2 ]]; then
            echo "Usage: $0 <database> <password>"
            exit 1
        fi
         
        DATABASE=$1
        PASSWORD=$2
         
        mysql -u root <<EOF
        CREATE DATABASE $DATABASE CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_general_ci';
        CREATE USER 'relution'@'localhost' IDENTIFIED BY '$PASSWORD';
        GRANT ALL PRIVILEGES ON $DATABASE.* TO 'relution'@'localhost' IDENTIFIED BY '$PASSWORD';
        CREATE USER 'relution'@'%' IDENTIFIED BY '$PASSWORD';
        GRANT ALL PRIVILEGES ON $DATABASE.* TO 'relution'@'%' IDENTIFIED BY '$PASSWORD';
        FLUSH PRIVILEGES;
        EOF
        

      • chmod +x create_database.sh
      • ./create_database.sh relution password

      Minimal application configuration

      • Create /opt/relution/application.yml =>

        relution:
          server:
            externalURL: https://www.example.org
          database:
            type: mysql
            url: jdbc:mariadb://127.0.0.1:3306/relution?useServerPrepStmts=true
            username: relution
            password: password
        

      Start application

      • sudo -u relution ./bin/relution-foreground.sh
      • Wait until database has been initialized
      • "INFO 10033 [ main] vice.StandardLockService: Successfully released change log lock"
      • Stop application (Ctrl+C)

      You can now start and stop the application, the database is working as expected.

      Restart MariaDB

      • systemctl stop mariadb.service
      • systemctl start mariadb.service

      Attempt to start application again

      • sudo -u relution ./bin/relution-foreground.sh
        ==> The database is now broken

      2019-12-05 11:37:36.146 ERROR 10335 [        main] s.boot.SpringApplication: Application run failed []
      java.sql.SQLException: Table 'relution.scheduler_task' doesn't exist in engine
      Query is: SELECT t0.`uuid`, t1.`scheduler_task_uuid`, t0.`active`, t0.`clustered`, t0.`data`, t0.`description`, t0.`last_success_date`, t0.`name`, t0.`run_as_user`, t0.`schedulable_type`, t0.`schedulable_uuid`, t0.`version`, t1.`expression`, t1.`period_number`, t1.`period_type`, t1.`start` FROM `scheduler_task` t0, `scheduler_task_period` t1 WHERE (t1.`scheduler_task_uuid` = t0.`uuid`)
      <…>
      Wrapped by: org.eclipse.persistence.exceptions.DatabaseException: 
      Internal Exception: java.sql.SQLSyntaxErrorException: (conn=18) Table 'relution.scheduler_task' doesn't exist in engine
      Error Code: 1932
      

      Attachments

        1. docker-compose.yml
          1 kB
        2. mariadb.log
          3 kB
        3. relution.cnf
          0.1 kB

        Issue Links

          Activity

            This could be related to MDEV-21088 and MDEV-21148.

            marko Marko Mäkelä added a comment - This could be related to MDEV-21088 and MDEV-21148 .
            m.pfeiffer Markus added a comment -

            @marko, I used the Docker setup from #comment-140997 just now to produce the server log file I attached in my previous comment. So this was a "fresh install" of MariaDB 10.4.10 with a brand new database. The issue can be easily reproduced with the steps in the initial description or by using Docker for both MariaDB and our product.

            I know the initial description is pretty lengthy, but basically it boils down to:

            1. Install OS
            2. Install MariaDB
            3. Install our service
            4. Start our service for the first time
            5. Let it initialize its database (create tables, fill in some default content)
            6. Stop our service
            7. Restart database
              ==> Can no longer start our service because database is now corrupt

            The Docker setup does pretty much the same thing, only a lot faster, since there is no need to install anything. The issue occurs with both MariaDB 10.4.10 and 10.4.9, but does not occur with 10.4.11. Once corrupt, updating the MariaDB version does not fix the issue. So this is a "fresh install" each time, by using "docker-compose down --volumes" between tests to remove the containers and associated volumes.

            m.pfeiffer Markus added a comment - @marko, I used the Docker setup from #comment-140997 just now to produce the server log file I attached in my previous comment. So this was a "fresh install" of MariaDB 10.4.10 with a brand new database. The issue can be easily reproduced with the steps in the initial description or by using Docker for both MariaDB and our product. I know the initial description is pretty lengthy, but basically it boils down to: Install OS Install MariaDB Install our service Start our service for the first time Let it initialize its database (create tables, fill in some default content) Stop our service Restart database ==> Can no longer start our service because database is now corrupt The Docker setup does pretty much the same thing, only a lot faster, since there is no need to install anything. The issue occurs with both MariaDB 10.4.10 and 10.4.9, but does not occur with 10.4.11. Once corrupt, updating the MariaDB version does not fix the issue. So this is a "fresh install" each time, by using "docker-compose down --volumes" between tests to remove the containers and associated volumes.
            m.pfeiffer Markus added a comment -

            Alright, I've used our service's logs to get the SQL statements issued by Liquibase to set up the database. Using the statements for one of the affected tables, I'm now able to reproduce the issue manually.

            1. Start MariaDB (docker-compose up)
            2. Connect with a client
            3. Execute the SQL commands below
            4. Restart MariaDB (Ctrl+C, docker-compose stop, docker-compose up)
            5. The table is now inaccessible

            Repeating the same steps with MariaDB 10.4.11 does no longer see the same issue.

            docker-compose.yml

            Show all

            version: '2.1'
            services:
              mariadb:
                image: mariadb:10.4.10
                container_name: testbox_mariadb
                environment:
                  - MYSQL_DATABASE=relution
                  - MYSQL_ROOT_PASSWORD=S3creTM4r14
                  - MYSQL_USER=relution
                  - MYSQL_PASSWORD=relution
                expose:
                  - '3306'
                ports:
                  - '3307:3306'
                volumes:
                  - './mariadb.cnf:/etc/mysql/conf.d/mariadb.cnf'
            

            mariadb.cnf

            Show all

            [mysqld]
            collation-server=utf8mb4_general_ci
            character-set-server=utf8mb4
            max_allowed_packet=1G
            innodb_file_per_table=1
            

            SQL

            CREATE DATABASE `corruption`;
             
            CREATE TABLE `corruption`.`scheduler_task` (`uuid` CHAR(36) NOT NULL, `name` VARCHAR(255) NOT NULL, `description` TEXT NULL, `run_as_user` VARCHAR(255) NULL, `active` BIT(1) NULL, `clustered` BIT(1) NULL, `schedulable_type` VARCHAR(255) NOT NULL, `schedulable_uuid` VARCHAR(255) NOT NULL, `last_success_date` timestamp NULL, `data` BLOB NULL, `bundle` VARCHAR(255) NULL, `version` INT DEFAULT 1 NOT NULL, `application_uuid` CHAR(36) NULL, CONSTRAINT `PK_SCHEDULER_TASK` PRIMARY KEY (`uuid`)) ENGINE=InnoDB ROW_FORMAT=DYNAMIC;
            ALTER TABLE `corruption`.`scheduler_task` MODIFY `last_success_date` timestamp NULL;
            ALTER TABLE `corruption`.`scheduler_task` ROW_FORMAT = DYNAMIC;
            ALTER TABLE `corruption`.`scheduler_task` DROP COLUMN `application_uuid`;
            ALTER TABLE `corruption`.`scheduler_task` DROP COLUMN `bundle`;
            

            m.pfeiffer Markus added a comment - Alright, I've used our service's logs to get the SQL statements issued by Liquibase to set up the database. Using the statements for one of the affected tables, I'm now able to reproduce the issue manually. Start MariaDB (docker-compose up) Connect with a client Execute the SQL commands below Restart MariaDB (Ctrl+C, docker-compose stop, docker-compose up) The table is now inaccessible Repeating the same steps with MariaDB 10.4.11 does no longer see the same issue. docker-compose.yml Show all version: '2.1' services: mariadb: image: mariadb:10.4.10 container_name: testbox_mariadb environment: - MYSQL_DATABASE=relution - MYSQL_ROOT_PASSWORD=S3creTM4r14 - MYSQL_USER=relution - MYSQL_PASSWORD=relution expose: - '3306' ports: - '3307:3306' volumes: - './mariadb.cnf:/etc/mysql/conf.d/mariadb.cnf' mariadb.cnf Show all [mysqld] collation-server=utf8mb4_general_ci character- set -server=utf8mb4 max_allowed_packet=1G innodb_file_per_table=1 SQL CREATE DATABASE `corruption`;   CREATE TABLE `corruption`.`scheduler_task` (`uuid` CHAR(36) NOT NULL, `name` VARCHAR(255) NOT NULL, `description` TEXT NULL, `run_as_user` VARCHAR(255) NULL, `active` BIT(1) NULL, `clustered` BIT(1) NULL, `schedulable_type` VARCHAR(255) NOT NULL, `schedulable_uuid` VARCHAR(255) NOT NULL, `last_success_date` timestamp NULL, `data` BLOB NULL, `bundle` VARCHAR(255) NULL, `version` INT DEFAULT 1 NOT NULL, `application_uuid` CHAR(36) NULL, CONSTRAINT `PK_SCHEDULER_TASK` PRIMARY KEY (`uuid`)) ENGINE=InnoDB ROW_FORMAT=DYNAMIC; ALTER TABLE `corruption`.`scheduler_task` MODIFY `last_success_date` timestamp NULL; ALTER TABLE `corruption`.`scheduler_task` ROW_FORMAT = DYNAMIC; ALTER TABLE `corruption`.`scheduler_task` DROP COLUMN `application_uuid`; ALTER TABLE `corruption`.`scheduler_task` DROP COLUMN `bundle`;

            m.pfeiffer, it looks like your report could be a duplicate of MDEV-21088. The PRIMARY KEY is CHAR(36). If the default character set is UTF-8 or some other variable-length character set, then ROW_FORMAT=DYNAMIC would internally store that as variable length. In the case of utf8mb3, the internally stored length would be 36 to 3*36 bytes.

            Can you confirm the character set of the column uuid, for example by SHOW CREATE TABLE corruption.scheduler_task?

            Free support advice:

            1. Try to use as short keys as possible, especially for InnoDB PRIMARY KEY. Using a text string type for storing hexadecimal digits is an overkill.
            2. Be aware that when instant DROP COLUMN is being used (MDEV-15562), some storage overhead will remain. Today I was working on MDEV-20590 to help control that.
            marko Marko Mäkelä added a comment - m.pfeiffer , it looks like your report could be a duplicate of MDEV-21088 . The PRIMARY KEY is CHAR(36) . If the default character set is UTF-8 or some other variable-length character set, then ROW_FORMAT=DYNAMIC would internally store that as variable length. In the case of utf8mb3 , the internally stored length would be 36 to 3*36 bytes. Can you confirm the character set of the column uuid , for example by SHOW CREATE TABLE corruption.scheduler_task ? Free support advice: Try to use as short keys as possible, especially for InnoDB PRIMARY KEY . Using a text string type for storing hexadecimal digits is an overkill. Be aware that when instant DROP COLUMN is being used ( MDEV-15562 ), some storage overhead will remain. Today I was working on MDEV-20590 to help control that.
            m.pfeiffer Markus added a comment -

            It does indeed look like a duplicate of that issue. We use utf8mb4 for our database since we need to support emojis.

            We rarely drop columns for backwards compatibility reasons, which would explain why <2% of our tables were affected. We use strings for legacy/compatibility reasons, not sure how feasible it is to change that unfortunately. I suppose it might be possible to change the character set of our key columns though. I'll open an internal ticket to investigate that.

            m.pfeiffer Markus added a comment - It does indeed look like a duplicate of that issue. We use utf8mb4 for our database since we need to support emojis. We rarely drop columns for backwards compatibility reasons, which would explain why <2% of our tables were affected. We use strings for legacy/compatibility reasons, not sure how feasible it is to change that unfortunately. I suppose it might be possible to change the character set of our key columns though. I'll open an internal ticket to investigate that.

            People

              thiru Thirunarayanan Balathandayuthapani
              m.pfeiffer Markus
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.