[MDEV-26276] Significant data corruption after dropping/adding database - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.5.11
Fix Version/s: None
Component/s: Server
Labels:
- regression
Environment:

Hide
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic

dpkg -l | grep maria
ii libmariadb-dev 1:10.5.11+maria~bionic amd64 MariaDB database development files
ii libmariadb3:amd64 1:10.5.11+maria~bionic amd64 MariaDB database client library
ii libmariadbclient18 1:10.5.11+maria~bionic amd64 Virtual package to satisfy external libmariadbclient18 depends
ii mariadb-client 1:10.5.11+maria~bionic all MariaDB database client (metapackage depending on the latest version)
rc mariadb-client-10.1 1:10.1.48-0ubuntu0.18.04.1 amd64 MariaDB database client binaries
ii mariadb-client-10.5 1:10.5.11+maria~bionic amd64 MariaDB database client binaries
ii mariadb-client-core-10.5 1:10.5.11+maria~bionic amd64 MariaDB database core client binaries
ii mariadb-common 1:10.5.11+maria~bionic all MariaDB common configuration files
ii mariadb-server 1:10.5.11+maria~bionic all MariaDB database server (metapackage depending on the latest version)
ii mariadb-server-10.5 1:10.5.11+maria~bionic amd64 MariaDB database server binaries
ii mariadb-server-core-10.5 1:10.5.11+maria~bionic amd64 MariaDB database core server files

Show
lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.1 LTS Release: 18.04 Codename: bionic dpkg -l | grep maria ii libmariadb-dev 1:10.5.11+maria~bionic amd64 MariaDB database development files ii libmariadb3:amd64 1:10.5.11+maria~bionic amd64 MariaDB database client library ii libmariadbclient18 1:10.5.11+maria~bionic amd64 Virtual package to satisfy external libmariadbclient18 depends ii mariadb-client 1:10.5.11+maria~bionic all MariaDB database client (metapackage depending on the latest version) rc mariadb-client-10.1 1:10.1.48-0ubuntu0.18.04.1 amd64 MariaDB database client binaries ii mariadb-client-10.5 1:10.5.11+maria~bionic amd64 MariaDB database client binaries ii mariadb-client-core-10.5 1:10.5.11+maria~bionic amd64 MariaDB database core client binaries ii mariadb-common 1:10.5.11+maria~bionic all MariaDB common configuration files ii mariadb-server 1:10.5.11+maria~bionic all MariaDB database server (metapackage depending on the latest version) ii mariadb-server-10.5 1:10.5.11+maria~bionic amd64 MariaDB database server binaries ii mariadb-server-core-10.5 1:10.5.11+maria~bionic amd64 MariaDB database core server files

Description

I'm observing significant data corruption after dropping and loading a schema. This is so significant that I thought others would have reported it as well. However, I have searched and found nothing. I can reproduce the error using only SQL commands, so it seems legit.

Starting state: The database is up and running well. I have used the C/C++ API to add several entries to the database
The process writing to the DB is stopped, (tried both killing and stopping cleanly)
Here is one entry in the DB:

MariaDB [sigmaDB]> select * from path;

+-----+-------+------------------+--------------------+----------------------+-----------------+--------------+--------------+-------------------+--------------+----------+-----------+---------+------------+--------------+----------+

| idx | name  | uuid             | primaryCollectUuid | componentId          | sigmaId         | reference_id | wptspacing_m | distanceMarginMet | hdgMarginDeg | approved | completed | aborted | restricted | callbacksSet | priority |

+-----+-------+------------------+--------------------+----------------------+-----------------+--------------+--------------+-------------------+--------------+----------+-----------+---------+------------+--------------+----------+

|   1 | 1:Lot | t��~��N������a▒�            | ���4B#�S�����?            | 14481131123844579333 | 220964525205148 |            0 |          100 |                10 |          360 |        1 |         0 |       0 |          0 |            1 |        0 |

+-----+-------+------------------+--------------------+----------------------+-----------------+--------------+--------------+-------------------+--------------+----------+-----------+---------+------------+--------------+----------+

1 row in set (0.000 sec)

MariaDB [sigmaDB]> select count(*) from path;

+----------+

| count(*) |

+----------+

|        1 |

+----------+

1 row in set (0.001 sec)

Then I drop the DB and it appears to be really gone

MariaDB [sigmaDB]> drop database sigmaDB;

Query OK, 37 rows affected (0.101 sec)

MariaDB [(none)]> use sigmaDB;

ERROR 1049 (42000): Unknown database 'sigmaDB'

MariaDB [(none)]> select * from path;

ERROR 1046 (3D000): No database selected

MariaDB [(none)]>

Then I reload the schema (without any inserted data)

root@c01045:/sigma# grep -i insert sigmaDB.sql

root@c01045:/sigma# cat sigmaDB.sql | mysql -u sigma -p

Enter password:

root@c01045:/sigma#

Now I look in the db in the path table and magically I have data that existed from before the previous drop database command

MariaDB [sigmaDB]> select * from path;

+-----+-------+------------------+--------------------+----------------------+-----------------+--------------+--------------+-------------------+--------------+----------+-----------+---------+------------+--------------+----------+

| idx | name  | uuid             | primaryCollectUuid | componentId          | sigmaId         | reference_id | wptspacing_m | distanceMarginMet | hdgMarginDeg | approved | completed | aborted | restricted | callbacksSet | priority |

+-----+-------+------------------+--------------------+----------------------+-----------------+--------------+--------------+-------------------+--------------+----------+-----------+---------+------------+--------------+----------+

|   1 | 1:Lot | t��~��N������a▒�            | ���4B#�S�����?            | 14481131123844579333 | 220964525205148 |            0 |          100 |                10 |          360 |        1 |         0 |       0 |          0 |            1 |        0 |

+-----+-------+------------------+--------------------+----------------------+-----------------+--------------+--------------+-------------------+--------------+----------+-----------+---------+------------+--------------+----------+

1 row in set (0.000 sec)

Then I restart mariadb service

root@c01045:/sigma# systemctl restart mysql

root@c01045:/sigma#

After the restart of the service, the "phantom row" is now gone from the database

MariaDB [sigmaDB]> select * from path;

Empty set (0.002 sec)

MariaDB [sigmaDB]>

This really makes me think that the drop/reload isn't happening as it should. It seems like some amount of data is being stored in memory and not destroyed through the drop then schema load. Once the service is brought down and restarted, this cached data is no longer there and the querries work appropriately. Note that rolling back to 10.3 removes this problem/behavior.

Attachments

Activity

Ascending order - Click to sort in descending order

Daniel Black added a comment - 2021-07-30 02:30

Do you have a datadir preserved that you can upload for the private use by the mariadb developers to understand/resolve this issue?

Daniel Black added a comment - 2021-07-30 02:30 Do you have a datadir preserved that you can upload for the private use by the mariadb developers to understand/resolve this issue?

Nathan Jensen added a comment - 2021-08-02 13:45

Daniel,

I think I can provide what you need, but can you be more specific about what you mean by datadir? I'd hate to give you only half of what you need and have to rinse/repeat this process. Do you mean all the files in /var/lib/mysql? Any other log files? At what state do you want the data dir? I could snapshot it before or after the observed corruption. Thanks for the help!

-Nate

Nathan Jensen added a comment - 2021-08-02 13:45 Daniel, I think I can provide what you need, but can you be more specific about what you mean by datadir? I'd hate to give you only half of what you need and have to rinse/repeat this process. Do you mean all the files in /var/lib/mysql? Any other log files? At what state do you want the data dir? I could snapshot it before or after the observed corruption. Thanks for the help! -Nate

Daniel Black added a comment - 2021-08-03 00:57

A snapshot before would be most useful if possible. Everything else can be derived. Yes all of /var/lib/mysql. If you have log files of mariadb that would be good too (from file or journalctl -u mariadb.service from recent restarts). Thanks njensen. Is there an indication that this is a filesystem out of space error?

Daniel Black added a comment - 2021-08-03 00:57 A snapshot before would be most useful if possible. Everything else can be derived. Yes all of /var/lib/mysql. If you have log files of mariadb that would be good too (from file or journalctl -u mariadb.service from recent restarts). Thanks njensen . Is there an indication that this is a filesystem out of space error?

Nathan Jensen added a comment - 2021-08-03 14:20

Daniel,

I think I have added a useful debug data set for you. Since the problem is trivial to reproduce, I took data dir snapshots at all stages. All of these snaps are contained in the uploaded archive: MDEV-26276_debug_data.tar. When you break these sections out, here is how to align them with the steps I show in the above ticket description:

beforeSchemaLoad.tar.gz – Totally clean/dropped DB. The name of the DB we will eventually work with is called sigmaDB
whileRunning.tar.gz – This is my C/C++ code which loads the schema and populates the table; note the single entry in the path table
afterKill.tar.gz – This is after I killed my C/C++ code. Note that the database seems sane at this point; queries return results that are expected
afterDbDrop.tar.gz – This is after I dropped the database via the command line client (i.e. "drop database sigmaDB"). After this, the DB appears to be gone in the command line client as expected
afterReloadCorrupted.tar.gz – This is after I reload the schema via command line client; note that there should be no data inserted in any table. However, when I query the path table I see a single result
afterRestartGood.tar.gz – This is after I restart the mysql service. At this point, the record in the path table is gone (as it should have been all along) and my DB appears to be back in a sane state

Also note that I uploaded the output of the journalctl command in file: MDEV-26276jounalctl

If there is anything else I can provide, please don't hesitate to ask! I appreciate the help.

-Nate

Nathan Jensen added a comment - 2021-08-03 14:20 Daniel, I think I have added a useful debug data set for you. Since the problem is trivial to reproduce, I took data dir snapshots at all stages. All of these snaps are contained in the uploaded archive: MDEV-26276 _debug_data.tar. When you break these sections out, here is how to align them with the steps I show in the above ticket description: beforeSchemaLoad.tar.gz – Totally clean/dropped DB. The name of the DB we will eventually work with is called sigmaDB whileRunning.tar.gz – This is my C/C++ code which loads the schema and populates the table; note the single entry in the path table afterKill.tar.gz – This is after I killed my C/C++ code. Note that the database seems sane at this point; queries return results that are expected afterDbDrop.tar.gz – This is after I dropped the database via the command line client (i.e. "drop database sigmaDB"). After this, the DB appears to be gone in the command line client as expected afterReloadCorrupted.tar.gz – This is after I reload the schema via command line client; note that there should be no data inserted in any table. However, when I query the path table I see a single result afterRestartGood.tar.gz – This is after I restart the mysql service. At this point, the record in the path table is gone (as it should have been all along) and my DB appears to be back in a sane state Also note that I uploaded the output of the journalctl command in file: MDEV-26276 jounalctl If there is anything else I can provide, please don't hesitate to ask! I appreciate the help. -Nate

People

Assignee:: Unassigned

Reporter:: Nathan Jensen

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2021-07-29 14:40

Updated:: 2021-08-03 14:20

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server