[MDEV-6605] Multiple Clients Inserting Causing Error: Failed to read auto-increment value from storage engine - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 5.5.39, 10.0.13
Fix Version/s: 5.5.40, 10.0.14
Component/s: Storage Engine - TokuDB
Labels:
- tokudb
Environment:
Ubuntu 64bit 14.4 LTS: linux-kernel-3.13.0-24-generic

Description

I have a table partitioned by range created with the tokudb engine on mariadb 10.0.12. When I have one client inserting into this table everything works as expected. However, as soon as I have a second client inserting more data into this table I get the following error: Failed to read auto-increment value from storage engine. When I see the table status the Auto_increment value is 2049. My auto increment description is "id bigint not null auto_increment".

I can reproduce this problem with the below table and Java client side code. Another tokudb user on the tokudb-user group performed the test on Percona Server 5.6, which did not fail: https://groups.google.com/forum/#!topic/tokudb-user/2pnjQxuuvUo

CREATE TABLE `auto_inc_test` (

  `id` bigint(20) NOT NULL AUTO_INCREMENT,

  `time` bigint(20) NOT NULL,

  PRIMARY KEY (`time`,`id`)

) ENGINE=TokuDB AUTO_INCREMENT=1001 DEFAULT CHARSET=latin1 `compression`='tokudb_zlib'

ALTER TABLE order_books partition by range (time) (

    partition p0 values less than (1507495611880)

);

public class MultipleClientsInsertingDataIntoPartitionedTokudbTable {

    public static final String INSERT_STATEMENT = "INSERT INTO test.auto_inc_test (time) values (?)";

    private final CountDownLatch countDownLatch;

    public MultipleClientsInsertingDataIntoPartitionedTokudbTable() {

        countDownLatch = new CountDownLatch(2);

    public static void main(String[] args) throws InterruptedException {

        MultipleClientsInsertingDataIntoPartitionedTokudbTable multipleClientsInsertingDataIntoPartitionedTokudbTable = new MultipleClientsInsertingDataIntoPartitionedTokudbTable();

        multipleClientsInsertingDataIntoPartitionedTokudbTable.start();

    private void start() throws InterruptedException {

        DataSource dataSource = createDataSource();

        new Thread(new InsertRunnable(dataSource)).start();

        new Thread(new InsertRunnable(dataSource)).start();

        countDownLatch.await();

    private DataSource createDataSource() {

        String jdbcUrl = "jdbc:mysql://xxx.xxx.xxx.xxx/test?tcpNoDelay=true&tcpKeepAlive=true&rewriteBatchedStatements=true";

        String username = "user";

        String password = "password";

        MysqlDataSource mysqlDataSource = new MysqlDataSource();

        mysqlDataSource.setURL(jdbcUrl);

        mysqlDataSource.setUser(username);

        mysqlDataSource.setPassword(password);

        return mysqlDataSource;

    private void insertData(DataSource dataSource) {

        try (Connection connection = dataSource.getConnection();

                PreparedStatement preparedStatement = connection

                        .prepareStatement(INSERT_STATEMENT)) {

            connection.setAutoCommit(false);

            for (int i = 0; i < 1000; i++) {

                preparedStatement.setLong(1, System.currentTimeMillis());

                preparedStatement.addBatch();

            preparedStatement.executeBatch();

        } catch (SQLException e) {

            e.printStackTrace();

    private class InsertRunnable implements Runnable {

        private final DataSource dataSource;

        private InsertRunnable(DataSource dataSource) {

            this.dataSource = dataSource;

        @Override

        public void run() {

            for (int i = 0; i < 1000; i++) {

                insertData(dataSource);

            countDownLatch.countDown();

Attachments

Activity

Ascending order - Click to sort in descending order

View 4 older comments

Sergei Golubchik added a comment - 2014-09-07 17:15

prohaska7, thanks. Could you elaborate a bit, please? Where the tokudb lock on PK is taken and on what code execution path is it taken before the partition auto inc mutex?

Sergei Golubchik added a comment - 2014-09-07 17:15 prohaska7 , thanks. Could you elaborate a bit, please? Where the tokudb lock on PK is taken and on what code execution path is it taken before the partition auto inc mutex?

Rich Prohaska added a comment - 2014-09-07 21:59 - edited

given table mdev6605 (t bigint not null, id bigint not null auto_increment, primary key(t,id)) engine=tokudb partition by range(t) (partition p0 values less than (1));

t1: begin; insert into mdev6605 (t) values (0);
t1 holds a lock on the pk. the pk lock taken by ha_tokudb::index_read, handler::index_read_map, handler::get_auto_increment, ha_partition::get_auto_increment call stack.

t2: begin; insert into mdev6605 (t) values (0);
t2 holds the partition auto inc mutex, blocked on the tokudb pk

t1: insert into mdev6605 (t) values (0);
t1 blocked on the partition auto inc mutex

eventually t2 times out in tokudb's lock manager, t2 rolls back, and the deadlock is removed.

Rich Prohaska added a comment - 2014-09-07 21:59 - edited given table mdev6605 (t bigint not null, id bigint not null auto_increment, primary key(t,id)) engine=tokudb partition by range(t) (partition p0 values less than (1)); t1: begin; insert into mdev6605 (t) values (0); t1 holds a lock on the pk. the pk lock taken by ha_tokudb::index_read, handler::index_read_map, handler::get_auto_increment, ha_partition::get_auto_increment call stack. t2: begin; insert into mdev6605 (t) values (0); t2 holds the partition auto inc mutex, blocked on the tokudb pk t1: insert into mdev6605 (t) values (0); t1 blocked on the partition auto inc mutex eventually t2 times out in tokudb's lock manager, t2 rolls back, and the deadlock is removed.

Sergei Golubchik added a comment - 2014-09-08 18:30

Thanks. Simple test case without partitioning:

create table t1 (a int auto_increment, b bigint(20), primary key (b,a)) engine=tokudb;

start transaction;

insert t1 (b) values (1);

connect(con2,localhost,root);

start transaction;

insert t1 (b) values (1); -- crash!

That means, partitioning autoinc mutex is not the issue here. handler::get_auto_increment() crashes, because it doesn't expect an error, but, I believe, lock timeout is a perfectly valid status and it should be expected. Compare with

create table t1 (a int auto_increment, b bigint(20), primary key (b,a)) engine=tokudb;

start transaction;

insert t1 select max(a)+1, 1 from t1 where b=1;

connect(con2,localhost,root);

start transaction;

insert t1 select max(a)+1, 1 from t1 where b=1;

This has the same semantics, and the second insert also times out on a lock wait.

Sergei Golubchik added a comment - 2014-09-08 18:30 Thanks. Simple test case without partitioning: create table t1 (a int auto_increment, b bigint (20), primary key (b,a)) engine=tokudb; start transaction ; insert t1 (b) values (1); connect (con2,localhost,root); start transaction ; insert t1 (b) values (1); -- crash! That means, partitioning autoinc mutex is not the issue here. handler::get_auto_increment() crashes, because it doesn't expect an error, but, I believe, lock timeout is a perfectly valid status and it should be expected. Compare with create table t1 (a int auto_increment, b bigint (20), primary key (b,a)) engine=tokudb; start transaction ; insert t1 select max (a)+1, 1 from t1 where b=1; connect (con2,localhost,root); start transaction ; insert t1 select max (a)+1, 1 from t1 where b=1; This has the same semantics, and the second insert also times out on a lock wait.

Rich Prohaska added a comment - 2014-09-08 18:33

The crash is on a debug assert. The test case just gets an error on a release build.

Rich Prohaska added a comment - 2014-09-08 18:33 The crash is on a debug assert. The test case just gets an error on a release build.

Sergei Golubchik added a comment - 2014-09-08 19:44

True. And the deadlock is caused by the partitioning auto-inc mutex. But the underlying error will happen in debug or release builds, with or without partitioning. See the second test case, the one with insert ... select. The second thread needs to read the value from the index, and the first thread keeps it locked. There can be no workaround, if the first thread doesn't commit — the second will time out eventually.

I'll fix the deadlock, though, so that the first thread would be able to commit.

Sergei Golubchik added a comment - 2014-09-08 19:44 True. And the deadlock is caused by the partitioning auto-inc mutex. But the underlying error will happen in debug or release builds, with or without partitioning. See the second test case, the one with insert ... select . The second thread needs to read the value from the index, and the first thread keeps it locked. There can be no workaround, if the first thread doesn't commit — the second will time out eventually. I'll fix the deadlock, though, so that the first thread would be able to commit.

People

Assignee:: Sergei Golubchik

Reporter:: Can Can

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2014-08-19 13:23

Updated:: 2014-10-11 01:23

Resolved:: 2014-09-13 10:10

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server