Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35592

ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0' on CONCAT in sub-query

Details

    • Bug
    • Status: In Progress (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 11.5.2, 11.6.2
    • 11.8
    • None
    • Official docker image mariadb:11.6.2
      Also tested with mariadb 11.5.2

    Description

      For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

      I was able to get a minimal set of anonymised data to reproduce the issue.

      Step to reproduce :

      • Execute `01_schema.sql`
      • Execute `02_data_partner.sql`
      • Execute `03_data_movement.sql`

      Run the following query:

      SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
              FROM test_partner partner
              WHERE mvt.delivery_fk = partner.id) AS val
      FROM test_movement mvt
      WHERE mvt.documentId > 'a';
      

      You will get the error.

      The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

      Using docker:

      # Setup a clean mariadb
      docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11.6
       
      # Create a test db
      docker exec mariadb_test mariadb -e 'CREATE DATABASE test'
       
      # Import schema
      docker exec -i mariadb_test mariadb test < 01_schema.sql
       
      # Import data
      docker exec -i mariadb_test mariadb test < 02_data_partner.sql
      docker exec -i mariadb_test mariadb test < 03_data_movement.sql
       
      # Execute the query to trigger the bug
      docker exec -i mariadb_test mariadb test < 04_query.sql
       
      # Cleanup
      docker stop mariadb_test
      

      Attachments

        1. final.7z
          7.45 MB
        2. MDEV-35592.7z
          6.65 MB

        Issue Links

          Activity

            Congelli501 Colin GILLE created issue -
            Congelli501 Colin GILLE made changes -
            Field Original Value New Value
            Description For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

            This *seems* to be linked to the usage of filesort.

            I was able to get a minimal set of anonymised data to reproduce the issue.

            Step to reproduce :
             * Execute `01_schema.sql`
             * Execute `02_data_partner.sql`
             * Execute `03_data_movement.sql`

            Run the following query:
            {code:sql}
            SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
                    FROM test_partner partner
                    WHERE delivery_fk = mvt.id) AS val
            FROM test_movement mvt
            WHERE mvt.documentId > 'a';
            {code}

            You will get the error.

            The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

            Using docker:

            {code:bash}
            # Setup a clean mariadb
            docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11

            # Create a test db
            docker exec mariadb_test mariadb -e 'CREATE DATABASE test'

            # Import schema
            docker exec -i mariadb_test mariadb test < 01_schema.sql

            # Import data
            docker exec -i mariadb_test mariadb test < 02_data_partner.sql
            docker exec -i mariadb_test mariadb test < 03_data_movement.sql

            # Execute the query to trigger the bug
            docker exec -i mariadb_test mariadb test < 04_query.sql

            # Cleanup
            docker stop mariadb_test
            {code}

            For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

            I was able to get a minimal set of anonymised data to reproduce the issue.

            Step to reproduce :
             * Execute `01_schema.sql`
             * Execute `02_data_partner.sql`
             * Execute `03_data_movement.sql`

            Run the following query:
            {code:sql}
            SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
                    FROM test_partner partner
                    WHERE delivery_fk = mvt.id) AS val
            FROM test_movement mvt
            WHERE mvt.documentId > 'a';
            {code}

            You will get the error.

            The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

            Using docker:

            {code:bash}
            # Setup a clean mariadb
            docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11

            # Create a test db
            docker exec mariadb_test mariadb -e 'CREATE DATABASE test'

            # Import schema
            docker exec -i mariadb_test mariadb test < 01_schema.sql

            # Import data
            docker exec -i mariadb_test mariadb test < 02_data_partner.sql
            docker exec -i mariadb_test mariadb test < 03_data_movement.sql

            # Execute the query to trigger the bug
            docker exec -i mariadb_test mariadb test < 04_query.sql

            # Cleanup
            docker stop mariadb_test
            {code}

            Congelli501 Colin GILLE made changes -
            Description For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

            I was able to get a minimal set of anonymised data to reproduce the issue.

            Step to reproduce :
             * Execute `01_schema.sql`
             * Execute `02_data_partner.sql`
             * Execute `03_data_movement.sql`

            Run the following query:
            {code:sql}
            SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
                    FROM test_partner partner
                    WHERE delivery_fk = mvt.id) AS val
            FROM test_movement mvt
            WHERE mvt.documentId > 'a';
            {code}

            You will get the error.

            The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

            Using docker:

            {code:bash}
            # Setup a clean mariadb
            docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11

            # Create a test db
            docker exec mariadb_test mariadb -e 'CREATE DATABASE test'

            # Import schema
            docker exec -i mariadb_test mariadb test < 01_schema.sql

            # Import data
            docker exec -i mariadb_test mariadb test < 02_data_partner.sql
            docker exec -i mariadb_test mariadb test < 03_data_movement.sql

            # Execute the query to trigger the bug
            docker exec -i mariadb_test mariadb test < 04_query.sql

            # Cleanup
            docker stop mariadb_test
            {code}

            For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

            I was able to get a minimal set of anonymised data to reproduce the issue.

            Step to reproduce :
             * Execute `01_schema.sql`
             * Execute `02_data_partner.sql`
             * Execute `03_data_movement.sql`

            Run the following query:
            {code:sql}
            SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
                    FROM test_partner partner
                    WHERE mvt.delivery_fk = partner.id) AS val
            FROM test_movement mvt
            WHERE mvt.documentId > 'a';
            {code}

            You will get the error.

            The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

            Using docker:

            {code:bash}
            # Setup a clean mariadb
            docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11

            # Create a test db
            docker exec mariadb_test mariadb -e 'CREATE DATABASE test'

            # Import schema
            docker exec -i mariadb_test mariadb test < 01_schema.sql

            # Import data
            docker exec -i mariadb_test mariadb test < 02_data_partner.sql
            docker exec -i mariadb_test mariadb test < 03_data_movement.sql

            # Execute the query to trigger the bug
            docker exec -i mariadb_test mariadb test < 04_query.sql

            # Cleanup
            docker stop mariadb_test
            {code}

            Congelli501 Colin GILLE made changes -
            Attachment data.7z [ 74321 ]
            Congelli501 Colin GILLE made changes -
            Attachment final.7z [ 74322 ]
            Congelli501 Colin GILLE made changes -
            Description For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

            I was able to get a minimal set of anonymised data to reproduce the issue.

            Step to reproduce :
             * Execute `01_schema.sql`
             * Execute `02_data_partner.sql`
             * Execute `03_data_movement.sql`

            Run the following query:
            {code:sql}
            SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
                    FROM test_partner partner
                    WHERE mvt.delivery_fk = partner.id) AS val
            FROM test_movement mvt
            WHERE mvt.documentId > 'a';
            {code}

            You will get the error.

            The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

            Using docker:

            {code:bash}
            # Setup a clean mariadb
            docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11

            # Create a test db
            docker exec mariadb_test mariadb -e 'CREATE DATABASE test'

            # Import schema
            docker exec -i mariadb_test mariadb test < 01_schema.sql

            # Import data
            docker exec -i mariadb_test mariadb test < 02_data_partner.sql
            docker exec -i mariadb_test mariadb test < 03_data_movement.sql

            # Execute the query to trigger the bug
            docker exec -i mariadb_test mariadb test < 04_query.sql

            # Cleanup
            docker stop mariadb_test
            {code}

            For some SELECT queries with a CONCAT in a sub-query on a table with a lot of data, I get the "ERROR 1062 (23000): Duplicate entry 'NULL' for key 'key0'" error.

            I was able to get a minimal set of anonymised data to reproduce the issue.

            Step to reproduce :
             * Execute `01_schema.sql`
             * Execute `02_data_partner.sql`
             * Execute `03_data_movement.sql`

            Run the following query:
            {code:sql}
            SELECT (SELECT CONCAT(partner.companyName, partner.lastName)
                    FROM test_partner partner
                    WHERE mvt.delivery_fk = partner.id) AS val
            FROM test_movement mvt
            WHERE mvt.documentId > 'a';
            {code}

            You will get the error.

            The `WHERE mvt.documentId > 'a';` filter is necessary to trigger the bug.

            Using docker:

            {code:bash}
            # Setup a clean mariadb
            docker run --rm --name mariadb_test -d --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 mariadb:11.6

            # Create a test db
            docker exec mariadb_test mariadb -e 'CREATE DATABASE test'

            # Import schema
            docker exec -i mariadb_test mariadb test < 01_schema.sql

            # Import data
            docker exec -i mariadb_test mariadb test < 02_data_partner.sql
            docker exec -i mariadb_test mariadb test < 03_data_movement.sql

            # Execute the query to trigger the bug
            docker exec -i mariadb_test mariadb test < 04_query.sql

            # Cleanup
            docker stop mariadb_test
            {code}

            Congelli501 Colin GILLE made changes -
            Affects Version/s 11.6.2 [ 29908 ]
            Affects Version/s 11.5.2 [ 29838 ]
            alice Alice Sherepa made changes -
            Labels regression
            alice Alice Sherepa made changes -
            Fix Version/s 11.7 [ 29815 ]
            alice Alice Sherepa made changes -
            alice Alice Sherepa added a comment -

            Thanks! I repeated as described on 11.5+.
            It is caused by b9f5793176 or 865ef0f567 (with b9f5793176 I've got compile errors, so not able to check exactly)
            I add MDEV-35592.test - nearly the same test, but in 1 file and for running with mtr

            alice Alice Sherepa added a comment - Thanks! I repeated as described on 11.5+. It is caused by b9f5793176 or 865ef0f567 (with b9f5793176 I've got compile errors, so not able to check exactly) I add MDEV-35592 .test - nearly the same test, but in 1 file and for running with mtr
            alice Alice Sherepa made changes -
            Attachment MDEV-35592.7z [ 74330 ]
            alice Alice Sherepa made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            alice Alice Sherepa made changes -
            Assignee Sergei Golubchik [ serg ]
            serg Sergei Golubchik made changes -
            Assignee Sergei Golubchik [ serg ] Michael Widenius [ monty ]
            serg Sergei Golubchik made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            Congelli501 Colin GILLE added a comment -

            Hi, is there any news on this issue or a workaround?

            I tried to set the `max_tmp_session_space_usage` and `max_tmp_total_space_usage` variables to 0 to disable the feature causing this, but it doesn't seem to help

            Congelli501 Colin GILLE added a comment - Hi, is there any news on this issue or a workaround? I tried to set the `max_tmp_session_space_usage` and `max_tmp_total_space_usage` variables to 0 to disable the feature causing this, but it doesn't seem to help
            serg Sergei Golubchik made changes -
            Fix Version/s 11.8 [ 29921 ]
            Fix Version/s 11.7(EOL) [ 29815 ]

            I will take a look at this tomorrow.

            monty Michael Widenius added a comment - I will take a look at this tomorrow.
            monty Michael Widenius made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]

            First some background for this issue (from the MTR test case):

            CREATE TABLE `t1` (
            `id` int(11) NOT NULL,
            PRIMARY KEY (`id`)
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_uca1400_ai_ci
            ;

            CREATE TABLE `t2` (
            `id` int(11) NOT NULL,
            `documentId` varchar(24) DEFAULT NULL,
            `delivery_fk` int(11) DEFAULT NULL,
            PRIMARY KEY (`id`),
            UNIQUE KEY `documentId_index` (`documentId`),
            KEY `delivery_index` (`delivery_fk`)
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_uca1400_ai_ci

            The exlplain for the query is:

            explain SELECT (SELECT 1 FROM t1 WHERE t2.delivery_fk = t1.id) AS val FROM t2
            WHERE t2.documentId > 'a';

            -----------------------------------------------------------------------------------------------------+

            id select_type table type possible_keys key key_len ref rows Extra

            -----------------------------------------------------------------------------------------------------+

            1 PRIMARY t2 ALL documentId_index NULL NULL NULL 743769 Using where
            2 DEPENDENT SUBQUERY t1 eq_ref PRIMARY PRIMARY 4 test.t2.delivery_fk 1 Using index

            ----------------------------------+-------

            json format is in this case more informative:
            explain format=json
            {
            "query_block": {
            "select_id": 1,
            "cost": 118.0823664,
            "nested_loop": [
            {
            "table":

            { "table_name": "t2", "access_type": "ALL", "possible_keys": ["documentId_index"], "loops": 1, "rows": 743769, "cost": 118.0823664, "filtered": 100, "attached_condition": "t2.documentId > 'a'" }

            }
            ],
            "subqueries": [
            {
            "expression_cache": {
            "state": "uninitialized",
            "query_block": {
            "select_id": 2,
            "cost": 0.000838227,
            "outer_ref_condition": "t2.delivery_fk is not null",
            "nested_loop": [
            {
            "table":

            { "table_name": "t1", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["test.t2.delivery_fk"], "loops": 1, "rows": 1, "cost": 0.000838227, "filtered": 100, "using_index": true }

            }
            ]
            }
            }
            }
            ]
            }
            }

            The issue is that the subquery_cache (expression cache) has a bug when
            one part of the cached keys contains a null and the used table is
            null-rejecting (there will be no matching rows if the compared key
            is null).

            In the test case, there is a lot if rows in t2.delivery_fk that contains
            NULL, which causes the problem to appear.

            I have created a patch for this. Now we only have to create a smaller
            test case for this.

            Until next release, this issue can be avoid by turning the subquery
            cache off:

            SET optimizer_switch='subquery_cache=off';

            monty Michael Widenius added a comment - First some background for this issue (from the MTR test case): CREATE TABLE `t1` ( `id` int(11) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_uca1400_ai_ci ; CREATE TABLE `t2` ( `id` int(11) NOT NULL, `documentId` varchar(24) DEFAULT NULL, `delivery_fk` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `documentId_index` (`documentId`), KEY `delivery_index` (`delivery_fk`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_uca1400_ai_ci The exlplain for the query is: explain SELECT (SELECT 1 FROM t1 WHERE t2.delivery_fk = t1.id) AS val FROM t2 WHERE t2.documentId > 'a'; ----- ------------------ ----- ------ ---------------- ------- ------- ------------------- ------ ------------+ id select_type table type possible_keys key key_len ref rows Extra ----- ------------------ ----- ------ ---------------- ------- ------- ------------------- ------ ------------+ 1 PRIMARY t2 ALL documentId_index NULL NULL NULL 743769 Using where 2 DEPENDENT SUBQUERY t1 eq_ref PRIMARY PRIMARY 4 test.t2.delivery_fk 1 Using index ----- ------------------ ----- ------ + ------- json format is in this case more informative: explain format=json { "query_block": { "select_id": 1, "cost": 118.0823664, "nested_loop": [ { "table": { "table_name": "t2", "access_type": "ALL", "possible_keys": ["documentId_index"], "loops": 1, "rows": 743769, "cost": 118.0823664, "filtered": 100, "attached_condition": "t2.documentId > 'a'" } } ], "subqueries": [ { "expression_cache": { "state": "uninitialized", "query_block": { "select_id": 2, "cost": 0.000838227, "outer_ref_condition": "t2.delivery_fk is not null", "nested_loop": [ { "table": { "table_name": "t1", "access_type": "eq_ref", "possible_keys": ["PRIMARY"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["id"], "ref": ["test.t2.delivery_fk"], "loops": 1, "rows": 1, "cost": 0.000838227, "filtered": 100, "using_index": true } } ] } } } ] } } The issue is that the subquery_cache (expression cache) has a bug when one part of the cached keys contains a null and the used table is null-rejecting (there will be no matching rows if the compared key is null). In the test case, there is a lot if rows in t2.delivery_fk that contains NULL, which causes the problem to appear. I have created a patch for this. Now we only have to create a smaller test case for this. Until next release, this issue can be avoid by turning the subquery cache off: SET optimizer_switch='subquery_cache=off';
            monty Michael Widenius made changes -
            Assignee Michael Widenius [ monty ] Oleksandr Byelkin [ sanja ]
            Congelli501 Colin GILLE added a comment -

            I can confirm the workaround is working fine, thanks!

            Congelli501 Colin GILLE added a comment - I can confirm the workaround is working fine, thanks!

            People

              sanja Oleksandr Byelkin
              Congelli501 Colin GILLE
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.