Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5542

GROUP_CONCAT truncate output to 65.536 chars when using DISTINCT or ORDER BY

Details

    Description

      As in the summary, I try to GROUP_CONCAT a LONGTEXT field; result is fine if I don'use DISTINCT or ORDER BY as options for the GROUP BY function, otherwise the result is truncated to 65.536 chars (maybe the result is VARCHAR?).
      I've already set the variable group_concat_max_len to 100.000.000.
      However the query return a warning of the result will be truncated.

      Attachments

        Activity

          I don't see it happen, if it's still reproducible for you, can you paste or attach the output that demonstrates it?
          Mine is below.

          MariaDB [test]> select @@version;
          +----------------+
          | @@version      |
          +----------------+
          | 5.5.34-MariaDB |
          +----------------+
          1 row in set (0.00 sec)
           
          MariaDB [test]> set group_concat_max_len = 100000000;
          Query OK, 0 rows affected (0.00 sec)

          MariaDB [test]> create table t1 (f longtext) engine=MyISAM;Query OK, 0 rows affected (0.04 sec)
           
          insert into t1 values (repeat('1abcdefgh',8192)), (repeat('2abcdefgh',8192)), (repeat('3abcdefgh',8192)), (repeat('4abcdefgh',8192));
          Query OK, 4 rows affected (0.00 sec)
          Records: 4  Duplicates: 0  Warnings: 0
           
          MariaDB [test]> select group_concat(distinct f order by f) from t1;
          ....
          1 row in set (0.00 sec)
           
          MariaDB [test]> select length(group_concat(distinct f order by f)) from t1;
          +---------------------------------------------+
          | length(group_concat(distinct f order by f)) |
          +---------------------------------------------+
          |                                      262143 |
          +---------------------------------------------+
          1 row in set (0.00 sec)
           

          elenst Elena Stepanova added a comment - I don't see it happen, if it's still reproducible for you, can you paste or attach the output that demonstrates it? Mine is below. MariaDB [test]> select @@version; +----------------+ | @@version | +----------------+ | 5.5.34-MariaDB | +----------------+ 1 row in set (0.00 sec)   MariaDB [test]> set group_concat_max_len = 100000000; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> create table t1 (f longtext) engine=MyISAM;Query OK, 0 rows affected (0.04 sec)   insert into t1 values (repeat('1abcdefgh',8192)), (repeat('2abcdefgh',8192)), (repeat('3abcdefgh',8192)), (repeat('4abcdefgh',8192)); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0   MariaDB [test]> select group_concat(distinct f order by f) from t1; .... 1 row in set (0.00 sec)   MariaDB [test]> select length(group_concat(distinct f order by f)) from t1; +---------------------------------------------+ | length(group_concat(distinct f order by f)) | +---------------------------------------------+ | 262143 | +---------------------------------------------+ 1 row in set (0.00 sec)  

          The problem is still present. Here my steps, with MariaDB 10.0.19:

          Only 1 record:

          drop table if exists t1;create table t1 (f longtext) engine=Aria;
          insert into t1 values (repeat('abcdefgh',9000));
          select length(group_concat(distinct f order by f)) from t1;
          result: 72000 -OK

          With 2 records:

          drop table if exists t1;create table t1 (f longtext) engine=Aria;
          insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000));
          select length(group_concat(distinct f order by f)) from t1;
          result: 65535 -KO

          drop table if exists t1;create table t1 (f longtext) engine=Aria;
          insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000));
          select length(group_concat(f)) from t1;
          result: 144001 -OK

          drop table if exists t1;create table t1 (f longtext) engine=Aria;
          insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000));
          select length(group_concat(f order by f)) from t1;
          result: 131071 -KO

          The problem occur when using DISTINCT and/or ORDER BY.

          Thank you.

          ----Messaggio originale----
          Da: Elena Stepanova (JIRA) jira@mariadb.atlassian.net
          Inviato: martedì 2 giugno 2015 14:18
          A: Risato Stefano (IR SCR)
          Oggetto: [JIRA] (MDEV-5542) GROUP_CONCAT truncate output to 65.536 chars when using DISTINCT or ORDER BY

          [ https://mariadb.atlassian.net/browse/MDEV-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

          Elena Stepanova updated MDEV-5542:
          ----------------------------------
          Labels: need_feedback (was: )
          Fix Version/s: (was: 5.5)

          I don't see it happen, if it's still reproducible for you, can you paste or attach the output that demonstrates it?
          Mine is below.

          MariaDB [test]> select @@version;
          +----------------+
          +----------------+
          +----------------+
          1 row in set (0.00 sec)
           
          MariaDB [test]> set group_concat_max_len = 100000000; Query OK, 0 rows affected (0.00 sec) 

           MariaDB [test]> create table t1 (f longtext) engine=MyISAM;Query OK, 0 rows affected (0.04 sec)
           
          insert into t1 values (repeat('1abcdefgh',8192)), (repeat('2abcdefgh',8192)), (repeat('3abcdefgh',8192)), (repeat('4abcdefgh',8192)); Query OK, 4 rows affected (0.00 sec)
          Records: 4  Duplicates: 0  Warnings: 0
           
          MariaDB [test]> select group_concat(distinct f order by f) from t1; ....
          1 row in set (0.00 sec)
           
          MariaDB [test]> select length(group_concat(distinct f order by f)) from t1;
          +---------------------------------------------+
          +---------------------------------------------+
          +---------------------------------------------+
          1 row in set (0.00 sec)
           

          –
          This message was sent by Atlassian JIRA
          (v6.5-OD-05-041#65001)

          steris Risato Stefano added a comment - The problem is still present. Here my steps, with MariaDB 10.0.19: Only 1 record: drop table if exists t1;create table t1 (f longtext) engine=Aria; insert into t1 values (repeat('abcdefgh',9000)); select length(group_concat(distinct f order by f)) from t1; result: 72000 -OK With 2 records: drop table if exists t1;create table t1 (f longtext) engine=Aria; insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000)); select length(group_concat(distinct f order by f)) from t1; result: 65535 -KO drop table if exists t1;create table t1 (f longtext) engine=Aria; insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000)); select length(group_concat(f)) from t1; result: 144001 -OK drop table if exists t1;create table t1 (f longtext) engine=Aria; insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000)); select length(group_concat(f order by f)) from t1; result: 131071 -KO The problem occur when using DISTINCT and/or ORDER BY. Thank you. ---- Messaggio originale ---- Da: Elena Stepanova (JIRA) jira@mariadb.atlassian.net Inviato: martedì 2 giugno 2015 14:18 A: Risato Stefano (IR SCR) Oggetto: [JIRA] ( MDEV-5542 ) GROUP_CONCAT truncate output to 65.536 chars when using DISTINCT or ORDER BY [ https://mariadb.atlassian.net/browse/MDEV-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elena Stepanova updated MDEV-5542 : ---------------------------------- Labels: need_feedback (was: ) Fix Version/s: (was: 5.5) I don't see it happen, if it's still reproducible for you, can you paste or attach the output that demonstrates it? Mine is below. MariaDB [test]> select @@version; +----------------+ +----------------+ +----------------+ 1 row in set (0.00 sec)   MariaDB [test]> set group_concat_max_len = 100000000; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> create table t1 (f longtext) engine=MyISAM;Query OK, 0 rows affected (0.04 sec)   insert into t1 values (repeat('1abcdefgh',8192)), (repeat('2abcdefgh',8192)), (repeat('3abcdefgh',8192)), (repeat('4abcdefgh',8192)); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0   MariaDB [test]> select group_concat(distinct f order by f) from t1; .... 1 row in set (0.00 sec)   MariaDB [test]> select length(group_concat(distinct f order by f)) from t1; +---------------------------------------------+ +---------------------------------------------+ +---------------------------------------------+ 1 row in set (0.00 sec)   – This message was sent by Atlassian JIRA (v6.5-OD-05-041#65001)

          Thanks for clarification. Sorry I misunderstood your description before.
          So, it's not the resulting string that is truncated to 65535, it's each concatenated value.
          That's why with ORDER BY without DISTINCT it becomes 131071 (65535*2 + 1), and with DISTINCT it's 65535.

          Test case

          set group_concat_max_len = 100000000;
           
          drop table if exists t1;
          create table t1 (f longtext) engine=MyISAM;
          insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000)); 
          select length(group_concat(f)) from t1;
          select length(group_concat(f order by f)) from t1;
          select length(group_concat(distinct f order by f)) from t1;
          drop table t1;
           
          create table t1 (f longtext) engine=MyISAM;
          insert into t1 values (repeat('1abcdefg',9000)),(repeat('2abcdefg',9000)); 
          select length(group_concat(f)) from t1;
          select length(group_concat(f order by f)) from t1;
          select length(group_concat(distinct f order by f)) from t1;
          drop table t1;

          Expected result

          create table t1 (f longtext) engine=MyISAM;
          insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000));
          select length(group_concat(f)) from t1;
          length(group_concat(f))
          144001
          select length(group_concat(f order by f)) from t1;
          length(group_concat(f order by f))
          144001
          select length(group_concat(distinct f order by f)) from t1;
          length(group_concat(distinct f order by f))
          72000
          drop table t1;
          create table t1 (f longtext) engine=MyISAM;
          insert into t1 values (repeat('1abcdefg',9000)),(repeat('2abcdefg',9000));
          select length(group_concat(f)) from t1;
          length(group_concat(f))
          144001
          select length(group_concat(f order by f)) from t1;
          length(group_concat(f order by f))
          144001
          select length(group_concat(distinct f order by f)) from t1;
          length(group_concat(distinct f order by f))
          144001
          drop table t1;

          Actual result

          create table t1 (f longtext) engine=MyISAM;
          insert into t1 values (repeat('abcdefgh',9000)),(repeat('abcdefgh',9000));
          select length(group_concat(f)) from t1;
          length(group_concat(f))
          144001
          select length(group_concat(f order by f)) from t1;
          length(group_concat(f order by f))
          131071
          select length(group_concat(distinct f order by f)) from t1;
          length(group_concat(distinct f order by f))
          65535
          drop table t1;
          create table t1 (f longtext) engine=MyISAM;
          insert into t1 values (repeat('1abcdefg',9000)),(repeat('2abcdefg',9000));
          select length(group_concat(f)) from t1;
          length(group_concat(f))
          144001
          select length(group_concat(f order by f)) from t1;
          length(group_concat(f order by f))
          131071
          select length(group_concat(distinct f order by f)) from t1;
          length(group_concat(distinct f order by f))
          131071
          drop table t1;

          The problem exists in MySQL 5.5, but not in 5.6/5.7.

          elenst Elena Stepanova added a comment - Thanks for clarification. Sorry I misunderstood your description before. So, it's not the resulting string that is truncated to 65535, it's each concatenated value. That's why with ORDER BY without DISTINCT it becomes 131071 (65535*2 + 1), and with DISTINCT it's 65535. Test case set group_concat_max_len = 100000000;   drop table if exists t1; create table t1 (f longtext) engine=MyISAM; insert into t1 values (repeat( 'abcdefgh' ,9000)),(repeat( 'abcdefgh' ,9000)); select length(group_concat(f)) from t1; select length(group_concat(f order by f)) from t1; select length(group_concat( distinct f order by f)) from t1; drop table t1;   create table t1 (f longtext) engine=MyISAM; insert into t1 values (repeat( '1abcdefg' ,9000)),(repeat( '2abcdefg' ,9000)); select length(group_concat(f)) from t1; select length(group_concat(f order by f)) from t1; select length(group_concat( distinct f order by f)) from t1; drop table t1; Expected result create table t1 (f longtext) engine=MyISAM; insert into t1 values (repeat( 'abcdefgh' ,9000)),(repeat( 'abcdefgh' ,9000)); select length(group_concat(f)) from t1; length(group_concat(f)) 144001 select length(group_concat(f order by f)) from t1; length(group_concat(f order by f)) 144001 select length(group_concat( distinct f order by f)) from t1; length(group_concat( distinct f order by f)) 72000 drop table t1; create table t1 (f longtext) engine=MyISAM; insert into t1 values (repeat( '1abcdefg' ,9000)),(repeat( '2abcdefg' ,9000)); select length(group_concat(f)) from t1; length(group_concat(f)) 144001 select length(group_concat(f order by f)) from t1; length(group_concat(f order by f)) 144001 select length(group_concat( distinct f order by f)) from t1; length(group_concat( distinct f order by f)) 144001 drop table t1; Actual result create table t1 (f longtext) engine=MyISAM; insert into t1 values (repeat( 'abcdefgh' ,9000)),(repeat( 'abcdefgh' ,9000)); select length(group_concat(f)) from t1; length(group_concat(f)) 144001 select length(group_concat(f order by f)) from t1; length(group_concat(f order by f)) 131071 select length(group_concat( distinct f order by f)) from t1; length(group_concat( distinct f order by f)) 65535 drop table t1; create table t1 (f longtext) engine=MyISAM; insert into t1 values (repeat( '1abcdefg' ,9000)),(repeat( '2abcdefg' ,9000)); select length(group_concat(f)) from t1; length(group_concat(f)) 144001 select length(group_concat(f order by f)) from t1; length(group_concat(f order by f)) 131071 select length(group_concat( distinct f order by f)) from t1; length(group_concat( distinct f order by f)) 131071 drop table t1; The problem exists in MySQL 5.5, but not in 5.6/5.7.

          http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-6.html

          If a GROUP_CONCAT() result was calculated using intermediate results (for example, if ORDER BY or
          DISTINCT was present), individual intermediate results were each truncated to a maximum of 64K,
          even if the group_concat_max_len system variable was set to a larger value. Now the length of
          any intermediate result and the final result are controlled by the group_concat_max_len value.
          (Bug #13387020)

          psergei Sergei Petrunia added a comment - http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-6.html If a GROUP_CONCAT() result was calculated using intermediate results (for example, if ORDER BY or DISTINCT was present), individual intermediate results were each truncated to a maximum of 64K, even if the group_concat_max_len system variable was set to a larger value. Now the length of any intermediate result and the final result are controlled by the group_concat_max_len value. (Bug #13387020)
          psergei Sergei Petrunia added a comment - Seems like this one: https://github.com/mysql/mysql-server/commit/0d11edacff0c725ac945d28b8e46d78270f53b02

          revision-id: 6c414fcf89510215d6d3466eb9992d444eadae89 (mariadb-10.1.12-3-g6c414fc)
          parent(s): 66832b619510f5b9724d8db1eac48bdafb9225e9
          committer: Oleksandr Byelkin
          timestamp: 2016-03-01 21:10:59 +0100
          message:

          MDEV-5542: GROUP_CONCAT truncate output to 65.536 chars when using DISTINCT or ORDER BY

          port of mysql fix WL#6098

          —

          sanja Oleksandr Byelkin added a comment - revision-id: 6c414fcf89510215d6d3466eb9992d444eadae89 (mariadb-10.1.12-3-g6c414fc) parent(s): 66832b619510f5b9724d8db1eac48bdafb9225e9 committer: Oleksandr Byelkin timestamp: 2016-03-01 21:10:59 +0100 message: MDEV-5542 : GROUP_CONCAT truncate output to 65.536 chars when using DISTINCT or ORDER BY port of mysql fix WL#6098 —

          People

            sanja Oleksandr Byelkin
            steris Risato Stefano
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.