Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7521

CONNECT Engine Column names are not retrieved properly when field values are not ANCII characters

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.15, 10.0.16
    • 10.0.18
    • None
    • CentOS, MariaDB 10.0.16, Connect Engine 1.3

    Description

      I am trying to replicate example how to create pivot table like from MariaDB documentation: https://mariadb.com/kb/en/mariadb/documentation/storage-engines/connect/connect-table-types/connect-table-types-pivot-table-type/

      Everything works as fine as explained.
      However, if I replace in source table fields 'Beer' with '啤酒‘ and 'Car' with 'машина‘, I get the following CREATE TABLE statement:

      CREATE TABLE `pivex_cn`
      (
      `who` varchar(135) NOT NULL,
      `week` int(11) NOT NULL,
      `啤酒` decimal(6,2) NOT NULL `FLAG`=1,
      `Food` decimal(6,2) NOT NULL `FLAG`=1,
      `машина` decimal(6,2) NOT NULL `FLAG`=1
      )
      ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='pivot' `TABNAME`='topivot_cn'

      Which then in SELECT * FROM pivex_cn; gives the following error: "Error Code: 1296. Got error 122 'Cannot find matching column' from CONNECT"

      All tables are using utf8 as default. I also tried uft8mb4, the same result

      Bug was discovered first at 10.0.15 MariaDB version, but reproduced in 10.0.16 as well..

      Attachments

        1. source.csv
          0.1 kB
        2. src.fix
          0.2 kB

        Activity

          sent!

          eevvkk Evgeny Kosolapov added a comment - sent!

          With the help of Alexander Barkov we could spot the places where column names were converted to latin1. That was causing the bug.
          However, note that it fixes the case of general utf8 use but it would not take in account cases where several columns would be using different charsets.

          Columns names are shown as their utf8 code. It cannot be different because the column names belong to different codepages. For instance:

          MariaDB [(none)]> use test
          Database changed
          MariaDB [test]> set names utf8;
          Query OK, 0 rows affected (0.04 sec)
           
          MariaDB [test]> CREATE TABLE `tc1` (
            `who` varchar(16) NOT NULL,
            `what` varchar(16) NOT NULL,
            `amount` double(6,2) NOT NULL
          ) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`=CSV `FILE_NAME`='source.csv' `LRECL`=100 `DATA_CHARSET`=utf8 `ending`=1;
          affected rows 0
           
          MariaDB [test]> select * from tc1;
          +-------+--------------+--------+
          | who   | what         | amount |
          +-------+--------------+--------+
          | Beth  | Pizza        |  12.00 |
          | Janet | Bière       |   3.00 |
          | Ali   | ð£ð░Ðêð©ð¢ð░ |   3.00 |
          | Tom   | ÚØóÕîà       |   8.00 |
          +-------+--------------+--------+
          4 rows in set (0.11 sec)
           
          MariaDB [test]> CREATE TABLE tc2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=tc1;
          affected rows 0
           
          MariaDB [test]> show create table tc2;
           
          CREATE TABLE `tc2` (
            `who` varchar(48) NOT NULL,
            `Pizza` double(6,2) NOT NULL `FLAG`=1,
            `Bière` double(6,2) NOT NULL `FLAG`=1,
            `ð£ð░Ðêð©ð¢ð░` double(6,2) NOT NULL `FLAG`=1,
            `ÚØóÕîà` double(6,2) NOT NULL `FLAG`=1
          ) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='tc1';
           
          MariaDB [test]> select * from tc2;
          +-------+-------+--------+--------------+--------+
          | who   | Pizza | Bi├¿re | ð£ð░Ðêð©ð¢ð░ | ÚØóÕîà |
          +-------+-------+--------+--------------+--------+
          | Ali   |  0.00 |   0.00 |         3.00 |   0.00 |
          | Beth  | 12.00 |   0.00 |         0.00 |   0.00 |
          | Janet |  0.00 |   3.00 |         0.00 |   0.00 |
          | Tom   |  0.00 |   0.00 |         0.00 |   8.00 |
          +-------+-------+--------+--------------+--------+
          4 rows in set (0.54 sec)

          bertrandop Olivier Bertrand added a comment - With the help of Alexander Barkov we could spot the places where column names were converted to latin1. That was causing the bug. However, note that it fixes the case of general utf8 use but it would not take in account cases where several columns would be using different charsets. Columns names are shown as their utf8 code. It cannot be different because the column names belong to different codepages. For instance: MariaDB [(none)]> use test Database changed MariaDB [test]> set names utf8; Query OK, 0 rows affected (0.04 sec)   MariaDB [test]> CREATE TABLE `tc1` ( `who` varchar(16) NOT NULL, `what` varchar(16) NOT NULL, `amount` double(6,2) NOT NULL ) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`=CSV `FILE_NAME`='source.csv' `LRECL`=100 `DATA_CHARSET`=utf8 `ending`=1; affected rows 0   MariaDB [test]> select * from tc1; +-------+--------------+--------+ | who | what | amount | +-------+--------------+--------+ | Beth | Pizza | 12.00 | | Janet | Bi├¿re | 3.00 | | Ali | ð£ð░Ðêð©ð¢ð░ | 3.00 | | Tom | ÚØóÕîà | 8.00 | +-------+--------------+--------+ 4 rows in set (0.11 sec)   MariaDB [test]> CREATE TABLE tc2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=tc1; affected rows 0   MariaDB [test]> show create table tc2;   CREATE TABLE `tc2` ( `who` varchar(48) NOT NULL, `Pizza` double(6,2) NOT NULL `FLAG`=1, `Bi├¿re` double(6,2) NOT NULL `FLAG`=1, `ð£ð░Ðêð©ð¢ð░` double(6,2) NOT NULL `FLAG`=1, `ÚØóÕîà` double(6,2) NOT NULL `FLAG`=1 ) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='tc1';   MariaDB [test]> select * from tc2; +-------+-------+--------+--------------+--------+ | who | Pizza | Bi├¿re | ð£ð░Ðêð©ð¢ð░ | ÚØóÕîà | +-------+-------+--------+--------------+--------+ | Ali | 0.00 | 0.00 | 3.00 | 0.00 | | Beth | 12.00 | 0.00 | 0.00 | 0.00 | | Janet | 0.00 | 3.00 | 0.00 | 0.00 | | Tom | 0.00 | 0.00 | 0.00 | 8.00 | +-------+-------+--------+--------------+--------+ 4 rows in set (0.54 sec)

          Sorry, explanation is not clear. What should be done in order to fix the bug?

          I was trying to replicate your example on test machine, still got the error.

          eevvkk Evgeny Kosolapov added a comment - Sorry, explanation is not clear. What should be done in order to fix the bug? I was trying to replicate your example on test machine, still got the error.

          Nothing you can do now. The bug will be fixed in the next version 10.0.18.

          bertrandop Olivier Bertrand added a comment - Nothing you can do now. The bug will be fixed in the next version 10.0.18.

          Thanks a lot! I see it fixed.

          eevvkk Evgeny Kosolapov added a comment - Thanks a lot! I see it fixed.

          People

            bertrandop Olivier Bertrand
            eevvkk Evgeny Kosolapov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.