|
When you say that in the original data file you replaced 'Car' with 'машина‘ I suppose it is coded in UTF8 in the data file.
The problem is that MariaDB unconditionally converts column names into UTF8. Therefore the corresponding column should be named literally, not by its UTF8 representation to avoid a double conversion. Unfortunately trying to create the pivot table by:
|
CREATE TABLE `pivex_cn`
|
(
|
`who` varchar(135) NOT NULL,
|
`week` int(11) NOT NULL,
|
`Beer` decimal(6,2) NOT NULL `FLAG`=1,
|
`Food` decimal(6,2) NOT NULL `FLAG`=1,
|
`машина` decimal(6,2) NOT NULL `FLAG`=1
|
)
|
ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='pivot' `TABNAME`='topivot_cn';
|
fails with the MariaDB message (1166) Incorrect column name ''.
That's unfortunate because for instance, replacing Beer with UTF8 Bière in the data file I can create the table:
CREATE TABLE `pivex_fr`
|
(
|
`who` varchar(135) NOT NULL,
|
`week` int(11) NOT NULL,
|
`Bière` decimal(6,2) NOT NULL `FLAG`=1,
|
`Food` decimal(6,2) NOT NULL `FLAG`=1,
|
`Car` decimal(6,2) NOT NULL `FLAG`=1
|
)
|
ENGINE=CONNECT TABLE_TYPE='pivot' `TABNAME`='topivot_fr';
|
It is accepted and works perfectly.
The issue is why is MariaDB refusing the create table with a quoted Cyrillic column name.
|
|
Yes, everything is coded in UTF8, its default in my database settings. It dosn't matter if I replace data within the table, or create the table from scratch. Let's do exercise with creating table from scratch:
Step 1:
CREATE TABLE `original_data` (
|
`who` VARCHAR(45) NOT NULL,
|
`what` VARCHAR(45) NOT NULL,
|
`amount` DECIMAL(4,2) NOT NULL);
|
|
INSERT INTO `original_data` (who, what, amount)
|
VALUES ('Beth', 'Pizza', 12), ('Janet', 'Beer', 3), ('Ali', 'Car', 3), ('Tom', 'Bread', 8);
|
Result:
MariaDB > select * from original_data;
------------------
------------------
| Beth |
Pizza |
12.00 |
| Janet |
Beer |
3.00 |
| Ali |
Car |
3.00 |
| Tom |
Bread |
8.00 |
------------------
4 rows in set (0.00 sec)
Step 2:
create table pivex
|
engine=connect table_type=pivot tabname=original_data;
|
Result:
MariaDB > select * from pivex;
-------------------------
-------------------------
| Ali |
0.00 |
0.00 |
3.00 |
0.00 |
| Beth |
12.00 |
0.00 |
0.00 |
0.00 |
| Janet |
0.00 |
3.00 |
0.00 |
0.00 |
| Tom |
0.00 |
0.00 |
0.00 |
8.00 |
-------------------------
4 rows in set (0.00 sec)
Everything is nice. Perfect! Lets check SHOW CREATE TABLE for pivex:
| pivex | CREATE TABLE `pivex` (
|
`who` varchar(135) NOT NULL,
|
`Pizza` decimal(6,2) NOT NULL `FLAG`=1,
|
`Beer` decimal(6,2) NOT NULL `FLAG`=1,
|
`Car` decimal(6,2) NOT NULL `FLAG`=1,
|
`Bread` decimal(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='pivot' `TABNAME`='original_data' |
|
Now, lets repeat all of this with French, Russian and Chinese:
Step 1:
drop table if exists original_data;
|
CREATE TABLE `original_data` (
|
`who` VARCHAR(45) NOT NULL,
|
`what` VARCHAR(45) NOT NULL,
|
`amount` DECIMAL(4,2) NOT NULL);
|
|
INSERT INTO `original_data` (who, what, amount)
|
VALUES ('Beth', 'Pizza', 12), ('Janet', 'Bière', 3), ('Ali', 'Машина', 3), ('Tom', '面包', 8);
|
Result:
MariaDB > select * from original_data;
-------------------------
-------------------------
| Beth |
Pizza |
12.00 |
| Janet |
Bière |
3.00 |
| Ali |
Машина |
3.00 |
| Tom |
面包 |
8.00 |
-------------------------
4 rows in set (0.00 sec)
Step 2:
drop table if exists pivex;
|
create table pivex
|
engine=connect table_type=pivot tabname=original_data;
|
Result:
MariaDB > select * from pivex;
ERROR 1296 (HY000): Got error 122 'Cannot find matching column' from CONNECT
Lets check SHOW CREATE TABLE for pivex:
| pivex | CREATE TABLE `pivex` (
|
`who` varchar(135) NOT NULL,
|
`Pizza` decimal(6,2) NOT NULL `FLAG`=1,
|
`Bière` decimal(6,2) NOT NULL `FLAG`=1,
|
`Машина` decimal(6,2) NOT NULL `FLAG`=1,
|
`é¢åŒ…` decimal(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='pivot' `TABNAME`='original_data' |
|
Table 'original_data' was also in utf8 all the time by default:
| original_data | CREATE TABLE `original_data` (
|
`who` varchar(45) NOT NULL,
|
`what` varchar(45) NOT NULL,
|
`amount` decimal(4,2) NOT NULL
|
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
|
So, basically, the point is that something wrong is happenning between Step 1 and Step 2, when the fields values are transmitted into column names. Of cource, we can re-create CREATE TABLE explicitely and manally (or with some programming around), but what the point then?
And, it doesn't matter if it is Russian, Chinese or only French. If I do
INSERT INTO `original_data` (who, what, amount)
|
VALUES ('Beth', 'Pizza', 12), ('Janet', 'Bière', 3), ('Ali', 'Car', 3), ('Tom', 'Bread', 8);
|
I have the same error.
|
|
any hope?
|
|
I shall work on it.
|
|
Well, I did try to reproduce your case but unsuccessfully. First of all, I could not execute the command:
INSERT INTO `original_data` (who, what, amount)
|
VALUES ('Beth', 'Pizza', 12), ('Janet', 'Bière', 3), ('Ali', 'Машина', 3), ('Tom', '面包', 8);
|
It was rejected, probably due to the russian and chinese values that are not recognised on my machine. So what I did, is to manually make a CONNECT table corresponding to the 'original_data' table using and editor
able to past the 'Машина' and '面包' strings and code the whole thing in UTF8.
I attached this file, "src.fix", to this case.
Then I could made a table on it:
CREATE OR REPLACE TABLE t1 (
|
`who` VARCHAR(16) NOT NULL ,
|
`what` VARCHAR(16) NOT NULL,
|
`amount` DOUBLE(6,2) NOT NULL)
|
ENGINE=CONNECT TABLE_TYPE=DOS FILE_NAME='src.fix' LRECL=128 DATA_CHARSET=UTF8;
|
but trying to see it resulted in:
MariaDB [test]> select * from t1;
|
+-------+--------+--------+
|
| who | what | amount |
|
+-------+--------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | ?????? | 3.00 |
|
| Tom | ?? | 8.00 |
|
+-------+--------+--------+
|
4 rows in set, 4 warnings (23 min 9.74 sec)
|
|
MariaDB [test]> show warnings;
|
+---------+------+----------------------------------------------------------------------------------+
|
| Level | Code | Message |
|
+---------+------+----------------------------------------------------------------------------------+
|
| Warning | 1366 | Incorrect string value: '\xD0\x9C\xD0\xB0\xD1\x88...' for column 'what' at row 3 |
|
| Warning | 1105 | Out of range value ?????? for column 'what' at row 3 |
|
| Warning | 1366 | Incorrect string value: '\xE9\x9D\xA2\xE5\x8C\x85' for column 'what' at row 4 |
|
| Warning | 1105 | Out of range value ?? for column 'what' at row 4 |
|
+---------+------+----------------------------------------------------------------------------------+
|
4 rows in set (0.00 sec)
|
However, suppressing "DATA_CHARSET=UTF8" from the t1 definition, it was seen as:
MariaDB [test]> select * from t1;
|
+-------+--------------+--------+
|
| who | what | amount |
|
+-------+--------------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | Машина | 3.00 |
|
| Tom | é¢åŒ… | 8.00 |
|
+-------+--------------+--------+
|
4 rows in set (11.10 sec)
|
Now I made the PIVOT table:
CREATE TABLE t2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=t1;
|
And when looking at it I got:
MariaDB [test]> select * from t2;
|
+-------+-------+--------+--------------+--------+
|
| who | Pizza | Bière | Ð?аÑ?ина | é?¢å?? |
|
+-------+-------+--------+--------------+--------+
|
| Ali | 0.00 | 0.00 | 3.00 | 0.00 |
|
| Beth | 12.00 | 0.00 | 0.00 | 0.00 |
|
| Janet | 0.00 | 3.00 | 0.00 | 0.00 |
|
| Tom | 0.00 | 0.00 | 0.00 | 8.00 |
|
+-------+-------+--------+--------------+--------+
|
4 rows in set (0.39 sec)
|
Forgetting the fact that cyrillic and chinese characters are not recognised on my machine, it apparently "worked"
and I got the PIVOT table displayed. The only difference with your sample is that I set the DEFAULT CHARSET of the PIVOT table to UTF8. You should try it and tell me what it does.
Note that this is not a real fix of this problem, because I think that the column names should be the original ones, not their UTF8 translation, as well in the PIVOT create table as in the result headers.
This in fact can be achieved starting from the first flavor of the t1 table, the one specifying "DATA_CHARSET=UTF8". This is interresting. After making the t2 PIVOT table on it we get:
show create table t2;
|
CREATE TABLE `t2` (
|
`who` varchar(16) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`??????` double(6,2) NOT NULL `FLAG`=1,
|
`??` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='t1';
|
And it is displayed as:
MariaDB [test]> select * from t2;
|
+-------+-------+-------+--------+------+
|
| who | Pizza | Bière | ?????? | ?? |
|
+-------+-------+-------+--------+------+
|
| Ali | 0.00 | 0.00 | 3.00 | 0.00 |
|
| Beth | 12.00 | 0.00 | 0.00 | 0.00 |
|
| Janet | 0.00 | 3.00 | 0.00 | 0.00 |
|
| Tom | 0.00 | 0.00 | 0.00 | 8.00 |
|
+-------+-------+-------+--------+------+
|
4 rows in set (0.02 sec)
|
Wonderful isn't it?
|
|
Well, this is really confusing why your machine cannot handle Russian and Chinese, I hope there is nothing political
We use CentOS on cloud server.
Anyway, when I was trying to replicate your example, I've got following strange results:
MariaDB [test]> CREATE OR REPLACE TABLE t1 (
|
-> `who` VARCHAR(16) NOT NULL ,
|
-> `what` VARCHAR(16) NOT NULL,
|
-> `amount` DOUBLE(6,2) NOT NULL)
|
-> ENGINE=CONNECT TABLE_TYPE=DOS FILE_NAME='/home/src.fix' LRECL=128 DATA_CHARSET=UTF8;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [test]> select * from t1;
|
+------------------+------+--------+
|
| who | what | amount |
|
+------------------+------+--------+
|
| Beth | | 0.00 |
|
| Janet | | 0.00 |
|
| Ali | | 0.00 |
|
| Tom | | 0.00 |
|
+------------------+------+--------+
|
4 rows in set, 8 warnings (0.01 sec)
|
As we can see, there are more warnings here, but for all rows, even for English one:
MariaDB [test]> show warnings;
|
+---------+------+-------------------------------------------------------------------------------------+
|
| Level | Code | Message |
|
+---------+------+-------------------------------------------------------------------------------------+
|
| Warning | 1265 | Data truncated for column 'who' at row 1 |
|
| Warning | 1105 | Out of range value Beth Pizza 12.00 for column 'who' at row 1 |
|
| Warning | 1265 | Data truncated for column 'who' at row 2 |
|
| Warning | 1105 | Out of range value Janet Bière 3.00 for column 'who' at row 2 |
|
| Warning | 1265 | Data truncated for column 'who' at row 3 |
|
| Warning | 1105 | Out of range value Ali Машина 3.00 for column 'who' at row 3 |
|
| Warning | 1265 | Data truncated for column 'who' at row 4 |
|
| Warning | 1105 | Out of range value Tom 面包 8.00 for column 'who' at row 4 |
|
+---------+------+-------------------------------------------------------------------------------------+
|
8 rows in set (0.00 sec)
|
Okay, maybe something wrong with actual src.fix file, created on your machine. So, I have created my own CSV file with similar format. Attached here - source.csv
It works much better at first step!
MariaDB [test]> CREATE OR REPLACE TABLE t1 (
|
-> `who` VARCHAR(16) NOT NULL ,
|
-> `what` VARCHAR(16) NOT NULL,
|
-> `amount` DOUBLE(6,2) NOT NULL)
|
-> ENGINE=CONNECT TABLE_TYPE=CSV FILE_NAME='/home/source.csv' LRECL=128 DATA_CHARSET=UTF8;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [test]> select * from t1;
|
+-------+--------------+--------+
|
| who | what | amount |
|
+-------+--------------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | Машина | 3.00 |
|
| Tom | 面包 | 8.00 |
|
+-------+--------------+--------+
|
4 rows in set (0.00 sec)
|
However, when doing second step creating pivot table:
MariaDB [test]> CREATE TABLE t2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=t1;
|
Query OK, 0 rows affected (0.07 sec)
|
I get the following error:
MariaDB [test]> select * from t2;
|
ERROR 1296 (HY000): Got error 122 'Cannot find matching column' from CONNECT
|
So, basically it seems that issue is not how we create the table t1, but how we pivot it to table t2.
|
|
Excellent. Your example probably shows where is the issue... except that it is missing one important item: The SHOW CREATE TABLE of the PIVOT table.
Here is what I get starting from your source.csv file:
CREATE OR REPLACE TABLE tc1 (
|
`who` VARCHAR(16) NOT NULL ,
|
`what` VARCHAR(16) NOT NULL,
|
`amount` DOUBLE(6,2) NOT NULL)
|
ENGINE=CONNECT DEFAULT CHARSET latin1 TABLE_TYPE=CSV FILE_NAME='source.csv' LRECL=100 DATA_CHARSET=utf8 ending=1;
|
|
select * from tc1;
|
| who |
what |
amount |
| Beth |
Pizza |
12.00 |
| Janet |
Bière |
3.00 |
| Ali |
?????? |
3.00 |
| Tom |
?? |
8.00 |
| Level |
Code |
Message |
| Warning |
1366 |
Incorrect string value: '\xD0\x9C\xD0\xB0\xD1\x88...' for column 'what' at row 3 |
| Warning |
1105 |
Out of range value ?????? for column 'what' at row 3 |
| Warning |
1366 |
Incorrect string value: '\xE9\x9D\xA2\xE5\x8C\x85' for column 'what' at row 4 |
| Warning |
1105 |
Out of range value ?? for column 'what' at row 4 |
These warnings you don't get because on your system, the russian and chinese values are accepted.
Now creating the PIVOT table:
CREATE TABLE tc2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=tc1;
|
|
show create table tc2;
|
Table Create Table
|
tc2 CREATE TABLE `tc2` (
|
`who` varchar(16) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`??????` double(6,2) NOT NULL `FLAG`=1,
|
`??` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='tc1';
|
The columns names for these values are
.
CONNECT is able to retrieve them in the source table because when reading it, these values are also changed the same way, therefore:
returns:
| who |
Pizza |
Bière |
?????? |
?? |
| Ali |
0.00 |
0.00 |
3.00 |
0.00 |
| Beth |
12.00 |
0.00 |
0.00 |
0.00 |
| Janet |
0.00 |
3.00 |
0.00 |
0.00 |
| Tom |
0.00 |
0.00 |
0.00 |
8.00 |
What happen in your case is that your PIVOT table is created with column names that are Машина and 面包 instead of
.Then, for some reason, these names are not handled correctly and cannot be matched within the source table.
Unfortunately, I cannot reproduce this on my machine and this makes me unable to trace what happens and to find why it fails. I don't know either whether this is a CONNECT bug or a MariaDB issue.
Now you can also try this:
CREATE OR REPLACE TABLE tcx1 (
|
`who` VARCHAR(16) NOT NULL ,
|
`what` VARCHAR(16) NOT NULL,
|
`amount` DOUBLE(6,2) NOT NULL)
|
ENGINE=CONNECT DEFAULT CHARSET latin1 TABLE_TYPE=CSV FILE_NAME='source.csv' LRECL=100;
|
That is the same thing but dropping the DATA_CHARSET=utf8 so the table text is no more converted.
| who |
what |
amount |
| Beth |
Pizza |
12.00 |
| Janet |
Bière |
3.00 |
| Ali |
Машина |
3.00 |
| Tom |
é¢åŒ… |
8.00 |
No more warnings, now let's make the PIVOT table:
CREATE TABLE tcx2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=tcx1;
|
|
show create table tcx2;
|
CREATE TABLE `tcx2` (
|
`who` varchar(16) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`Машина` double(6,2) NOT NULL `FLAG`=1,
|
`é¢åŒ…` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='tcx1';
|
|
select * from tcx2;
|
| who |
Pizza |
Bière |
Машина |
é¢åŒ… |
| Ali |
0.00 |
0.00 |
3.00 |
0.00 |
| Beth |
12.00 |
0.00 |
0.00 |
0.00 |
| Janet |
0.00 |
3.00 |
0.00 |
0.00 |
| Tom |
0.00 |
0.00 |
0.00 |
8.00 |
This is rather ugly but perhaps can work on your machine.
|
|
hmm.. nope. Strangely, your latest example didn't work on my machine. The steps are the following:
MariaDB [test]> CREATE OR REPLACE TABLE tcx1 (
|
-> `who` VARCHAR(16) NOT NULL ,
|
-> `what` VARCHAR(16) NOT NULL,
|
-> `amount` DOUBLE(6,2) NOT NULL)
|
-> ENGINE=CONNECT DEFAULT CHARSET latin1 TABLE_TYPE=CSV FILE_NAME='/home/source.csv' LRECL=100;
|
Query OK, 0 rows affected (0.08 sec)
|
|
MariaDB [test]> select * from tcx1;
|
+-------+--------------------------+--------+
|
| who | what | amount |
|
+-------+--------------------------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | Машина | 3.00 |
|
| Tom | é¢åŒ… | 8.00 |
|
+-------+--------------------------+--------+
|
4 rows in set (0.01 sec)
|
MariaDB [test]> CREATE TABLE tcx2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=tcx1;
|
Query OK, 0 rows affected (0.02 sec)
|
|
MariaDB [test]> show create table tcx2;
|
| tcx2 | CREATE TABLE `tcx2` (
|
`who` varchar(48) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`Üðшøýð` double(6,2) NOT NULL `FLAG`=1,
|
`é¢包` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='tcx1'
|
but when I try to do select:
MariaDB [test]> select * from tcx2;
|
ERROR 1296 (HY000): Got error 122 'Cannot find matching column' from CONNECT
|
in my example above (t1->t2) the SHOW CREATE TABLE t2 gives:
| t2 | CREATE TABLE `t2` (
|
`who` varchar(48) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`Машина` double(6,2) NOT NULL `FLAG`=1,
|
`é¢åŒ…` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='t1'
|
|
|
I see. The difference is that in the process of making the pivot table column definitions (by Discovery) the columns names are converted to utf8 while they are not on my machine. This is probably due to the setting of system variables. On my machine this is:
Variable_name Value
|
character_set_client latin1
|
character_set_connection latin1
|
character_set_database latin1
|
character_set_filesystem binary
|
character_set_results latin1
|
character_set_server latin1
|
character_set_system utf8
|
What you can do is to play with these settings or alternatively to manually define the columns in your PIVOT CREATE TABLE statement to see what happens.
|
|
Variables on my machine:
MariaDB [(none)]> SHOW VARIABLES LIKE 'char%';
|
+--------------------------+----------------------------+
|
| Variable_name | Value |
|
+--------------------------+----------------------------+
|
| character_set_client | utf8 |
|
| character_set_connection | utf8 |
|
| character_set_database | utf8 |
|
| character_set_filesystem | binary |
|
| character_set_results | utf8 |
|
| character_set_server | utf8 |
|
| character_set_system | utf8 |
|
| character_sets_dir | /usr/share/mysql/charsets/ |
|
+--------------------------+----------------------------+
|
8 rows in set (0.00 sec)
|
If I manually define the columns in PIVOT CREATE TABLE it perfectly works. But isn't it the whole point of convenience of having it automatic? )
|
|
You're right. But the issue is in the Discovery process when character_set_results is set to utf8. And this happen after returning from the CONNECT connect_assisted_discovery function.
You could try:
set character_set_results=latin1;
|
CREATE TABLE tcx2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT;
|
set character_set_results=utf8;
|
I'd be interrested to see if it works.
|
|
What I said previously is not true. I was fooled because when character_set_results is set to utf8, this applies to the display of the result of the SHOW CREATE TABLE statement. However, the table is normally created with original column names.
What happens in your case is that for the Russian and Chinese characters, something causes the conversion to be wrong, although it is correct for French ones. But I cannot see where it happens on my machines that does not behave like yours.
Maybe what you could do is to display the SHOW CREATE TABLE of the table created via Discovery after setting the character_set_results to another charset, one allowing to see all your specific characters.
What I want to check is whether there is a difference in the definition of the table via Discovery (without the columns) and the definition of the table you made manually.
If they are identical (same output from SHOW CREATE TABLE) they should behave identically. If not, there is something hidden somewhere that should be discovered.
|
|
Hi Olivier, I was away due to Chinese New Year, but I didn't forget about the issue 
Meanwhile, I have launched free instance on Amazon, so that we can both try the issue on the same machine.
So, if we launch fresh new Ubuntu and install fresh new MariaDB (10.0.17 now), add CONNECT Engine, the default Charsets would be:
MariaDB [(none)]> show variables like "%character%"; show variables like "%collation%";
|
+--------------------------+----------------------------+
|
| Variable_name | Value |
|
+--------------------------+----------------------------+
|
| character_set_client | utf8 |
|
| character_set_connection | utf8 |
|
| character_set_database | latin1 |
|
| character_set_filesystem | binary |
|
| character_set_results | utf8 |
|
| character_set_server | latin1 |
|
| character_set_system | utf8 |
|
| character_sets_dir | /usr/share/mysql/charsets/ |
|
+--------------------------+----------------------------+
|
8 rows in set (0.01 sec)
|
|
+----------------------+-------------------+
|
| Variable_name | Value |
|
+----------------------+-------------------+
|
| collation_connection | utf8_general_ci |
|
| collation_database | latin1_swedish_ci |
|
| collation_server | latin1_swedish_ci |
|
+----------------------+-------------------+
|
3 rows in set (0.00 sec)
|
I have also placed into /home/ubuntu/ two source files: source_en.csv and source_cn.csv
If I try to make Pivot tables from English only source (source_en.csv) everything works as perfect as in manuals. But if I try to make Pivot table from multilingual file (source_cn.cvs) I can replicate something similar what happens on your machine:
MariaDB [test]> CREATE OR REPLACE TABLE t1 (who VARCHAR(16) NOT NULL, what VARCHAR(16) NOT NULL, amount DOUBLE(6,2) NOT NULL) ENGINE=CONNECT TABLE_TYPE=CSV FILE_NAME='/home/ubuntu/source_cn.csv' LRECL=128 DATA_CHARSET=UTF8;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [test]> select * from t1;
|
+-------+--------+--------+
|
| who | what | amount |
|
+-------+--------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | ?????? | 3.00 |
|
| Tom | ?? | 8.00 |
|
+-------+--------+--------+
|
4 rows in set, 4 warnings (0.00 sec)
|
|
MariaDB [test]> show warnings;
|
+---------+------+----------------------------------------------------------------------------------+
|
| Level | Code | Message |
|
+---------+------+----------------------------------------------------------------------------------+
|
| Warning | 1366 | Incorrect string value: '\xD0\x9C\xD0\xB0\xD1\x88...' for column 'what' at row 3 |
|
| Warning | 1105 | Out of range value Машина for column 'what' at row 3 |
|
| Warning | 1366 | Incorrect string value: '\xE9\x9D\xA2\xE5\x8C\x85' for column 'what' at row 4 |
|
| Warning | 1105 | Out of range value 面包 for column 'what' at row 4 |
|
+---------+------+----------------------------------------------------------------------------------+
|
4 rows in set (0.00 sec)
|
|
MariaDB [test]> CREATE OR REPLACE TABLE t2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=t1;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [test]> select * from t2;
|
+-------+-------+--------+--------+------+
|
| who | Pizza | Bière | ?????? | ?? |
|
+-------+-------+--------+--------+------+
|
| Ali | 0.00 | 0.00 | 3.00 | 0.00 |
|
| Beth | 12.00 | 0.00 | 0.00 | 0.00 |
|
| Janet | 0.00 | 3.00 | 0.00 | 0.00 |
|
| Tom | 0.00 | 0.00 | 0.00 | 8.00 |
|
+-------+-------+--------+--------+------+
|
4 rows in set (0.00 sec)
|
Well, lets try to change Charsets to all uft8. I am modifying /etc/mysql/my.cnf as following:
[mysqld]
|
init_connect=‘SET collation_connection = utf8_unicode_ci’
|
character-set-server = utf8
|
collation-server = utf8_unicode_ci
|
|
[client]
|
default-character-set = utf8
|
Now:
MariaDB [(none)]> show variables like "%character%"; show variables like "%collation%";
|
+--------------------------+----------------------------+
|
| Variable_name | Value |
|
+--------------------------+----------------------------+
|
| character_set_client | utf8 |
|
| character_set_connection | utf8 |
|
| character_set_database | utf8 |
|
| character_set_filesystem | binary |
|
| character_set_results | utf8 |
|
| character_set_server | utf8 |
|
| character_set_system | utf8 |
|
| character_sets_dir | /usr/share/mysql/charsets/ |
|
+--------------------------+----------------------------+
|
8 rows in set (0.00 sec)
|
|
+----------------------+-----------------+
|
| Variable_name | Value |
|
+----------------------+-----------------+
|
| collation_connection | utf8_general_ci |
|
| collation_database | utf8_unicode_ci |
|
| collation_server | utf8_unicode_ci |
|
+----------------------+-----------------+
|
3 rows in set (0.00 sec)
|
In order to make pure fresh test, I create DB 'test2' and now trying to make pivot table:
MariaDB [test2]> CREATE OR REPLACE TABLE t1 (who VARCHAR(16) NOT NULL, what VARCHAR(16) NOT NULL, amount DOUBLE(6,2) NOT NULL) ENGINE=CONNECT TABLE_TYPE=CSV FILE_NAME='/home/ubuntu/source_cn.csv' LRECL=128 DATA_CHARSET=UTF8;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [test2]> select * from t1;
|
+-------+--------------+--------+
|
| who | what | amount |
|
+-------+--------------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | Машина | 3.00 |
|
| Tom | 面包 | 8.00 |
|
+-------+--------------+--------+
|
4 rows in set (0.01 sec)
|
|
MariaDB [test2]> CREATE OR REPLACE TABLE t2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=t1;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [test2]> select * from t2;
|
*ERROR 1296 (HY000): Got error 122 'Cannot find matching column' from CONNECT
|
*
|
MariaDB [test2]> show create table t2;
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| t2 | CREATE TABLE `t2` (
|
`who` varchar(16) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`??????` double(6,2) NOT NULL `FLAG`=1,
|
`??` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='t1' |
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
1 row in set (0.00 sec)
|
Where should I send you login information for this server? Should I use your email I see here at Jira on gmail?
|
|
Yes you can use my email.
|
|
sent!
|
|
With the help of Alexander Barkov we could spot the places where column names were converted to latin1. That was causing the bug.
However, note that it fixes the case of general utf8 use but it would not take in account cases where several columns would be using different charsets.
Columns names are shown as their utf8 code. It cannot be different because the column names belong to different codepages. For instance:
MariaDB [(none)]> use test
|
Database changed
|
MariaDB [test]> set names utf8;
|
Query OK, 0 rows affected (0.04 sec)
|
|
MariaDB [test]> CREATE TABLE `tc1` (
|
`who` varchar(16) NOT NULL,
|
`what` varchar(16) NOT NULL,
|
`amount` double(6,2) NOT NULL
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`=CSV `FILE_NAME`='source.csv' `LRECL`=100 `DATA_CHARSET`=utf8 `ending`=1;
|
affected rows 0
|
|
MariaDB [test]> select * from tc1;
|
+-------+--------------+--------+
|
| who | what | amount |
|
+-------+--------------+--------+
|
| Beth | Pizza | 12.00 |
|
| Janet | Bière | 3.00 |
|
| Ali | ð£ð░Ðêð©ð¢ð░ | 3.00 |
|
| Tom | ÚØóÕîà | 8.00 |
|
+-------+--------------+--------+
|
4 rows in set (0.11 sec)
|
|
MariaDB [test]> CREATE TABLE tc2 ENGINE=CONNECT DEFAULT CHARSET utf8 TABLE_TYPE=PIVOT TABNAME=tc1;
|
affected rows 0
|
|
MariaDB [test]> show create table tc2;
|
|
CREATE TABLE `tc2` (
|
`who` varchar(48) NOT NULL,
|
`Pizza` double(6,2) NOT NULL `FLAG`=1,
|
`Bière` double(6,2) NOT NULL `FLAG`=1,
|
`ð£ð░Ðêð©ð¢ð░` double(6,2) NOT NULL `FLAG`=1,
|
`ÚØóÕîà` double(6,2) NOT NULL `FLAG`=1
|
) ENGINE=CONNECT DEFAULT CHARSET=utf8 `TABLE_TYPE`='PIVOT' `TABNAME`='tc1';
|
|
MariaDB [test]> select * from tc2;
|
+-------+-------+--------+--------------+--------+
|
| who | Pizza | Bi├¿re | ð£ð░Ðêð©ð¢ð░ | ÚØóÕîà |
|
+-------+-------+--------+--------------+--------+
|
| Ali | 0.00 | 0.00 | 3.00 | 0.00 |
|
| Beth | 12.00 | 0.00 | 0.00 | 0.00 |
|
| Janet | 0.00 | 3.00 | 0.00 | 0.00 |
|
| Tom | 0.00 | 0.00 | 0.00 | 8.00 |
|
+-------+-------+--------+--------------+--------+
|
4 rows in set (0.54 sec)
|
|
|
Sorry, explanation is not clear. What should be done in order to fix the bug?
I was trying to replicate your example on test machine, still got the error.
|
|
Nothing you can do now. The bug will be fixed in the next version 10.0.18.
|
|
Thanks a lot! I see it fixed.
|