[MDEV-16979] Inconsistent data stored via INSERT INTO and .csv file when using Russian letters Created: 2018-08-14 Updated: 2018-08-16 Resolved: 2018-08-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Affects Version/s: | 10.2.14 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Martin Landhage | Assignee: | Unassigned |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | tests | ||
| Environment: |
NAME=Fedora |
||
| Attachments: |
|
| Description |
|
I have two tables that I merge through a Python program: The selection is done using Teltec.Prislista.Typ the data is merged to Teltec.Ror. This has been working fine for both Swedish and Ryssian letters when data is inserted with INSERT INTO command for both tables. Then I loaded data using a csv file. Then my python 2.7 sorting program did not select types with Russian and Swedish letters due to mismatch between the tables. The data stored are simply diffrent ! I see this as a MariaDB inconsistency. I have enclosed: Compare prislista.txt and lager.txt, for rows containing Russian and Swedish letters does not look ok. This is why the python program failes to get a match. Using MySQL 5.7 I got the data sorting to work for Russian and Swedish letters by adding the following lines in prislista.sql: |
| Comments |
| Comment by Alice Sherepa [ 2018-08-16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I can repeat the difference, got also the same results with Mysql 5.7.
1.csv :
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Alice Sherepa [ 2018-08-16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
"The character set indicated by the character_set_database system variable is used to interpret the information in the file. SET NAMES and the setting of character_set_client do not affect interpretation of input. If the contents of the input file use a character set that differs from the default, it is usually preferable to specify the character set of the file by using the CHARACTER SET clause, which is available." (https://mariadb.com/kb/en/library/load-data-infile/) So the difference was caused by character_set_database (=latin1), while after specifing CHARACTER SET utf8 the results are correct.
|