[MDEV-9368] Improper result returned when using escaped backslash in query using like Created: 2016-01-05  Updated: 2016-06-20  Resolved: 2016-06-20

Status: Closed
Project: MariaDB Server
Component/s: Character Sets
Affects Version/s: 5.3.13, 5.5, 10.0, 10.1
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Christian Schmid Assignee: Alexander Barkov
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

Windows 7


Sprint: 10.0.26

 Description   

When using the LIKE operator on a varchar column an inproper result is returned if the data contains non standard ASCII chars (e.g. german umlauts).

Steps to reproduce
Create dummy table:

CREATE TABLE ESCAPE_TEST (
    id int primary key AUTO_INCREMENT,
    content varchar(500)
);

Insert dummy data:

INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Foo');
INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Lorem\\Ipsum');
INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Bar');
INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Fübär');
commit;

Perform query:

SELECT * FROM ESCAPE_TEST WHERE CONTENT LIKE '%\\\\%';

Actual Result

+----+-------------+
| id | content     |
+----+-------------+
|  2 | Lorem\Ipsum |
|  4 | Fübär       |
+----+-------------+

Expected Result

+----+-------------+
| id | content     |
+----+-------------+
|  2 | Lorem\Ipsum |
+----+-------------+

Further Information / Hints

  • Apparently lines containing some none ASCII chars are returned although they don't contain a backslash.
  • Problem occures with both InnoDB and MyISAM engine.
  • Current version of mysql returns expected result.


 Comments   
Comment by Elena Stepanova [ 2016-01-10 ]

Thanks for the report and the test case.

In fact, it works the same way on current MySQL. Maybe you had different character set values on that server, so it wasn't reproducible (apparently, it happens with utf8, but not latin1).

MySQL [test]> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
 
MySQL [test]> 
MySQL [test]> CREATE TABLE ESCAPE_TEST (
    ->     id int primary key AUTO_INCREMENT,
    ->     content varchar(500)
    -> );
Query OK, 0 rows affected (0.26 sec)
 
MySQL [test]> 
MySQL [test]> INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Foo');
Query OK, 1 row affected (0.02 sec)
 
MySQL [test]> INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Lorem\\Ipsum');
Query OK, 1 row affected (0.00 sec)
 
MySQL [test]> INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Bar');
Query OK, 1 row affected (0.00 sec)
 
MySQL [test]> INSERT INTO ESCAPE_TEST (CONTENT) VALUES ('Fübär');
Query OK, 1 row affected (0.01 sec)
 
MySQL [test]> 
MySQL [test]> SELECT * FROM ESCAPE_TEST WHERE CONTENT LIKE '%\\\\%';
+----+-------------+
| id | content     |
+----+-------------+
|  2 | Lorem\Ipsum |
|  4 | Fübär       |
+----+-------------+
2 rows in set (0.00 sec)
 
MySQL [test]> select @@version;
+-----------+
| @@version |
+-----------+
| 5.7.10    |
+-----------+
1 row in set (0.00 sec)

Comment by Alexander Barkov [ 2016-06-16 ]

This script:

SET NAMES utf8;
SELECT CONVERT('Fübär' USING latin1) LIKE '%\\\\%';

returns

+-----------------------------------------------+
| CONVERT('Fübär' USING latin1) LIKE '%\\\\%'   |
+-----------------------------------------------+
|                                             1 |
+-----------------------------------------------+

This is wrong.

Comment by Alexander Barkov [ 2016-06-16 ]

The problem is repeatable with the equality operator as well:

SET NAMES utf8;
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET latin1);
INSERT INTO t1 VALUES ('\\');
SELECT * FROM t1 WHERE a='ä';

+------+
| a    |
+------+
| \    |
+------+
1 row in set (0.00 sec)

The reason for that is that latin1_swedish_ci is defined the way that '\' is equal to 'ä'.
See here for details:
http://collation-charts.org/mysql60/mysql604.latin1_swedish_ci.html

We cannot change latin1_swedish_ci at this point, as it will bring heavy upgrade problems.

Please use this workaround:

SELECT * FROM t1 WHERE CONTENT LIKE BINARY '%\\\\%';

It correctly returns the expected result:

+----+-------------+
| id | content     |
+----+-------------+
|  2 | Lorem\Ipsum |
+----+-------------+

Generated at Thu Feb 08 07:34:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.