[MDEV-29095] REGEXP_REPLACE treats empty strings different than REPLACE in ORACLE mode Created: 2022-07-13  Updated: 2024-01-24  Resolved: 2024-01-24

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.3.35, 10.4.25, 10.5.16, 10.6.8, 10.7.4, 10.8.3
Fix Version/s: 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2

Type: Bug Priority: Critical
Reporter: Hartmut Holzgraefe Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MDEV-27744 LPAD in vcol created in ORACLE mode m... Closed
Relates
relates to MDEV-13003 Oracle compatibility : Replace functi... Closed

 Description   

The REPLACE function has special treatment for cases where the 3rd parameter is NULL in ORACLE sql_mode, REGEXP_REPLACE on the other hand always returns NULL when the 3rd parameter is NULL, or an empty string converted to NULL due to empty_string_is_null sql_mode.

When calling both functions with parameters ('x-x', '-', empty string or NULL) the results are as follows

3rd param  REPLACE  REGEXP_REPLACE  sql_mode
---------------------------------------------------------------
''           xx           xx        ORACLE
NULL         xx          NULL       ORACLE
''           xx          NULL       ORACLE,empty_string_is_null
NULL         xx          NULL       ORACLE,empty_string_is_null



 Comments   
Comment by Alexander Barkov [ 2023-04-24 ]

Oracle 11g works as follows:

CREATE TABLE t1 (subject CLOB, pattern CLOB, replacement CLOB);
INSERT INTO t1 VALUES ('x-x','-','');
INSERT INTO t1 VALUES ('x-x','-',NULL);
 
SELECT
  subject, pattern, replacement,
  REPLACE(subject,pattern,replacement) AS rpl,
  REGEXP_REPLACE(subject,pattern,replacement) AS rrpl
FROM t1;

| SUBJECT | PATTERN | REPLACEMENT | RPL | RRPL |
|---------|---------|-------------|-----|------|
|     x-x |       - |      (null) |  xx |   xx |
|     x-x |       - |      (null) |  xx |   xx |

SELECT REPLACE('x-x','-','') AS rpl, REGEXP_REPLACE('x-x','-','') AS rrpl FROM DUAL;

| RPL | RRPL |
|-----|------|
|  xx |   xx |
        

SELECT REPLACE('x-x','-',NULL) AS rpl, REGEXP_REPLACE('x-x','-',NULL) AS rrpl FROM DUAL;

| RPL | RRPL |
|-----|------|
|  xx |   xx |

Comment by Alexander Barkov [ 2023-04-24 ]

MariaDB-10.4-28:

DELIMITER $$
CREATE OR REPLACE PROCEDURE p1(mode TEXT)
BEGIN
  SET sql_mode='';
  CREATE OR REPLACE TABLE t1 (subject TEXT, pattern TEXT, replacement TEXT);
  INSERT INTO t1 VALUES ('x-x','-','');
  INSERT INTO t1 VALUES ('x-x','-',NULL);
 
  SET sql_mode=mode;
  -- EXECUTE IMMEDIATE is needed to reparse the statements using the given sql_mode
 
  EXECUTE IMMEDIATE
  'SELECT
    subject, pattern, replacement,
    REPLACE(subject,pattern,replacement) AS rpl,
    REGEXP_REPLACE(subject,pattern,replacement) AS rrpl
  FROM t1';
  EXECUTE IMMEDIATE
  "SELECT REPLACE('x-x','-','') AS rpl, REGEXP_REPLACE('x-x','-','') AS rrpl";
  EXECUTE IMMEDIATE
  "SELECT REPLACE('x-x','-',NULL) AS rpl, REGEXP_REPLACE('x-x','-',NULL) AS rrpl";
END;
$$
DELIMITER ;

CALL p1('ORACLE');

+---------+---------+-------------+------+------+
| subject | pattern | replacement | rpl  | rrpl |
+---------+---------+-------------+------+------+
| x-x     | -       |             | xx   | xx   |
| x-x     | -       | NULL        | xx   | NULL | <-- expect 'xx'
+---------+---------+-------------+------+------+
2 rows in set (0.055 sec)
 
+-----+------+
| rpl | rrpl |
+-----+------+
| xx  | xx   |
+-----+------+
1 row in set (0.055 sec)
 
+------+------+
| rpl  | rrpl |
+------+------+
| xx   | NULL | <-- expect 'xx'
+------+------+
1 row in set (0.055 sec)

CALL p1('ORACLE,empty_string_is_null');

+---------+---------+-------------+------+------+
| subject | pattern | replacement | rpl  | rrpl |
+---------+---------+-------------+------+------+
| x-x     | -       |             | xx   | xx   |
| x-x     | -       | NULL        | xx   | NULL | <-- expect 'xx'
+---------+---------+-------------+------+------+
2 rows in set (0.032 sec)
 
+------+------+
| rpl  | rrpl |
+------+------+
| xx   | NULL | <-- expect 'xx'
+------+------+
1 row in set (0.032 sec)
 
+------+------+
| rpl  | rrpl |
+------+------+
| xx   | NULL | <-- expect 'xx'
+------+------+
1 row in set (0.032 sec)

Obvervation:

  • REPLACE consistently returns a not NULL result for sql_mode='ORACLE' - this is correct
  • REGEXP_REPLACE returns NULL for NULL in the third argument if sql_mode='ORACLE' - this is wrong
  • REGEXP_REPLACE returns NULL for empty string literal in the third argument if sql_mode='ORACLE,empty_string_is_null' - this is wrong
Comment by Sergei Golubchik [ 2023-06-27 ]

Because it depends on MDEV-27744.

Instead of adding yet another function with the _ORACLE suffix, REGEX_REPLACE_ORACLE, MDEV-27744 introduces a uniform handling of all oracle-specific functions without adding more keywords with suffixes. We'd like to avoid the hack of adding more names with _ORACLE (or _MARIADB) suffix, as we'll never be able to take them out.

Comment by Alexander Barkov [ 2023-11-24 ]

Hello, serg,

please review this patch:

https://github.com/MariaDB/server/commit/9d6de9212d54b586b3c3434201abf833512190dc

Thanks.

Comment by Alexander Barkov [ 2024-01-16 ]

Hello serg,

Please review a new patch version:

https://github.com/MariaDB/server/commit/94fff9671daff70137bc806723363e9b507a5034

Also, comments sent by email.

Thanks.

Comment by Sergei Golubchik [ 2024-01-23 ]

94fff9671daff70137bc806723363e9b507a5034 is ok to push

Generated at Thu Feb 08 10:05:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.