[MDEV-21640] REGEXP_REPLACE 'pcre_exec: null argument passed' from regexp Created: 2020-02-03  Updated: 2023-04-27

Status: Confirmed
Project: MariaDB Server
Component/s: Character Sets, Data types
Affects Version/s: 10.4.10, 10.4.11, 10.3, 10.4
Fix Version/s: 10.4

Type: Bug Priority: Minor
Reporter: Chris Hewett Assignee: Alexander Barkov
Resolution: Unresolved Votes: 2
Labels: None
Environment:

CentOS 7



 Description   

I am experiencing a bug when updating a field with a REGEX_REPLACE() when the field is a blank string and it is passed through TRIM() first.

To Replicate:

CREATE TABLE `test` (
 `input` varchar(10) COLLATE utf8_unicode_ci NOT NULL,
 `output` varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
 PRIMARY KEY (`input`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; 

INSERT INTO `test` (`input`, `output`) VALUES ('', NULL);

UPDATE `test` SET `output` = REGEXP_REPLACE(TRIM(`input`), '[^a-z]', '');

Expected Result:

`output` is set to ''

Actual Result:

Got error 'pcre_exec: null argument passed' from regexp

It seems to be something to do with the combination between TRIM() and a field with a blank string as:

UPDATE `test` SET `output` = REGEXP_REPLACE(`input`, '[^a-z]', '');

no error

Also:

UPDATE `test` SET `output` = REGEXP_REPLACE(TRIM(''), '[^a-z]', '');

no error



 Comments   
Comment by Elena Stepanova [ 2020-03-07 ]

Thanks for the report.
Reproducible on 10.3-10.5, not reproducible on 10.2.
It has something to do with the column being utf8:

10.3 a662cb9b

MariaDB [test]> CREATE TABLE t (a varchar(10) CHARSET utf8);
Query OK, 0 rows affected (0.226 sec)
 
MariaDB [test]> INSERT INTO t VALUES ('');
Query OK, 1 row affected (0.041 sec)
 
MariaDB [test]> SELECT REGEXP_REPLACE(TRIM(a), '[^a-z]', '') FROM t;
+---------------------------------------+
| REGEXP_REPLACE(TRIM(a), '[^a-z]', '') |
+---------------------------------------+
|                                       |
+---------------------------------------+
1 row in set, 1 warning (0.000 sec)
 
MariaDB [test]> SHOW WARNINGS;
+---------+------+---------------------------------------------------------+
| Level   | Code | Message                                                 |
+---------+------+---------------------------------------------------------+
| Warning | 1139 | Got error 'pcre_exec: null argument passed' from regexp |
+---------+------+---------------------------------------------------------+
1 row in set (0.000 sec)

Not reproducible with latin1:

MariaDB [test]> CREATE TABLE t (a varchar(10) CHARSET latin1);
Query OK, 0 rows affected (0.195 sec)
 
MariaDB [test]> INSERT INTO t VALUES ('');
Query OK, 1 row affected (0.033 sec)
 
MariaDB [test]> SELECT REGEXP_REPLACE(TRIM(a), '[^a-z]', '') FROM t;
+---------------------------------------+
| REGEXP_REPLACE(TRIM(a), '[^a-z]', '') |
+---------------------------------------+
|                                       |
+---------------------------------------+
1 row in set (0.000 sec)

Comment by Edouard [ 2020-03-20 ]

Hi all,

Just to add informations:

  • I observe there is no error on the first call, but then all subsequent fail. Crazy !!!
  • It's not specific to TRIM. I also observe it with REPLACE. So maybe they are other situations with this trouble.

Workaround in the meanwhile (update in buster stable may be really long) :

  • IGNORE keyword: a bit too strong according to me
  • cast '' to NULL and then recast NULL to '':

    IFNULL(REGEXP_REPLACE(NULLIF(TRIM(a),''), '[^a-z]', ''),'')
    

Sure we can do it better, so do not hesitate to share,

Regards, Edouard

Comment by Christian Proust [ 2020-11-20 ]

I have MariaDB 10.5.5.

I cannot reproduce the bug with the code in the first comment comment-146036 (it does not issue any warning for me).

I can reproduce the bug with the code from the summary. The error message is for me:

SQL Error (1139): Regex error 'NULL argument passed'

I notice that the bug does not appear with utf8mb4_unicode_ci, only with utf8_unicode_ci. Maybe it's a good workaround?

Generated at Thu Feb 08 09:08:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.