[MDEV-31748] Reuse Simple_tokenizer - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: None
Component/s: Variables
Labels:
None

Description

~~MDEV-30164~~ introduced a class Simple_tokenizer. It supports loose parsing of a list of name=value pairs:

,,name1=value1, name2 =   value2 ,,, name3=value3,,

Any number of spaces are allowed at any place between tokens
Empty name=value pairs are allowed, i.e. multiple commas can go in a row. This makes it convenient to "edit" the value of @@character_set_collations, for example, pass it to REGEX_REPLACE() and cut a fragment using a simpler regular expression.

SET @@character_set_collations='big5=big5_bin,latin1=latin1_bin,utf8mb4=utf8mb4_bin';

SELECT REGEXP_REPLACE(@@character_set_collations,'big5=[a-z0-9_]*','');

+-----------------------------------------------------------------+

| REGEXP_REPLACE(@@character_set_collations,'big5=[a-z0-9_]*','') |

+-----------------------------------------------------------------+

|                          ,latin1=latin1_bin,utf8mb4=utf8mb4_bin |

+-----------------------------------------------------------------+

Notice, REGEXP_REPLACE() made an empty pair in the beginning of the result, however SET still understands the result:

SET @@character_set_collations=REGEXP_REPLACE(@@character_set_collations,'big5=[a-z0-9_]*','');

SELECT @@character_set_collations;

+---------------------------------------+

| @@character_set_collations            |

+---------------------------------------+

| latin1=latin1_bin,utf8mb4=utf8mb4_bin |

+---------------------------------------+

Simple_tokenizer is charset-unaware. It expects only ASCII data. It can be reused for some other system variables where we parse pure ASCII data with complex format, such as lists:

sql_mode
optimizer_switch
log_slow_filter
myisam_recover_options
slave_transaction_retry_errors

For now every variable implement its own tokenizers, so behaviour can vary:

-- Only-spaces are not allowed

SET optimizer_switch=' ';

ERROR 1231 (42000): Variable 'optimizer_switch' can't be set to the value of ' '

-- Empty pairs are not allowed

SET optimizer_switch=',';

ERROR 1231 (42000): Variable 'optimizer_switch' can't be set to the value of ','

-- Spaces before commas are allowed

SET optimizer_switch='index_merge=on ,index_merge_union=on';

Query OK, 0 rows affected (0.000 sec)

-- However spaces after commas are not allowed

SET optimizer_switch='index_merge=on, index_merge_union=on';

ERROR 1231 (42000): Variable 'optimizer_switch' can't be set to the value of ' index_merge_union=on'

Let's reuse the class Simple_tokenizer to:

make all variables work in the same style
reduce duplicate/similar code

Attachments

Issue Links

relates to

MDEV-30164 System variable for default collations

Closed

Activity

There are no comments yet on this issue.

People

Assignee:: Alexander Barkov

Reporter:: Alexander Barkov

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2023-07-20 01:50

Updated:: 2024-06-21 09:07

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server