Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25015

Custom formatting of strings in MariaDB queries

Details

    Description

      Formatting more complex strings in a SELECT statement can get awkward when there are many concat(), format(), etc calls involved.
      It would be very cool and helpful to have a function that takes an input string and a formatting specification and returns string formatted using the rules the user passed in the specification.

      A great example for such a function is the classic C printf function, which, in this context, would look something like:

      SELECT printf('%s %s, %s', first_name, last_name, job_title) from employees;

      But it doesn't necessarily need to look this way, an alternative syntax could be Python-ish, which would leverage the fact that the server already knows the datatype of each field used in the formatting scheme:

      SELECT sformat('arg1: {}, arg2: {}', col1, col2) from table;

      In that syntax one passes formatting options within the curly braces:

      -- Print 'arg1: col1, arg2: col2'  where col1 from table is of datetime type and should be printed as: 'Sunday November 2021'
      SELECT sformat('arg1: {%W %M %Y}, arg2: {}', col1, col2) from table;
      

      Ideally, this new function should use, behind the scenes, the existing builtin formatting functions in MariaDB (e.g. date_format(), format()) and even future formatting functions (e.g. MySQL's format_bytes(), format_pico_time()), so the syntax has to be designed in a smart way to accommodate easily future additions.

      Attachments

        Issue Links

          Activity

            is there anything comparable in other DBMS-es?

            serg Sergei Golubchik added a comment - is there anything comparable in other DBMS-es?

            Didn't find in Oracle, DB2, or SQL Server.
            PostgreSQL has https://www.postgresql.org/docs/9.3/functions-string.html#FUNCTIONS-STRING-FORMAT

            serg Sergei Golubchik added a comment - Didn't find in Oracle, DB2, or SQL Server. PostgreSQL has https://www.postgresql.org/docs/9.3/functions-string.html#FUNCTIONS-STRING-FORMAT

            Perhaps, suggesting {%W %M %Y} wasn't such a great idea, one can always do

            SELECT sformat('arg1: {}', date_format(col1, '%W %M %Y')) from table
            

            May be it'd be more consistent to stick to python-ish format and not allow completely different syntax inside curly braces for different types.

            Having PostgreSQL-like "sql literal" and "identifier" (also like %`s in my_vsnprinf()) modifiers would be quite useful, though.

            serg Sergei Golubchik added a comment - Perhaps, suggesting { %W %M %Y } wasn't such a great idea, one can always do SELECT sformat( 'arg1: {}' , date_format(col1, '%W %M %Y' )) from table May be it'd be more consistent to stick to python-ish format and not allow completely different syntax inside curly braces for different types. Having PostgreSQL-like "sql literal" and "identifier" (also like %`s in my_vsnprinf() ) modifiers would be quite useful, though.
            serg Sergei Golubchik added a comment - - edited

            just to record the conclusion. We ended up using fmt library.

            Changes/limitation:

            • temporal data are handled as strings (no native support yet)
            • decimal data are handled as doubles (no native support yet)
            • row values are syntax errors (the parser cannot do it)
            • {:Ñ€} is not supported, obviously
            • {:L} uses separators as in FORMAT() function, and does not look at the OS locale
            serg Sergei Golubchik added a comment - - edited just to record the conclusion. We ended up using fmt library. Changes/limitation: temporal data are handled as strings (no native support yet) decimal data are handled as doubles (no native support yet) row values are syntax errors (the parser cannot do it) { :Ñ€ } is not supported, obviously { :L } uses separators as in FORMAT() function, and does not look at the OS locale

            serg Thank you for the solid MTR testcase main/func_sformat.test

            Roel Roel Van de Paar added a comment - serg Thank you for the solid MTR testcase main/func_sformat.test

            serg Hi! There is a compiler error when installing libfmt-dev on Ubuntu 20.04.3 LTS and compiling:

            [ 91%] Built target ma_test_loghandler_first_lsn-t
            [ 91%] Building CXX object sql/CMakeFiles/sql.dir/lock.cc.o
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc: In member function ‘virtual String* Item_func_sformat::val_str(String*)’:
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1392:27: error: ‘fmt::detail’ has not been declared
             1392 |       vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_int());
                  |                           ^~~~~~
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1392:47: error: expected primary-expression before ‘>’ token
             1392 |       vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_int());
                  |                                               ^
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1396:27: error: ‘fmt::detail’ has not been declared
             1396 |       vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_real());
                  |                           ^~~~~~
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1396:47: error: expected primary-expression before ‘>’ token
             1396 |       vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_real());
                  |                                               ^
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1405:29: error: ‘fmt::detail’ has not been declared
             1405 |         vargs[carg-1]= fmt::detail::make_arg<ctx>(parg->ptr()[0]);
                  |                             ^~~~~~
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1405:49: error: expected primary-expression before ‘>’ token
             1405 |         vargs[carg-1]= fmt::detail::make_arg<ctx>(parg->ptr()[0]);
                  |                                                 ^
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1407:29: error: ‘fmt::detail’ has not been declared
             1407 |         vargs[carg-1]= fmt::detail::make_arg<ctx>(*parg);
                  |                             ^~~~~~
            /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1407:49: error: expected primary-expression before ‘>’ token
             1407 |         vargs[carg-1]= fmt::detail::make_arg<ctx>(*parg);
                  |                                                 ^
            [ 91%] Building CXX object sql/CMakeFiles/sql.dir/log_event.cc.o
            make[2]: *** [sql/CMakeFiles/sql.dir/build.make:418: sql/CMakeFiles/sql.dir/item_strfunc.cc.o] Error 1
            make[2]: *** Waiting for unfinished jobs....
            make[1]: *** [CMakeFiles/Makefile2:11022: sql/CMakeFiles/sql.dir/all] Error 2
            make: *** [Makefile:163: all] Error 2
            

            Compiling with that package not installed or purged works fine (and SFORMAT works in that case too).

            Roel Roel Van de Paar added a comment - serg Hi! There is a compiler error when installing libfmt-dev on Ubuntu 20.04.3 LTS and compiling: [ 91%] Built target ma_test_loghandler_first_lsn-t [ 91%] Building CXX object sql/CMakeFiles/sql.dir/lock.cc.o /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc: In member function ‘virtual String* Item_func_sformat::val_str(String*)’: /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1392:27: error: ‘fmt::detail’ has not been declared 1392 | vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_int()); | ^~~~~~ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1392:47: error: expected primary-expression before ‘>’ token 1392 | vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_int()); | ^ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1396:27: error: ‘fmt::detail’ has not been declared 1396 | vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_real()); | ^~~~~~ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1396:47: error: expected primary-expression before ‘>’ token 1396 | vargs[carg-1]= fmt::detail::make_arg<ctx>(args[carg]->val_real()); | ^ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1405:29: error: ‘fmt::detail’ has not been declared 1405 | vargs[carg-1]= fmt::detail::make_arg<ctx>(parg->ptr()[0]); | ^~~~~~ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1405:49: error: expected primary-expression before ‘>’ token 1405 | vargs[carg-1]= fmt::detail::make_arg<ctx>(parg->ptr()[0]); | ^ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1407:29: error: ‘fmt::detail’ has not been declared 1407 | vargs[carg-1]= fmt::detail::make_arg<ctx>(*parg); | ^~~~~~ /test/MDEV-25015-10.7_dbg/sql/item_strfunc.cc:1407:49: error: expected primary-expression before ‘>’ token 1407 | vargs[carg-1]= fmt::detail::make_arg<ctx>(*parg); | ^ [ 91%] Building CXX object sql/CMakeFiles/sql.dir/log_event.cc.o make[2]: *** [sql/CMakeFiles/sql.dir/build.make:418: sql/CMakeFiles/sql.dir/item_strfunc.cc.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [CMakeFiles/Makefile2:11022: sql/CMakeFiles/sql.dir/all] Error 2 make: *** [Makefile:163: all] Error 2 Compiling with that package not installed or purged works fine (and SFORMAT works in that case too).
            Roel Roel Van de Paar added a comment - - edited

            serg Looking at TODO-3121, it could be the minimum version? The standard version (from sudo apt install libfmt-dev on 20.04.3 LTS) is 6.1 (libfmt-dev_6.1.2+ds-2_amd64.deb).

            And, version 7 is not available by default E: Unable to locate package libfmt7-dev

            Roel Roel Van de Paar added a comment - - edited serg Looking at TODO-3121, it could be the minimum version? The standard version (from sudo apt install libfmt-dev on 20.04.3 LTS) is 6.1 ( libfmt-dev_6.1.2+ds-2_amd64.deb ). And, version 7 is not available by default E: Unable to locate package libfmt7-dev
            Roel Roel Van de Paar added a comment - - edited

            Testing (without libfmt-dev installed, which works, ref question in TODO-3121) shows;

            • Feature looks stable and well implemented.
            • Created MTR test is extensive and also stable, even when run in parallel + many times.
            • When the same SQL is taken and run in 1k parallel threads using pquery, feature also remains stable.
            • Idem for CLI (2k threads) and still performant (SFORMAT).
            • Format rendering is ultra fast.
            • No ASAN nor UBSAN errors were detected.
            Roel Roel Van de Paar added a comment - - edited Testing (without libfmt-dev installed, which works, ref question in TODO-3121) shows; Feature looks stable and well implemented. Created MTR test is extensive and also stable, even when run in parallel + many times. When the same SQL is taken and run in 1k parallel threads using pquery, feature also remains stable. Idem for CLI (2k threads) and still performant (SFORMAT). Format rendering is ultra fast. No ASAN nor UBSAN errors were detected.

            I consider the feature OK to push.

            Roel Roel Van de Paar added a comment - I consider the feature OK to push.
            serg Sergei Golubchik added a comment - - edited

            Roel, you should use preview-10.7-MDEV-25015-sformat branch, as listed in the TODO-3118. bb-10.7-MDEV-25015-sformat is old and leaks memory every time SFORMAT is called.

            Yes, our implementation needs libfmt at least 7.0, Ubuntu's 6.1 is too old. Thanks, for pointing it out, I'll fix it.

            By the way, it's not my code or test cases, it's a GSoC'21 project implemented by Alan Cueva

            serg Sergei Golubchik added a comment - - edited Roel , you should use preview-10.7- MDEV-25015 -sformat branch, as listed in the TODO-3118. bb-10.7- MDEV-25015 -sformat is old and leaks memory every time SFORMAT is called. Yes, our implementation needs libfmt at least 7.0, Ubuntu's 6.1 is too old. Thanks, for pointing it out, I'll fix it. By the way, it's not my code or test cases, it's a GSoC'21 project implemented by Alan Cueva

            serg Thank you. I went back to double check the logs, and that was indeed the branch I used for testing.

            git clone --depth=1 --recurse-submodules -j8 --branch=preview-10.7-MDEV-25015-sformat https://github.com/MariaDB/server.git MDEV-25015-10.7
            

            I see. Thank you Alan Cueva for a well done testcase. Ack on libmft.

            Roel Roel Van de Paar added a comment - serg Thank you. I went back to double check the logs, and that was indeed the branch I used for testing. git clone --depth=1 --recurse-submodules -j8 --branch=preview-10.7-MDEV-25015-sformat https://github.com/MariaDB/server.git MDEV-25015-10.7 I see. Thank you Alan Cueva for a well done testcase. Ack on libmft.
            Roel Roel Van de Paar added a comment - - edited

            Logged MDEV-26646 SFORMAT Does not allow @variable use (Warning 4183 SFORMAT error: invalid type specifier)
            Logged MDEV-26648 Feature: SFORMAT auto-typing
            Logged MDEV-26649 SFORMAT: Temporal type support (DATE) missing

            Roel Roel Van de Paar added a comment - - edited Logged MDEV-26646 SFORMAT Does not allow @variable use (Warning 4183 SFORMAT error: invalid type specifier) Logged MDEV-26648 Feature: SFORMAT auto-typing Logged MDEV-26649 SFORMAT: Temporal type support (DATE) missing
            Roel Roel Van de Paar added a comment - - edited

            serg Could you please add this to the testcase (subqueries were not included yet):

            CREATE TABLE t1 (c1 FLOAT);
            INSERT INTO t1 VALUES (1.1),(2.2);
            SELECT SFORMAT ('Float 1.1: {:+f}', (SELECT c1 FROM t1 LIMIT 1));
            SELECT SFORMAT ('Error ERROR 1242 (21000): Subquery returns more than 1 row: {:+f}', (SELECT c1 FROM t1));
            SELECT SFORMAT ('Float 2.2: {:+f}', (SELECT c1 FROM t1 WHERE c1>2));
            SELECT SFORMAT ('Float 1.1: {:+f}', (SELECT * FROM t1 LIMIT 1));
            SELECT SFORMAT ('Error ERROR 1242 (21000): Subquery returns more than 1 row: {:+f}', (SELECT * FROM t1));
            DROP TABLE t1;
            CREATE TABLE t1 (c1 FLOAT,c2 FLOAT);
            INSERT INTO t1 VALUES (1.1,2.2);
            SELECT SFORMAT ('Float 1.1: {:+f}', (SELECT c1 FROM t1 LIMIT 1));
            SELECT SFORMAT ('ERROR 1241 (21000): Operand should contain 1 column(s) {:+f}', (SELECT c1,c2 FROM t1 LIMIT 1));
            SELECT SFORMAT ('ERROR 1241 (21000): Operand should contain 1 column(s) {:+f}', (SELECT * FROM t1 LIMIT 1));
            DROP TABLE t1;
            

            Thank you

            Roel Roel Van de Paar added a comment - - edited serg Could you please add this to the testcase (subqueries were not included yet): CREATE TABLE t1 (c1 FLOAT); INSERT INTO t1 VALUES (1.1),(2.2); SELECT SFORMAT ('Float 1.1: {:+f}', (SELECT c1 FROM t1 LIMIT 1)); SELECT SFORMAT ('Error ERROR 1242 (21000): Subquery returns more than 1 row: {:+f}', (SELECT c1 FROM t1)); SELECT SFORMAT ('Float 2.2: {:+f}', (SELECT c1 FROM t1 WHERE c1>2)); SELECT SFORMAT ('Float 1.1: {:+f}', (SELECT * FROM t1 LIMIT 1)); SELECT SFORMAT ('Error ERROR 1242 (21000): Subquery returns more than 1 row: {:+f}', (SELECT * FROM t1)); DROP TABLE t1; CREATE TABLE t1 (c1 FLOAT,c2 FLOAT); INSERT INTO t1 VALUES (1.1,2.2); SELECT SFORMAT ('Float 1.1: {:+f}', (SELECT c1 FROM t1 LIMIT 1)); SELECT SFORMAT ('ERROR 1241 (21000): Operand should contain 1 column(s) {:+f}', (SELECT c1,c2 FROM t1 LIMIT 1)); SELECT SFORMAT ('ERROR 1241 (21000): Operand should contain 1 column(s) {:+f}', (SELECT * FROM t1 LIMIT 1)); DROP TABLE t1; Thank you

            Note sure if this could be considered a bug?

            10.7.0 53b2c1f4664a3cb90f583979d9aa2771b7e1c98d (Debug)

            10.7.0-dbg>CREATE TABLE t1 (c1 FLOAT);
            Query OK, 0 rows affected (0.013 sec)
             
            10.7.0-dbg>INSERT INTO t1 VALUES (1.1);
            Query OK, 1 row affected (0.003 sec)
             
            10.7.0-dbg>SELECT * FROM t1;
            +------+
            | c1   |
            +------+
            |  1.1 |
            +------+
            1 row in set (0.001 sec)
             
            10.7.0-dbg>SELECT SFORMAT('{}',(SELECT 1.1));
            +----------------------------+
            | SFORMAT('{}',(SELECT 1.1)) |
            +----------------------------+
            | 1.1                        |
            +----------------------------+
            1 row in set (0.000 sec)
             
            10.7.0-dbg>SELECT SFORMAT('{}',(SELECT c1 FROM t1));
            +-----------------------------------+
            | SFORMAT('{}',(SELECT c1 FROM t1)) |
            +-----------------------------------+
            | 1.100000023841858                 |
            +-----------------------------------+
            1 row in set (0.001 sec)
            

            Roel Roel Van de Paar added a comment - Note sure if this could be considered a bug? 10.7.0 53b2c1f4664a3cb90f583979d9aa2771b7e1c98d (Debug) 10.7.0-dbg>CREATE TABLE t1 (c1 FLOAT); Query OK, 0 rows affected (0.013 sec)   10.7.0-dbg>INSERT INTO t1 VALUES (1.1); Query OK, 1 row affected (0.003 sec)   10.7.0-dbg>SELECT * FROM t1; +------+ | c1 | +------+ | 1.1 | +------+ 1 row in set (0.001 sec)   10.7.0-dbg>SELECT SFORMAT('{}',(SELECT 1.1)); +----------------------------+ | SFORMAT('{}',(SELECT 1.1)) | +----------------------------+ | 1.1 | +----------------------------+ 1 row in set (0.000 sec)   10.7.0-dbg>SELECT SFORMAT('{}',(SELECT c1 FROM t1)); +-----------------------------------+ | SFORMAT('{}',(SELECT c1 FROM t1)) | +-----------------------------------+ | 1.100000023841858 | +-----------------------------------+ 1 row in set (0.001 sec)

            yes, let's consider it a bug. libfmt can print floats just fine, so we should pass down floats as floats without upcasting them to double.

            serg Sergei Golubchik added a comment - yes, let's consider it a bug. libfmt can print floats just fine, so we should pass down floats as floats without upcasting them to double.

            Logged as MDEV-26691 SFORMAT: Pass down FLOAT as FLOAT, without upcast to DOUBLE

            Roel Roel Van de Paar added a comment - Logged as MDEV-26691 SFORMAT: Pass down FLOAT as FLOAT, without upcast to DOUBLE
            Roel Roel Van de Paar added a comment - - edited

            Though it's covered in the testcase (input (5,5,5)) should this give a warning ("extraneous inputs/parameter found") - or the existing ERROR 1582 (42000): Incorrect parameter count in the call to native function 'sformat'?

            10.7.0 53b2c1f4664a3cb90f583979d9aa2771b7e1c98d (Debug)

            10.7.0-dbg>SELECT SFORMAT(1,2,3);
            +----------------+
            | SFORMAT(1,2,3) |
            +----------------+
            | 1              |
            +----------------+
            1 row in set (0.000 sec)
            

            Roel Roel Van de Paar added a comment - - edited Though it's covered in the testcase ( input (5,5,5) ) should this give a warning ("extraneous inputs/parameter found") - or the existing ERROR 1582 (42000): Incorrect parameter count in the call to native function 'sformat' ? 10.7.0 53b2c1f4664a3cb90f583979d9aa2771b7e1c98d (Debug) 10.7.0-dbg>SELECT SFORMAT(1,2,3); +----------------+ | SFORMAT(1,2,3) | +----------------+ | 1 | +----------------+ 1 row in set (0.000 sec)

            Further testing has not revealed any additional issues save the one above.

            Re-assigning back to serg for push to trunk if not done already.

            Roel Roel Van de Paar added a comment - Further testing has not revealed any additional issues save the one above. Re-assigning back to serg for push to trunk if not done already.
            f_razzoli Federico Razzoli added a comment - - edited

            Trying to use named parameters ('{name}') doesn't return a specific error, though I couldn't find a way to make it work. Please clarify how to use named parameters in the docs, or produce an error that clarifies that this is not supported.

            f_razzoli Federico Razzoli added a comment - - edited Trying to use named parameters ('{name}') doesn't return a specific error, though I couldn't find a way to make it work. Please clarify how to use named parameters in the docs, or produce an error that clarifies that this is not supported.

            This is the error message that libfmt returns. And yes named parameters don't work. It'd open a very special can of worms, consider, for example,

            select sformat(concat("Column value: {", col, "}")) from t1;
            

            which shoud show the value of a column which name is stored in the column col.

            I'm not saying it's impossible to make it work, but it'd be very complex with tons of vague corner cases. We decided not to go there for now.

            serg Sergei Golubchik added a comment - This is the error message that libfmt returns. And yes named parameters don't work. It'd open a very special can of worms, consider, for example, select sformat(concat( "Column value: {" , col, "}" )) from t1; which shoud show the value of a column which name is stored in the column col . I'm not saying it's impossible to make it work, but it'd be very complex with tons of vague corner cases. We decided not to go there for now.

            I think there are ways to write obscure queries both with named and positional arguments. But it's reasonable to omit a feature from libfmt, I just think that it should be more clear.

            f_razzoli Federico Razzoli added a comment - I think there are ways to write obscure queries both with named and positional arguments. But it's reasonable to omit a feature from libfmt, I just think that it should be more clear.

            People

              serg Sergei Golubchik
              robertbindar Robert Bindar
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.