Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4742

A function for native natural sorting / natural compare

    XMLWordPrintable

Details

    Description

      Hi guys, could be nice a natural sorting inside mariadb
      today some guys use SQL procedures/functions to execute this cpu intensive task
      php released a nice function to do this job, maybe we could port part of it and implement in udf function, and release as a default udf function?

      functions:
      1) natural sort, used in ORDER BY / GROUP BY
      2) natural sort compare, could be used as an operator, like the "SOUNDS LIKE" operator, in other words, the function that return a canonical form of the string could be used to compare, something that could be rewrite "field NATURAL LIKE value" to "natual(field)=natural(value)"

      i didn't found (2) in internet, but it's used in php, and the (1) should use it in some place to order/group correctly
      --------------
      example of (2):
      http://www.php.net/manual/en/function.strnatcmp.php
      https://github.com/php/php-src/blob/master/ext/standard/strnatcmp.c
      --------------

      example of (1)
      Here a example in STORED PROCEDURE/FUNCTIONS for ORDER BY, but not exaust tested:
      source: http://stackoverflow.com/questions/153633/natural-sort-in-mysql

      DROP FUNCTION IF EXISTS `udf_FirstNumberPos`;
      DELIMITER ;;
      CREATE FUNCTION `udf_FirstNumberPos` (`instring` varchar(4000)) 
      RETURNS int
      LANGUAGE SQL
      DETERMINISTIC
      NO SQL
      SQL SECURITY INVOKER
      BEGIN
          DECLARE position int;
          DECLARE tmp_position int;
          SET position = 5000;
          SET tmp_position = LOCATE('0', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF; 
          SET tmp_position = LOCATE('1', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('2', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('3', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('4', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('5', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('6', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('7', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('8', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
          SET tmp_position = LOCATE('9', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
       
          IF (position = 5000) THEN RETURN 0; END IF;
          RETURN position;
      END
      ;;
       
      DROP FUNCTION IF EXISTS `udf_NaturalSortFormat`;
      DELIMITER ;;
      CREATE FUNCTION `udf_NaturalSortFormat` (`instring` varchar(4000), `numberLength` int, `sameOrderChars` char(50)) 
      RETURNS varchar(4000)
      LANGUAGE SQL
      DETERMINISTIC
      NO SQL
      SQL SECURITY INVOKER
      BEGIN
          DECLARE sortString varchar(4000);
          DECLARE numStartIndex int;
          DECLARE numEndIndex int;
          DECLARE padLength int;
          DECLARE totalPadLength int;
          DECLARE i int;
          DECLARE sameOrderCharsLen int;
       
          SET totalPadLength = 0;
          SET instring = TRIM(instring);
          SET sortString = instring;
          SET numStartIndex = udf_FirstNumberPos(instring);
          SET numEndIndex = 0;
          SET i = 1;
          SET sameOrderCharsLen = LENGTH(sameOrderChars);
       
          WHILE (i <= sameOrderCharsLen) DO
              SET sortString = REPLACE(sortString, SUBSTRING(sameOrderChars, i, 1), ' ');
              SET i = i + 1;
          END WHILE;
       
          WHILE (numStartIndex <> 0) DO
              SET numStartIndex = numStartIndex + numEndIndex;
              SET numEndIndex = numStartIndex;
       
              WHILE (udf_FirstNumberPos(SUBSTRING(instring, numEndIndex, 1)) = 1) DO
                  SET numEndIndex = numEndIndex + 1;
              END WHILE;
       
              SET numEndIndex = numEndIndex - 1;
       
              SET padLength = numberLength - (numEndIndex + 1 - numStartIndex);
       
              IF padLength < 0 THEN
                  SET padLength = 0;
              END IF;
       
              SET sortString = INSERT(sortString, numStartIndex + totalPadLength, 0, REPEAT('0', padLength));
       
              SET totalPadLength = totalPadLength + padLength;
              SET numStartIndex = udf_FirstNumberPos(RIGHT(instring, LENGTH(instring) - numEndIndex));
          END WHILE;
       
          RETURN sortString;
      END
      ;;

      Attachments

        1. mariadb_nat_comp.c
          13 kB
          roberto spadim
        2. num.01.diff
          7 kB
          Alexander Barkov

        Issue Links

          Activity

            People

              wlad Vladislav Vaintroub
              rspadim roberto spadim
              Votes:
              2 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.