Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
Description
Hi guys, could be nice a natural sorting inside mariadb
today some guys use SQL procedures/functions to execute this cpu intensive task
php released a nice function to do this job, maybe we could port part of it and implement in udf function, and release as a default udf function?
functions:
1) natural sort, used in ORDER BY / GROUP BY
2) natural sort compare, could be used as an operator, like the "SOUNDS LIKE" operator, in other words, the function that return a canonical form of the string could be used to compare, something that could be rewrite "field NATURAL LIKE value" to "natual(field)=natural(value)"
i didn't found (2) in internet, but it's used in php, and the (1) should use it in some place to order/group correctly
--------------
example of (2):
http://www.php.net/manual/en/function.strnatcmp.php
https://github.com/php/php-src/blob/master/ext/standard/strnatcmp.c
--------------
example of (1)
Here a example in STORED PROCEDURE/FUNCTIONS for ORDER BY, but not exaust tested:
source: http://stackoverflow.com/questions/153633/natural-sort-in-mysql
DROP FUNCTION IF EXISTS `udf_FirstNumberPos`;
|
DELIMITER ;;
|
CREATE FUNCTION `udf_FirstNumberPos` (`instring` varchar(4000))
|
RETURNS int
|
LANGUAGE SQL
|
DETERMINISTIC
|
NO SQL
|
SQL SECURITY INVOKER
|
BEGIN
|
DECLARE position int;
|
DECLARE tmp_position int;
|
SET position = 5000;
|
SET tmp_position = LOCATE('0', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('1', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('2', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('3', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('4', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('5', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('6', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('7', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('8', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
SET tmp_position = LOCATE('9', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
|
|
IF (position = 5000) THEN RETURN 0; END IF;
|
RETURN position;
|
END
|
;;
|
|
DROP FUNCTION IF EXISTS `udf_NaturalSortFormat`;
|
DELIMITER ;;
|
CREATE FUNCTION `udf_NaturalSortFormat` (`instring` varchar(4000), `numberLength` int, `sameOrderChars` char(50))
|
RETURNS varchar(4000)
|
LANGUAGE SQL
|
DETERMINISTIC
|
NO SQL
|
SQL SECURITY INVOKER
|
BEGIN
|
DECLARE sortString varchar(4000);
|
DECLARE numStartIndex int;
|
DECLARE numEndIndex int;
|
DECLARE padLength int;
|
DECLARE totalPadLength int;
|
DECLARE i int;
|
DECLARE sameOrderCharsLen int;
|
|
SET totalPadLength = 0;
|
SET instring = TRIM(instring);
|
SET sortString = instring;
|
SET numStartIndex = udf_FirstNumberPos(instring);
|
SET numEndIndex = 0;
|
SET i = 1;
|
SET sameOrderCharsLen = LENGTH(sameOrderChars);
|
|
WHILE (i <= sameOrderCharsLen) DO
|
SET sortString = REPLACE(sortString, SUBSTRING(sameOrderChars, i, 1), ' ');
|
SET i = i + 1;
|
END WHILE;
|
|
WHILE (numStartIndex <> 0) DO
|
SET numStartIndex = numStartIndex + numEndIndex;
|
SET numEndIndex = numStartIndex;
|
|
WHILE (udf_FirstNumberPos(SUBSTRING(instring, numEndIndex, 1)) = 1) DO
|
SET numEndIndex = numEndIndex + 1;
|
END WHILE;
|
|
SET numEndIndex = numEndIndex - 1;
|
|
SET padLength = numberLength - (numEndIndex + 1 - numStartIndex);
|
|
IF padLength < 0 THEN
|
SET padLength = 0;
|
END IF;
|
|
SET sortString = INSERT(sortString, numStartIndex + totalPadLength, 0, REPEAT('0', padLength));
|
|
SET totalPadLength = totalPadLength + padLength;
|
SET numStartIndex = udf_FirstNumberPos(RIGHT(instring, LENGTH(instring) - numEndIndex));
|
END WHILE;
|
|
RETURN sortString;
|
END
|
;;
|
Attachments
Issue Links
- causes
-
MDEV-26786 Inserting NULL into base column breaks NATURAL_SORT_KEY column
- Closed
-
MDEV-26787 Document that NATURAL_SORT_KEY virtual column should be longer than the base column (and other notes)
- Closed
-
MDEV-26796 Natural sort does not work for utf32/utf16/ucs2
- Closed
-
MDEV-26806 Server crash in Charset::charset / Item_func_natural_sort_key::val_str
- Closed
- relates to
-
MDEV-23400 Add UCA case sensitive accent sensitive collations for Unicode character sets
- Closed
- links to