Details

    • New Feature
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • None
    • 22.08.1
    • PrimProc
    • None

    Description

      Our current implementation of the aggregate MODA supports all numeric types. Implementations by other databases support string types as well. I would expect some users to want such functionality.

      Attachments

        Issue Links

          Activity

            This functionality was incorporated in MCOL-5092

            David.Hall David Hall (Inactive) added a comment - This functionality was incorporated in MCOL-5092

            QA: Moda() takes the input column (after filters) and finds the value with the most occurrences. If there's a tie for numerical types it grabs the one closest to avg(). If there's still a tie, it uses the smallest one.

            For string types, there is no concept of avg(), so in case of a tie, it should choose the one with the lowest collation order.

            We should test for both CHAR with lengths < 9 and lengths > 8 and VARCHAR with lengths < 8 and lengths > 7. Note that this is byte lengths. We should also test with various collations and character sets with multi-byte characters.

            David.Hall David Hall (Inactive) added a comment - QA: Moda() takes the input column (after filters) and finds the value with the most occurrences. If there's a tie for numerical types it grabs the one closest to avg(). If there's still a tie, it uses the smallest one. For string types, there is no concept of avg(), so in case of a tie, it should choose the one with the lowest collation order. We should test for both CHAR with lengths < 9 and lengths > 8 and VARCHAR with lengths < 8 and lengths > 7. Note that this is byte lengths. We should also test with various collations and character sets with multi-byte characters.

            Build tested: 22.08-1 (#5290)

            Tested the following test case using different data types char(5), char(13), varchar(5), and varchar(13).

            Using numeric strings

            MariaDB [mytest]> CREATE TABLE tchar (C1 char(5)) ENGINE=COLUMNSTORE
                -> ;
            Query OK, 0 rows affected (1.119 sec)
             
            MariaDB [mytest]> INSERT tchar VALUES ('1'),('2'),('2'),('2'),('3'),('4'),('8'),('8'),('8');
            Query OK, 9 rows affected (0.246 sec)
            Records: 9  Duplicates: 0  Warnings: 0
             
            MariaDB [mytest]> SELECT MODA(C1) FROM tchar;
            +----------+
            | MODA(C1) |
            +----------+
            | 8        |
            +----------+
            1 row in set (0.028 sec)
             
            MariaDB [mytest]> select * from tchar order by c1;
            +------+
            | C1   |
            +------+
            | 1    |
            | 2    |
            | 2    |
            | 2    |
            | 3    |
            | 4    |
            | 8    |
            | 8    |
            | 8    |
            +------+
            9 rows in set (0.004 sec)
             
            MariaDB [mytest]> insert into tchar values('9');
            Query OK, 1 row affected (1.188 sec)
             
            MariaDB [mytest]> SELECT MODA(C1) FROM tchar;
            +----------+
            | MODA(C1) |
            +----------+
            | 8        |
            +----------+
            1 row in set (0.014 sec)
            

            Using strings

            MariaDB [mytest]> CREATE TABLE tvchar (C1 varchar(13)) ENGINE=COLUMNSTORE
                -> ;
            Query OK, 0 rows affected (1.112 sec)
             
            MariaDB [mytest]> INSERT tvchar VALUES ('t1'),('t2'),('t2'),('t2'),('t3'),('t4'),('t8'),('t8'),('t8');
            Query OK, 9 rows affected (0.145 sec)
            Records: 9  Duplicates: 0  Warnings: 0
             
            MariaDB [mytest]> insert into tvchar values('t9'),('t9'), ('t9');
            Query OK, 3 rows affected (0.171 sec)
            Records: 3  Duplicates: 0  Warnings: 0
             
            MariaDB [mytest]> SELECT MODA(C1) FROM tvchar;
            +----------+
            | MODA(C1) |
            +----------+
            | t8       |
            +----------+
            1 row in set (0.028 sec)
            

            dleeyh Daniel Lee (Inactive) added a comment - Build tested: 22.08-1 (#5290) Tested the following test case using different data types char(5), char(13), varchar(5), and varchar(13). Using numeric strings MariaDB [mytest]> CREATE TABLE tchar (C1 char(5)) ENGINE=COLUMNSTORE -> ; Query OK, 0 rows affected (1.119 sec)   MariaDB [mytest]> INSERT tchar VALUES ('1'),('2'),('2'),('2'),('3'),('4'),('8'),('8'),('8'); Query OK, 9 rows affected (0.246 sec) Records: 9 Duplicates: 0 Warnings: 0   MariaDB [mytest]> SELECT MODA(C1) FROM tchar; +----------+ | MODA(C1) | +----------+ | 8 | +----------+ 1 row in set (0.028 sec)   MariaDB [mytest]> select * from tchar order by c1; +------+ | C1 | +------+ | 1 | | 2 | | 2 | | 2 | | 3 | | 4 | | 8 | | 8 | | 8 | +------+ 9 rows in set (0.004 sec)   MariaDB [mytest]> insert into tchar values('9'); Query OK, 1 row affected (1.188 sec)   MariaDB [mytest]> SELECT MODA(C1) FROM tchar; +----------+ | MODA(C1) | +----------+ | 8 | +----------+ 1 row in set (0.014 sec) Using strings MariaDB [mytest]> CREATE TABLE tvchar (C1 varchar(13)) ENGINE=COLUMNSTORE -> ; Query OK, 0 rows affected (1.112 sec)   MariaDB [mytest]> INSERT tvchar VALUES ('t1'),('t2'),('t2'),('t2'),('t3'),('t4'),('t8'),('t8'),('t8'); Query OK, 9 rows affected (0.145 sec) Records: 9 Duplicates: 0 Warnings: 0   MariaDB [mytest]> insert into tvchar values('t9'),('t9'), ('t9'); Query OK, 3 rows affected (0.171 sec) Records: 3 Duplicates: 0 Warnings: 0   MariaDB [mytest]> SELECT MODA(C1) FROM tvchar; +----------+ | MODA(C1) | +----------+ | t8 | +----------+ 1 row in set (0.028 sec)

            2 or t2 were expected from the tests above.

            dleeyh Daniel Lee (Inactive) added a comment - 2 or t2 were expected from the tests above.

            Reopen per the last test result.

            dleeyh Daniel Lee (Inactive) added a comment - Reopen per the last test result.

            Build verified: 22.08-1 (#5303)

            dleeyh Daniel Lee (Inactive) added a comment - Build verified: 22.08-1 (#5303)

            People

              dleeyh Daniel Lee (Inactive)
              David.Hall David Hall (Inactive)
              Daniel Lee Daniel Lee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.