Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 10.3.11
    • N/A
    • Optimizer
    • None
    • Linux

    Description

      I checked this with MS SQL and it behaves correctly.
      I have a very large table with one single column, varchar, primary key
      These two statements take forever
      Set @strvbig:='9544447408';select 1 FROM large_table WHERE did = cast(@strvbig as char) limit 1;
      Set @strvbig:=9544447408;select 1 FROM large_table WHERE did = cast(@strvbig as char) limit 1;
      while
      select 1 FROM large_table WHERE did = '9544447408' limit 1;
      returns immediately.
      the optimizer is executing the cast on the right side of the Where clause for each row, which is insane since there are no columns involved, therefore it's a constant expression.
      In MS SQL, this works perfectly
      declare @strvbig bigint=9544447408;
      select 1 from large_table where did=cast (@strvbig as varchar);
      This problem seems to add an unbelievable amount of inefficiency to MariaDB when writing stored procedures and writing statements like select from large_table where column=variable,
      if the variable is of a different datatype.

      Attachments

        Issue Links

          Activity

            I think it's the same implicit charset conversion issues already discussed in another bug report MDEV-14698. I'll leave it to alice to make sure of that.

            elenst Elena Stepanova added a comment - I think it's the same implicit charset conversion issues already discussed in another bug report MDEV-14698 . I'll leave it to alice to make sure of that.
            philip_38 Philip orleans added a comment - - edited

            There is a further problem, using RocksDB. Take this query on a very large table. Please note that all fields are boolean, i.e. numeric, and the three fields are indexed.

             explain select count(*) from asterisk.npadata where reachable=1 and port_type=2 and tested=0;
            +------+-------------+---------+-------------+------------------------------------------+------------------------------------------+---------+------+------------+------------------------------------------------------------------------+
            | id   | select_type | table   | type        | possible_keys                            | key                                      | key_len | ref  | rows       | Extra                                                                  |
            +------+-------------+---------+-------------+------------------------------------------+------------------------------------------+---------+------+------------+------------------------------------------------------------------------+
            |    1 | SIMPLE      | npadata | index_merge | IDX_npadata_tested,IDX_npadata_reachable | IDX_npadata_tested,IDX_npadata_reachable | 2,2     | NULL | 1139455527 | Using intersect(IDX_npadata_tested,IDX_npadata_reachable); Using where |
            +------+-------------+---------+-------------+------------------------------------------+------------------------------------------+---------+------+------------+------------------------------------------------------------------------+
            

            I had to stop the query after 12 hours. It is basically useless for BIG data. This query in MsSQL may take 15 minutes max. I could try InnoDB but it will require several times the disk space as RocksDB. Should I try TokuDB, or this is an issue with the optimizer?

            I want to emphasize that anybody can do a faster job on the same query using a CSV file and plain old AWK. It is trivial.

            The entire data fits in memory:

            +---------+---------+----------+----------+------------+
            | ENGINE  | Data MB | Index MB | Total MB | Num Tables |
            +---------+---------+----------+----------+------------+
            | NULL    |    NULL |     NULL |     NULL |          4 |
            | Aria    |     0.0 |      0.0 |      0.0 |          1 |
            | CSV     |     0.0 |      0.0 |      0.0 |          3 |
            | InnoDB  | 62981.1 |    609.4 |  63590.5 |         37 |
            | MyISAM  |     0.8 |      0.1 |      0.9 |         25 |
            | ROCKSDB | 14842.9 |  42893.4 |  57736.3 |          2 |
            +---------+---------+----------+----------+------------+
            6 rows in set (0.017 sec)
             
            MariaDB [asterisk]> show variables like 'rocksdb%cache';
            +-------------------------------------------------+-------+
            | Variable_name                                   | Value |
            +-------------------------------------------------+-------+
            | rocksdb_no_block_cache                          | OFF   |
            | rocksdb_pin_l0_filter_and_index_blocks_in_cache | ON    |
            | rocksdb_skip_fill_cache                         | OFF   |
            | rocksdb_use_clock_cache                         | OFF   |
            +-------------------------------------------------+-------+
            4 rows in set (0.002 sec)
             
            MariaDB [asterisk]> show variables like 'rocksdb%cache%';
            +-------------------------------------------------+-------------+
            | Variable_name                                   | Value       |
            +-------------------------------------------------+-------------+
            | rocksdb_block_cache_size                        | 96636764160 |
            | rocksdb_cache_index_and_filter_blocks           | ON          |
            | rocksdb_force_compute_memtable_stats_cachetime  | 60000000    |
            | rocksdb_no_block_cache                          | OFF         |
            | rocksdb_persistent_cache_path                   |             |
            | rocksdb_persistent_cache_size_mb                | 0           |
            | rocksdb_pin_l0_filter_and_index_blocks_in_cache | ON          |
            | rocksdb_sim_cache_size                          | 0           |
            | rocksdb_skip_fill_cache                         | OFF         |
            | rocksdb_table_cache_numshardbits                | 19          |
            | rocksdb_use_clock_cache                         | OFF         |
            +-------------------------------------------------+-------------+
            

            philip_38 Philip orleans added a comment - - edited There is a further problem, using RocksDB. Take this query on a very large table. Please note that all fields are boolean, i.e. numeric, and the three fields are indexed. explain select count(*) from asterisk.npadata where reachable=1 and port_type=2 and tested=0; +------+-------------+---------+-------------+------------------------------------------+------------------------------------------+---------+------+------------+------------------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------+-------------+------------------------------------------+------------------------------------------+---------+------+------------+------------------------------------------------------------------------+ | 1 | SIMPLE | npadata | index_merge | IDX_npadata_tested,IDX_npadata_reachable | IDX_npadata_tested,IDX_npadata_reachable | 2,2 | NULL | 1139455527 | Using intersect(IDX_npadata_tested,IDX_npadata_reachable); Using where | +------+-------------+---------+-------------+------------------------------------------+------------------------------------------+---------+------+------------+------------------------------------------------------------------------+ I had to stop the query after 12 hours. It is basically useless for BIG data. This query in MsSQL may take 15 minutes max. I could try InnoDB but it will require several times the disk space as RocksDB. Should I try TokuDB, or this is an issue with the optimizer? I want to emphasize that anybody can do a faster job on the same query using a CSV file and plain old AWK. It is trivial. The entire data fits in memory: +---------+---------+----------+----------+------------+ | ENGINE | Data MB | Index MB | Total MB | Num Tables | +---------+---------+----------+----------+------------+ | NULL | NULL | NULL | NULL | 4 | | Aria | 0.0 | 0.0 | 0.0 | 1 | | CSV | 0.0 | 0.0 | 0.0 | 3 | | InnoDB | 62981.1 | 609.4 | 63590.5 | 37 | | MyISAM | 0.8 | 0.1 | 0.9 | 25 | | ROCKSDB | 14842.9 | 42893.4 | 57736.3 | 2 | +---------+---------+----------+----------+------------+ 6 rows in set (0.017 sec)   MariaDB [asterisk]> show variables like 'rocksdb%cache'; +-------------------------------------------------+-------+ | Variable_name | Value | +-------------------------------------------------+-------+ | rocksdb_no_block_cache | OFF | | rocksdb_pin_l0_filter_and_index_blocks_in_cache | ON | | rocksdb_skip_fill_cache | OFF | | rocksdb_use_clock_cache | OFF | +-------------------------------------------------+-------+ 4 rows in set (0.002 sec)   MariaDB [asterisk]> show variables like 'rocksdb%cache%'; +-------------------------------------------------+-------------+ | Variable_name | Value | +-------------------------------------------------+-------------+ | rocksdb_block_cache_size | 96636764160 | | rocksdb_cache_index_and_filter_blocks | ON | | rocksdb_force_compute_memtable_stats_cachetime | 60000000 | | rocksdb_no_block_cache | OFF | | rocksdb_persistent_cache_path | | | rocksdb_persistent_cache_size_mb | 0 | | rocksdb_pin_l0_filter_and_index_blocks_in_cache | ON | | rocksdb_sim_cache_size | 0 | | rocksdb_skip_fill_cache | OFF | | rocksdb_table_cache_numshardbits | 19 | | rocksdb_use_clock_cache | OFF | +-------------------------------------------------+-------------+

            Take a table with a single field varchar(10) primary key
            it has 700 MM records
            if I do this, it works immediately
            select * from table where field='9999999999';
            but, had I chose to use a number
            select * from table where field=9999999999;
            it takes for hours
            You would think that the optimizer would be smart enough to convert my bigint only once to a varchar and then do the search. It is not.
            No commercial database has this issue.

            philip_38 Philip orleans added a comment - Take a table with a single field varchar(10) primary key it has 700 MM records if I do this, it works immediately select * from table where field='9999999999'; but, had I chose to use a number select * from table where field=9999999999; it takes for hours You would think that the optimizer would be smart enough to convert my bigint only once to a varchar and then do the search. It is not. No commercial database has this issue.

            consider this:

            create table t1 (a varchar(100));
            insert t1 values ('9'), ('0009'), ('     9');
            select * from t1 where a=9;
            

            as you can see, one cannot just convert bigint to a varchar once.

            serg Sergei Golubchik added a comment - consider this: create table t1 (a varchar (100)); insert t1 values ( '9' ), ( '0009' ), ( ' 9' ); select * from t1 where a=9; as you can see, one cannot just convert bigint to a varchar once.
            philip_38 Philip orleans added a comment - - edited

            I disagree, you take the bigint and convert it to varchar representation, that is '9' in the case, and retrieve any record that matches, '9'.
            The customer already decided to pull that record, he is not looking for ' 9' or '00009', he knows what he is looking for, otherwise, he would have used a varchar variable. Please do not think the customer doesn't know what he/she is doing. Also, as I mentioned, any other commercial database works that way.

            philip_38 Philip orleans added a comment - - edited I disagree, you take the bigint and convert it to varchar representation, that is '9' in the case, and retrieve any record that matches, '9'. The customer already decided to pull that record, he is not looking for ' 9' or '00009', he knows what he is looking for, otherwise, he would have used a varchar variable. Please do not think the customer doesn't know what he/she is doing. Also, as I mentioned, any other commercial database works that way.
            serg Sergei Golubchik added a comment - - edited

            This is a different issue.

            My example shows that if comparison rules are as documented here: https://mariadb.com/kb/en/type-conversion/#rules-for-conversion-on-comparison then the example above cannot use an index or convert bigint to varchar once.

            You're saying that the comparison rules should be different. Perhaps, but they're very difficult to change, MariaDB always worked like this, since 2010, and MySQL always worked like that since 1998. It's a lot of history to break.

            Besides, SQL Standard says that one cannot compare a number to a string anyway, so any conforming database implementation should just throw an error if you'd try to do that.

            serg Sergei Golubchik added a comment - - edited This is a different issue. My example shows that if comparison rules are as documented here: https://mariadb.com/kb/en/type-conversion/#rules-for-conversion-on-comparison then the example above cannot use an index or convert bigint to varchar once. You're saying that the comparison rules should be different. Perhaps, but they're very difficult to change, MariaDB always worked like this, since 2010, and MySQL always worked like that since 1998. It's a lot of history to break. Besides, SQL Standard says that one cannot compare a number to a string anyway, so any conforming database implementation should just throw an error if you'd try to do that.

            Then close the issue.

            philip_38 Philip orleans added a comment - Then close the issue.

            People

              Unassigned Unassigned
              philip_38 Philip orleans
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.