[MDEV-26232] Performance Regression in queries against mixed BIGINT values Created: 2021-07-24  Updated: 2023-04-27

Status: Confirmed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.4.20
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Josh Bello Assignee: Oleksandr Byelkin
Resolution: Unresolved Votes: 2
Labels: SELECT, performance, regression
Environment:

CentOS 7



 Description   

We are experiencing what appears to be a regression in the query planner in the process of testing an upgrade from MariaDB 10.2.39 to 10.4.20.

The table definition we are running the query against is as follows:

Create Table: CREATE TABLE `token_data` (
  `uid` int(10) unsigned NOT NULL,
  `token` bigint(20) unsigned NOT NULL,
  `value1` bigint(20) unsigned NOT NULL,
  `value2` bigint(20) unsigned NOT NULL,
  `last_update` date NOT NULL,
  UNIQUE KEY `id_token_data_01` (`uid`,`token`),
  KEY `last_update` (`last_update`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

The query that I am running is as follows:

EXPLAIN EXTENDED SELECT token,value1,value2 FROM token_data WHERE uid=12001 AND token IN (14454172572971265821,6745279665170062986,5976548953849996655,8277955528833064934,17940379197342480268,14869208320558067485,17315553045409243299,17053444543038773962,7801999317187194854,9774973577495866952,6139029933980675674,5975793561981298031,15780788265856007991,6710543574905126912,8056624067400366231,5372330405299365231,1731820755350078667,10337750861957591287,18179962722013403225,4032846873084559360,15507949777456787353,14864294813815369501,16190638042948088801,9743340660943945179,4310889774226046499,8654011433740205778,8701519266898721160,1770783118357761330,15057675132202217245,277070117901065582,540659882387707353,2205831429349578201,3457869911869787254,14980716260267224861,3754261088098005618,14052870096204695077,10943173851576400023,8654011433585790674,12993243060260250460,14979329550618816285,16850241364708797108,8583000031085752646,12264907527857408362,5618407151415731567,4703440132903012219,9920933646715251360,10410078769612255479,9859952648301078679,8583000031664820550,6007773530648094063,15819724772544679200,14387761156831755502,5620458545859667311,16245102814004570404,15079799121045188381,1121062113675701369,6715486898116100096,6463300224464549874,5499253815180141935,15718037849488072027,17592402927639996588,3569614194670960609,11469729274248341082,15053603698089750301,10489091348942158279,3961032953135919073,8947867702334725879,9774973577950588488,1282350908471679870,6820046171247287274,15543245596512385203);

The query returns 49 rows in 0.00 sec on MariaDB 10.2, but consistently takes over 8-9 seconds returning the same 49 rows on MariaDB 10.4. We have been running similar queries for many years on MySQL 5.5 and MariaDB 10.2 against even much longer sets of unsigned BIGINT values, and the results have always been nearly instant.

Here is the EXPLAIN EXTENDED output on the query on MariaDB 10.2:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: token_data
         type: range
possible_keys: id_token_data_01
          key: id_token_data_01
      key_len: 12
          ref: NULL
         rows: 71
     filtered: 76.06
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

Here is the EXPLAIN EXTENDED output on the query on MariaDB 10.4:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: token_data
         type: ref
possible_keys: id_token_data_01
          key: id_token_data_01
      key_len: 4
          ref: const
         rows: 9950548
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.002 sec)

As you see in the schema above, the value for 'token' is an unsigned BIGINT.

In the query we are running, 24 of the values are below the max size of a signed BIGINT (9223372036854775807), 25 of the values are above that size.

Here is where it gets interesting...

When we run the same query on MariaDB 10.4 but only include the 24 values below the signed BIGINT max size, the results return in 0.002 seconds, with the following EXPLAIN EXTENDED output:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: token_data
         type: range
possible_keys: id_token_data_01
          key: id_token_data_01
      key_len: 12
          ref: NULL
         rows: 24
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.002 sec)

Running the same query again, but this time against the 25 values that are above the signed BIGINT max size, the results return in 0.001 seconds, with the following EXPLAIN EXTENDED output:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: token_data
         type: range
possible_keys: id_token_data_01
          key: id_token_data_01
      key_len: 12
          ref: NULL
         rows: 25
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.002 sec)

Finally, if when running the same query against 24 values that are both above and below the unsigned BIGINT limit, the query takes 3.305 seconds and the EXPLAIN EXTENDED reports:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: token_data
         type: ref
possible_keys: id_token_data_01
          key: id_token_data_01
      key_len: 4
          ref: const
         rows: 9950548
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.002 sec)

We have tried similar tests against table structures that use a simple primary key on 'token' instead of the composite key above, but the performance characteristics are similar: fast when values do not straddle the max signed BIGINT limit, very slow when they do.

The only workaround we have found is by using OR rather than IN, but that is not a solution which will work for us in all use cases.



 Comments   
Comment by Josh Bello [ 2021-07-26 ]

For more complete info as we dig into this, please see the following JSON explains:

First, a query for integers that span the range of signed and unsigned BIGINT values:

EXPLAIN FORMAT=JSON SELECT token,value1,value2 FROM token_data WHERE uid=12001 AND token IN (6139029933980675674,  6710543574905126912,  6715486898116100096,  7801999317187194854,  8056624067400366231,  8277955528833064934,  8583000031085752646,  8583000031664820550,  8654011433585790674,  8654011433740205778,  8701519266898721160,  8947867702334725879,    9743340660943945179,  9774973577495866952,  9774973577950588488,  9859952648301078679,  9920933646715251360, 10337750861957591287, 10410078769612255479, 10943173851576400023, 11469729274248341082, 12264907527857408362, 14052870096204695077, 14387761156831755502) \G;
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "token_data",
      "access_type": "ref",
      "possible_keys": ["id_token_data_01"],
      "key": "id_token_data_01",
      "key_length": "4",
      "used_key_parts": ["uid"],
      "ref": ["const"],
      "rows": 9950548,
      "filtered": 100,
      "attached_condition": "token_data.token in (6139029933980675674,6710543574905126912,6715486898116100096,7801999317187194854,8056624067400366231,8277955528833064934,8583000031085752646,8583000031664820550,8654011433585790674,8654011433740205778,8701519266898721160,8947867702334725879,9743340660943945179,9774973577495866952,9774973577950588488,9859952648301078679,9920933646715251360,10337750861957591287,10410078769612255479,10943173851576400023,11469729274248341082,12264907527857408362,14052870096204695077,14387761156831755502)"
    }
  }
}

Now for a similar quantity of integers, but ones which do not span both signed and unsigned BIGINT:

EXPLAIN FORMAT=JSON SELECT token,value1,value2 FROM token_data WHERE uid=12001 AND token IN (   277070117901065582,  1731820755350078667,  3569614194670960609,  3754261088098005618,  3961032953135919073,  5372330405299365231,  5499253815180141935,  5618407151415731567,  5620458545859667311,  5975793561981298031,  5976548953849996655,  6007773530648094063,   6139029933980675674,  6710543574905126912,  6715486898116100096,  7801999317187194854,  8056624067400366231,  8277955528833064934,  8583000031085752646,  8583000031664820550,  8654011433585790674,  8654011433740205778,  8701519266898721160,  8947867702334725879) \G;
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "token_data",
      "access_type": "range",
      "possible_keys": ["id_token_data_01"],
      "key": "id_token_data_01",
      "key_length": "12",
      "used_key_parts": ["uid", "token"],
      "rows": 24,
      "filtered": 100,
      "attached_condition": "token_data.uid = 12001 and token_data.token in (277070117901065582,1731820755350078667,3569614194670960609,3754261088098005618,3961032953135919073,5372330405299365231,5499253815180141935,5618407151415731567,5620458545859667311,5975793561981298031,5976548953849996655,6007773530648094063,6139029933980675674,6710543574905126912,6715486898116100096,7801999317187194854,8056624067400366231,8277955528833064934,8583000031085752646,8583000031664820550,8654011433585790674,8654011433740205778,8701519266898721160,8947867702334725879)"
    }
  }
}

Comment by Sergei Golubchik [ 2021-08-03 ]

This is the effect of bugs like MDEV-11554. When you have both signed (negative) and unsigned (big) integers in IN, the comparison is done using a type that can fit them all, that is, they're all converted to a DECIMAL.

Small positive integers are considered signed, so when you mix small and big positive integers you get DECIMAL too.

I think we can implement a fix where small positive integers can be treated as "neither signed nor unsigned" and do not contribute to the decision "oh, we have a signed/unsigned mix, need to use decimal". Like this:

--- a/sql/item_cmpfunc.cc
+++ b/sql/item_cmpfunc.cc
@@ -137,7 +137,7 @@ Type_handler_hybrid_field_type::aggregate_for_comparison(const char *funcname,
   */
   if (int_uint_as_dec &&
       cmp_type() == INT_RESULT &&
-      unsigned_count != nitems && unsigned_count != 0)
+      unsigned_count + weak_count != nitems && unsigned_count != 0)
     set_handler(&type_handler_newdecimal);
   return 0;
 }

Generated at Thu Feb 08 09:43:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.