[MDEV-6738] use_stat_table + histograms crashing optimizer Created: 2014-09-12  Updated: 2014-10-10  Resolved: 2014-10-10

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0.13
Fix Version/s: 10.0.15

Type: Bug Priority: Major
Reporter: VAROQUI Stephane Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: eits, optimizer

Issue Links:
Relates
relates to MDEV-6519 Assertion `join->best_read < double(.... Closed
relates to MDEV-6442 Assertion `join->best_read < double(.... Closed

 Description   

use_stat_tables='preferably'
optimizer_use_condition_selectivity=4
histogram_size=255

Thread pointer: 0x0x7f187cf92008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f1cb5654d10 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb7054b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x727588]
/lib64/libpthread.so.0(+0xf710)[0x7f1cb531a710]
/usr/sbin/mysqld(_ZN9Field_bit5storeEPKcjPK15charset_info_st+0x10f)[0x713caf]
/usr/sbin/mysqld(_Z28get_column_range_cardinalityP5FieldP12st_key_rangeS2_j+0x135)[0x63f605]
/usr/sbin/mysqld(_Z36calculate_cond_selectivity_for_tableP3THDP5TABLEP4Item+0x836)[0x80bd76]
/usr/sbin/mysqld[0x528cce]
/usr/sbin/mysqld(_ZN4JOIN14optimize_innerEv+0x6de)[0x61c8de]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xd8)[0x61efb8]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x28d)[0x62284d]
/usr/sbin/mysqld[0x5ca456]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4c4f)[0x5d51af]
/usr/sbin/mysqld[0x5d6cc2]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1b20)[0x5d8e80]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x453)[0x696993]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x696a62]
/lib64/libpthread.so.0(+0x79d1)[0x7f1cb53129d1]
/lib64/libc.so.6(clone+0x6d)[0x7f1cb3a2fb5d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f1859c20020): SELECT id, reportName, scheduleName, reportId, reportScheduleId, reportType, status, reportFormat, generationStartTime, executionTime, fileUrl, fileSize, retainUntil, users, userGroups, recurrenceType, recurrenceIntervalType, recurrenceIntervalValue, recurrenceParam1, recurrenceParam2, recurrenceParam3, reportSource, userId, isGloballyViewable, retainLast FROM report_schedule_logs WHERE ((`retainUntil` < 20140912114319) AND (`retainLast` != true))
Connection ID (thread ID): 4
Status: NOT_KILLED

 CREATE TABLE `report_schedule_logs` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `reportName` varchar(128) DEFAULT NULL,
  `scheduleName` varchar(128) DEFAULT NULL,
  `reportId` varchar(255) DEFAULT NULL,
  `reportScheduleId` varchar(255) DEFAULT NULL,
  `reportType` int(10) unsigned DEFAULT NULL,
  `reportFormat` tinyint(3) unsigned DEFAULT NULL,
  `generationStartTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `executionTime` varchar(128) DEFAULT NULL,
  `reportSource` tinyint(3) unsigned DEFAULT NULL,
  `status` tinyint(3) unsigned DEFAULT NULL,
  `fileUrl` varchar(1024) DEFAULT NULL,
  `fileSize` bigint(20) DEFAULT NULL,
  `retainUntil` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `retainLast` bit(1) DEFAULT b'0',
  `users` longblob,
  `userGroups` longblob,
  `recurrenceType` tinyint(3) unsigned DEFAULT NULL,
  `recurrenceIntervalType` tinyint(3) unsigned DEFAULT NULL,
  `recurrenceIntervalValue` int(11) DEFAULT NULL,
  `recurrenceParam1` varchar(128) DEFAULT NULL,
  `recurrenceParam2` varchar(128) DEFAULT NULL,
  `recurrenceParam3` varchar(128) DEFAULT NULL,
  `userId` varchar(128) DEFAULT NULL,
  `isGloballyViewable` bit(1) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `report_schedule_logs_idx1` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

explain  SELECT id, reportName, scheduleName, reportId, reportScheduleId, reportType, status, reportFormat, generationStartTime, executionTime, fileUrl, fileSize, retainUntil, users, userGroups, recurrenceType, recurrenceIntervalType, recurrenceIntervalValue, recurrenceParam1, recurrenceParam2, recurrenceParam3, reportSource, userId, isGloballyViewable, retainLast FROM report_schedule_logs WHERE ((`retainUntil` < 20140912114319) AND (`retainLast` != true));

changing the configuration to default instanlty solve the issue .

 show variables like 'optimizer_use_condition_selectivity';
+-------------------------------------+-------+
| Variable_name                       | Value |
+-------------------------------------+-------+
| optimizer_use_condition_selectivity | 4     |
+-------------------------------------+-------+
1 row in set (0.00 sec)
 
MariaDB [sa]> explain  SELECT id, reportName, scheduleName, reportId, reportScheduleId, reportType, status, reportFormat, generationStartTime, executionTime, fileUrl, fileSize, retainUntil, users, userGroups, recurrenceType, recurrenceIntervalType, recurrenceIntervalValue, recurrenceParam1, recurrenceParam2, recurrenceParam3, reportSource, userId, isGloballyViewable, retainLast FROM report_schedule_logs WHERE ((`retainUntil` < 20140912114319) AND (`retainLast` != true));
ERROR 2013 (HY000): Lost connection to MySQL server during query

MariaDB [sa]> show variables like 'optimizer_use_condition_selectivity';ERROR 2006 (HY000): MySQL server has gone awayNo connection. Trying to reconnect...
Connection id:    6
Current database: sa
 
+-------------------------------------+-------+
| Variable_name                       | Value |
+-------------------------------------+-------+
| optimizer_use_condition_selectivity | 1     |
+-------------------------------------+-------+
1 row in set (0.46 sec)
 
MariaDB [sa]> explain  SELECT id, reportName, scheduleName, reportId, reportScheduleId, reportType, status, reportFormat, generationStartTime, executionTime, fileUrl, fileSize, retainUntil, users, userGroups, recurrenceType, recurrenceIntervalType, recurrenceIntervalValue, recurrenceParam1, recurrenceParam2, recurrenceParam3, reportSource, userId, isGloballyViewable, retainLast FROM report_schedule_logs WHERE ((`retainUntil` < 20140912114319) AND (`retainLast` != true));
+------+-------------+----------------------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table                | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+----------------------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | report_schedule_logs | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where |
+------+-------------+----------------------+------+---------------+------+---------+------+------+-------------+

The context is that our client under support benefit from histograms on some query plan. We activated it but after a process of reseting the data . it started to crash on this query run every 5 minutes. not that after the reset we did no run analyze table but the table was readable and and was empty.

the content of the stats tables because are empty ! due to restart



 Comments   
Comment by Elena Stepanova [ 2014-09-15 ]

You are missing ANALYZE TABLE after table creation, so the client wouldn't benefit from histograms even without a crash.

If you add ANALYZE TABLE, the crash is gone.

We have a bug report about a failure without statistics in the stat tables (MDEV-6442), but there we are getting an assertion failure, and here we have a crash, so I'm assigning it to psergey to either make sure it's the same bug, or to fix it separately.

Comment by Sergei Petrunia [ 2014-10-08 ]

Still repeatable after fix for MDEV-6442.

Comment by Sergei Petrunia [ 2014-10-08 ]

We crash in Field_bit::store on this line

    set_rec_bits((1 << bit_len) - 1, bit_ptr, bit_ofs, bit_len);

because of this

(gdb) print bit_ptr
  $34 = (uchar *) 0x36000 <Address 0x36000 out of bounds>

Field_bit refers to the "retainLast" field.

Comment by Sergei Petrunia [ 2014-10-08 ]

The first guess is that create_min_max_fields_for_table[share] create invalid field objects is incorrect.

Running

  select retainLast FROM report_schedule_logs

shows that "regular" field objects also have bit_ptr pointing to nowhere:

  Breakpoint 7, sub_select (join=0x7fffcac20978, join_tab=0x7fffcac218b0, end_of_records=false) at /home/psergey/dev2/10.0/sql/sql_select.cc:17587
(gdb) p join_tab->table->field[14]
  $125 = (Field_bit_as_char *) 0x7fffcac3e8e8
(gdb) p join_tab->table->field[14]->bit_ptr
  $126 = (uchar *) 0x36000 <Address 0x36000 out of bounds>

Comment by Sergei Petrunia [ 2014-10-09 ]

A smaller testcase:

set use_stat_tables='preferably';
set optimizer_use_condition_selectivity=4;
set histogram_size=255;
 
create table t1(col1 int, col2 bit(1) DEFAULT NULL) engine=innodb;
 
select * from t1 where col2 != true;

Comment by Sergei Petrunia [ 2014-10-10 ]

  Program received signal SIGSEGV, Segmentation fault.
  0x00000000008473c0 in Field_bit::store (this=0x7ffe9c010130, from=0x7ffff7f80ba7 "\001@\307\301", length=2, cs=0x17d7900) at /home/psergey/dev2/10.0/sql/field.cc:8526
(gdb) wher
  #0  0x00000000008473c0 in Field_bit::store (this=0x7ffe9c010130, from=0x7ffff7f80ba7 "\001@\307\301", length=2, cs=0x17d7900) at /home/psergey/dev2/10.0/sql/field.cc:8526
  #1  0x0000000000851133 in Field_bit::set_key_image (this=0x7ffe9c010130, buff=0x7ffff7f80ba7 "\001@\307\301", length=2) at /home/psergey/dev2/10.0/sql/field.h:2753
  #2  0x0000000000973501 in store_key_image_to_rec (field=0x7ffe9c010130, ptr=0x7ffff7f80ba7 "\001@\307\301", len=2) at /home/psergey/dev2/10.0/sql/opt_range.cc:3765
  #3  0x00000000006fffee in get_column_range_cardinality (field=0x7ffe9c010130, min_endp=0x7ffff7f7f920, max_endp=0x7ffff7f7f940, range_flag=12) at /home/psergey/dev2/10.0/sql/sql_statistics.cc:3567
  #4  0x0000000000972651 in records_in_column_ranges (param=0x7ffff7f7fb50, idx=0, tree=0x7ffe9c065290) at /home/psergey/dev2/10.0/sql/opt_range.cc:3421
  #5  0x0000000000972e35 in calculate_cond_selectivity_for_table (thd=0xe6857a0, table=0x7ffe9c00d5d0, cond=0x7ffe9c006f98) at /home/psergey/dev2/10.0/sql/opt_range.cc:3628
  #6  0x00000000006a33e4 in make_join_statistics (join=0x7ffe9c0071d8, tables_list=..., conds=0x7ffe9c006f98, keyuse_array=0x7ffe9c0074e0) at /home/psergey/dev2/10.0/sql/sql_select.cc:3949

Field_bit::store() crashes only when the parameter is length=2. If it is
length==1, then there is no crash. I still think there should be no crash
when length==2, either, however when I tried to get other code to call this
function with length=2, I did not succeed.

length=2 because the first byte is NULL-flag, and the next byte holds the bit
value.

store_key_image_to_rec looks like this:

void store_key_image_to_rec(Field *field, uchar *ptr, uint len)
{
  /* Do the same as print_key() does */ 
  my_bitmap_map *old_map;
 
  if (field->real_maybe_null())
  {
    if (*ptr)
    {
      field->set_null();
      return;
    }
    field->set_notnull();
    ptr++;
  }    
  old_map= dbug_tmp_use_all_columns(field->table,
                                    field->table->write_set);
  field->set_key_image(ptr, len); 

Note: we skip the NULL-byte in the ptr, but we don't decrease 'len'.
Searching for how other code calls store_key_image_to_rec, one can see:

    store_key_image_to_rec(sel_arg->field, sel_arg->min_value,
                           parts[sel_arg->part].length);

    store_key_image_to_rec(part->field, arg->min_value, part->length);

and KEY_PART_INFO::length is defined as:

  /* Length of key part in bytes, excluding NULL flag and length bytes */
  uint16 length;

Comment by Sergei Petrunia [ 2014-10-10 ]

So, a possible fix is to make EITS code substract HA_KEY_NULL_LENGTH before calling store_key_image_to_rec.

Generated at Thu Feb 08 07:14:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.