[MDEV-26710] Histogram field in mysql.column_stats is too short, JSON histograms get corrupt Created: 2021-09-28  Updated: 2022-06-20  Resolved: 2021-10-01

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: N/A
Fix Version/s: 10.8.1

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
is caused by MDEV-21130 Histograms: use JSON as on-disk format Closed
is caused by MDEV-26519 JSON Histograms: improve histogram co... Closed
Relates
relates to MDEV-28866 mariadb-upgrade to 10.8 mysql.column_... Closed

 Description   

mysql.column_stats.histogram field is a standard BLOB. A few long values will overfill it easily, the JSON will get truncated and become invalid.

create or replace table t (a varchar(8192));
insert into t values
  (repeat('A',8192)),
  (repeat('B',8192)),
  (repeat('C',8192)),
  (repeat('D',8192)),
  (repeat('E',8192)),
  (repeat('F',8192)),
  (repeat('G',8192)),
  (repeat('H',8192)),
  ('I');
 
set histogram_type= JSON_HB;
analyze table t persistent for all;
select * from t where a = 'foo';
 
# Cleanup
drop table t;

preview-10.7-MDEV-26519-json-histograms da8bb4b4

MariaDB [test]> select * from t where a = 'foo';
ERROR 4183 (HY000): Failed to parse histogram: Root JSON element must be a JSON object at offset 0.



 Comments   
Comment by Sergei Petrunia [ 2021-10-01 ]

This is because mysql.column_stats.histogram is defined as

  `histogram` blob DEFAULT NULL,

which has a maximum length of 64K.

If the maximum number of buckets is 255, this gives 257 bytes to represent one bucket.

In utf8mb4, this is 64 4-byte characters.

One could argue that JSON syntax parts like field names, quotes, etc. are not 4-byte characters.
Also, we wanted to truncate the values that are too long.

Still, the size limit is close enough. I don't see any arguments why we should not raise it.

Generated at Thu Feb 08 09:47:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.