[MDEV-4369] column_stats.histogram contents doesnt make sense Created: 2013-04-04  Updated: 2013-04-07  Resolved: 2013-04-07

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: mwl#253

Attachments: PNG File mdev4369-plot.png    
Issue Links:
Relates
relates to MDEV-4145 Take into account the selectivity of ... Closed

 Description   

I used DBT-3 data, scale=10, InnoDB engine. I create statistical tables like
specified in scripts/mysql_system_tables.sql.

Then I use the following to generate the histogram:

MariaDB [dbt3sf10]> set histogram_size=200;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [dbt3sf10]> set histogram_type='single_prec_hb';
Query OK, 0 rows affected (0.00 sec)
MariaDB [dbt3sf10]> analyze table customer persistent for all;

Then I want to take a look at the histogram:

MariaDB [dbt3sf10]> select *,hex(histogram),length(histogram) from mysql.column_stats where column_name='c_acctbal' and table_name='customer'\G
*************************** 1. row ***************************
          db_name: dbt3sf10
       table_name: customer
      column_name: c_acctbal
        min_value: -999.99
        max_value: 9999.99
      nulls_ratio: 0.0000
       avg_length: 8.0000
    avg_frequency: 1.8319
        hist_size: 200
        hist_type: 
        histogram: ���	
������� "%'),.0357:<>ACEHJLOQSVXZ\_acfhjmoqtvx{}������������������������������������������������������   �       �           
                                                                                                                                 �P      �             �?                �w���          �   mmen
   hex(histogram): 020406090B0D10121417191B1E20222527292C2E303335373A3C3E414345484A4C4F515356585A5C5F616366686A6D6F717476787B7D7F828486898B8D90929497999B9EA0A2A5A7A9ACAEB0B2B5B7B9BCBEC0C3C5C7CACCCED1D3D5D8DADCDFE1E3E6E8EAEDEFF1F4F6F8FBFD00000001000000000000000300000000000000000000000B00000003500000000000000100000000000000000000000000F03F00000000000000000000000000000000F0770298C67F00000000000000000000010000006D6D656E

Look at the hex(histogram) data. The first half of it has growing values:
02,04,06,09,0B,0D, ..., F1,F4,F6,F8,FB,FD.. Then it goes back to zero, and
continues at zero, with a few spikes.

As far as I understand the meaning of values in a histogram, they should always
increase. That is, histogram[i+1] >= histogram[i]. This is not the case for
this histogram.



 Comments   
Comment by Sergei Petrunia [ 2013-04-04 ]

Graphic plot of values in the histogram

Comment by Igor Babaev [ 2013-04-07 ]

A fix for the bug has been pushed into maria-10.0-mwl253.

Generated at Thu Feb 08 06:55:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.