[MDEV-21130] Histograms: use JSON as on-disk format Created: 2019-11-23 Updated: 2023-03-21 Resolved: 2022-01-21 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Fix Version/s: | 10.8.1 |
| Type: | Task | Priority: | Critical |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | eits, gsoc20, gsoc21 | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Currently, histograms are stored as array of 1-byte bucket bounds (SINGLE_PREC_HB) or or 2-byte bucket bounds (DOUBLE_PREC_HB). The table storing the histograms supports different histogram formats but limits them to 256 bytes (hist_size is tinyint).
This prevents us from supporting other kinds of histograms. The first low-hanging fruit would be to store the histogram bucket bounds precisely (like MySQL and PostgreSQL do, for example). The idea of this MDEV is to switch to JSON as storage format for histograms. If we do that, it will:
Milestone-1:Let histogram_type have another possible value, tentative name "JSON"
that is, the following should work:
this should produce {"hello":"world"}. Milestone-2: produce JSON with histogram
|
[
|
"value1",
|
"value2",
|
...
|
]
|
Milestone-2, part#2: make mysql.column_stats.histogram a blob.
Figure out how to use the JSON parser.
Parse the JSON data produced in Milestone-2 back. For now, just print the parsed values to stderr.
(Additional input provided on Zulip re parsing valid/invalid JSON histograms)
Currently, there's only one type of histogram.
smaller issue: histogram lookup functions assume the histogram stores fractions, not values.
bigger issue: memory allocation for histograms is de-coupled from reading the histograms. See alloc_statistics_for_table, read_histograms_for_table.
The histogram object lives in a data structure that is bzero'ed first and then filled later (IIRC there was a bug (fixed) where the optimizer attempted to use bzero'ed histogram)
Can histograms be collected or loaded in parallel by several threads? This was an (unintentional?) possibility but then it was disabled (see TABLE_STATISTICS_CB object and its use)
Here's the commit:
https://github.com/MariaDB/server/commit/3ac32917ab6c42a5a0f9ed817dd8d3c7e20ce34d
Need to introduce
class Histogram -- interface, no data members.
|
class Histogram_binary : public Histogram
|
class Histogram_json : public Histogram
|
and a factory function
Histogram *create_histogram(Histogram_type)
|
for now, let Histogram_json::point_selectivity() and Histogram_json::range_selectivity() return 0.1 and 0.5, respectively.
Now, the code already can:
Parse function at the moment only prints to stderr.
However, we should catch parse errors and make sure they are reported to the client.
The test may look like this:
INSERT INTO mysql.column_stats VALUES('test','t1','column1', .... '[invalid, json, data']);
|
FLUSH TABLES;
|
# this should print some descriptive test
|
--error NNNN
|
select * from test.t1;
|
The structure is
std::vector<std::string>
|
and it holds the data in KeyTupleFormat (See the comments for reasoning. There was a suggestion to use in_vector (This is what IN subqueries use) but it didn't work out)
Make a function to estimate selectivity using the data structure specified in previous milestone.
(currently, they accept fractions, which is only suitable for binary histograms)
This means Histogram_binary will need to have access to min_value and max_value to compute the fractions.
| Comments |
| Comment by Dhruv Aggarwal [ 2020-03-25 ] | |||||||
|
Hi there! I would like to work on this issue and then submit my proposal. | |||||||
| Comment by Sergei Golubchik [ 2020-03-26 ] | |||||||
|
Did you see https://mariadb.org/get-involved/getting-started-for-developers/get-code-build-test/ ? | |||||||
| Comment by Sergei Petrunia [ 2021-05-18 ] | |||||||
|
Ok this task is selected for Google Summer of Code 2021. The student is Michael Okoko, the mentor is me. | |||||||
| Comment by Sergei Petrunia [ 2021-06-08 ] | |||||||
|
Discussion: | |||||||
| Comment by Sergei Petrunia [ 2021-06-21 ] | |||||||
|
Code for Milestone-1 is here: https://github.com/MariaDB/server/pull/1854 . It still fails the tests. | |||||||
| Comment by Sergei Petrunia [ 2021-07-09 ] | |||||||
|
Code for Milestone-2: https://github.com/MariaDB/server/pull/1871 . It has a memory leak in class Histogram but this will be fixed in the subsequent milestones. | |||||||
| Comment by Sergei Petrunia [ 2021-07-09 ] | |||||||
|
On the question of "How does 'col IN (const1,const2,...)' store its lookup array" : It is item_cmpfunc.h, class in_vector. It has type-specific descendants like in_string, in_longlong,
The storage format is specific to each datatype and is not explicitly defined. The interface for populating the array and making lookups uses Item objects:
| |||||||
| Comment by Sergei Petrunia [ 2021-07-09 ] | |||||||
|
One thing it doesn't have is that there's no way to get the values back from the array. | |||||||
| Comment by Sergei Petrunia [ 2021-07-23 ] | |||||||
|
Some notes from debugging === Histogram lifecycle === === Loading histogram from mysql.column_stat === alloc_statistics_for_table_share() - allocates stats data structures but read_statistics_for_table() - loads the statistics. Note that this is done in two steps: first, column stats are loaded, including The loading is synchronized using TABLE_STATISTICS_CB (only one thread is doing read_histograms_for_table() - called after read_statistics_for_table(), reads the The synchronization is done through TABLE_STATISTICS_CB (only one thread is doing Also, TABLE_STATISTICS_CB ensures that histograms are only read once (if there === Collecting histogram === alloc_statistics_for_table() is used to allocate statistical data structures. collect_statistics_for_table() - is used to collect the statistics. (Note: it looks like it's possible for different threads to collect statistics Then, Column_stat::store_stat_fields() saves the data into table. Then, the histogram is read back from disk before it is used. | |||||||
| Comment by Sergei Petrunia [ 2021-07-23 ] | |||||||
|
Patch for Milestone-3: https://github.com/MariaDB/server/pull/1875 | |||||||
| Comment by Sergei Petrunia [ 2021-07-29 ] | |||||||
|
Patch for Milestone-4.1: https://github.com/idoqo/server/pull/2/ | |||||||
| Comment by Sergei Petrunia [ 2021-08-05 ] | |||||||
|
More on the in-memory data structure for lookups. It doesn't have to have anything to do with collection-time data structure. It is basically an ordered array (either of bucket endpoints or of some The ordered array will be used to make lookups into it, so that we can find the It is likely that $SEARCH_RANGE partially overlaps with a bucket, that is $BUCKET_MIN < $SEARCH_ENDP < $BUCKET_MAX. In this case, we'll follow existing histograms and assume that $SEARCH_RANGE Currenty the fraction is calculated as follows:
Here, the min_value and max_value have the bounds. They are defined as
and pos_in_interval is defined as:
There are two implementations: pos_in_interval_val_str() which gets min and max values by calling
and then operating on 8-byte prefixes as integers. | |||||||
| Comment by Sergei Petrunia [ 2021-08-06 ] | |||||||
|
Arguments against in_vector data structure: 1. Its methods to populate/make lookups use "Item *item" as parameters. This will need to change to also allow Field*. Arguments for using it: we will be able to reuse Not much. | |||||||
| Comment by Sergei Petrunia [ 2021-08-10 ] | |||||||
|
Leaning towards using std::vector<KeyTupleFormat> data structure/ Here is a patch https://github.com/MariaDB/server/commit/4d3c434028649a0f14471ed4a1a4c10b97b78eb8 that shows how to do all needed operations:
Patch usage: https://gist.github.com/spetrunia/a09bd451f650737ba5ea7a6e51bc797a | |||||||
| Comment by Michael Okoko [ 2021-08-23 ] | |||||||
|
The progress so far is documented as part of the GSoC final report at https://gist.github.com/idoqo/ac520943e53f64034beaed4258b62ba5 | |||||||
| Comment by Sergei Petrunia [ 2021-08-31 ] | |||||||
|
The candidate code is at: https://github.com/MariaDB/server/tree/bb-10.7-mdev21130 | |||||||
| Comment by Sergei Petrunia [ 2021-08-31 ] | |||||||
|
elenst, documentation for testing: histogram_type variable is used to specify what kind of histogram is collected. After this patch, it can be set to a new value: JSON_HB. Then, EITS code will create JSON histograms. Old histogram types remain supported, the optimizer will use whatever histogram is available. The histograms are kept in mysql.column_stats, the columns of interest are:
the function DECODE_HISTOGRAM also "supports" JSON_HB by returning the histogram unmodified. That is, one can do this:
and get a readable representation of histogram for any kind of histogram. Things to test
There is only one column col1 which is not indexed. The interesting parts of the output are filtered (prediction from the histogram) and r_filtered (actual value). There will be differences in estimates with binary histograms. The estimates should be generally better. Rounding-error-level regressions are acceptable. An estimate that is significantly worse than one from the binary histogram warrants an investigation. | |||||||
| Comment by Sergei Petrunia [ 2021-09-01 ] | |||||||
|
The issue in MySQL that I was mentioning: https://bugs.mysql.com/bug.php?id=104789 | |||||||
| Comment by Sergei Petrunia [ 2021-09-05 ] | |||||||
|
Pushed more cleanup patches to https://github.com/MariaDB/server/tree/bb-10.7-mdev21130 . Buildbot is clean. | |||||||
| Comment by Ian Gilfillan [ 2021-09-22 ] | |||||||
|
Descriptions in the server for system variables need updating as well, for example histogram_type still states "Specifies type of the histograms created by ANALYZE. Possible values are: SINGLE_PREC_HB - single precision height-balanced, DOUBLE_PREC_HB - double precision height-balanced." even though JSON_HB is valid as well. | |||||||
| Comment by Sergei Petrunia [ 2022-01-21 ] | |||||||
|
Pushed into 10.8. JSON_HB histograms are not enabled by default. |