[MDEV-33305] Testing for MDEV-21829: Packed keys in Unique Created: 2024-01-24  Updated: 2024-02-07

Status: In Progress
Project: MariaDB Server
Component/s: Tests
Fix Version/s: 10.6

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Lena Startseva
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File tempfile-usage3.bt    
Issue Links:
Relates
relates to MDEV-21829 Use packed sort keys in Unique objects In Review

 Description   

This is about doing testing for MDEV-21829: Use packed sort keys in Unique objects

Queries affected by MDEV-21829

Packed Unique has these uses:

USE1: Histogram collection in EITS
USE2: agg_func(DISTINCT col,,, )
USE3: Index_merge stores rowids.

We need to test the scenario where the data doesn't fit into the in-memory Unique and is spilled on disk.

The focus is on the scenario where Unique stores variable-sized data, although it's nice to have a test for fixed-size data, too.

There is no way to turn off the new behavior. You will need to compare with the old server.

There is a bpftrace script to watch tempfile usage at MDEV-32472.

Variable-sized data

This is VARCHAR(n) columns.
(TODO: what happens with CHAR(n)?) cvicentiu ?

Controlling amount of memory that Unique uses

How much memory Unique will use for COUNT(DISTINCT): The logic is here:

  size_t Item_sum::ram_limitation(THD *thd)
  {     
    return MY_MAX(1024,
             (size_t)MY_MIN(thd->variables.tmp_memory_table_size,
                            thd->variables.max_heap_table_size));

How much memory Unique will use in EITS ANALYZE:

  void Column_statistics_collected::init(THD *thd, Field *table_field)
  {
    size_t max_heap_table_size= (size_t)thd->variables.max_heap_table_size;

Plan for test#1:

Create a table with VARCHAR column.

  • Try different charsets (latin, utf8mb3, utf8mb4 if you have the data, ucs2)
  • Try a few different collations (There's no need to try all collations)

Populate the table with enough random data of different lengths and chars.

Set @@max_heap_table_size low enough to force Unique to flush to disk.

Run

ANALYZE TABLE t PERSISTENT FOR COLUMNS (col) INDEXES ();

Compare the output of

select * from mysql.column_stats where db_name=datababase() and table_name='t'

with the server before the patch.

Also compare bpftrace tempfile usage output for both servers.

Testing for agg_func(DISTINCT ...)

TODO



 Comments   
Comment by Sergei Petrunia [ 2024-02-04 ]

lstartseva, followup to previous talk:

Attached script indeed counts writes to tempfile, not tempfile size. If tempfile is rewritten, its size is actually smaller.

Please find attached tempfile-usage3.bt which catches lseek calls, and also tracks read to see how large the file was. Example output.

XX Lseek on fd=57
Lseek on fd=57
^C
 
@lseek_in_progress[15244]: 57
 
 
@tempfile_max_read[58]: 5954293
@tempfile_max_read[57]: 5968974
 
@tempfile_size[58]: 5954293
@tempfile_size[57]: 11920744

Note that tempfile_max_read[57] < tempfile_size[57]: 11920744, and tempfile_max_read is how large the file actually was.

Comment by Sergei Petrunia [ 2024-02-04 ]

I've also discovered this piece of code in:

bool Unique::walk(TABLE *table, tree_walk_action action, void *walk_action_arg)

  if (buff_sz < full_size * (file_ptrs.elements + 1UL))
    res= merge(table, merge_buffer, buff_sz,
               buff_sz >= full_size * MERGEBUFF2) ;
 
  if (!res)
  {
    res= merge_walk(merge_buffer, buff_sz, full_size,
                    (Merge_chunk *) file_ptrs.buffer,

file_ptrs.elements is "how many unique objects were dumped".
buff_sz is the amount of buffer space we have
full_size is one element size.

This means if the buffer is so so small that we can't have one element from each file in the buffer, then... we use merge_walk() procedure that merges all the buffers at once.

This involves doing a lot of small read calls:

@tempfile_read_size: 
[8, 16)             2297 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16, 32)             495 |@@@@@@@@@@@                                         |
[32, 64)             458 |@@@@@@@@@@                                          |
[64, 128)            465 |@@@@@@@@@@                                          |
[128, 256)           463 |@@@@@@@@@@                                          |
[256, 512)           491 |@@@@@@@@@@@                                         |
[512, 1K)            517 |@@@@@@@@@@@                                         |
[1K, 2K)             511 |@@@@@@@@@@@                                         |
[2K, 4K)             329 |@@@@@@@                                             |
[4K, 8K)            1198 |@@@@@@@@@@@@@@@@@@@@@@@@@@@                         |

this is an edge case. It would be nice if testing involved this edge case but mostly "normal" case.
One can recognize which case occurred by checking whether lseek call was made (see the trace script).

Comment by Lena Startseva [ 2024-02-06 ]

There are usually two temporary files generated. To understand when it happens see the stacktraces below:
Stacktrace of the creation of the first temporary file:

#0  create_temp_file (to=0x7f99afefc570 "\001", dir=0x556bfef55838 ".../MariaDB/10.6/bld/mysql-test/var/tmp/mysqld.1", prefix=0x7f998401bf00 "MY", mode=0, MyFlags=80)
    at .../MariaDB/10.6/mysys/mf_tempfile.c:64
#1  0x0000556bfb918003 in real_open_cached_file (cache=0x7f998401bdc8) at .../MariaDB/10.6/mysys/mf_cache.c:61
#2  0x0000556bfb91d387 in my_b_flush_io_cache (info=0x7f998401bdc8, need_append_buffer_lock=0) at .../MariaDB/10.6/mysys/mf_iocache.c:1706
#3  0x0000556bfb91aba7 in _my_b_write (info=0x7f998401bdc8, Buffer=0x7f9984ae6774 "p%dJ&zX(^#2p`m&n|F&)=66KXZ:Sre\"uLZaYyv\217\217\217\217\217\217\370\253\235\204\231\177", Count=38)
    at .../MariaDB/10.6/mysys/mf_iocache.c:559
#4  0x0000556bfad83099 in my_b_write (info=0x7f998401bdc8, Buffer=0x7f9984ae6700 "\226", Count=154) at .../MariaDB/10.6/include/my_sys.h:522
#5  0x0000556bfad83154 in Unique::unique_write_to_file (key=0x7f9984ae6700 "\226", count=1, unique=0x7f998401bd90) at .../MariaDB/10.6/sql/uniques.cc:45
#6  0x0000556bfb94e8a9 in tree_walk_left_root_right (tree=0x7f998401bf38, element=0x7f9984ae66e8, action=0x556bfad830f8 <Unique::unique_write_to_file(unsigned char*, unsigned int, Unique*)>, 
    argument=0x7f998401bd90) at .../MariaDB/10.6/mysys/tree.c:590
#7  0x0000556bfb94e85a in tree_walk_left_root_right (tree=0x7f998401bf38, element=0x7f9984992a20, action=0x556bfad830f8 <Unique::unique_write_to_file(unsigned char*, unsigned int, Unique*)>, 
    argument=0x7f998401bd90) at .../MariaDB/10.6/mysys/tree.c:588
#8  0x0000556bfb94e8ce in tree_walk_left_root_right (tree=0x7f998401bf38, element=0x7f9984ad4928, action=0x556bfad830f8 <Unique::unique_write_to_file(unsigned char*, unsigned int, Unique*)>, 
    argument=0x7f998401bd90) at .../MariaDB/10.6/mysys/tree.c:593
#9  0x0000556bfb94e8ce in tree_walk_left_root_right (tree=0x7f998401bf38, element=0x7f9984a7a838, action=0x556bfad830f8 <Unique::unique_write_to_file(unsigned char*, unsigned int, Unique*)>, 
    argument=0x7f998401bd90) at .../MariaDB/10.6/mysys/tree.c:593
#10 0x0000556bfb94e85a in tree_walk_left_root_right (tree=0x7f998401bf38, element=0x7f99849d4730, action=0x556bfad830f8 <Unique::unique_write_to_file(unsigned char*, unsigned int, Unique*)>, 
    argument=0x7f998401bd90) at .../MariaDB/10.6/mysys/tree.c:588
#11 0x0000556bfb94e8ce in tree_walk_left_root_right (tree=0x7f998401bf38, element=0x7f9984a701f0, action=0x556bfad830f8 <Unique::unique_write_to_file(unsigned char*, unsigned int, Unique*)>, 
    argument=0x7f998401bd90) at .../MariaDB/10.6/mysys/tree.c:593

Stacktrace of the creation of the second temporary file:

#0  create_temp_file (to=0x7f99afefc090 "m\034\357\373kU", dir=0x556bfef55838 ".../MariaDB/10.6/bld/mysql-test/var/tmp/mysqld.1", prefix=0x7f99afefc708 "MY", mode=0, MyFlags=80)
    at .../MariaDB/10.6/mysys/mf_tempfile.c:64
#1  0x0000556bfb918003 in real_open_cached_file (cache=0x7f99afefc5d0) at .../MariaDB/10.6/mysys/mf_cache.c:61
#2  0x0000556bfb91d387 in my_b_flush_io_cache (info=0x7f99afefc5d0, need_append_buffer_lock=0) at .../MariaDB/10.6/mysys/mf_iocache.c:1706
#3  0x0000556bfb91aba7 in _my_b_write (info=0x7f99afefc5d0, 
    Buffer=0x7f9984bc1769 "P_ju12GM0CCdN:vf{Z1$^2x-pnYTvwpNrw{%Eh`'?H,@>6AXyvg#r8`Y|)RA/g9'TS#RW<gr*t/+))4h2\\Y*d:48X7F:2)v:", '\245' <repeats 104 times>..., Count=45)
    at .../MariaDB/10.6/mysys/mf_iocache.c:559
#4  0x0000556bfafa86c2 in my_b_write (info=0x7f99afefc5d0, Buffer=0x7f9984bc172c "f", Count=106) at .../MariaDB/10.6/include/my_sys.h:522
#5  0x0000556bfafaf45a in merge_buffers (param=0x7f99afefc7d0, from_file=0x7f998401bdc8, to_file=0x7f99afefc5d0, sort_buffer=..., lastbuff=0x7f9984a10118, Fb=0x7f9984a10118, Tb=0x7f9984a10268, flag=0)
    at .../MariaDB/10.6/sql/filesort.cc:1980
#6  0x0000556bfafae290 in merge_many_buff (param=0x7f99afefc7d0, sort_buffer=..., buffpek=0x7f9984a10118, maxbuffer=0x7f99afefc7a4, t_file=0x7f998401bdc8)
    at .../MariaDB/10.6/sql/filesort.cc:1657
#7  0x0000556bfad84d8b in Unique::merge (this=0x7f998401bd90, table=0x7f99849bbf20, buff=0x7f9984ac1f88 "i", buff_size=1046436, without_last_merge=true)
    at .../MariaDB/10.6/sql/uniques.cc:769
#8  0x0000556bfad84a06 in Unique::walk (this=0x7f998401bd90, table=0x7f99849bbf20, action=0x556bfb0ccf82 <item_sum_distinct_walk_for_count(void*, element_count, void*)>, walk_action_arg=0x7f998401bbd8)
    at .../MariaDB/10.6/sql/uniques.cc:689
#9  0x0000556bfb0ce3ef in Aggregator_distinct::endup (this=0x7f998401bbd8) at .../MariaDB/10.6/sql/item_sum.cc:1100
#10 0x0000556bfb0d2022 in Item_sum_count::val_int (this=0x7f99840184f8) at .../MariaDB/10.6/sql/item_sum.cc:1946
#11 0x0000556bfae95a3b in Type_handler::Item_send_longlong (this=0x556bfc6aa7c0 <type_handler_slonglong>, item=0x7f99840184f8, protocol=0x7f99840013e0, buf=0x7f99afefcaa0)
    at .../MariaDB/10.6/sql/sql_type.cc:7514

Generated at Thu Feb 08 10:37:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.