[MCOL-406] Stored procedures required for I_S tables Created: 2016-11-15  Updated: 2016-12-07  Resolved: 2016-12-07

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.0.6

Type: New Feature Priority: Major
Reporter: Andrew Hutchings (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MCOL-309 Support method to report on data set ... Closed
Sprint: 2016-22, 2016-23, 2016-24

 Description   

Stored procedures are required for the new I_S tables created in MCOL-309. At the very least:

1. a full disk size report
2. a usage report per table

We need somewhere to put these as I_S cannot have stored procedures. I recommend something new such as columnstoresys so that it doesn't have any of the legacy names our other schemas use.



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2016-11-15 ]

My initial query was wrong for a number of reasons. Most of these down to one join's results skewing the other join. These are the three queries:

Extent file sizes:

select c.table_name, sum(e1.file_size) extent_file_size from columnstore_columns c join columnstore_extents e1 on c.object_id = e1.object_id and e1.block_offset=0 group by c.table_name;

Dict file sizes:

select c.table_name, sum(e1.fsize) dict_file_size from columnstore_columns c join (select object_id, sum(file_size) fsize from columnstore_extents where block_offset=0 group by object_id) e1 on c.dictionary_object_id = e1.object_id group by c.table_name;

Total file sizes:

select sum(file_size) from columnstore_extents where block_offset=0;

Comment by Andrew Hutchings (Inactive) [ 2016-11-15 ]

Things to note:

  • It will be slightly different to 'du' because 'du' includes things like versionbuffer and the file size of the directory entries.
  • Extent+Dict != total. This is because there are small hidden tables that cannot be exposed to columnstore_columns
Comment by Andrew Hutchings (Inactive) [ 2016-11-15 ]

Total data size:

select sum(dsize) from (select object_id, max(block_offset), sum(data_size) as dsize from columnstore_extents group by object_id) ce;

Need to use this to work out estimated compression ratio

Comment by Andrew Hutchings (Inactive) [ 2016-11-15 ]

also requested for this feature: a shell wrapper script

Comment by Andrew Hutchings (Inactive) [ 2016-11-21 ]

Compression ratio (for compressed tables):

select sum(data_size) / sum(compressed_data_size) * 100 from columnstore_extents where block_offset=0 and uncompressed_file_size is not null;

Not quite right, need mac(block_offset)

Comment by Andrew Hutchings (Inactive) [ 2016-11-23 ]

New patch does the following:

  • Fix a few minor bugs
  • Move the filename/size to a new table called information_schema.columnstore_files
  • Adds a new schema called columnstore_info
  • Add three new stored procedures in columnstore_info:
    • total_usage() - gives total disk/data usage for all columnstore tables (excluding system tables)
    • table_usage() - gives total disk/data usage for a given table in single quotes or NULL to get usage for all tables
    • compression_ration() - gives the overall compression ratio for the compressed columns in ColumnStore

Done everything apart from a script to access it. Getting a script to work if a MySQL password is changed could be difficult and it should be easy to access the stored procedures.

Moving to stalled until tree is unfrozen

Comment by Andrew Hutchings (Inactive) [ 2016-11-24 ]

Documentation updates to follow after merge

Comment by Andrew Hutchings (Inactive) [ 2016-11-28 ]

Documentation draft available at: https://mariadb.com/kb/en/mariadb/columnstore-information-schema-tables/+r/50150/

Comment by Daniel Lee (Inactive) [ 2016-12-02 ]

Build tested: Source from Github.

[root@localhost mariadb-columnstore-server]# git show
commit 3795bd4cf42d59b792c473101703911fb53e9297
Merge: 570184c 84714c9
Author: dhall-InfiniDB <david.hall@mariadb.com>
Date: Wed Nov 30 11:42:14 2016 -0600

Merge pull request #18 from mariadb-corporation/MCOL-424

MCOL-424 Disable indexes for cross-engine

[root@localhost mariadb-columnstore-server]# cd mariadb-columnstore-engine/
[root@localhost mariadb-columnstore-engine]# git show
commit 63266938716bd933b80eb03fc8aa4ef10d25eab6
Merge: 691c52c 4efd58d
Author: david hill <david.hill@mariadb.com>
Date: Fri Dec 2 10:03:11 2016 -0600

merge mcol-421 branch

It is nicely implemented and the information that it provides is very useful. The following are few issues that I found:

1)
MariaDB [columnstore_info]> call columnstore_info.table_usage();
ERROR 1318 (42000): Incorrect number of arguments for PROCEDURE columnstore_info.table_usage; expected 1, got 0

The documentation says this:

The table_usage() procedure gives a the total data disk usage, dictionary disk usage and grand total disk usage per-table. It can be called one of two ways, the first gives a total for each table:

> call columnstore_info.table_usage();

Or for a specific table, my_table in this example:

> call columnstore_info.table_usage('my_table');

2)
MariaDB [tpch1c]> call columnstore_info.table_usage('lineitem');
-----------------------------------------------------+

TABLE_NAME DATA_DISK_USAGE DICT_DISK_USAGE TOTAL_USAGE

-----------------------------------------------------+

lineitem 5.34 GB 4.50 GB 9.84 GB

-----------------------------------------------------+
1 row in set (0.09 sec)

The lineitem table exist in different schemas. How do I specify a schema to identify a specific table. Is this usage for all lineitem tables?

3) mysqld crashed when calling the table_usage() procedure, but I could not reproduce it after.

MariaDB [tpch1c]> call columnstore_info.table_usage('tpch10c.lineitem');
ERROR 2013 (HY000): Lost connection to MySQL server during query
MariaDB [tpch1c]> call columnstore_info.table_usage('tpch10c.lineitem');
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 3
Current database: tpch1c

Info from the /var/log/mariadb/mariadb.log file

161202 22:14:34 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.19-MariaDB
key_buffer_size=536870912
read_buffer_size=4194304
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1780837 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x7f1de1a90908
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f1db0720d98 thread_stack 0x80000
/usr/local/mariadb/columnstore/mysql//bin/mysqld(my_print_stacktrace+0x29)[0x7f1ddff7d639]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(handle_fatal_signal+0x2dd)[0x7f1ddfb3a74d]
/lib64/libpthread.so.0(+0xf100)[0x7f1ddf168100]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(+0x4686b5)[0x7f1ddfa016b5]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN4JOIN14optimize_innerEv+0xed4)[0x7f1ddfa1a294]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN4JOIN8optimizeEv+0x2f)[0x7f1ddfa1c35f]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x1bb)[0x7f1ddfa1c5cb]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x234)[0x7f1ddfa1d184]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(+0x428e09)[0x7f1ddf9c1e09]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z21mysql_execute_commandP3THD+0x5e52)[0x7f1ddf9cdca2]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x35)[0x7f1ddfc37d75]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN13sp_lex_keeper23reset_lex_and_exec_coreEP3THDPjbP8sp_instr+0x7d)[0x7f1ddfc3e2bd]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x204)[0x7f1ddfc3e864]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN7sp_head7executeEP3THDb+0x767)[0x7f1ddfc3aa97]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x5a7)[0x7f1ddfc3bf47]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(+0x428afe)[0x7f1ddf9c1afe]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z21mysql_execute_commandP3THD+0x159c)[0x7f1ddf9c93ec]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x33b)[0x7f1ddf9d110b]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z18idb_vtable_processP3THDyP9Statement+0x10c8)[0x7f1ddf9d3318]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1f02)[0x7f1ddf9d7bd2]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z10do_commandP3THD+0x161)[0x7f1ddf9d8a41]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(_Z24do_handle_one_connectionP3THD+0x194)[0x7f1ddfa8f0a4]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(handle_one_connection+0x37)[0x7f1ddfa8f277]
/lib64/libpthread.so.0(+0x7dc5)[0x7f1ddf160dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f1ddd7a2ced]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f1d70135e88): is an invalid pointer
Connection ID (thread ID): 12
Status: NOT_KILLED

Optimizer switch: index_merge=off,index_merge_union=off,index_merge_sort_union=off,index_merge_intersection=off,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=off,in_to_exists=on,semijoin=off,partial_match_rowid_merge=off,partial_match_table_scan=off,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=off,exists_to_in=on,orderby_uses_equalities=off

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

We think the query pointer is invalid, but we will try to print it anyway.
Query: SELECT TABLE_NAME, format_filesize(sum(cf.file_size)) DATA_DISK_USAGE, format_filesize(sum(IFNULL(ccf.file_size, 0))) DICT_DISK_USAGE, format_filesize(sum(cf.file_size) + sum(IFNULL(ccf.file_size, 0))) TOTAL_USAGE FROM INFORMATION_SCHEMA.COLUMNSTORE_COLUMNS cc
JOIN INFORMATION_SCHEMA.COLUMNSTORE_FILES cf ON cc.object_id = cf.object_id
LEFT JOIN INFORMATION_SCHEMA.COLUMNSTORE_FILES ccf ON cc.dictionary_object_id = ccf.object_id
WHERE table_name = NAME_CONST('t_name',_latin1'tpch10c.lineitem' COLLATE 'latin1_swedish_ci') GROUP BY table_name

161202 22:14:35 mysqld_safe Number of processes running now: 0
161202 22:14:35 mysqld_safe mysqld restarted

Comment by Andrew Hutchings (Inactive) [ 2016-12-03 ]

For point 1, it is supposed to be "call columnstore_info.table_usage(NULL);".

For the rest, I'll look into this early next week.

Comment by Andrew Hutchings (Inactive) [ 2016-12-05 ]

Segfault was very likely to be MCOL-441

Comment by Andrew Hutchings (Inactive) [ 2016-12-06 ]

Review for fixes based on Daniel's comments.

To answer the specific problems:

1) A 'NULL' needs to be provided for the parameter since optional parameters are not possible. I will make sure to update the documentation appropriately

2) Which you can only use the table name in the query (schema.table format won't work) with the latest pull request it will give the results broken down by schema and table.

3) I'm pretty sure MCOL-441 will fix this.

Comment by David Thompson (Inactive) [ 2016-12-06 ]

Agreed, lets go forward with justins suggestion for 1.0.6.

Comment by Andrew Hutchings (Inactive) [ 2016-12-06 ]

Another pull request implemented the discussed change to table_usage()

Comment by Andrew Hutchings (Inactive) [ 2016-12-07 ]

Another new pull request, fixes the table_usage('schema', NULL) use case and the hang dleeyh experienced.

Comment by Daniel Lee (Inactive) [ 2016-12-07 ]

Build verified: Github source

[root@localhost mariadb-columnstore-server]# git show
commit 8592d353c5477940f9600566639302de9fa994c7
Merge: 3795bd4 7af4e57
Author: dhall-InfiniDB <david.hall@mariadb.com>
Date: Tue Dec 6 09:49:03 2016 -0600

Merge pull request #20 from mariadb-corporation/MCOL-441

MCOL-441 Fix segfault on SP error

[root@localhost mariadb-columnstore-server]# cd mariadb-columnstore-engine/
[root@localhost mariadb-columnstore-engine]# git show
commit 7a8322dc28471b830aca243698cd7fce5bc4401c
Merge: 5c0ced8 9b6beb4
Author: dhall-InfiniDB <david.hall@mariadb.com>
Date: Wed Dec 7 10:10:22 2016 -0600

Merge pull request #73 from mariadb-corporation/MCOL-435

Mcol 435

Verified the two fixes.

Generated at Thu Feb 08 02:20:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.