Big column compressed (MDEV-11371)

[MDEV-13342] Testing for MDEV-11371 (Big column compressed) Created: 2017-07-18  Updated: 2017-12-16  Resolved: 2017-12-16

Status: Closed
Project: MariaDB Server
Component/s: Tests
Affects Version/s: 10.3
Fix Version/s: N/A

Type: Technical task Priority: Major
Reporter: Elena Stepanova Assignee: Alice Sherepa
Resolution: Fixed Votes: 0
Labels: None

Attachments: File column_compression_parts.result     File column_compression_parts.test     File column_compression_rpl.inc     File column_compression_rpl.result     File column_compression_rpl.test    
Issue Links:
Blocks
is blocked by MDEV-13539 Latest revision of bb-10.3-svoj (big ... Closed
PartOf
includes MDEV-13665 EXCHANGE PARTITION does not work with... Closed
Relates
relates to MDEV-13378 Mysqld crashes with COMPRESSED data t... Closed
relates to MDEV-13540 Server crashes in copy or Assertion `... Closed
relates to MDEV-13541 Server crashes in next_breadth_first_... Closed
Sprint: 10.3.1-2

 Description   

Development tasks: MDEV-11371, MDEV-11381
Development tree (as of July 18th): bb-10.3-svoj
Tentative patch (as of July 18th): https://github.com/MariaDB/server/commit/79e055f407d34f195e3fde20401f39033dfce51d

Request before the second round of review (received by email):

On Thu, Jun 29, 2017 at 12:22:15PM +0400, Sergey Vojtovich wrote:

Elena: I added decent test for this feature, but it would be great if you could
extend it, especially replication testing is missing.

Note: Since there is no documentation for the feature, need to explore first.
Note: Make sure it's documented before the release.
Note: Have 'innodb' removed from MDEV-11371 subject.

Initial exploration:

  • server restart
  • base for virt col
  • storage for dyncol
  • 2nd part of index
  • change column, alter column
  • views
  • analyze, optimize, check
  • create table .. select with unsupported engine haven't found an unsupported engine yet
  • partitions, partition by compressed col
  • timestamps, sets, enums (N/A)
  • zerofill, unsigned (N/A)
  • null/not null
  • default
  • charsets
  • compressed column + row_format=compressed
  • + table compressed
  • + encryption
  • aria, tokudb, rocksdb, connect, heap, federated
  • binary log, replication
  • mysqldump


here does not mean "tested", it just means it appears to be supported and does not fail right away

Extra MTR tests needed:

  • with partitions
    • PARTITION BY KEY + SELECT .. WHERE col = 'something' etc. – crashes
  • with binlog in row format (easy to check with replication)
    • some strange '\x00foo' shows up, length increases


 Comments   
Comment by Alice Sherepa [ 2017-07-24 ]

1)If we have compressed data types on master and compressed on slave, then we get wrong data on slave, for all BLOB and TEXT data types(TINY,...), works correct with VARCHAR and VARBINARY

The same if we have compressed on master, but not compressed on slave.
If we have compressed data on slave, but not compressed on master, then the results on slave are correct.

2) tables with partitions and compressed data types crash, when

select/update/delete ... where a="...";     # column a- compressed (varchar or varbinary)

no crash and correct result if we use compressed column with functions LIKE, NOT LIKE, LENGTH(), ...
example of stack trace

Thread 1 (Thread 0x7fec02075700 (LWP 26424)):
#0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
#1  0x0000556749cc8ea6 in my_write_core (sig=6) at /home/alice/git/10.3.sv/mysys/stacktrace.c:477
#2  0x0000556749553ca5 in handle_fatal_signal (sig=6) at /home/alice/git/10.3.sv/sql/signal_handler.cc:299
#3  <signal handler called>
#4  0x00007fec07dba428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#5  0x00007fec07dbc02a in __GI_abort () at abort.c:89
#6  0x00007fec07db2bd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x556749e85028 "0", file=file@entry=0x556749e84fe0 "/home/alice/git/10.3.sv/sql/field.h", line=line@entry=3213, function=function@entry=0x556749e8a840 <Field_varstring_compressed::get_key_image(unsigned char*, unsigned int, Field::imagetype)::__PRETTY_FUNCTION__> "virtual uint Field_varstring_compressed::get_key_image(uchar*, uint, Field::imagetype)") at assert.c:92
#7  0x00007fec07db2c82 in __GI___assert_fail (assertion=0x556749e85028 "0", file=0x556749e84fe0 "/home/alice/git/10.3.sv/sql/field.h", line=3213, function=0x556749e8a840 <Field_varstring_compressed::get_key_image(unsigned char*, unsigned int, Field::imagetype)::__PRETTY_FUNCTION__> "virtual uint Field_varstring_compressed::get_key_image(uchar*, uint, Field::imagetype)") at assert.c:101
#8  0x0000556749547aa9 in Field_varstring_compressed::get_key_image (this=0x7febf0036d40, buff=0x7febf0054f91 '\245' <repeats 200 times>..., length=1000, type_arg=Field::itRAW) at /home/alice/git/10.3.sv/sql/field.h:3213
#9  0x00005567496acdac in Item_bool_func::get_mm_leaf (this=0x7febf0015168, param=0x7fec02072a90, field=0x7febf0036d40, key_part=0x7febf00546f8, type=Item_func::EQ_FUNC, value=0x7febf00150d8) at /home/alice/git/10.3.sv/sql/opt_range.cc:8123
#10 0x00005567496abdfc in Item_bool_func::get_mm_parts (this=0x7febf0015168, param=0x7fec02072a90, field=0x7febf0036d40, type=Item_func::EQ_FUNC, value=0x7febf00150d8) at /home/alice/git/10.3.sv/sql/opt_range.cc:7841
#11 0x00005567492450f5 in Item_bool_func2_with_rev::get_func_mm_tree (this=0x7febf0015168, param=0x7fec02072a90, field=0x7febf0036d40, value=0x7febf00150d8) at /home/alice/git/10.3.sv/sql/item_cmpfunc.h:442
#12 0x00005567496aaca0 in Item_bool_func::get_full_func_mm_tree (this=0x7febf0015168, param=0x7fec02072a90, field_item=0x7febf0014fd8, value=0x7febf00150d8) at /home/alice/git/10.3.sv/sql/opt_range.cc:7520
#13 0x0000556749244e0c in Item_bool_func::get_full_func_mm_tree_for_args (this=0x7febf0015168, param=0x7fec02072a90, item=0x7febf0014fd8, value=0x7febf00150d8) at /home/alice/git/10.3.sv/sql/item_cmpfunc.h:189
#14 0x0000556749245274 in Item_bool_func2_with_rev::get_mm_tree (this=0x7febf0015168, param=0x7fec02072a90, cond_ptr=0x7fec020729e8) at /home/alice/git/10.3.sv/sql/item_cmpfunc.h:470
#15 0x00005567496a12be in prune_partitions (thd=0x7febf0000b00, table=0x7febf003c7d0, pprune_cond=0x7febf0015168) at /home/alice/git/10.3.sv/sql/opt_range.cc:3483
#16 0x00005567496f3b44 in mysql_delete (thd=0x7febf0000b00, table_list=0x7febf00149c0, conds=0x7febf0015168, order_list=0x7febf0005020, limit=18446744073709551615, options=0, result=0x0) at /home/alice/git/10.3.sv/sql/sql_delete.cc:392
#17 0x00005567492b1e1e in mysql_execute_command (thd=0x7febf0000b00) at /home/alice/git/10.3.sv/sql/sql_parse.cc:4726
#18 0x00005567492bc290 in mysql_parse (thd=0x7febf0000b00, rawbuf=0x7febf00148d8 "DELETE FROM t1 where a=\"a\"", length=26, parser_state=0x7fec02074210, is_com_multi=false, is_next_command=false) at /home/alice/git/10.3.sv/sql/sql_parse.cc:7897
#19 0x00005567492a9d41 in dispatch_command (command=COM_QUERY, thd=0x7febf0000b00, packet=0x7febf0165e01 "DELETE FROM t1 where a=\"a\"", packet_length=26, is_com_multi=false, is_next_command=false) at /home/alice/git/10.3.sv/sql/sql_parse.cc:1814
#20 0x00005567492a86de in do_command (thd=0x7febf0000b00) at /home/alice/git/10.3.sv/sql/sql_parse.cc:1377
#21 0x00005567493f40e2 in do_handle_one_connection (connect=0x55674bdf78a0) at /home/alice/git/10.3.sv/sql/sql_connect.cc:1354
#22 0x00005567493f3e62 in handle_one_connection (arg=0x55674bdf78a0) at /home/alice/git/10.3.sv/sql/sql_connect.cc:1260
#23 0x000055674978ba52 in pfs_spawn_thread (arg=0x55674bdffc70) at /home/alice/git/10.3.sv/storage/perfschema/pfs.cc:1862
#24 0x00007fec089f76ba in start_thread (arg=0x7fec02075700) at pthread_create.c:333
#25 0x00007fec07e8c3dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109


column_compression_parts.test column_compression_rpl.test

Comment by Sergey Vojtovich [ 2017-07-25 ]

Partitioning should be fixed now, please pull.
Another thing that may need extensive testing is optimiser when it has to create temporary table.

Comment by Sergey Vojtovich [ 2017-07-25 ]

In replication test please also make sure that data is stored in compressed form in binlog (whenever applicable).

Comment by Sergey Vojtovich [ 2017-07-27 ]

Replication of BLOB COMPRESSED to BLOB COMPRESSED should be fixed now.

Replication of COMPRESSED<->UNCOMPRESSED is not yet solved. We have to extend binlog format to transfer COMPRESSED flag, which has to be discussed with monty.

Comment by Alice Sherepa [ 2017-07-27 ]

I added new test for replication, there is binlog printed at the end.

so, for now, (274ab73a89573554d07a6de1deb846ad47eb1004):
compressed to compressed works correct,
compressed to not compressed - tables on slave return wrong results
not compressed to compressed – tables on slave return 0 (before worked for all types, except varchar and varbinary, error "1259: ZLIB: Input data corrupted/ [^compr_rpl.test]

Comment by Sergey Vojtovich [ 2017-07-27 ]

As I mentioned above not compressed to compressed is not ready yet. Otherwise this change is intentional and makes things inline with varchar.

Comment by Sergey Vojtovich [ 2017-08-08 ]

This change is now in bb-10.3-svoj: Fixed replication of compressed->uncompressed and uncompressed->compressed data. Depending on slave_type_conversions it is either rejected or converted properly.

To make it work currently, libmariadb has to be patched:

diff --git a/include/mariadb_com.h b/include/mariadb_com.h
index cdf7db8..31550cb 100644
--- a/include/mariadb_com.h
+++ b/include/mariadb_com.h
@@ -327,6 +327,8 @@ enum enum_field_types { MYSQL_TYPE_DECIMAL, MYSQL_TYPE_TINY,
                         MYSQL_TYPE_TIMESTAMP2,
                         MYSQL_TYPE_DATETIME2,
                         MYSQL_TYPE_TIME2,
+                        MYSQL_TYPE_BLOB_COMPRESSED= 140,
+                        MYSQL_TYPE_VARCHAR_COMPRESSED= 141,
                         /* --------------------------------------------- */
                         MYSQL_TYPE_JSON=245,
                         MYSQL_TYPE_NEWDECIMAL=246,

Comment by Elena Stepanova [ 2017-08-15 ]

Ignore my previous comment (now removed), it belongs to a different task.

Comment by Alice Sherepa [ 2017-08-21 ]

replication test failed for compresssed to not compressed part (Column 0 of table 'test.t1' cannot be converted from type 'tinyblob compressed' to type 'blob') in row replication type

Comment by Sergey Vojtovich [ 2017-08-21 ]

Did you set slave_type_conversions?

Comment by Elena Stepanova [ 2017-08-21 ]

alice,
1) column_compression_parts is not stable, please try to record a result file and run the test several times with it, you'll see.
2) I think we should have both MyISAM and InnoDB in the test. Alternatively, we can have combinations based on engine, but it will require an rdiff file for the result, because there is a lot of engine-specific in the result output.

Replication test fails as alice said above. svoj, setting slave_type_conversions to either ALL_NON_LOSSY or ALL_LOSSY doesn't seem to make a difference.

Comment by Sergey Vojtovich [ 2017-08-23 ]

Back to the discussion about

DELETE FROM t1 where a=(REPEAT('a',100));

.

It does compress indeed during partition pruning. I'd say it is performance bug. It shouldn't block initial version though.

In parts test data is spread differently depending on COMPRESSION flag setting (see output from I_S.PARTITIONS). I don't really like it.

Comment by Sergey Vojtovich [ 2017-08-23 ]

Also note that I pushed rebased version to bb-10.3-MDEV-11371. I suggest to switch to that branch unless it fails badly.

Comment by Sergey Vojtovich [ 2017-08-23 ]

I confirm there's something wrong with replication. I'll try to get this fixed asap.

The fact that it says "'tinyblob compressed' to type 'blob'" is a bug as such. It should say "'blob compressed' to type 'blob'". Reported as MDEV-13629.

If I set --slave-type-conversions=ALL_NON_LOSSY it accepts blobs just fine, but then fails with "varchar(10001) compressed' to type 'varchar(10000)". That's something that I'll have to fix.

Comment by Sergey Vojtovich [ 2017-08-25 ]

Replication test should be extended to test tinyblob with data length 255 to tinyblob compressed with column_compression_threshold>255.
It should be an error.

Comment by Sergey Vojtovich [ 2017-08-25 ]

Also VARCHAR(10000) to VARCHAR(9999) COMPRESSED should be an error.

Comment by Sergey Vojtovich [ 2017-08-25 ]

Replication should now be fixed in bb-10.3-MDEV-11371.

Comment by Sergey Vojtovich [ 2017-08-25 ]

Data distribution in partitioned tables is now fixed in bb-10.3-MDEV-11371.

Comment by Alice Sherepa [ 2017-08-29 ]

replication does not work for the biggest value in blob and text (tiny, medium,..) with all kinds (compr to compr, c->n c, n c->c). returns an error (e.g. tinyblob)

INSERT INTO t1(a)  VALUES(REPEAT('a',255))' failed: 1406: Data too long for column 'a' at row 1

Comment by Sergey Vojtovich [ 2017-08-29 ]

It is a bug, it should only fail for nc->c.
I wonder how did you make it fail for c->c though?

Comment by Alice Sherepa [ 2017-08-29 ]

it works for 254, but not 255.

connection slave;
CREATE TABLE t1  (a tinyblob COMPRESSED);
connection master;
CREATE TABLE IF NOT EXISTS t1 (a tinyblob COMPRESSED);
INSERT INTO t1(a)  VALUES(REPEAT('a',255));
main.1dd 'mix'                           [ fail ]
        Test ended at 2017-08-29 12:45:34
 
CURRENT_TEST: main.1dd
mysqltest: At line 15: query 'INSERT INTO t1(a)  VALUES(REPEAT('a',255))' failed: 1406: Data too long for column 'a' at row 1

it indeed does not show an error with blobs while ALL_NON_LOSSY is set
with varchars there is an error. In case with varchar(1000)->varchar(1000) compressed:

Last_Errno	1677
Last_Error	Column 0 of table 'test.t2' cannot be converted from type 'varchar(10001) compressed' to type 'varchar(9999) /*!100301 COMPRESS' 

Comment by Sergey Vojtovich [ 2017-08-29 ]

It happens on master, which is kind of expected. Max data length for compressed blobs 1 byte shorter than for regular blobs. There's probably a way to fix it, but it was decided not to bother with this in first implementation.

Comment by Sergey Vojtovich [ 2017-08-29 ]

If you want to try 255 bytes compressed blob you should do something like this:

SET column_compression_threshold=255;
INSERT INTO t1(a)  VALUES(REPEAT('a',254));

Comment by Sergey Vojtovich [ 2017-08-29 ]

Are you testing recent bb-10.3-MDEV-11371? VARCHAR should have been fixed there.

Comment by Alice Sherepa [ 2017-09-21 ]

column_compression_parts.result column_compression_parts.test column_compression_rpl.inc column_compression_rpl.result column_compression_rpl.test

Generated at Thu Feb 08 08:04:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.