[MDEV-18586] Clarify benefits of compression methods (ROW_FORMAT=COMPRESSED vs PAGE_COMPRESSED=1) Created: 2019-02-14  Updated: 2020-05-14  Resolved: 2019-10-09

Status: Closed
Project: MariaDB Server
Component/s: Documentation, Storage Engine - InnoDB
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Nuno Assignee: Kenneth Dyer (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-20670 Add status variable that lists suppor... Closed
relates to MDEV-20672 Usage message for innodb_compression_... Closed

 Description   

I am very confused at the moment in relation to how my tables are compressed, and how should I proceed now to ensure I won't have a problem in the future.

I have been compressing a number of tables with ROW_FORMAT=COMPRESSED.

I read the doc page about Compression ( https://mariadb.com/kb/en/library/compression/ ), but I'm still not clear what's the difference between ROW_FORMAT=COMPRESSED and PAGE_COMPRESSED=1.

Can I use both? Is it recommended to use both if I want to benefit from compression?

Is it possible to know what compression algorithms are installed in MariaDB already? (or is it a system-level installation that MariaDB uses?) It's not clear in the documentation.



 Comments   
Comment by Geoff Montee (Inactive) [ 2019-09-25 ]

I read the doc page about Compression ( https://mariadb.com/kb/en/library/compression/ ), but I'm still not clear what's the difference between ROW_FORMAT=COMPRESSED and PAGE_COMPRESSED=1.

I added some details about the differences here:

https://mariadb.com/kb/en/library/compression/#differences-compared-to-the-compressed-row-format

Can I use both? Is it recommended to use both if I want to benefit from compression?

No, you cannot use both. I added some details about that here:

InnoDB page compression is also only supported if the table's row format is COMPACT or DYNAMIC.

https://mariadb.com/kb/en/library/compression/#using-innodb-page-compression

Is it possible to know what compression algorithms are installed in MariaDB already?

I added some details about that too. You can do something like this:

MariaDB [(none)]> SHOW GLOBAL STATUS WHERE Variable_name IN ('Innodb_have_lz4', 'Innodb_have_lzo', 'Innodb_have_lzma', 'Innodb_have_bzip2', 'Innodb_have_snappy');
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Innodb_have_lz4    | OFF   |
| Innodb_have_lzo    | OFF   |
| Innodb_have_lzma   | ON    |
| Innodb_have_bzip2  | OFF   |
| Innodb_have_snappy | OFF   |
+--------------------+-------+
5 rows in set (0.001 sec)

These details are now here:

https://mariadb.com/kb/en/library/compression/#checking-which-page-compression-algorithms-are-supported

Comment by Nuno [ 2019-09-26 ]

@GeoffMontee - thank you so much for this. You're very helpful.

The documentation is much better now.

1) Regarding this:

"With InnoDB page compression, only uncompressed pages are stored in the buffer pool. In contrast, with InnoDB's COMPRESSED row format, both uncompressed and compressed pages can be in the buffer pool."

When I read this first time, I understood this as "whatever is compressed NEVER goes to buffer pool", but I think this means that the pages are decompressed before they go to the buffer pool.

What clarified this to me was this sentence:

"With InnoDB page compression, pages are decompressed right after they are read from tablespace file, but before being placed in the buffer pool."

Someone that reads only the first sentence might get confused, but reading everything clears things up.

2) Is it right to say that it is recommended to use Page Compression instead of ROW_FORMAT=COMPRESSED ?

3) Under Persistent TRIM,

innodb_use_fallocate=ON

Might be worth putting a small note saying that this variable was removed in MariaDB 10.3.0

Thank you very much!

Comment by Geoff Montee (Inactive) [ 2019-09-26 ]

Hi nunop,

thank you so much for this. You're very helpful.

I'm glad to help!

When I read this first time, I understood this as "whatever is compressed NEVER goes to buffer pool", but I think this means that the pages are decompressed before they go to the buffer pool.
What clarified this to me was this sentence:
...
Someone that reads only the first sentence might get confused, but reading everything clears things up.

Good point. That was worded in a confusing way. I just combined those two bullets into a single bullet.

With InnoDB page compression, compressed pages are immediately decompressed after being read from the tablespace file, and only uncompressed pages are stored in the buffer pool. In contrast, with InnoDB's COMPRESSED row format, compressed pages are decompressed immediately after they are read from the tablespace file, and both the uncompressed and compressed pages are stored in the buffer pool. This means that the COMPRESSED row format uses more space in the buffer pool than InnoDB page compression does.

https://mariadb.com/kb/en/library/compression/#differences-compared-to-the-compressed-row-format

Is it right to say that it is recommended to use Page Compression instead of ROW_FORMAT=COMPRESSED ?

Yeah, exactly.

Under Persistent TRIM,

innodb_use_fallocate=ON

Might be worth putting a small note saying that this variable was removed in MariaDB 10.3.0

That whole section needed a heavy edit. I went ahead and edited it, and then renamed it.

https://mariadb.com/kb/en/library/compression/#saving-storage-space-by-punching-holes-in-compressed-pages

Comment by Nuno [ 2019-09-26 ]

Very good improvements. Thanks very much!

I guess this Issue can now be closed, if there's no further work to do here

Regards.

Comment by Geoff Montee (Inactive) [ 2019-09-26 ]

Thanks!

jacob.moorman may want the changes to be reviewed. I'll let him decide whether this can be closed, or whether he would like other actions to be taken on this first.

Comment by Nuno [ 2020-05-14 ]

@Geoff, @Jacob,

Finally had the opportunity to test PAGE_COMPRESSED.

I have a table that is currently 14GB big (in the file system).
ENGINE=InnoDB ROW_FORMAT=COMPRESSED

I am testing converting to PAGE_COMPRESSED=1, with maximum compression (zlib level 9),
ALTER TABLE mytable ENGINE=InnoDB ROW_FORMAT=DYNAMIC KEY_BLOCK_SIZE=0 PAGE_COMPRESSED=1 PAGE_COMPRESSION_LEVEL=9;

But the file ended up being almost 20GB.

By rebuilding it, defragmenting, and using maximum compression, shouldn't it have become smaller?

EDIT - It might be ok.
The sparse file is 20GB, but I see that the real size of it has reduced to 8GB.

Comment by Geoff Montee (Inactive) [ 2020-05-14 ]

Hi nunop,

By rebuilding it, defragmenting, and using maximum compression, shouldn't it have become smaller?

EDIT - It might be ok.
The sparse file is 20GB, but I see that the real size of it has reduced to 8GB.

Right. Since the file is a sparse file, some tools might still report the size of the file to be the full size of the file--not just the size of the allocated parts of the file. There is some information about that here:

https://mariadb.com/kb/en/innodb-page-compression/#saving-storage-space-with-sparse-files

If you are on Linux, then ls -s can report the actual size:

The ls utility will report the non-sparse size of the tablespace file when executed with default behavior, but ls -s will report the actual amount of storage allocated for the tablespace file.

Generated at Thu Feb 08 08:45:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.