[MDEV-22367] Remove write support for ROW_FORMAT=COMPRESSED Created: 2020-04-24 Updated: 2023-10-30 Resolved: 2022-02-02 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Fix Version/s: | N/A |
| Type: | Task | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | performance | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Description |
|
Between 2004 and 2007, when I designed and implemented ROW_FORMAT=COMPRESSED in the InnoDB Plugin for MySQL 5.1 based on Heikki Tuuri’s rough idea, it might have been a good idea to trade some CPU cycles for I/O bandwidth. Nowadays, with fast solid-state storage being a commodity, it is less so. ROW_FORMAT=COMPRESSED is introducing quite a bit of complexity to the buffer pool and crash recovery. We could make the server much faster if we removed write support. The format is also too rigid to support innodb_page_size larger than 16 KiB or things like instant ADD COLUMN ( To allow users to upgrade from old databases, we must retain read support at least for a few major versions. |
| Comments |
| Comment by Nuno [ 2021-01-01 ] | |
|
@Marko - will we at least be provided with a way to easily convert ONLINE these tables to the alternative? (PAGE_COMPRESSED=1) I have large compressed tables on my 24-hour online write-intensive website. This change seems to mean to me that I won't be able to upgrade my MariaDB version to 10.7+, once the COMPRESSED feature is completely removed... Also, as discussed in MDEV-22839, PAGE_COMPRESSED doesn't really seem to work well for me. It even increased the non-sparse size of the data & indexes, when I tested... I'm really not happy with this change. > "Nowadays, with fast solid-state storage being a commodity," I would be careful in saying this. | |
| Comment by Sergei Golubchik [ 2021-01-28 ] | |
|
nunop, if this will be implemented, tables will continue to be readable, so ALTER TABLE or CREATE ... SELECT will work. | |
| Comment by Nuno [ 2021-01-28 ] | |
|
serg Thank you. However, this means that I have to go through a huge downtime to convert all the large (and critical, 24h online) tables into non-Compressed, to be able to continue upgrading MariaDB. And buy more storage. I'd like to ask you, please, what's the final decision in relation to compressing tables in MariaDB. ROW_FORMAT=COMPRESSED is being obsoleted/removed, so I definitely need to go away from it. So, I just want to confirm with you – there are no better options native of MariaDB planned, is that correct? Thank you very much. | |
| Comment by Ian Gilfillan [ 2021-01-28 ] | |
|
nunop A third option is storage-engine independent compression: https://mariadb.com/kb/en/storage-engine-independent-column-compression/. A note on timings is that if 10.6 is the last version to support ROW_FORMAT=COMPRESSED, it will still be supported for five years after its probable GA release later this year. | |
| Comment by Nuno [ 2021-01-28 ] | |
|
greenman Thank you for your suggestion. I have actually tested column compression in the past. I think I wasn't very happy with the result at the time. While the "average row length" decreased, the Index increased significantly for some reason: EDIT - Apologies. Didn't realize this would add to the attachments of this Ticket. It may be a bit misleading, thinking the Test Results are for the Ticket's original subject! | |
| Comment by Sergei Golubchik [ 2021-01-28 ] | |
|
nunop, I meant if this will be implemented. May be it won't be, if it'll be too difficult for our users. May be there will be non-blocking online ALTER TABLE by that time. There is no final decision yet. At the moment of writing this comment, this MDEV is more like a wish or a request to remove the old, slow, and overcomplicated code. With the goal to make InnoDB faster and easier to maintain (= less bugs). | |
| Comment by Manuel Arostegui [ 2021-03-21 ] | |
|
@Ian Gilfillan @Sergei Golubchik what's the suggested workaround from MariaDB for those who have most of their tables on ROW_FORMAT=COMPRESSED | |
| Comment by Nuno [ 2021-03-21 ] | |
|
marostegui Please have a look at MDEV-22839. As per MDEV-22839, Page Compression didn't give me good results. Column Compression didn't really work well for me, because of the error reported in For now, I migrated my databases to a server with a lot more storage, and disabled compression completely for now, until they decide what to do in terms of compression. Until then, my costs are increased because I'm using way more storage, for fast-growing databases... | |
| Comment by Manuel Arostegui [ 2021-03-21 ] | |
|
Thanks nunop, that's definitely a big deal for us too. Not to mention the fact that we'd need to rebuilt thousands of tables to remove the ROW_FORMAT=COMPRESSED | |
| Comment by Nuno [ 2021-03-21 ] | |
|
I understand their side, regarding ROW_FORMAT=COMPRESSED. It's messy code that they'd like to get rid of, and there's a big disadvantage that it has to have both the Compressed and Decompressed rows in memory. The only problem is that at the moment there's no alternative I can use, that I'm confident of. | |
| Comment by Marko Mäkelä [ 2021-03-23 ] | |
|
Based on the strong feedback, we do not have to remove the write support in 10.7. I think that the deprecation warning that was introduced in 10.6 ( It would be nice to simplify our buffer pool code further. Would it be acceptable if the buffer pool memory usage or the write performance of ROW_FORMAT=COMPRESSED degraded further? I am thinking that we could keep only the uncompressed page image in the buffer pool and allocate and update the compressed copy on demand (when anything is being modified). I think that write performance could have been already hit in 10.5 due to | |
| Comment by Manuel Arostegui [ 2021-03-24 ] | |
|
Thanks for the comment Marko, much appreciated. There are of course tradeoffs as you mention, the ADD COLUMN is definitely something we could benefit from. However, in our case (again, just talking only from our infra point of view) the benefits of being able to compress tables and save footprint is definitely larger than being able to change innodb_page_size or adding columns instantly. But again, we have a very specific workload and very unique requirements. Probably for smaller environments not compressing tables is something that might be easier to deal with and might not be a huge difference, for us, it would be a big pain as we have some pretty massive tables, not only cause we'd need to rethink our footprint strategy, but also case we'd need to rebuilt lots of tables, perform switchovers etc... The approach of having just the uncompressed page doesn't sound bad to me, it can probably work well on most of read-bound environments (which is our case). I'd be happy to do early benchmarking for you with our data if that helps in any case, just ping me about it | |
| Comment by Olaf Buitelaar [ 2021-04-05 ] | |
|
I concur with above comment. Having good compressions is much more important for us, than (for example) the instant addition of a column. Maybe it's an idea to see how a good alternative compression can be implemented before start dropping this feature? | |
| Comment by Rick James [ 2021-05-13 ] | |
|
ADD COLUMN on a COMPRESSed table – Would it help to simply say that that requires ALGORITHM=COPY? Each feature is rarely used; the combination is probably extremely rare. That is this copout would hurt very few people. | |
| Comment by Rick Pizzi [ 2021-08-02 ] | |
|
marko Main reason for using ROW_FORMAT=COMPRESSED is to save disk space, we have a lot of customers actively using that option and deprecating the feature without an alternative that actually works the same, would be a big issue for several shops. I strongly want to descourage this idea. Thanks | |
| Comment by Wagner Bianchi (Inactive) [ 2021-08-02 ] | |
|
I recommend not removing this feature as I see many customers using the InnoDB COMPRESSED row format to save space. It does not sound like a good idea. | |
| Comment by Andrea Ponzo (Inactive) [ 2021-08-02 ] | |
|
Hello Marko, | |
| Comment by Federico Razzoli [ 2021-08-02 ] | |
|
I agree with the comments above. | |
| Comment by Rick Pizzi [ 2021-08-02 ] | |
|
It would be easy to revert the default to the safer and more conservative value innodb_read_only_compressed=OFF in 10.6 | |
| Comment by Marko Mäkelä [ 2021-08-02 ] | |
|
rpizzi, f_razzoli, I wish you had expressed your objection in Anyone who needs ROW_FORMAT=COMPRESSED tables can update the configuration file to include the following in their configuration file:
In MDEV-22839, I suspect that the logical file size was misinterpreted as physical file size. Sorry, I should have checked those numbers more carefully when it was originally posted. I think that we will have to make an informed decision on this. | |
| Comment by Rick Pizzi [ 2021-08-02 ] | |
|
I was only informed of all this today, or of course I would have expressed my opinion earlier. | |
| Comment by Federico Razzoli [ 2021-08-02 ] | |
|
I apology for not speaking out before, in this case. But keep in mind that community members don't necessarily see the tasks in time to express their concerns. | |
| Comment by Reinis Rozitis [ 2021-08-29 ] | |
|
So in the end what alternative remains for data compression? Tokudb was dropped, now InnoDB loses this feature, storage independent compression has only zlib (which probably isn't very optimal), with MyRocks we have somewhat bad experience and compared to rest it's way more complicated to configure. (properly), | |
| Comment by Manuel Arostegui [ 2021-08-30 ] | |
|
After all the comments from this thread my understanding is that it won't be removed, it is "simply" disabled by default on 10.6 but it can be enabled if you just enable it on your my.cnf by adding:
| |
| Comment by Marko Mäkelä [ 2021-10-07 ] | |
|
While there are good technical reasons for removing the ROW_FORMAT=COMPRESSED specific parts of the buffer pool code, that might be done in a way that would allow such tables to be written (just wasting even more memory and CPU). A naïve implementation could be to simply try recompressing the pages on every modification, to ensure that when we would actually write out the page from the buffer pool, recompressing would be guaranteed to succeed. In this way, we would only hold the uncompressed page in the buffer pool. I am afraid that without caching the previous contents of the page in the we would end up basically copying the entire ROW_FORMAT=COMPRESSED page to the redo log on every modification. If writes are going to be relatively rare, that could still be acceptable. An option could be implemented to make the log records cover changes to the uncompressed page. That would require recovery to decompress and recompress pages. In | |
| Comment by Marko Mäkelä [ 2021-11-04 ] | |
|
In If we moved the fields that are related to the adaptive hash index ( | |
| Comment by Nuno [ 2022-01-20 ] | |
|
marko Are you saying that I had to go through all the pain of migrating from ROW_FORMAT=COMPRESSED to DYNAMIC for all my tables, have some unfortunate downtime, and after all, COMPRESSED is going to stay because your main issue with the code is already resolved? It's fine anyway - I didn't like much that COMPRESSED requires double memory usage. I'm just sad that PAGE_COMPRESSED and the COMPRESSED columns either don't work well ( Happy new year! (by the way - seems that if I try to add an Emoji here, the comment submission fails! looks like JIRA isn't using UTF8MB4 | |
| Comment by Marko Mäkelä [ 2022-02-02 ] | |
|
Based on community feedback to | |
| Comment by Eloy Lafuente (stronk7) [ 2022-02-02 ] | |
|
Hi, idiot question but... does this imply that ROW_FORMAT=COMPRESSED is going to be writeable by default in the future or the setting will continue being needed for that forever? Ciao PS, edited: I'm just trying to imagine what to do in the future, to continue with COMPRESSED (with or without the setting) for new installations or switch to any other, better / recommended alternative to it. | |
| Comment by Eloy Lafuente (stronk7) [ 2022-02-13 ] | |
|
Ok, so now the 10.6 release notes page says:
And also, from InnoDB's COMPRESSED Row format:
Thanks! | |
| Comment by Marko Mäkelä [ 2022-02-14 ] | |
|
stronk7, thank you. I have now also updated | |
| Comment by Eloy Lafuente (stronk7) [ 2022-02-14 ] | |
|
Thanks for confirming, and for everything else, marko ! |