[MDEV-32663] mariadb filesystem size increasing for wp_options.ibd files - Jira

Marko Mäkelä added a comment - 2023-11-02 14:56

I think that this could be a duplicate of ~~MDEV-32050~~. Can you please check the InnoDB history list length from the output of SHOW ENGINE INNODB STATUS?

Furthermore, MariaDB Server 10.6.16 with a fix of ~~MDEV-32050~~ should be released soon. Can you check if the situation would improve after upgrading?

Marko Mäkelä added a comment - 2023-11-02 14:56 I think that this could be a duplicate of MDEV-32050 . Can you please check the InnoDB history list length from the output of SHOW ENGINE INNODB STATUS ? Furthermore, MariaDB Server 10.6.16 with a fix of MDEV-32050 should be released soon. Can you check if the situation would improve after upgrading?

Gabriel Tziotzis added a comment - 2023-11-02 15:11

With the table at 1.6GB size the history length was 2317

Gabriel Tziotzis added a comment - 2023-11-02 15:11 With the table at 1.6GB size the history length was 2317

Gabriel Tziotzis added a comment - 2023-11-03 10:47

With the table at 50GB size after a single day

History list length 24897

Gabriel Tziotzis added a comment - 2023-11-03 10:47 With the table at 50GB size after a single day History list length 24897

Marko Mäkelä added a comment - 2023-11-03 12:44

The history list is measured in transactions, not rows. If the transactions are updating indexed columns, they could cause the secondary indexes to grow a lot.

Note that InnoDB never shrinks .ibd files, except when the table is being rebuilt due to an operation like OPTIMIZE TABLE, ALTER TABLE, or TRUNCATE TABLE. So, even if the purge of history caught up, the file size would likely remain at 50GB until it needs to be extended again.

Marko Mäkelä added a comment - 2023-11-03 12:44 The history list is measured in transactions, not rows. If the transactions are updating indexed columns, they could cause the secondary indexes to grow a lot. Note that InnoDB never shrinks .ibd files, except when the table is being rebuilt due to an operation like OPTIMIZE TABLE , ALTER TABLE , or TRUNCATE TABLE . So, even if the purge of history caught up, the file size would likely remain at 50GB until it needs to be extended again.

Gabriel Tziotzis added a comment - 2023-11-03 13:03

I'm not sure how to respond to this as from a user's point of view, having the table increase to 500GB by deleted data or indexes which are oversized compared to the actuall necessary data appears unreasonble. Basically as a user when I delete the data I expect the data to be deleted and the space reclaimed if all went properly.

Does the table description help in deducing why the space is never reclaimed? If not how can I assist in troubleshooting this?

CREATE TABLE `wp_options` (

  `option_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,

  `option_name` varchar(191) NOT NULL DEFAULT '',

  `option_value` longtext NOT NULL,

  `autoload` varchar(20) NOT NULL DEFAULT 'yes',

  PRIMARY KEY (`option_id`),

  UNIQUE KEY `option_name` (`option_name`),

  KEY `autoload` (`autoload`)

) ENGINE=InnoDB AUTO_INCREMENT=12987461 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Gabriel Tziotzis added a comment - 2023-11-03 13:03 I'm not sure how to respond to this as from a user's point of view, having the table increase to 500GB by deleted data or indexes which are oversized compared to the actuall necessary data appears unreasonble. Basically as a user when I delete the data I expect the data to be deleted and the space reclaimed if all went properly. Does the table description help in deducing why the space is never reclaimed? If not how can I assist in troubleshooting this? CREATE TABLE `wp_options` ( `option_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `option_name` varchar(191) NOT NULL DEFAULT '', `option_value` longtext NOT NULL, `autoload` varchar(20) NOT NULL DEFAULT 'yes', PRIMARY KEY (`option_id`), UNIQUE KEY `option_name` (`option_name`), KEY `autoload` (`autoload`) ) ENGINE=InnoDB AUTO_INCREMENT=12987461 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Marko Mäkelä added a comment - 2023-11-03 13:40

To understand why this is as it is, you could watch these presentations:
https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/
https://mariadb.org/fest2023/innodb-undo-log-and-purge-history/

Marko Mäkelä added a comment - 2023-11-03 13:40 To understand why this is as it is, you could watch these presentations: https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/ https://mariadb.org/fest2023/innodb-undo-log-and-purge-history/

Gabriel Tziotzis added a comment - 2023-11-03 14:15

OK sure, but what is the end-user supposed to do if that's the case? 3 facts remain at the very lest

1. Obviously we need the many reads / writes / deletes otherwise the app wouldn't do it in the first place.
2. The database size increases by more than 50GB in a day in one site that's 17.8TB in a year. We cannot have this much space free especially in shared hosting environments where we have 50 customers with wordpress sites for example. That's 2.5TB increase in a day
3. We can't keep running optimize or alter table statements constantly as they are blocking operations and for the time they take to execute, the site drops temporarily

How is a user supposed to tackle this?

From the settings
MariaDB [(none)]> show variables like '%undo%';
----------------------------------+

Variable_name

Value

----------------------------------+

innodb_max_undo_log_size	10485760
innodb_undo_directory	./
innodb_undo_log_truncate	OFF
innodb_undo_tablespaces	0

----------------------------------+

By default it's turned to off, but does it mean that enabling innodb_undo_log_truncate would actually reclaim the space?

Gabriel Tziotzis added a comment - 2023-11-03 14:15 OK sure, but what is the end-user supposed to do if that's the case? 3 facts remain at the very lest 1. Obviously we need the many reads / writes / deletes otherwise the app wouldn't do it in the first place. 2. The database size increases by more than 50GB in a day in one site that's 17.8TB in a year. We cannot have this much space free especially in shared hosting environments where we have 50 customers with wordpress sites for example. That's 2.5TB increase in a day 3. We can't keep running optimize or alter table statements constantly as they are blocking operations and for the time they take to execute, the site drops temporarily How is a user supposed to tackle this? From the settings MariaDB [(none)] > show variables like '%undo%'; ------------------------- ---------+ Variable_name Value ------------------------- ---------+ innodb_max_undo_log_size 10485760 innodb_undo_directory ./ innodb_undo_log_truncate OFF innodb_undo_tablespaces 0 ------------------------- ---------+ By default it's turned to off, but does it mean that enabling innodb_undo_log_truncate would actually reclaim the space?

Marko Mäkelä added a comment - 2023-11-03 14:28

The setting innodb_undo_log_truncate is related to truncating dedicated innodb_undo_tablespaces. The space that is occupied by undo log records in the system tablespace can only be reclaimed in very recent versions that include ~~MDEV-14795~~.

The end user is expected to upgrade to MariaDB 10.6.16 once it is released, and report if the problem was fixed by that.

Marko Mäkelä added a comment - 2023-11-03 14:28 The setting innodb_undo_log_truncate is related to truncating dedicated innodb_undo_tablespaces . The space that is occupied by undo log records in the system tablespace can only be reclaimed in very recent versions that include MDEV-14795 . The end user is expected to upgrade to MariaDB 10.6.16 once it is released, and report if the problem was fixed by that.

Gabriel Tziotzis added a comment - 2023-11-03 14:34

OK I see, it sounds reasonable.

Until then is it possible to mitigate ?

Gabriel Tziotzis added a comment - 2023-11-03 14:34 OK I see, it sounds reasonable. Until then is it possible to mitigate ?

Gabriel Tziotzis added a comment - 2023-11-22 06:28

Hi @Marko I'm now on Server version: 10.6.16-MariaDB-1:10.6.16+maria~ubu2004 mariadb.org binary distribution and the issue's impact is reduced but not eliminated

I still get a bloated filesystem within a few days. At least it's not 500G which is progress but still the increase is there for indeterminate time and resources are not being released.

While I understand the concept you shared in the videos, having ever-growing tables in the filesystem holding deleted data doesn't help. Optimizing the DB tables constantly doesn't help with performance or downtimes either.

Before optimize
~~rw-rw~~--- 1 mysql mysql 6.8G Nov 22 08:21 /var/lib/mysql/db/wp_options.ibd

After optimize
~~rw-rw~~--- 1 mysql mysql 140M Nov 22 08:24 /var/lib/mysql/db/wp_options.ibd

I'll provide the history list length when it increases again as after optimize it's

> History list length 25

Gabriel Tziotzis added a comment - 2023-11-22 06:28 Hi @Marko I'm now on Server version: 10.6.16-MariaDB-1:10.6.16+maria~ubu2004 mariadb.org binary distribution and the issue's impact is reduced but not eliminated I still get a bloated filesystem within a few days. At least it's not 500G which is progress but still the increase is there for indeterminate time and resources are not being released. While I understand the concept you shared in the videos, having ever-growing tables in the filesystem holding deleted data doesn't help. Optimizing the DB tables constantly doesn't help with performance or downtimes either. Before optimize rw-rw --- 1 mysql mysql 6.8G Nov 22 08:21 /var/lib/mysql/db/wp_options.ibd After optimize rw-rw --- 1 mysql mysql 140M Nov 22 08:24 /var/lib/mysql/db/wp_options.ibd I'll provide the history list length when it increases again as after optimize it's > History list length 25

Gabriel Tziotzis added a comment - 2023-11-22 07:41

The same table is now 1.2G and history list length is 160

Gabriel Tziotzis added a comment - 2023-11-22 07:41 The same table is now 1.2G and history list length is 160

Marko Mäkelä added a comment - 2023-11-22 09:55

ioweb.gr, thank you for the update. It seems to me that after the fix of ~~MDEV-32050~~, the remaining problem is that the .ibd file gets fragmented, or pages are not being merged when the index trees are being shrunk when the history is being deleted.

The remaining growth could also be related to the way InnoDB allocates pages within the data file. That may not have been improved enough in ~~MDEV-13013~~.

Can you try to get some page-level statistics from the data file? To force all data to be written back from the buffer pool to the data file, you might execute

FLUSH TABLES wp_options FOR EXPORT; UNLOCK TABLES;

(or alternatively tune innodb_max_dirty_pages_pct and wait some time as noted in ~~MDEV-30000~~) and then execute the innochecksum utility on the wp_options.ibd file, to determine which indexes are consuming the most space within the data files. The INDEX_ID should correspond to INFORMATION_SCHEMA.INNODB_SYS_INDEXES.INDEX_ID.

There is an attribute COMMENT 'MERGE_THRESHOLD=50' that was introduced in MySQL 5.7 and MariaDB Server 10.2, for tables as well as individual indexes. You could try to experiment with it.

Marko Mäkelä added a comment - 2023-11-22 09:55 ioweb.gr , thank you for the update. It seems to me that after the fix of MDEV-32050 , the remaining problem is that the .ibd file gets fragmented, or pages are not being merged when the index trees are being shrunk when the history is being deleted. The remaining growth could also be related to the way InnoDB allocates pages within the data file. That may not have been improved enough in MDEV-13013 . Can you try to get some page-level statistics from the data file? To force all data to be written back from the buffer pool to the data file, you might execute FLUSH TABLES wp_options FOR EXPORT; UNLOCK TABLES; (or alternatively tune innodb_max_dirty_pages_pct and wait some time as noted in MDEV-30000 ) and then execute the innochecksum utility on the wp_options.ibd file, to determine which indexes are consuming the most space within the data files. The INDEX_ID should correspond to INFORMATION_SCHEMA.INNODB_SYS_INDEXES.INDEX_ID . There is an attribute COMMENT 'MERGE_THRESHOLD=50' that was introduced in MySQL 5.7 and MariaDB Server 10.2, for tables as well as individual indexes. You could try to experiment with it.

Gabriel Tziotzis added a comment - 2023-11-22 10:54

FLUSH TABLES wp_options FOR EXPORT; UNLOCK TABLES;

This didn't work. The size remains at over 1.3G

+--------------------------------+-----------+

| Variable_name                  | Value     |

+--------------------------------+-----------+

| innodb_max_dirty_pages_pct     | 90.000000 |

| innodb_max_dirty_pages_pct_lwm | 0.000000  |

What should I put for innodb_max_dirty_pages_pct?

Gabriel Tziotzis added a comment - 2023-11-22 10:54 FLUSH TABLES wp_options FOR EXPORT; UNLOCK TABLES; This didn't work. The size remains at over 1.3G +--------------------------------+-----------+ | Variable_name | Value | +--------------------------------+-----------+ | innodb_max_dirty_pages_pct | 90.000000 | | innodb_max_dirty_pages_pct_lwm | 0.000000 | What should I put for innodb_max_dirty_pages_pct?

Marko Mäkelä added a comment - 2023-11-22 11:50

ioweb.gr, the only purpose of the SQL statements was to ensure that innochecksum can access a reasonably recent version of the data in the .ibd file. We’d want to see what the data pages inside the .ibd file are being used for.

Marko Mäkelä added a comment - 2023-11-22 11:50 ioweb.gr , the only purpose of the SQL statements was to ensure that innochecksum can access a reasonably recent version of the data in the .ibd file. We’d want to see what the data pages inside the .ibd file are being used for.

Gabriel Tziotzis added a comment - 2023-11-22 11:58 - edited

I see and please excuse me if I'm not being easy to assist but I'm not well versed into database administrative tasks.

I'm assuming I need to run

$ innochecksum -S wp_options.ibd

File::wp_options.ibd

================PAGE TYPE SUMMARY==============

#PAGE_COUNT     PAGE_TYPE

===============================================

     324        Index page

       0        Undo log page

       1        Inode page

       0        Insert buffer free list page

    1404        Freshly allocated page

       6        Insert buffer bitmap

       0        System page

       0        Transaction system page

       1        File Space Header

       5        Extent descriptor page

   81971        BLOB page

       0        Compressed BLOB page

       0        Page compressed page

       0        Page compressed encrypted page

       0        Other type of page

===============================================

Additional information:

Undo page type: 0

Undo page state: 0 active, 0 cached, 0 to_purge, 0 prepared, 0 other

index_id        #pages          #leaf_pages     #recs_per_page  #bytes_per_page

31538956                234             233             48              13039

31538957                60              59              189             11942

31538958                14              13              799             12985

index_id        page_data_bytes_histgram(empty,...,oversized)

31538956                0       2       2       6       6       12      8       17      19      49      113     0

31538957                0       0       0       1       0       10      16      5       1       1       26      0

31538958                0       1       0       1       0       1       0       0       0       1       10      0

And then check for example index_id 31538956 which seems to consume most size

Which yields

mysql -e 'SELECT * FROM information_schema.INNODB_SYS_INDEXES

> WHERE INDEX_ID = 31538956;'

+----------+---------+----------+------+----------+---------+----------+-----------------+

| INDEX_ID | NAME    | TABLE_ID | TYPE | N_FIELDS | PAGE_NO | SPACE    | MERGE_THRESHOLD |

+----------+---------+----------+------+----------+---------+----------+-----------------+

| 31538956 | PRIMARY | 18614158 |    3 |        1 |       3 | 14808244 |              50 |

+----------+---------+----------+------+----------+---------+----------+-----------------+

Gabriel Tziotzis added a comment - 2023-11-22 11:58 - edited I see and please excuse me if I'm not being easy to assist but I'm not well versed into database administrative tasks. I'm assuming I need to run $ innochecksum -S wp_options.ibd File::wp_options.ibd ================PAGE TYPE SUMMARY============== #PAGE_COUNT PAGE_TYPE =============================================== 324 Index page 0 Undo log page 1 Inode page 0 Insert buffer free list page 1404 Freshly allocated page 6 Insert buffer bitmap 0 System page 0 Transaction system page 1 File Space Header 5 Extent descriptor page 81971 BLOB page 0 Compressed BLOB page 0 Page compressed page 0 Page compressed encrypted page 0 Other type of page =============================================== Additional information: Undo page type: 0 Undo page state: 0 active, 0 cached, 0 to_purge, 0 prepared, 0 other index_id #pages #leaf_pages #recs_per_page #bytes_per_page 31538956 234 233 48 13039 31538957 60 59 189 11942 31538958 14 13 799 12985 index_id page_data_bytes_histgram(empty,...,oversized) 31538956 0 2 2 6 6 12 8 17 19 49 113 0 31538957 0 0 0 1 0 10 16 5 1 1 26 0 31538958 0 1 0 1 0 1 0 0 0 1 10 0 And then check for example index_id 31538956 which seems to consume most size Which yields mysql -e 'SELECT * FROM information_schema.INNODB_SYS_INDEXES > WHERE INDEX_ID = 31538956 ;' +----------+---------+----------+------+----------+---------+----------+-----------------+ | INDEX_ID | NAME | TABLE_ID | TYPE | N_FIELDS | PAGE_NO | SPACE | MERGE_THRESHOLD | +----------+---------+----------+------+----------+---------+----------+-----------------+ | 31538956 | PRIMARY | 18614158 | 3 | 1 | 3 | 14808244 | 50 | +----------+---------+----------+------+----------+---------+----------+-----------------+

Marko Mäkelä added a comment - 2023-11-22 13:18

Thank you, this is better.

I don’t think that the innochecksum tool currently parses the allocation bitmap pages. That is, some of the pages that it identifies as index or BLOB pages could actually be marked as free in the tablespace. Can you post the results of the following:

SELECT COUNT(*), SUM(LENGTH(option_value)) FROM wp_options;

If you are using the default innodb_page_size=16k, you would seem to have 81971*16384 = 1,343,012,864 bytes in current or former BLOB data pages inside the file, which is almost 10 times the file size after OPTIMIZE TABLE.

Based on the information gathered so far, the bug would seem to be the way how BLOB pages are allocated in InnoDB. It would be great if you can write a script that would simulate your workload. Using such a script we should be able to reproduce and fix the anomaly.

Marko Mäkelä added a comment - 2023-11-22 13:18 Thank you, this is better. I don’t think that the innochecksum tool currently parses the allocation bitmap pages. That is, some of the pages that it identifies as index or BLOB pages could actually be marked as free in the tablespace. Can you post the results of the following: SELECT COUNT (*), SUM (LENGTH(option_value)) FROM wp_options; If you are using the default innodb_page_size=16k , you would seem to have 81971*16384 = 1,343,012,864 bytes in current or former BLOB data pages inside the file, which is almost 10 times the file size after OPTIMIZE TABLE . Based on the information gathered so far, the bug would seem to be the way how BLOB pages are allocated in InnoDB. It would be great if you can write a script that would simulate your workload. Using such a script we should be able to reproduce and fix the anomaly.

Gabriel Tziotzis added a comment - 2023-11-22 14:58

SELECT COUNT(*), SUM(LENGTH(option_value)) FROM wp_options;

10450	21169372

I see. Unfortunately I'm not able to provide info for the workload which is causing it. I understand something is writing and deleting data which is temporary in the wp_options table which is common to a lot of plugins in wordpress but also wordpress itself and in the wordpress world they're called transients as a form of object cache.

It's certain that it's some type of transient that's repeating too often the write/delete process

However to be able to reproduce this, I need to see what actual data is stored in the deleted space otherwise I can't even guess what type of data is being created or deleted constantly to find the pattern

It feels like chasing my tail. In order to find the reason I need to see the data to start tracking which query writes them and when. In order to find the bug you need to see the workload that causes it.

Gabriel Tziotzis added a comment - 2023-11-22 14:58 SELECT COUNT(*), SUM(LENGTH(option_value)) FROM wp_options; 10450 21169372 I see. Unfortunately I'm not able to provide info for the workload which is causing it. I understand something is writing and deleting data which is temporary in the wp_options table which is common to a lot of plugins in wordpress but also wordpress itself and in the wordpress world they're called transients as a form of object cache. It's certain that it's some type of transient that's repeating too often the write/delete process However to be able to reproduce this, I need to see what actual data is stored in the deleted space otherwise I can't even guess what type of data is being created or deleted constantly to find the pattern It feels like chasing my tail. In order to find the reason I need to see the data to start tracking which query writes them and when. In order to find the bug you need to see the workload that causes it.

Marko Mäkelä added a comment - 2023-11-22 16:08

Can you give any hints about the workload? Would option_id ever be changed in an UPDATE? For InnoDB BLOBs, that is handled as a special case: the newly inserted record with the updated PRIMARY KEY will inherit the BLOBs from the old record that will be delete-marked, and the old record will ‘disown’ the BLOBs. When BLOBs themselves are updated, a new copy is always allocated and written, and the old copy should be freed as part of purging the undo log records. There could be some bug in this area.

Can you provide copies of the allocation bitmap pages for the file? That is, if you are using the default innodb_page_size=16384, we would need copies of the files produced by the following:

dd bs=16384 skip=0 count=1 if=wp_options.ibd of=page0.bin

dd bs=16384 skip=16384 count=1 if=wp_options.ibd of=page16384.bin

dd bs=16384 skip=32768 count=1 if=wp_options.ibd of=page32768.bin

dd bs=16384 skip=49152 count=1 if=wp_options.ibd of=page49152.bin

dd bs=16384 skip=65536 count=1 if=wp_options.ibd of=page65536.bin

These pages contain bitmaps that identify which pages inside the file are allocated and not garbage. Please also post the innochecksum output for the file at the same point of time.

If most of the BLOB pages are actually marked as allocated, then this would be a bug in the purge of BLOBs. If they are marked as free, then the failure to reuse the freed pages would be a bug of the page allocator.

Marko Mäkelä added a comment - 2023-11-22 16:08 Can you give any hints about the workload? Would option_id ever be changed in an UPDATE ? For InnoDB BLOBs, that is handled as a special case: the newly inserted record with the updated PRIMARY KEY will inherit the BLOBs from the old record that will be delete-marked, and the old record will ‘disown’ the BLOBs. When BLOBs themselves are updated, a new copy is always allocated and written, and the old copy should be freed as part of purging the undo log records. There could be some bug in this area. Can you provide copies of the allocation bitmap pages for the file? That is, if you are using the default innodb_page_size=16384 , we would need copies of the files produced by the following: dd bs=16384 skip=0 count=1 if=wp_options.ibd of=page0.bin dd bs=16384 skip=16384 count=1 if=wp_options.ibd of=page16384.bin dd bs=16384 skip=32768 count=1 if=wp_options.ibd of=page32768.bin dd bs=16384 skip=49152 count=1 if=wp_options.ibd of=page49152.bin dd bs=16384 skip=65536 count=1 if=wp_options.ibd of=page65536.bin These pages contain bitmaps that identify which pages inside the file are allocated and not garbage. Please also post the innochecksum output for the file at the same point of time. If most of the BLOB pages are actually marked as allocated, then this would be a bug in the purge of BLOBs. If they are marked as free, then the failure to reuse the freed pages would be a bug of the page allocator.

Gabriel Tziotzis added a comment - 2023-11-23 07:03 - edited

As far as I know option_id is never changed. I've done a grep in code and I don't see this key anywhere in SQL update / insert commands.

Regarding the files request, do they contain sensitive information? If so is there a secure way to share them where the link won't be publicly accessible?

Gabriel Tziotzis added a comment - 2023-11-23 07:03 - edited As far as I know option_id is never changed. I've done a grep in code and I don't see this key anywhere in SQL update / insert commands. Regarding the files request, do they contain sensitive information? If so is there a secure way to share them where the link won't be publicly accessible?

Marko Mäkelä added a comment - 2023-11-23 07:37

For BLOBs and PRIMARY KEY updates, we have the open bug ~~MDEV-31441~~. But, it should not be applicable here.

The allocation bitmap pages do not contain any sensitive information or any user data. Because your data is mostly ASCII text (except for the option_id which would be stored as a big endian 64-bit integer), you could run the strings utility on the files to ensure that there is no legible text stored there. Index pages should have the strings infimum and supremum on them.

Page 0 contains the size of the file in pages, as well as information on which page numbers between 0 and innodb_page_size-1 are allocated. There also are some linked lists of allocated file segments or extents which span all the allocation bitmap pages.

Marko Mäkelä added a comment - 2023-11-23 07:37 For BLOBs and PRIMARY KEY updates, we have the open bug MDEV-31441 . But, it should not be applicable here. The allocation bitmap pages do not contain any sensitive information or any user data. Because your data is mostly ASCII text (except for the option_id which would be stored as a big endian 64-bit integer), you could run the strings utility on the files to ensure that there is no legible text stored there. Index pages should have the strings infimum and supremum on them. Page 0 contains the size of the file in pages, as well as information on which page numbers between 0 and innodb_page_size -1 are allocated. There also are some linked lists of allocated file segments or extents which span all the allocation bitmap pages.

Gabriel Tziotzis added a comment - 2023-11-23 14:56

OK I've added the required files in blob.zip

Gabriel Tziotzis added a comment - 2023-11-23 14:56 OK I've added the required files in blob.zip

Gabriel Tziotzis added a comment - 2023-12-10 05:36

Hi,

Anything else needed on my end to help reproduce this issue?

For the record we still get extreme filesystem sizes daily

Gabriel Tziotzis added a comment - 2023-12-10 05:36 Hi, Anything else needed on my end to help reproduce this issue? For the record we still get extreme filesystem sizes daily

Marko Mäkelä added a comment - 2023-12-11 10:07

Hi, sorry, I was occupied with other tasks. In the file innochecksum_result.txt inside blob.zip (as well as in some earlier comments by you) we can see the following:

index_id	#pages		#leaf_pages	#recs_per_page	#bytes_per_page

31538956		192		191		58		13399

31538957		75		74		178		11534

31538958		27		26		649		10536

The first one corresponds to the PRIMARY KEY index, and the next two should correspond to the UNIQUE KEY and the non-unique KEY. BLOB pages will be allocated from the same "file segment" as primary key leaf pages (BTR_SEG_LEAF). According to the #leaf_pages output above, there only are 191 primary key leaf pages or BLOB pages in total. There are at least 2 index leaf pages, because we see that there is one non-leaf page (the total number of pages is one larger).

Inside the file we also have statistics saying that there are 480,609 BLOB pages. It looks like at most 189 of them will actually be allocated for BLOBs.

The allocation bitmap page format is a little cumbersome to parse manually from a hexdump, because the bitmap bytes are interleaved with something else. There are two bits per page. One is always set, so 0xaa would refer to 4 allocated pages and 0xff would refer to 4 free pages. In page65536.bin I see 0xfaffff…. The first byte 0xfa means that the first 2 pages (page 65536 and 65537) are allocated by the system (for the allocation bitmap itself, and for a change buffer bitmap). The files page32768.bin and page49152.bin look similar. In the file page16384.bin the bitmap starts with 0xfaffff… but there are some scattered allocations: 0xabfffffbfffffbfaffbfaaaaaaaaaa. Based on this quick look, it seems that only some of the first 20,000 pages (or 320 MiB) of the file may actually be used, while the file is much larger. In total, only 294 pages (4.6 MiB) inside the file are used by actual payload, according to innochecksum. Actually, there should be 30 allocation bitmap pages in total.

This looks like a possible bug in the page allocation algorithm. I say "possible", because this would be an expected situation if there had been a huge purge lag at some point, and those BLOB pages were only recently freed. InnoDB never shrinks .ibd files. Possibly the secondary indexes will contribute to the purge lag; this would eventually be fixed by a file format change (MDEV-17598).

Can you monitor the history list length over time (once per second or minute), to see if there still are some bursts?

Would the .ibd file keep growing further from its current size, or would the former BLOB pages be eventually reused when the next burst of workload arrives?

Can you try to write a script that simulates the updates to this table? I am afraid that without such a script, it is hard to do anything.

Marko Mäkelä added a comment - 2023-12-11 10:07 Hi, sorry, I was occupied with other tasks. In the file innochecksum_result.txt inside blob.zip (as well as in some earlier comments by you) we can see the following: index_id #pages #leaf_pages #recs_per_page #bytes_per_page 31538956 192 191 58 13399 31538957 75 74 178 11534 31538958 27 26 649 10536 The first one corresponds to the PRIMARY KEY index, and the next two should correspond to the UNIQUE KEY and the non-unique KEY . BLOB pages will be allocated from the same "file segment" as primary key leaf pages ( BTR_SEG_LEAF ). According to the #leaf_pages output above, there only are 191 primary key leaf pages or BLOB pages in total. There are at least 2 index leaf pages, because we see that there is one non-leaf page (the total number of pages is one larger). Inside the file we also have statistics saying that there are 480,609 BLOB pages. It looks like at most 189 of them will actually be allocated for BLOBs. The allocation bitmap page format is a little cumbersome to parse manually from a hexdump, because the bitmap bytes are interleaved with something else. There are two bits per page. One is always set, so 0xaa would refer to 4 allocated pages and 0xff would refer to 4 free pages. In page65536.bin I see 0xfaffff…. The first byte 0xfa means that the first 2 pages (page 65536 and 65537) are allocated by the system (for the allocation bitmap itself, and for a change buffer bitmap). The files page32768.bin and page49152.bin look similar. In the file page16384.bin the bitmap starts with 0xfaffff… but there are some scattered allocations: 0xabfffffbfffffbfaffbfaaaaaaaaaa. Based on this quick look, it seems that only some of the first 20,000 pages (or 320 MiB) of the file may actually be used, while the file is much larger. In total, only 294 pages (4.6 MiB) inside the file are used by actual payload, according to innochecksum . Actually, there should be 30 allocation bitmap pages in total. This looks like a possible bug in the page allocation algorithm. I say "possible", because this would be an expected situation if there had been a huge purge lag at some point, and those BLOB pages were only recently freed. InnoDB never shrinks .ibd files. Possibly the secondary indexes will contribute to the purge lag; this would eventually be fixed by a file format change ( MDEV-17598 ). Can you monitor the history list length over time (once per second or minute), to see if there still are some bursts? Would the .ibd file keep growing further from its current size, or would the former BLOB pages be eventually reused when the next burst of workload arrives? Can you try to write a script that simulates the updates to this table? I am afraid that without such a script, it is hard to do anything.

Gabriel Tziotzis added a comment - 2023-12-11 10:59 - edited

My real problem is determining what updates are running.

Any ideas how to log all queries for a specific database and a specific table ? Perhaps if we parse a log file logging everything we can track the query that's mostly executing and try to simulate it.

I can write a script that monitors the history list length of course, it will run every minute (cron limitation) and log it . Do you also need me to output the filesize of the table as well in the same log?

It would output something like

<list length> - <size>
PS: no worries I totally understand

Gabriel Tziotzis added a comment - 2023-12-11 10:59 - edited My real problem is determining what updates are running. Any ideas how to log all queries for a specific database and a specific table ? Perhaps if we parse a log file logging everything we can track the query that's mostly executing and try to simulate it. I can write a script that monitors the history list length of course, it will run every minute (cron limitation) and log it . Do you also need me to output the filesize of the table as well in the same log? It would output something like <list length> - <size> PS: no worries I totally understand

Marko Mäkelä added a comment - 2023-12-14 09:03

I’m not familiar with any query filtering, but maybe danblack can help with that.

Yes, the history size and the wp_options.ibd file size would be useful to sample together.

Marko Mäkelä added a comment - 2023-12-14 09:03 I’m not familiar with any query filtering, but maybe danblack can help with that. Yes, the history size and the wp_options.ibd file size would be useful to sample together.

Daniel Black added a comment - 2023-12-15 06:50

The collection of not so great options include:

audit plugin, server_audit_events=QUERY, no per table option

system version tables:

ALTER TABLE wp_options (

   ADD start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START INVISIBLE,

   ADD end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END INVISIBLE,

   ADD PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid),

   ADD SYSTEM VERSIONING;

(remove with

set statement system_versioning_alter_history=1 for alter table wp_options DROP COLUMN start_trxid, DROP end_trxid, DROP SYSTEM VERSIONING;

).

The better option is probably:

log_bin=1, binlog_do_db=wp (no table option)
mariadb-dump --dump-master=1 --single-transaction wp wp_options
, possibly even with replicate_do_table or mariadb-binlog has --table option for filtering.
If, you are willing to upload these - https://mariadb.com/kb/en/meta/mariadb-ftp-server/

Daniel Black added a comment - 2023-12-15 06:50 The collection of not so great options include: audit plugin , server_audit_events=QUERY, no per table option system version tables: ALTER TABLE wp_options ( ADD start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START INVISIBLE, ADD end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END INVISIBLE, ADD PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid), ADD SYSTEM VERSIONING; (remove with set statement system_versioning_alter_history=1 for alter table wp_options DROP COLUMN start_trxid, DROP end_trxid, DROP SYSTEM VERSIONING; ). The better option is probably: log_bin=1, binlog_do_db=wp (no table option) mariadb-dump --dump-master=1 --single-transaction wp wp_options , possibly even with replicate_do_table or mariadb-binlog has --table option for filtering. If, you are willing to upload these - https://mariadb.com/kb/en/meta/mariadb-ftp-server/

Marko Mäkelä added a comment - 2023-12-19 12:42

There also is a MyISAM bug MDEV-28499 that mentions the table wp_options. As far as I understand, the .MYD heap file would keep growing until TRUNCATE or OPTIMIZE.

Marko Mäkelä added a comment - 2023-12-19 12:42 There also is a MyISAM bug MDEV-28499 that mentions the table wp_options . As far as I understand, the .MYD heap file would keep growing until TRUNCATE or OPTIMIZE .

Marko Mäkelä added a comment - 2024-01-02 06:59

ioweb.gr, do you have any results that you could share? If the majority of read-write transactions is modifying something else than the wp_options table, then the history list length (which is measured in transactions, not rows) would not necessarily correlate with the growth of the file.

Marko Mäkelä added a comment - 2024-01-02 06:59 ioweb.gr , do you have any results that you could share? If the majority of read-write transactions is modifying something else than the wp_options table, then the history list length (which is measured in transactions, not rows) would not necessarily correlate with the growth of the file.

Gabriel Tziotzis added a comment - 2024-01-04 06:32

Hi,

sorry for the delay, I had to contact my customer about these options to get permission as they could potentially impede the normal functioning of the website
Would this option be good enough to test this ?

ALTER TABLE wp_options (

   ADD start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START INVISIBLE,

   ADD end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END INVISIBLE,

   ADD PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid),

   ADD SYSTEM VERSIONING;

What would you need me to provide after adding this to the table?

Gabriel Tziotzis added a comment - 2024-01-04 06:32 Hi, sorry for the delay, I had to contact my customer about these options to get permission as they could potentially impede the normal functioning of the website Would this option be good enough to test this ? ALTER TABLE wp_options ( ADD start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START INVISIBLE, ADD end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END INVISIBLE, ADD PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid), ADD SYSTEM VERSIONING; What would you need me to provide after adding this to the table?

Marko Mäkelä added a comment - 2024-01-04 08:15

I would not recommend enabling system versioning on the table on a production server. With system versioning, any UPDATE statements will be converted to INSERT and any DELETE to UPDATE. That is, the table will preserve the entire history from the time when system versioning was enabled.

It would be better to set up replication on this table only, and to enable the system versioning on the replica. It could be very helpful to have the history that corresponds to a time frame where the file is growing on the primary server. This of course together with the monitoring of the history list length on the primary server. On the replica, we would only be interested in the history of this table, nothing else.

Marko Mäkelä added a comment - 2024-01-04 08:15 I would not recommend enabling system versioning on the table on a production server. With system versioning, any UPDATE statements will be converted to INSERT and any DELETE to UPDATE . That is, the table will preserve the entire history from the time when system versioning was enabled. It would be better to set up replication on this table only, and to enable the system versioning on the replica. It could be very helpful to have the history that corresponds to a time frame where the file is growing on the primary server. This of course together with the monitoring of the history list length on the primary server. On the replica, we would only be interested in the history of this table, nothing else.

Gabriel Tziotzis added a comment - 2024-01-15 09:17

Hi,
I've asked our server admin to see if it's doable, however he says that it's not in his scope of work or field of expertise. Unfortunately I don't have a person that can perform this type of setup to do replication on only this table to help out

We'll have to try and think of easier potential workarounds that are feasible to perform on my end without breaking our production server.

Any other ideas on how to approach this?

Gabriel Tziotzis added a comment - 2024-01-15 09:17 Hi, I've asked our server admin to see if it's doable, however he says that it's not in his scope of work or field of expertise. Unfortunately I don't have a person that can perform this type of setup to do replication on only this table to help out We'll have to try and think of easier potential workarounds that are feasible to perform on my end without breaking our production server. Any other ideas on how to approach this?

Michael Widenius added a comment - 2024-01-23 18:18

The reason for the table growth is system versioning. This means that nothing is never deleted from the table and any change will just grow it indefinitely until versioning data is truncated.
See
https://mariadb.com/kb/en/system-versioned-tables/#removing-old-history

Michael Widenius added a comment - 2024-01-23 18:18 The reason for the table growth is system versioning. This means that nothing is never deleted from the table and any change will just grow it indefinitely until versioning data is truncated. See https://mariadb.com/kb/en/system-versioned-tables/#removing-old-history

Gabriel Tziotzis added a comment - 2024-01-23 19:53

Hi, in my case it's not because simply

ALTER TABLE wp_options DROP SYSTEM VERSIONING;
ERROR 4124 (HY000): Table `wp_options` is not system-versioned

Gabriel Tziotzis added a comment - 2024-01-23 19:53 Hi, in my case it's not because simply ALTER TABLE wp_options DROP SYSTEM VERSIONING; ERROR 4124 (HY000): Table `wp_options` is not system-versioned

Gabriel Tziotzis added a comment - 2024-02-09 08:08 - edited

Recently I've been tracking this in more tables as well except for wp_options which is pretty impossible to minimize the type of queries that are being run.
I noticed it on magento websites as well when set to store sessions in the database and wordpress scheduler for cron jobs.

So tables like
1. core_session (OpenMage)
2 session (Magento 2) will start bloating over time.
3. wp_actionscheduler_actions (Cron data for wordpress)

Here the data is more strictly controlled as it stores specific information about the user session and the cron tasks.

For example on one core_session table due to a bug, I had to delete 3.5 million rows manually using

`delete from core_session where session_expires = 4294967295`

The size of the table never dropped from 8GB until I manually issued the optimize command.

So this is controlled replication of the issue

I can replicate this on two more tables with absurdly high sizes where I can delete the excess session data and then see the size not decrease ever.

Does this help? Can we do any actions to see what is happening between these operations without breaking our production database server?

I can get confirmation to delete the sessions on those websites' tables as they're very interested in solving this (it will reduce disk usage and thus their subscription cost)

PS: I'm now on 10.6.17-MariaDB-1:10.6.17+maria~ubu2004-log

Gabriel Tziotzis added a comment - 2024-02-09 08:08 - edited Recently I've been tracking this in more tables as well except for wp_options which is pretty impossible to minimize the type of queries that are being run. I noticed it on magento websites as well when set to store sessions in the database and wordpress scheduler for cron jobs. So tables like 1. core_session (OpenMage) 2 session (Magento 2) will start bloating over time. 3. wp_actionscheduler_actions (Cron data for wordpress) Here the data is more strictly controlled as it stores specific information about the user session and the cron tasks. For example on one core_session table due to a bug, I had to delete 3.5 million rows manually using `delete from core_session where session_expires = 4294967295` The size of the table never dropped from 8GB until I manually issued the optimize command. So this is controlled replication of the issue I can replicate this on two more tables with absurdly high sizes where I can delete the excess session data and then see the size not decrease ever. Does this help? Can we do any actions to see what is happening between these operations without breaking our production database server? I can get confirmation to delete the sessions on those websites' tables as they're very interested in solving this (it will reduce disk usage and thus their subscription cost) PS: I'm now on 10.6.17-MariaDB-1:10.6.17+maria~ubu2004-log

Gabriel Tziotzis added a comment - 2024-02-11 07:16

One maybe stupid question, but is it possible this is happening because in the past we installed the debug symbols after suggestion by @Daniel Black for my question https://dba.stackexchange.com/questions/330629/how-can-i-find-out-whats-causing-the-deadlock-in-mariadb which led to our other issue report here https://jira.mariadb.org/browse/MDEV-32033

We installed the debug symbols then to find what was causing the lock but we never removed becuse even after solving that issue, we noticed this one happening so I never removed them just in case we needed them.

PS: In reality for that issue it was the optimizer screwing things up occassionally when the statements included a join twice https://github.com/magento/magento2/issues/36667#issuecomment-1725549906 causing the rows to be examined to be trillions randomly.

Gabriel Tziotzis added a comment - 2024-02-11 07:16 One maybe stupid question, but is it possible this is happening because in the past we installed the debug symbols after suggestion by @Daniel Black for my question https://dba.stackexchange.com/questions/330629/how-can-i-find-out-whats-causing-the-deadlock-in-mariadb which led to our other issue report here https://jira.mariadb.org/browse/MDEV-32033 We installed the debug symbols then to find what was causing the lock but we never removed becuse even after solving that issue, we noticed this one happening so I never removed them just in case we needed them. PS: In reality for that issue it was the optimizer screwing things up occassionally when the statements included a join twice https://github.com/magento/magento2/issues/36667#issuecomment-1725549906 causing the rows to be examined to be trillions randomly.

Marko Mäkelä added a comment - 2024-02-12 07:36

Hi ioweb.gr, I am interested in any setup where this bug can be reproduced. It would be ideal if you could reduce it to something as simple as "create and initialize a table like this" and "execute these SQL statements on it." You can obfuscate the data or omit columns or indexes, as long as you adjust the SQL accordingly so that the bug remains reproducible.

I do not think that installing debug symbols will have any effect on the server execution, other than by consuming some space in the file system. Typically the executable is installed as /usr/sbin/mariadbd and the information for debugging resides in /usr/lib/debug.

Marko Mäkelä added a comment - 2024-02-12 07:36 Hi ioweb.gr , I am interested in any setup where this bug can be reproduced. It would be ideal if you could reduce it to something as simple as "create and initialize a table like this" and "execute these SQL statements on it." You can obfuscate the data or omit columns or indexes, as long as you adjust the SQL accordingly so that the bug remains reproducible. I do not think that installing debug symbols will have any effect on the server execution, other than by consuming some space in the file system. Typically the executable is installed as /usr/sbin/mariadbd and the information for debugging resides in /usr/lib/debug .

Gabriel Tziotzis added a comment - 2024-02-12 08:09 - edited

Hi @Marko Mäkelä

For me the issue is replicable like this

CREATE TABLE `session` (

                           `session_id` varchar(255) NOT NULL COMMENT 'Session Id',

                           `session_expires` int(10) unsigned NOT NULL DEFAULT 0 COMMENT 'Date of Session Expiration',

                           `session_data` mediumblob NOT NULL COMMENT 'Session Data',

                           PRIMARY KEY (`session_id`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci COMMENT='Database Sessions Storage'

# create a function to generate a random hex string

DROP FUNCTION IF EXISTS `RandomHexString`;

DELIMITER $$

CREATE FUNCTION RandomHexString(desiredLength INT) RETURNS VARCHAR(255)

BEGIN

    DECLARE hexChars VARCHAR(16) DEFAULT '0123456789ABCDEF';

    DECLARE randomHexString VARCHAR(255) DEFAULT '';

    DECLARE i INT DEFAULT 0;

    DECLARE randomChar CHAR(1);

    DECLARE countIndex INT;

    WHILE i < desiredLength DO

            SET countIndex = FLOOR(1 + (RAND() * 16));

            SET randomChar = SUBSTRING(hexChars, countIndex, 1);

            SET randomHexString = CONCAT(randomHexString, randomChar);

            SET i = i + 1;

        END WHILE;

    RETURN randomHexString;

END$$

DELIMITER ;

# create a procedure to insert a large number of rows

DROP PROCEDURE IF EXISTS `RepeatInsert`;

DELIMITER $$

CREATE PROCEDURE RepeatInsert(IN times INT)

BEGIN

    DECLARE i INT DEFAULT 0;

    WHILE i < times DO

            INSERT INTO db_madund.session (session.session_id, session_expires, session_data) VALUES (UUID(), 1707724303, RandomHexString(255));

            SET i = i + 1;

        END WHILE;

    END$$

DELIMITER ;

# populate the table

CALL RepeatInsert(1000000);

# check current table size in the filesystem

# run some delete / reinsert operations many times 10 for example

delete from session limit 100000;

CALL RepeatInsert(100000);

# check current table size in the filesystem and watch it's increased

# run optimize on the table and watch it decrease in size

Because this is a session table the reads/writes are a lot depending on the traffic of the website and the lifetime of a session. Bots can create excessive traffic with short lived sessions for example.

Gabriel Tziotzis added a comment - 2024-02-12 08:09 - edited Hi @Marko Mäkelä For me the issue is replicable like this CREATE TABLE `session` ( `session_id` varchar( 255 ) NOT NULL COMMENT 'Session Id' , `session_expires` int ( 10 ) unsigned NOT NULL DEFAULT 0 COMMENT 'Date of Session Expiration' , `session_data` mediumblob NOT NULL COMMENT 'Session Data' , PRIMARY KEY (`session_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci COMMENT= 'Database Sessions Storage' # create a function to generate a random hex string DROP FUNCTION IF EXISTS `RandomHexString`; DELIMITER $$ CREATE FUNCTION RandomHexString(desiredLength INT) RETURNS VARCHAR( 255 ) BEGIN DECLARE hexChars VARCHAR( 16 ) DEFAULT '0123456789ABCDEF' ; DECLARE randomHexString VARCHAR( 255 ) DEFAULT '' ; DECLARE i INT DEFAULT 0 ; DECLARE randomChar CHAR( 1 ); DECLARE countIndex INT; WHILE i < desiredLength DO SET countIndex = FLOOR( 1 + (RAND() * 16 )); SET randomChar = SUBSTRING(hexChars, countIndex, 1 ); SET randomHexString = CONCAT(randomHexString, randomChar); SET i = i + 1 ; END WHILE; RETURN randomHexString; END$$ DELIMITER ; # create a procedure to insert a large number of rows DROP PROCEDURE IF EXISTS `RepeatInsert`; DELIMITER $$ CREATE PROCEDURE RepeatInsert(IN times INT) BEGIN DECLARE i INT DEFAULT 0 ; WHILE i < times DO INSERT INTO db_madund.session (session.session_id, session_expires, session_data) VALUES (UUID(), 1707724303 , RandomHexString( 255 )); SET i = i + 1 ; END WHILE; END$$ DELIMITER ; # populate the table CALL RepeatInsert( 1000000 ); # check current table size in the filesystem # run some delete / reinsert operations many times 10 for example delete from session limit 100000 ; CALL RepeatInsert( 100000 ); # check current table size in the filesystem and watch it's increased # run optimize on the table and watch it decrease in size Because this is a session table the reads/writes are a lot depending on the traffic of the website and the lifetime of a session. Bots can create excessive traffic with short lived sessions for example.

Marko Mäkelä added a comment - 2024-02-12 12:46

Thank you, ioweb.gr. I simplified your test case a little, as follows:

--source include/have_innodb.inc

--source include/have_sequence.inc

CREATE TABLE s (id varchar(255) PRIMARY KEY, d mediumblob NOT NULL)

ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;

INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_1000000;

let $datadir=`SELECT @@datadir`;

--exec wc -c $datadir/test/s.ibd

DELETE FROM s LIMIT 100000;

INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_100000;

--exec wc -c $datadir/test/s.ibd

OPTIMIZE TABLE s;

--exec wc -c $datadir/test/s.ibd

DROP TABLE s;

The constant non-indexed column session_expires INT NOT NULL should not matter much. For the MEDIUMBLOB column, we can use the constant string repeat('x',255) instead of a random string of the same length, because the column is not indexed.

I was able to reproduce the problem on a CMAKE_BUILD_TYPE=RelWithDebInfo build of mariadb-10.6.15. In fact, I am seeing some increase of the file size after OPTIMIZE TABLE, which I think could be due to ~~MDEV-26740~~:

mariadb-10.6.15
CREATE TABLE s (id varchar(255) PRIMARY KEY, d mediumblob NOT NULL)
ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;
SET STATEMENT foreign_key_checks=0,unique_checks=0 FOR
INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_1000000;
356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
DELETE FROM s LIMIT 100000;
INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_100000;
390070272 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
OPTIMIZE TABLE s;
Table Op Msg_type Msg_text
test.s optimize note Table does not support optimize, doing recreate + analyze instead
test.s optimize status OK
402653184 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
DROP TABLE s;

On the current 10.6 branch I get similar numbers:

10.6 466069b184f318aafc0e87cc6b06633d1fa9eb4b
356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
…
390070272 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
…
394264576 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd

Adding SET GLOBAL innodb_max_purge_lag_wait=0 (~~MDEV-16952~~) right after the DELETE statement would seem to fix this, except for the OPTIMIZE TABLE:

10.6 466069b184f318aafc0e87cc6b06633d1fa9eb4b
…
356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
DELETE FROM s LIMIT 100000;
SET GLOBAL innodb_max_purge_lag_wait=0;
INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_100000;
356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd
…
394264576 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd

This seems to confirm it as purge related. I think that ~~MDEV-32050~~ (fixed in 10.6.16) and ~~MDEV-33213~~ (fixed in the recently released 10.6.17) could help with your situation.

Can you check if upgrading to MariaDB 10.6.17 would help?

Marko Mäkelä added a comment - 2024-02-12 12:46 Thank you, ioweb.gr . I simplified your test case a little, as follows: --source include/have_innodb.inc --source include/have_sequence.inc CREATE TABLE s (id varchar (255) PRIMARY KEY , d mediumblob NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE =utf8mb3_general_ci; INSERT INTO s SELECT UUID(),repeat( 'x' ,255) FROM seq_1_to_1000000; let $datadir=` SELECT @@datadir`; --exec wc -c $datadir/test/s.ibd DELETE FROM s LIMIT 100000; INSERT INTO s SELECT UUID(),repeat( 'x' ,255) FROM seq_1_to_100000; --exec wc -c $datadir/test/s.ibd OPTIMIZE TABLE s; --exec wc -c $datadir/test/s.ibd DROP TABLE s; The constant non-indexed column session_expires INT NOT NULL should not matter much. For the MEDIUMBLOB column, we can use the constant string repeat('x',255) instead of a random string of the same length, because the column is not indexed. I was able to reproduce the problem on a CMAKE_BUILD_TYPE=RelWithDebInfo build of mariadb-10.6.15. In fact, I am seeing some increase of the file size after OPTIMIZE TABLE , which I think could be due to MDEV-26740 : mariadb-10.6.15 CREATE TABLE s (id varchar(255) PRIMARY KEY, d mediumblob NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci; SET STATEMENT foreign_key_checks=0,unique_checks=0 FOR INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_1000000; 356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd DELETE FROM s LIMIT 100000; INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_100000; 390070272 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd OPTIMIZE TABLE s; Table Op Msg_type Msg_text test.s optimize note Table does not support optimize, doing recreate + analyze instead test.s optimize status OK 402653184 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd DROP TABLE s; On the current 10.6 branch I get similar numbers: 10.6 466069b184f318aafc0e87cc6b06633d1fa9eb4b 356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd … 390070272 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd … 394264576 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd Adding SET GLOBAL innodb_max_purge_lag_wait=0 ( MDEV-16952 ) right after the DELETE statement would seem to fix this, except for the OPTIMIZE TABLE : 10.6 466069b184f318aafc0e87cc6b06633d1fa9eb4b … 356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd DELETE FROM s LIMIT 100000; SET GLOBAL innodb_max_purge_lag_wait=0; INSERT INTO s SELECT UUID(),repeat('x',255) FROM seq_1_to_100000; 356515840 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd … 394264576 /dev/shm/10.6/mysql-test/var/mysqld.1/data//test/s.ibd This seems to confirm it as purge related. I think that MDEV-32050 (fixed in 10.6.16) and MDEV-33213 (fixed in the recently released 10.6.17) could help with your situation. Can you check if upgrading to MariaDB 10.6.17 would help?

Gabriel Tziotzis added a comment - 2024-02-12 13:04

I'm actually in 10.6.17 and the problem persists

Server version: 10.6.17-MariaDB-1:10.6.17+maria~ubu2004-log mariadb.org binary distribution

Gabriel Tziotzis added a comment - 2024-02-12 13:04 I'm actually in 10.6.17 and the problem persists Server version: 10.6.17-MariaDB-1:10.6.17+maria~ubu2004-log mariadb.org binary distribution

Marko Mäkelä added a comment - 2024-02-12 13:07

ioweb.gr, thank you. Are you monitoring the history list length periodically (say, once per minute)?

Marko Mäkelä added a comment - 2024-02-12 13:07 ioweb.gr , thank you. Are you monitoring the history list length periodically (say, once per minute)?

Gabriel Tziotzis added a comment - 2024-02-12 13:30

Actually no I'm not should i?

Gabriel Tziotzis added a comment - 2024-02-12 13:30 Actually no I'm not should i?

Marko Mäkelä added a comment - 2024-02-12 14:49

ioweb.gr, my best hypothesis is that this space usage is correlating with the amount of garbage in the data file. If the history of a committed massive DELETE is purged without too much delay, then a subsequent huge INSERT could reuse the space, as demonstrated by my simplified version of your test case.

Increasing innodb_purge_batch_size and possibly also innodb_purge_threads could help keep the history list length low.

If there is no correlation between the history list length and the unnecessarily large file size, then we would be looking at a different bug and would likely need a different test case in order to understand what is going on.

Marko Mäkelä added a comment - 2024-02-12 14:49 ioweb.gr , my best hypothesis is that this space usage is correlating with the amount of garbage in the data file. If the history of a committed massive DELETE is purged without too much delay, then a subsequent huge INSERT could reuse the space, as demonstrated by my simplified version of your test case. Increasing innodb_purge_batch_size and possibly also innodb_purge_threads could help keep the history list length low. If there is no correlation between the history list length and the unnecessarily large file size, then we would be looking at a different bug and would likely need a different test case in order to understand what is going on.

Marko Mäkelä added a comment - 2024-02-21 07:17

ioweb.gr, did you start monitoring the history list length together with the file sizes, to see if there is a correlation between them? If yes, this is a known problem and not fixable (other than by trying to configure the server so that the purge lag is minimized). If not, we’d need something that would reproduce the anomaly.

Marko Mäkelä added a comment - 2024-02-21 07:17 ioweb.gr , did you start monitoring the history list length together with the file sizes, to see if there is a correlation between them? If yes, this is a known problem and not fixable (other than by trying to configure the server so that the purge lag is minimized). If not, we’d need something that would reproduce the anomaly.

Gabriel Tziotzis added a comment - 2024-02-21 10:17 - edited

I've actually just finished tweaking the logging script

#!/bin/bash

# Define log file path

LOG_FILE="/var/log/history-list-length.log"

for i in {1..15}

do

  # Get the current timestamp

  TIMESTAMP=$(date +'%Y-%m-%d %H:%M:%S')

  # Get InnoDB history list length

  HISTORY_LENGTH=$(mysql -e "SHOW ENGINE INNODB STATUS\G;" | grep -i 'history' | sed 's/.*history list length //' )

  # Get disk usage of the wp_options.ibd file

  DISK_USAGE=$(du -csh /var/lib/mysql/****/wp_options.ibd | head -n 1 | cut -f1)

  # Append both metrics as a single line to the log file

  echo "${TIMESTAMP} - History List Length: ${HISTORY_LENGTH}, Disk Usage: ${DISK_USAGE}" >> ${LOG_FILE}

  # Sleep for 1 second

  sleep 4

done

It will provide an entry every 4 seconds approximately

2024-02-21 12:16:10 - History List Length: History list length 8, Disk Usage: 113M
2024-02-21 12:16:15 - History List Length: History list length 9, Disk Usage: 113M
2024-02-21 12:16:19 - History List Length: History list length 23, Disk Usage: 113M
2024-02-21 12:16:23 - History List Length: History list length 40, Disk Usage: 113M
2024-02-21 12:16:27 - History List Length: History list length 178, Disk Usage: 113M
2024-02-21 12:16:31 - History List Length: History list length 82, Disk Usage: 113M
2024-02-21 12:16:35 - History List Length: History list length 17, Disk Usage: 113M

I hope by the end of the day the tables will increase enough to see this in effect

I'll be monitoring the size of a single table which increases usually a lot within the day

Gabriel Tziotzis added a comment - 2024-02-21 10:17 - edited I've actually just finished tweaking the logging script #!/bin/bash # Define log file path LOG_FILE= "/var/log/history-list-length.log" for i in { 1 .. 15 } do # Get the current timestamp TIMESTAMP=$(date + '%Y-%m-%d %H:%M:%S' ) # Get InnoDB history list length HISTORY_LENGTH=$(mysql -e "SHOW ENGINE INNODB STATUS\G;" | grep -i 'history' | sed 's/.*history list length //' ) # Get disk usage of the wp_options.ibd file DISK_USAGE=$(du -csh /var/lib/mysql /****/ wp_options.ibd | head -n 1 | cut -f1) # Append both metrics as a single line to the log file echo "${TIMESTAMP} - History List Length: ${HISTORY_LENGTH}, Disk Usage: ${DISK_USAGE}" >> ${LOG_FILE} # Sleep for 1 second sleep 4 done It will provide an entry every 4 seconds approximately 2024-02-21 12:16:10 - History List Length: History list length 8, Disk Usage: 113M 2024-02-21 12:16:15 - History List Length: History list length 9, Disk Usage: 113M 2024-02-21 12:16:19 - History List Length: History list length 23, Disk Usage: 113M 2024-02-21 12:16:23 - History List Length: History list length 40, Disk Usage: 113M 2024-02-21 12:16:27 - History List Length: History list length 178, Disk Usage: 113M 2024-02-21 12:16:31 - History List Length: History list length 82, Disk Usage: 113M 2024-02-21 12:16:35 - History List Length: History list length 17, Disk Usage: 113M I hope by the end of the day the tables will increase enough to see this in effect I'll be monitoring the size of a single table which increases usually a lot within the day

Gabriel Tziotzis added a comment - 2024-02-22 07:30 - edited

historylenth.zip
From what I can see, the history list length is increasing / decreasing normally but the size constantly increases. I've uploaded the log entries for yesterday until today

After optimizing the table it dropped to 73M from 1.1G

Gabriel Tziotzis added a comment - 2024-02-22 07:30 - edited historylenth.zip From what I can see, the history list length is increasing / decreasing normally but the size constantly increases. I've uploaded the log entries for yesterday until today After optimizing the table it dropped to 73M from 1.1G

Marko Mäkelä added a comment - 2024-02-22 14:20

Thank you. Do you have any closer insight on what happened in the following time frame?

2024-02-21 13:06:54 - History list length 142, Disk Usage: 505M

2024-02-21 13:06:58 - History list length 269, Disk Usage: 505M

2024-02-21 13:07:02 - History list length 351, Disk Usage: 537M

2024-02-21 13:07:06 - History list length 519, Disk Usage: 641M

2024-02-21 13:07:10 - History list length 625, Disk Usage: 749M

2024-02-21 13:07:14 - History list length 703, Disk Usage: 845M

2024-02-21 13:07:18 - History list length 797, Disk Usage: 945M

2024-02-21 13:07:22 - History list length 903, Disk Usage: 1.1G

2024-02-21 13:07:26 - History list length 152, Disk Usage: 1.1G

InnoDB data files never shrink unless they are being rebuilt, so we can say that the biggest ‘damage’ was done here, within half a minute. This was also almost the maximum purge lag. Of course, some transactions could be really small (updating one non-indexed column) and some very big, so we can’t infer much from these numbers. An earlier growth phase coincided with even more purge lag:

2024-02-21 12:59:50 - History list length 0, Disk Usage: 241M

2024-02-21 12:59:54 - History list length 2, Disk Usage: 241M

2024-02-21 12:59:58 - History list length 4, Disk Usage: 241M

2024-02-21 13:00:02 - History list length 58, Disk Usage: 241M

2024-02-21 13:00:06 - History list length 132, Disk Usage: 241M

2024-02-21 13:00:10 - History list length 725, Disk Usage: 241M

2024-02-21 13:00:14 - History list length 1017, Disk Usage: 345M

2024-02-21 13:00:18 - History list length 759, Disk Usage: 397M

2024-02-21 13:00:22 - History list length 986, Disk Usage: 397M

2024-02-21 13:00:26 - History list length 673, Disk Usage: 397M

2024-02-21 13:00:30 - History list length 633, Disk Usage: 409M

2024-02-21 13:00:34 - History list length 796, Disk Usage: 409M

2024-02-21 13:00:38 - History list length 970, Disk Usage: 433M

2024-02-21 13:00:42 - History list length 1155, Disk Usage: 433M

2024-02-21 13:00:46 - History list length 1048, Disk Usage: 433M

2024-02-21 13:00:50 - History list length 156, Disk Usage: 433M

2024-02-21 13:00:54 - History list length 597, Disk Usage: 433M

2024-02-21 13:00:58 - History list length 90, Disk Usage: 433M

2024-02-21 13:01:02 - History list length 165, Disk Usage: 433M

2024-02-21 13:01:06 - History list length 33, Disk Usage: 433M

2024-02-21 13:01:10 - History list length 113, Disk Usage: 433M

2024-02-21 13:01:14 - History list length 16, Disk Usage: 433M

I’m quoting also some "tail" to demonstrate that the history list is shrinking soon after the peak was reached. Can you say what kind of write transactions could have been executed during the 48 seconds between 12:59:50 and 13:00:38?

Based on this data, it looks plausible to me that the file size growth correlates with the purge lag. One thing that you could try (while keeping the monitoring enabled) is:

SET GLOBAL innodb_max_purge_lag=100;

SET GLOBAL innodb_max_purge_lag_delay=1000;

The first parameter refers to the history list length, and the second parameter is the maximum delay for DML operations, in microseconds. In ~~MDEV-29401~~ you can find some comments how these parameters work.

I would like to see if this artificial slowing down of the workload would prevent the file growth. If not, then we would need a workload script that reproduces the problem.

Marko Mäkelä added a comment - 2024-02-22 14:20 Thank you. Do you have any closer insight on what happened in the following time frame? 2024-02-21 13:06:54 - History list length 142, Disk Usage: 505M 2024-02-21 13:06:58 - History list length 269, Disk Usage: 505M 2024-02-21 13:07:02 - History list length 351, Disk Usage: 537M 2024-02-21 13:07:06 - History list length 519, Disk Usage: 641M 2024-02-21 13:07:10 - History list length 625, Disk Usage: 749M 2024-02-21 13:07:14 - History list length 703, Disk Usage: 845M 2024-02-21 13:07:18 - History list length 797, Disk Usage: 945M 2024-02-21 13:07:22 - History list length 903, Disk Usage: 1.1G 2024-02-21 13:07:26 - History list length 152, Disk Usage: 1.1G InnoDB data files never shrink unless they are being rebuilt, so we can say that the biggest ‘damage’ was done here, within half a minute. This was also almost the maximum purge lag. Of course, some transactions could be really small (updating one non-indexed column) and some very big, so we can’t infer much from these numbers. An earlier growth phase coincided with even more purge lag: 2024-02-21 12:59:50 - History list length 0, Disk Usage: 241M 2024-02-21 12:59:54 - History list length 2, Disk Usage: 241M 2024-02-21 12:59:58 - History list length 4, Disk Usage: 241M 2024-02-21 13:00:02 - History list length 58, Disk Usage: 241M 2024-02-21 13:00:06 - History list length 132, Disk Usage: 241M 2024-02-21 13:00:10 - History list length 725, Disk Usage: 241M 2024-02-21 13:00:14 - History list length 1017, Disk Usage: 345M 2024-02-21 13:00:18 - History list length 759, Disk Usage: 397M 2024-02-21 13:00:22 - History list length 986, Disk Usage: 397M 2024-02-21 13:00:26 - History list length 673, Disk Usage: 397M 2024-02-21 13:00:30 - History list length 633, Disk Usage: 409M 2024-02-21 13:00:34 - History list length 796, Disk Usage: 409M 2024-02-21 13:00:38 - History list length 970, Disk Usage: 433M 2024-02-21 13:00:42 - History list length 1155, Disk Usage: 433M 2024-02-21 13:00:46 - History list length 1048, Disk Usage: 433M 2024-02-21 13:00:50 - History list length 156, Disk Usage: 433M 2024-02-21 13:00:54 - History list length 597, Disk Usage: 433M 2024-02-21 13:00:58 - History list length 90, Disk Usage: 433M 2024-02-21 13:01:02 - History list length 165, Disk Usage: 433M 2024-02-21 13:01:06 - History list length 33, Disk Usage: 433M 2024-02-21 13:01:10 - History list length 113, Disk Usage: 433M 2024-02-21 13:01:14 - History list length 16, Disk Usage: 433M I’m quoting also some "tail" to demonstrate that the history list is shrinking soon after the peak was reached. Can you say what kind of write transactions could have been executed during the 48 seconds between 12:59:50 and 13:00:38? Based on this data, it looks plausible to me that the file size growth correlates with the purge lag. One thing that you could try (while keeping the monitoring enabled) is: SET GLOBAL innodb_max_purge_lag=100; SET GLOBAL innodb_max_purge_lag_delay=1000; The first parameter refers to the history list length, and the second parameter is the maximum delay for DML operations, in microseconds. In MDEV-29401 you can find some comments how these parameters work. I would like to see if this artificial slowing down of the workload would prevent the file growth. If not, then we would need a workload script that reproduces the problem.

Gabriel Tziotzis added a comment - 2024-02-22 15:05

Unfortunately that's why I was looking for a way to look at the type of data initially to understand what was making it grow.

Regarding the request, I executed

SET GLOBAL innodb_max_purge_lag=100;
SET GLOBAL innodb_max_purge_lag_delay=1000;

let's see what we'll gather up

Gabriel Tziotzis added a comment - 2024-02-22 15:05 Unfortunately that's why I was looking for a way to look at the type of data initially to understand what was making it grow. Regarding the request, I executed SET GLOBAL innodb_max_purge_lag=100; SET GLOBAL innodb_max_purge_lag_delay=1000; let's see what we'll gather up

Marko Mäkelä added a comment - 2024-02-22 15:17

Good luck! You may want to set a longer maximum delay if this is not enough.

Marko Mäkelä added a comment - 2024-02-22 15:17 Good luck! You may want to set a longer maximum delay if this is not enough.

Gabriel Tziotzis added a comment - 2024-02-26 09:53 - edited

Unfortunately it didn't make a difference historylenth.zip

After optimize the DB dropped to 173M

Gabriel Tziotzis added a comment - 2024-02-26 09:53 - edited Unfortunately it didn't make a difference historylenth.zip After optimize the DB dropped to 173M

Marko Mäkelä added a comment - 2024-02-26 10:29

Did you try a smaller innodb_max_purge_lag or longer innodb_max_purge_lag_delay? In historylenth-1.zip it seems that the growth of the file size is correlated with the purge lag. The last significant growth occurred here:

2024-02-24 06:00:01 - History list length 16, Disk Usage: 689M

2024-02-24 06:00:06 - History list length 1234, Disk Usage: 689M

2024-02-24 06:00:10 - History list length 1690, Disk Usage: 689M

2024-02-24 06:00:14 - History list length 2089, Disk Usage: 689M

2024-02-24 06:00:18 - History list length 2328, Disk Usage: 689M

2024-02-24 06:00:22 - History list length 2581, Disk Usage: 689M

2024-02-24 06:00:26 - History list length 2783, Disk Usage: 789M

2024-02-24 06:00:30 - History list length 2973, Disk Usage: 913M

2024-02-24 06:00:34 - History list length 3221, Disk Usage: 913M

2024-02-24 06:00:38 - History list length 1825, Disk Usage: 913M

2024-02-24 06:00:42 - History list length 934, Disk Usage: 913M

2024-02-24 06:00:46 - History list length 1443, Disk Usage: 913M

2024-02-24 06:00:50 - History list length 119, Disk Usage: 913M

2024-02-24 06:00:54 - History list length 56, Disk Usage: 913M

It looks like there was a huge burst of transactions within this one-minute interval. Maybe one of them was a massive DELETE or INSERT on this table? We can see that the purge of history is able to make some progress after 06:00:34, while some workload is still arriving.

One idea that might work better would be to modify the application in such a way that right after committing a transaction that did a massive DELETE on wp_options, it would issue the statement SET GLOBAL innodb_max_purge_lag_wait=0; so that all history of committed transactions will be purged. This should ensure that the freed space can be reused by a future INSERT.

Marko Mäkelä added a comment - 2024-02-26 10:29 Did you try a smaller innodb_max_purge_lag or longer innodb_max_purge_lag_delay ? In historylenth-1.zip it seems that the growth of the file size is correlated with the purge lag. The last significant growth occurred here: 2024-02-24 06:00:01 - History list length 16, Disk Usage: 689M 2024-02-24 06:00:06 - History list length 1234, Disk Usage: 689M 2024-02-24 06:00:10 - History list length 1690, Disk Usage: 689M 2024-02-24 06:00:14 - History list length 2089, Disk Usage: 689M 2024-02-24 06:00:18 - History list length 2328, Disk Usage: 689M 2024-02-24 06:00:22 - History list length 2581, Disk Usage: 689M 2024-02-24 06:00:26 - History list length 2783, Disk Usage: 789M 2024-02-24 06:00:30 - History list length 2973, Disk Usage: 913M 2024-02-24 06:00:34 - History list length 3221, Disk Usage: 913M 2024-02-24 06:00:38 - History list length 1825, Disk Usage: 913M 2024-02-24 06:00:42 - History list length 934, Disk Usage: 913M 2024-02-24 06:00:46 - History list length 1443, Disk Usage: 913M 2024-02-24 06:00:50 - History list length 119, Disk Usage: 913M 2024-02-24 06:00:54 - History list length 56, Disk Usage: 913M It looks like there was a huge burst of transactions within this one-minute interval. Maybe one of them was a massive DELETE or INSERT on this table? We can see that the purge of history is able to make some progress after 06:00:34, while some workload is still arriving. One idea that might work better would be to modify the application in such a way that right after committing a transaction that did a massive DELETE on wp_options , it would issue the statement SET GLOBAL innodb_max_purge_lag_wait=0; so that all history of committed transactions will be purged. This should ensure that the freed space can be reused by a future INSERT .

Gabriel Tziotzis added a comment - 2024-02-26 11:00

According to what you say I would expect that the free space should be used sooner or later whether the purge lag is small or large but what differs in my case is that the free space is never reused. Please here my simplified thoughts because perhaps I can't explain it in such a technical way.

Let's say I do the following without the system having enough time to purge

1. Run a huge insert statement (to use 500mb) (no purge) -> Filesize 500MB
2. Run a huge delete statement to delete 200mb of data (no purge) -> Filesize 500MB
3. Run a huge insert statement to insert 500MB (no purge) -> Since it cannot reuse the free space the filesize will be 1000MB
4. Run a huge delete statement to delete 300MB of data (then purge) -> Now 500MB should be reusable and the filesize is 1000MB
5a. Run a huge insert statement to insert 500MB -> I should have the filesize in 1000MB again which is my full data.
5b. Don't run any insert and run optimize -> I should have the filesize in 500MB as 500MB is marked as reusable
6a. If I run optimize I would expect to see a filesize in 1000MB

No matter what the purge lag, I should be seeing an amount of data that's equal to the data that is used, and the available data that should be reusable since history list length has dropped to 0. However in my case I see like 100-200MB in real data and I can reach to 500GB of free reusable dataspace.

Just the correlation between history list length and the growth doesn't explain why the table keeps growing infinitely.

There's definitely a bug here as I see it in the infinite expansion and the problem does not appear to be the application as the application won't stray from doing what is expected in a database table which means run inserts / deletes / updates.

The application itself shouldn't have to worry about the purge_lag or anything. It would expect the DB system to handle filesize growth and cleanup and defragmentation. At least that's how I see it.

Gabriel Tziotzis added a comment - 2024-02-26 11:00 According to what you say I would expect that the free space should be used sooner or later whether the purge lag is small or large but what differs in my case is that the free space is never reused. Please here my simplified thoughts because perhaps I can't explain it in such a technical way. Let's say I do the following without the system having enough time to purge 1. Run a huge insert statement (to use 500mb) (no purge) -> Filesize 500MB 2. Run a huge delete statement to delete 200mb of data (no purge) -> Filesize 500MB 3. Run a huge insert statement to insert 500MB (no purge) -> Since it cannot reuse the free space the filesize will be 1000MB 4. Run a huge delete statement to delete 300MB of data (then purge) -> Now 500MB should be reusable and the filesize is 1000MB 5a. Run a huge insert statement to insert 500MB -> I should have the filesize in 1000MB again which is my full data. 5b. Don't run any insert and run optimize -> I should have the filesize in 500MB as 500MB is marked as reusable 6a. If I run optimize I would expect to see a filesize in 1000MB No matter what the purge lag, I should be seeing an amount of data that's equal to the data that is used, and the available data that should be reusable since history list length has dropped to 0. However in my case I see like 100-200MB in real data and I can reach to 500GB of free reusable dataspace. Just the correlation between history list length and the growth doesn't explain why the table keeps growing infinitely. There's definitely a bug here as I see it in the infinite expansion and the problem does not appear to be the application as the application won't stray from doing what is expected in a database table which means run inserts / deletes / updates. The application itself shouldn't have to worry about the purge_lag or anything. It would expect the DB system to handle filesize growth and cleanup and defragmentation. At least that's how I see it.

Marko Mäkelä added a comment - 2024-02-26 14:24

You have a point: Depending on the allocation patterns, there could also be a failure to reuse previously freed space. However, in a pathological case I think that it is possible that the purge lag alone will cause significant bloat.

thiru has analyzed and fixed ~~MDEV-13013~~ and similar bugs in the past. I’m assigning this to him for further analysis.

A side finding of this ticket is ~~MDEV-33542~~, which shows that also OPTIMIZE TABLE may be wasting some space.

Marko Mäkelä added a comment - 2024-02-26 14:24 You have a point: Depending on the allocation patterns, there could also be a failure to reuse previously freed space. However, in a pathological case I think that it is possible that the purge lag alone will cause significant bloat. thiru has analyzed and fixed MDEV-13013 and similar bugs in the past. I’m assigning this to him for further analysis. A side finding of this ticket is MDEV-33542 , which shows that also OPTIMIZE TABLE may be wasting some space.

Gabriel Tziotzis added a comment - 2024-02-27 06:05

Of course anything within my capacity to provide let me know to get to the bottom of this.

Gabriel Tziotzis added a comment - 2024-02-27 06:05 Of course anything within my capacity to provide let me know to get to the bottom of this.

Gabriel Tziotzis added a comment - 2024-02-29 07:40

I decided to try to extract some queries that are occurring at the time where I record the history list length to see if we can piece everything together so you can understand better the type of workload. Perhaps it will help in understanding where the bug lies as well. The problem is that in just a couple of days the log has become huge like 23GB. Zipped it's going to become more than 600MB

Do you have any upload tool to send it to you securely @Marko Mäkelä @Thirunarayanan Balathandayuthapani ?

Gabriel Tziotzis added a comment - 2024-02-29 07:40 I decided to try to extract some queries that are occurring at the time where I record the history list length to see if we can piece everything together so you can understand better the type of workload. Perhaps it will help in understanding where the bug lies as well. The problem is that in just a couple of days the log has become huge like 23GB. Zipped it's going to become more than 600MB Do you have any upload tool to send it to you securely @Marko Mäkelä @Thirunarayanan Balathandayuthapani ?

Marko Mäkelä added a comment - 2024-03-01 07:45

I think that https://mariadb.com/kb/en/meta/mariadb-ftp-server/ can be used for this. Please use file names that start with MDEV-32663. I have found xz to compress well and fast, making use of multiple threads.

Marko Mäkelä added a comment - 2024-03-01 07:45 I think that https://mariadb.com/kb/en/meta/mariadb-ftp-server/ can be used for this. Please use file names that start with MDEV-32663 . I have found xz to compress well and fast, making use of multiple threads.

Gabriel Tziotzis added a comment - 2024-03-01 07:55

Thank you, I followed the instructions on the guide

Can you verify if it got uploaded because curl reported 426 instead of OK

 curl -T MDEV-32663_historylengthlogs.tar.gz --ssl-reqd ftp://ftp.mariadb.org/private/

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100  727M    0     0  100  727M      0  39.8M  0:00:18  0:00:18 --:--:-- 38.9M

curl: (18) server did not report OK, got 426

But later attempts say

curl -T MDEV-32663_historylengthlogs.tar.gz --ssl-reqd ftp://ftp.mariadb.org/private/

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

curl: (25) Failed FTP upload: 553

Gabriel Tziotzis added a comment - 2024-03-01 07:55 Thank you, I followed the instructions on the guide Can you verify if it got uploaded because curl reported 426 instead of OK curl -T MDEV-32663_historylengthlogs.tar.gz --ssl-reqd ftp: //ftp.mariadb.org/private/ % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 727M 0 0 100 727M 0 39 .8M 0 : 00 : 18 0 : 00 : 18 --:--:-- 38 .9M curl: ( 18 ) server did not report OK, got 426 But later attempts say curl -T MDEV-32663_historylengthlogs.tar.gz --ssl-reqd ftp: //ftp.mariadb.org/private/ % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: ( 25 ) Failed FTP upload: 553

Marko Mäkelä added a comment - 2024-03-01 15:13

Thank you. According to thiru, the file is accessible to him on the server.

Marko Mäkelä added a comment - 2024-03-01 15:13 Thank you. According to thiru , the file is accessible to him on the server.

Marko Mäkelä added a comment - 2024-03-06 12:15

This bug might be fixed by ~~MDEV-33542~~.

Marko Mäkelä added a comment - 2024-03-06 12:15 This bug might be fixed by MDEV-33542 .

Gabriel Tziotzis added a comment - 2024-03-06 12:57

Is there an estimate about the release of the patch so I can re-test after upgrading?

Gabriel Tziotzis added a comment - 2024-03-06 12:57 Is there an estimate about the release of the patch so I can re-test after upgrading?

Marko Mäkelä added a comment - 2024-03-22 11:13

ioweb.gr, we have scheduled maintenance releases 4 times per year. I see on the release page that the upcoming 10.6.18 release is scheduled for April 19.

But, we do not need to wait for that long. The fix of ~~MDEV-33542~~ was pushed to the 10.6 development branch earlier this week. The packages for the currently latest 10.6 development snapshot are available at https://ci.mariadb.org/44684. Can you test that in some non-production environment?

Marko Mäkelä added a comment - 2024-03-22 11:13 ioweb.gr , we have scheduled maintenance releases 4 times per year. I see on the release page that the upcoming 10.6.18 release is scheduled for April 19. But, we do not need to wait for that long. The fix of MDEV-33542 was pushed to the 10.6 development branch earlier this week. The packages for the currently latest 10.6 development snapshot are available at https://ci.mariadb.org/44684 . Can you test that in some non-production environment?

Gabriel Tziotzis added a comment - 2024-03-23 14:10

Because the release date is close enough and we can't simulate the
workload properly to actually get meaningful results, I think it's best
to wait for the official release. I'll keep it in mind to inform you if
it's fixed or not. Thanks

Gabriel Tziotzis added a comment - 2024-03-23 14:10 Because the release date is close enough and we can't simulate the workload properly to actually get meaningful results, I think it's best to wait for the official release. I'll keep it in mind to inform you if it's fixed or not. Thanks

Marko Mäkelä added a comment - 2024-04-03 14:11

~~MDEV-20094~~ is another possible explanation of this.

Marko Mäkelä added a comment - 2024-04-03 14:11 MDEV-20094 is another possible explanation of this.

Marko Mäkelä added a comment - 2024-04-05 09:08

One more possible explanation could be ~~MDEV-33819~~.

Marko Mäkelä added a comment - 2024-04-05 09:08 One more possible explanation could be MDEV-33819 .

Marko Mäkelä added a comment - 2024-05-20 04:56

ioweb.gr, the quarterly MariaDB Server releases are finally out, including mariadb-10.6.18. Can you check if the problem is fixed by that?

Marko Mäkelä added a comment - 2024-05-20 04:56 ioweb.gr , the quarterly MariaDB Server releases are finally out, including mariadb-10.6.18. Can you check if the problem is fixed by that?

Gabriel Tziotzis added a comment - 2024-05-20 09:04

I've upgraded and just optimized the wp_options table which dropped down to 216MB. I'll monitor and let you know how it goes
Thanks for the update

Gabriel Tziotzis added a comment - 2024-05-20 09:04 I've upgraded and just optimized the wp_options table which dropped down to 216MB. I'll monitor and let you know how it goes Thanks for the update

Marko Mäkelä added a comment - 2024-05-23 08:28

Thank you!

Marko Mäkelä added a comment - 2024-05-23 08:28 Thank you!

Marko Mäkelä added a comment - 2024-06-05 13:12

ioweb.gr, do you have any update yet?

Marko Mäkelä added a comment - 2024-06-05 13:12 ioweb.gr , do you have any update yet?

Gabriel Tziotzis added a comment - 2024-06-07 14:03

Hi I was monitoring for quite some time now. There is steady increase on the filesize but nothing like before.
Definitely the update helped a lot.

However still my wp_options table is now 10GB in size where if run optimize on it, it drops to around 200MB.

I think there is still an issue here, but it's not in the extent it was before when it was reaching 500GB in a couple of days.

Gabriel Tziotzis added a comment - 2024-06-07 14:03 Hi I was monitoring for quite some time now. There is steady increase on the filesize but nothing like before. Definitely the update helped a lot. However still my wp_options table is now 10GB in size where if run optimize on it, it drops to around 200MB. I think there is still an issue here, but it's not in the extent it was before when it was reaching 500GB in a couple of days.

Marko Mäkelä added a comment - 2024-06-07 16:03

The remaining bloat could be explained by MDEV-29823. Does CHECK TABLE…EXTENDED report many orphan records in the secondary indexes?

Marko Mäkelä added a comment - 2024-06-07 16:03 The remaining bloat could be explained by MDEV-29823 . Does CHECK TABLE…EXTENDED report many orphan records in the secondary indexes?

Gabriel Tziotzis added a comment - 2024-06-08 04:15

Currently it's at 2.2GB again after the optimize but here's what I got

MariaDB [db_sexstores]> check table wp_options extended;

+-------------------------+-------+----------+----------+

| Table                   | Op    | Msg_type | Msg_text |

+-------------------------+-------+----------+----------+

| db_sexstores.wp_options | check | status   | OK       |

+-------------------------+-------+----------+----------+

1 row in set (0.053 sec)

I'll wait for it to bloat more and check again.

Gabriel Tziotzis added a comment - 2024-06-08 04:15 Currently it's at 2.2GB again after the optimize but here's what I got MariaDB [db_sexstores]> check table wp_options extended; +-------------------------+-------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-------------------------+-------+----------+----------+ | db_sexstores.wp_options | check | status | OK | +-------------------------+-------+----------+----------+ 1 row in set ( 0.053 sec) I'll wait for it to bloat more and check again.

Gabriel Tziotzis added a comment - 2024-07-01 11:09

Hi, following up here the filesize is now 9.2GB

Mon Jul 01 14:08:35[root@d1.iohosting.eu]

~$ mysql

Welcome to the MariaDB monitor.  Commands end with ; or \g.

Your MariaDB connection id is 63397420

Server version: 10.6.18-MariaDB-ubu2004-log mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use db_sexstores;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

MariaDB [db_sexstores]> check table wp_options extended;

+-------------------------+-------+----------+----------+

| Table                   | Op    | Msg_type | Msg_text |

+-------------------------+-------+----------+----------+

| db_sexstores.wp_options | check | status   | OK       |

+-------------------------+-------+----------+----------+

1 row in set (0.226 sec)

Gabriel Tziotzis added a comment - 2024-07-01 11:09 Hi, following up here the filesize is now 9.2GB Mon Jul 01 14 : 08 : 35 [root @d1 .iohosting.eu] ~$ mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 63397420 Server version: 10.6 . 18 -MariaDB-ubu2004-log mariadb.org binary distribution Copyright (c) 2000 , 2018 , Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> use db_sexstores; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [db_sexstores]> check table wp_options extended; +-------------------------+-------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-------------------------+-------+----------+----------+ | db_sexstores.wp_options | check | status | OK | +-------------------------+-------+----------+----------+ 1 row in set ( 0.226 sec)

Marko Mäkelä added a comment - 2024-07-30 14:46

ioweb.gr, thank you for the update. Is the table at least growing a little slower now?

If I understood it correctly, wp_options is a table that WordPress uses as some kind of a key-value store. It is not good when combined with the InnoDB copy-on-write BLOBs. The bug can be in the page allocator, or the purge of history is still lagging and therefore the old BLOB pages are not being freed as soon as they could. Are you still monitoring the history list length?

I am afraid that we need some way of reproducing this in a development system. One way could be replaying a binlog file if you can supply one. There is an option to upload files privately, and this Jira would limit attachments (which are public to everyone) to about 10 megabytes anyway.

Marko Mäkelä added a comment - 2024-07-30 14:46 ioweb.gr , thank you for the update. Is the table at least growing a little slower now? If I understood it correctly, wp_options is a table that WordPress uses as some kind of a key-value store. It is not good when combined with the InnoDB copy-on-write BLOBs. The bug can be in the page allocator, or the purge of history is still lagging and therefore the old BLOB pages are not being freed as soon as they could. Are you still monitoring the history list length? I am afraid that we need some way of reproducing this in a development system. One way could be replaying a binlog file if you can supply one. There is an option to upload files privately, and this Jira would limit attachments (which are public to everyone) to about 10 megabytes anyway.

Gabriel Tziotzis added a comment - 2024-07-31 06:30

Hi,

actually no I stopped a while back monitoring the history list length. wp_options works like that to store configuration values and also data with transient nature. So it sounds about right.

The filesize increases a lot slower so the huge issue of it growing 500GB in a day is gone, but it still grows indefinitely unless I optimize the tables. I'm assuming accross a shared hosting environment of 10-20 websites, that amounts to a lot of space that's actually freeable in total.

I'll consider this with our legal department of how feasible it is to provide a binlog and get back to you as GDPR comes into concern as the binlog will actually contain data of all customers not only the one with the issue where we have reasons to be able to provide it.

Gabriel Tziotzis added a comment - 2024-07-31 06:30 Hi, actually no I stopped a while back monitoring the history list length. wp_options works like that to store configuration values and also data with transient nature. So it sounds about right. The filesize increases a lot slower so the huge issue of it growing 500GB in a day is gone, but it still grows indefinitely unless I optimize the tables. I'm assuming accross a shared hosting environment of 10-20 websites, that amounts to a lot of space that's actually freeable in total. I'll consider this with our legal department of how feasible it is to provide a binlog and get back to you as GDPR comes into concern as the binlog will actually contain data of all customers not only the one with the issue where we have reasons to be able to provide it.

Marko Mäkelä added a comment - 2024-08-06 13:06

It would help if you monitored the history list length. It is possible that ~~MDEV-34515~~ and ~~MDEV-34520~~ play a role if the history list length is not shrinking.

Marko Mäkelä added a comment - 2024-08-06 13:06 It would help if you monitored the history list length. It is possible that MDEV-34515 and MDEV-34520 play a role if the history list length is not shrinking.

Gabriel Tziotzis added a comment - 2024-08-27 05:31

I've added the monitor again as our company was closed in august. I'll report back once the table starts growing and send the log file.

Gabriel Tziotzis added a comment - 2024-08-27 05:31 I've added the monitor again as our company was closed in august. I'll report back once the table starts growing and send the log file.

Gabriel Tziotzis added a comment - 2024-08-29 06:25

OK my table has grown already at 528M from 40M most likely it was optimized when it exceeded 1GB some times already

Here's the log

history-list-length.zip

Gabriel Tziotzis added a comment - 2024-08-29 06:25 OK my table has grown already at 528M from 40M most likely it was optimized when it exceeded 1GB some times already Here's the log history-list-length.zip

Marko Mäkelä added a comment - 2024-09-10 07:32

Thank you. The history list length seems to coincide with some file growth. Here is an excerpt:

2024-02-23 15:59:06 - History list length 0, Disk Usage: 345M

2024-02-23 15:59:10 - History list length 4, Disk Usage: 345M

2024-02-23 15:59:14 - History list length 30, Disk Usage: 345M

2024-02-23 15:59:18 - History list length 47, Disk Usage: 345M

2024-02-23 15:59:22 - History list length 37, Disk Usage: 345M

2024-02-23 15:59:26 - History list length 22, Disk Usage: 345M

2024-02-23 15:59:30 - History list length 67, Disk Usage: 345M

2024-02-23 15:59:34 - History list length 24, Disk Usage: 345M

2024-02-23 15:59:38 - History list length 17, Disk Usage: 345M

2024-02-23 15:59:43 - History list length 94, Disk Usage: 345M

2024-02-23 15:59:47 - History list length 33, Disk Usage: 345M

2024-02-23 15:59:51 - History list length 184, Disk Usage: 345M

2024-02-23 15:59:55 - History list length 42, Disk Usage: 345M

2024-02-23 15:59:59 - History list length 15, Disk Usage: 345M

2024-02-23 16:00:02 - History list length 30, Disk Usage: 345M

2024-02-23 16:00:06 - History list length 1092, Disk Usage: 345M

2024-02-23 16:00:11 - History list length 1751, Disk Usage: 345M

2024-02-23 16:00:15 - History list length 1554, Disk Usage: 345M

2024-02-23 16:00:19 - History list length 1779, Disk Usage: 365M

2024-02-23 16:00:23 - History list length 960, Disk Usage: 409M

…

2024-02-23 16:01:18 - History list length 87, Disk Usage: 409M

2024-02-23 16:01:22 - History list length 165, Disk Usage: 409M

2024-02-23 16:01:26 - History list length 380, Disk Usage: 409M

2024-02-23 16:01:31 - History list length 259, Disk Usage: 409M

2024-02-23 16:01:35 - History list length 781, Disk Usage: 409M

2024-02-23 16:01:39 - History list length 1118, Disk Usage: 501M

2024-02-23 16:01:43 - History list length 1427, Disk Usage: 617M

Another one:

2024-02-24 06:00:01 - History list length 16, Disk Usage: 689M

2024-02-24 06:00:06 - History list length 1234, Disk Usage: 689M

2024-02-24 06:00:10 - History list length 1690, Disk Usage: 689M

2024-02-24 06:00:14 - History list length 2089, Disk Usage: 689M

2024-02-24 06:00:18 - History list length 2328, Disk Usage: 689M

2024-02-24 06:00:22 - History list length 2581, Disk Usage: 689M

2024-02-24 06:00:26 - History list length 2783, Disk Usage: 789M

2024-02-24 06:00:30 - History list length 2973, Disk Usage: 913M

But a long history length does not always result in the file being extended:

2024-02-24 03:00:02 - History list length 4, Disk Usage: 689M

2024-02-24 03:00:06 - History list length 968, Disk Usage: 689M

2024-02-24 03:00:10 - History list length 1285, Disk Usage: 689M

2024-02-24 03:00:14 - History list length 1392, Disk Usage: 689M

2024-02-24 03:00:18 - History list length 1440, Disk Usage: 689M

2024-02-24 03:00:22 - History list length 1490, Disk Usage: 689M

2024-02-24 03:00:26 - History list length 1617, Disk Usage: 689M

2024-02-24 03:00:30 - History list length 1861, Disk Usage: 689M

2024-02-24 03:00:34 - History list length 2658, Disk Usage: 689M

2024-02-24 03:00:38 - History list length 3160, Disk Usage: 689M

2024-02-24 03:00:42 - History list length 2001, Disk Usage: 689M

2024-02-24 03:00:46 - History list length 714, Disk Usage: 689M

2024-02-24 03:00:50 - History list length 80, Disk Usage: 689M

One thing that I recently noticed while working on ~~MDEV-34515~~ is that btr_cur_pessimistic_delete() may actually end up extending the data file, even though it should be highly unlikely that it needs to allocate a new index page while shrinking the tree. An upgrade to the upcoming next release (scheduled within 2 months) would include ~~MDEV-34515~~ and could help keep the history list smaller.

If that upgrade will not help, then I think that we must turn our attention to invocations of fsp_reserve_free_extents(). Some of those calls could possibly be removed and some file extension avoided.

Marko Mäkelä added a comment - 2024-09-10 07:32 Thank you. The history list length seems to coincide with some file growth. Here is an excerpt: 2024-02-23 15:59:06 - History list length 0, Disk Usage: 345M 2024-02-23 15:59:10 - History list length 4, Disk Usage: 345M 2024-02-23 15:59:14 - History list length 30, Disk Usage: 345M 2024-02-23 15:59:18 - History list length 47, Disk Usage: 345M 2024-02-23 15:59:22 - History list length 37, Disk Usage: 345M 2024-02-23 15:59:26 - History list length 22, Disk Usage: 345M 2024-02-23 15:59:30 - History list length 67, Disk Usage: 345M 2024-02-23 15:59:34 - History list length 24, Disk Usage: 345M 2024-02-23 15:59:38 - History list length 17, Disk Usage: 345M 2024-02-23 15:59:43 - History list length 94, Disk Usage: 345M 2024-02-23 15:59:47 - History list length 33, Disk Usage: 345M 2024-02-23 15:59:51 - History list length 184, Disk Usage: 345M 2024-02-23 15:59:55 - History list length 42, Disk Usage: 345M 2024-02-23 15:59:59 - History list length 15, Disk Usage: 345M 2024-02-23 16:00:02 - History list length 30, Disk Usage: 345M 2024-02-23 16:00:06 - History list length 1092, Disk Usage: 345M 2024-02-23 16:00:11 - History list length 1751, Disk Usage: 345M 2024-02-23 16:00:15 - History list length 1554, Disk Usage: 345M 2024-02-23 16:00:19 - History list length 1779, Disk Usage: 365M 2024-02-23 16:00:23 - History list length 960, Disk Usage: 409M … 2024-02-23 16:01:18 - History list length 87, Disk Usage: 409M 2024-02-23 16:01:22 - History list length 165, Disk Usage: 409M 2024-02-23 16:01:26 - History list length 380, Disk Usage: 409M 2024-02-23 16:01:31 - History list length 259, Disk Usage: 409M 2024-02-23 16:01:35 - History list length 781, Disk Usage: 409M 2024-02-23 16:01:39 - History list length 1118, Disk Usage: 501M 2024-02-23 16:01:43 - History list length 1427, Disk Usage: 617M Another one: 2024-02-24 06:00:01 - History list length 16, Disk Usage: 689M 2024-02-24 06:00:06 - History list length 1234, Disk Usage: 689M 2024-02-24 06:00:10 - History list length 1690, Disk Usage: 689M 2024-02-24 06:00:14 - History list length 2089, Disk Usage: 689M 2024-02-24 06:00:18 - History list length 2328, Disk Usage: 689M 2024-02-24 06:00:22 - History list length 2581, Disk Usage: 689M 2024-02-24 06:00:26 - History list length 2783, Disk Usage: 789M 2024-02-24 06:00:30 - History list length 2973, Disk Usage: 913M But a long history length does not always result in the file being extended: 2024-02-24 03:00:02 - History list length 4, Disk Usage: 689M 2024-02-24 03:00:06 - History list length 968, Disk Usage: 689M 2024-02-24 03:00:10 - History list length 1285, Disk Usage: 689M 2024-02-24 03:00:14 - History list length 1392, Disk Usage: 689M 2024-02-24 03:00:18 - History list length 1440, Disk Usage: 689M 2024-02-24 03:00:22 - History list length 1490, Disk Usage: 689M 2024-02-24 03:00:26 - History list length 1617, Disk Usage: 689M 2024-02-24 03:00:30 - History list length 1861, Disk Usage: 689M 2024-02-24 03:00:34 - History list length 2658, Disk Usage: 689M 2024-02-24 03:00:38 - History list length 3160, Disk Usage: 689M 2024-02-24 03:00:42 - History list length 2001, Disk Usage: 689M 2024-02-24 03:00:46 - History list length 714, Disk Usage: 689M 2024-02-24 03:00:50 - History list length 80, Disk Usage: 689M One thing that I recently noticed while working on MDEV-34515 is that btr_cur_pessimistic_delete() may actually end up extending the data file, even though it should be highly unlikely that it needs to allocate a new index page while shrinking the tree. An upgrade to the upcoming next release (scheduled within 2 months) would include MDEV-34515 and could help keep the history list smaller. If that upgrade will not help, then I think that we must turn our attention to invocations of fsp_reserve_free_extents() . Some of those calls could possibly be removed and some file extension avoided.

Marko Mäkelä added a comment - 2024-09-10 09:16

ioweb.gr, can you confirm that around the following point of time (or some other point of time with a similar pattern) there were no INSERT or any modification (other than DELETE of any BLOB data in the table?

2024-08-28 05:01:14 - History list length 5893, Disk Usage: 437M

2024-08-28 05:01:15 - History list length 5987, Disk Usage: 437M

2024-08-28 05:01:16 - History list length 6047, Disk Usage: 437M

2024-08-28 05:02:01 - History list length 11, Disk Usage: 521M

2024-08-28 05:02:02 - History list length 291, Disk Usage: 521M

2024-08-28 05:02:04 - History list length 870, Disk Usage: 521M

If there were none, it would seem that the purge itself is the culprit for unnecessarily extending the file, that is, btr_cur_pessimistic_delete() needs to avoid unnecessary calls to fsp_reserve_free_extents().

Marko Mäkelä added a comment - 2024-09-10 09:16 ioweb.gr , can you confirm that around the following point of time (or some other point of time with a similar pattern) there were no INSERT or any modification (other than DELETE of any BLOB data in the table? 2024-08-28 05:01:14 - History list length 5893, Disk Usage: 437M 2024-08-28 05:01:15 - History list length 5987, Disk Usage: 437M 2024-08-28 05:01:16 - History list length 6047, Disk Usage: 437M 2024-08-28 05:02:01 - History list length 11, Disk Usage: 521M 2024-08-28 05:02:02 - History list length 291, Disk Usage: 521M 2024-08-28 05:02:04 - History list length 870, Disk Usage: 521M If there were none, it would seem that the purge itself is the culprit for unnecessarily extending the file, that is, btr_cur_pessimistic_delete() needs to avoid unnecessary calls to fsp_reserve_free_extents() .

Gabriel Tziotzis added a comment - 2024-10-29 19:03

I've managed to get a good look at the queries running in the DB by getting a huge log of queries in wp_options table.
It seems the bulk is queries like in this file sample.zip or queries like this

starting UPDATE `wp_options` SET `option_value` = 'a:3:{i:0;a:36:{s:12:\\"channel_hash\\";s:32:\\"d4e5a1ae056299a65ce0d2b72b83ac00\\";s:4:

\\"name\\";s:7:\\"Skroutz\\";s:6:\\"fields\\";s:7:\\"skroutz\\";s:8:\\"taxonomy\\";s:4:\\"none\\";s:10:\\"utm_source\\";s:9:\\"Bestprice\\";s:4:\\"type\\";s:26:\\"Comparison shopping engine\\";s:11:\\"projectn

ame\\";s:9:\\"Bestprice\\";s:9:\\"countries\\";s:6:\\"Greece\\";s:10:\\"fileformat\\";s:3:\\"xml\\";s:4:\\"cron\\";s:6:\\"hourly\\";s:4:\\"step\\";s:3:\\"100\\";s:12:\\"project_hash\\";s:32:\\"hFlvg33s2QXLeYvd

1ozmtfHQvqMZK2DO\\";s:8:\\"filename\\";s:32:\\"hFlvg33s2QXLeYvd1ozmtfHQvqMZK2DO\\";s:13:\\"external_file\\";s:105:\\"https://www.mydb.gr/wp-content/uploads/woo-product-feed-pro/xml/hFlvg33s2QXLeYvd1ozmtfHQvqMZ

K2DO.xml\\";s:9:\\"query_log\\";s:79:\\"https://www.mydb.gr/wp-content/uploads/woo-product-feed-pro/logs/query.log\\";s:16:\\"query_output_log\\";s:86:\\"https://www.mydb.gr/wp-content/uploads/woo-product-feed

-pro/logs/query_output.log\\";s:10:\\"attributes\\";a:14:{i:0;a:5:{s:8:\\"rowCount\\";s:1:\\"0\\";s:9:\\"attribute\\";s:2:\\"id\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:2:\\"id\\";s:6:\\"suffix\\";s:

0:\\"\\";}i:1;a:5:{s:8:\\"rowCount\\";s:1:\\"1\\";s:9:\\"attribute\\";s:4:\\"name\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:5:\\"title\\";s:6:\\"suffix\\";s:0:\\"\\";}i:2;a:5:{s:8:\\"rowCount\\";s:1:\

\"2\\";s:9:\\"attribute\\";s:4:\\"link\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:4:\\"link\\";s:6:\\"suffix\\";s:0:\\"\\";}i:3;a:5:{s:8:\\"rowCount\\";s:1:\\"3\\";s:9:\\"attribute\\";s:5:\\"image\\";s

:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:5:\\"image\\";s:6:\\"suffix\\";s:0:\\"\\";}i:4;a:5:{s:8:\\"rowCount\\";s:1:\\"4\\";s:9:\\"attribute\\";s:8:\\"category\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfr

om\\";s:21:\\"category_path_skroutz\\";s:6:\\"suffix\\";s:0:\\"\\";}i:5;a:5:{s:8:\\"rowCount\\";s:1:\\"5\\";s:9:\\"attribute\\";s:14:\\"price_with_vat\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:5:\\"pr

ice\\";s:6:\\"suffix\\";s:0:\\"\\";}i:6;a:5:{s:8:\\"rowCount\\";s:1:\\"6\\";s:9:\\"attribute\\";s:12:\\"manufacturer\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:8:\\"pa_brand\\";s:6:\\"suffix\\";s:0:\\"

\\";}i:7;a:5:{s:8:\\"rowCount\\";s:1:\\"7\\";s:9:\\"attribute\\";s:3:\\"mpn\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:3:\\"sku\\";s:6:\\"suffix\\";s:0:\\"\\";}i:9;a:5:{s:8:\\"rowCount\\";s:1:\\"9\\";s

:9:\\"attribute\\";s:4:\\"size\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:7:\\"pa_size\\";s:6:\\"suffix\\";s:0:\\"\\";}i:10;a:5:{s:8:\\"rowCount\\";s:2:\\"10\\";s:9:\\"attribute\\";s:5:\\"color\\";s:6:

\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:8:\\"pa_color\\";s:6:\\"suffix\\";s:0:\\"\\";}i:11;a:5:{s:8:\\"rowCount\\";s:2:\\"11\\";s:9:\\"attribute\\";s:15:\\"additionalimage\\";s:6:\\"prefix\\";s:0:\\"\\";s:

7:\\"mapfrom\\";s:7:\\"image_1\\";s:6:\\"suffix\\";s:0:\\"\\";}i:12;a:5:{s:8:\\"rowCount\\";s:2:\\"12\\";s:9:\\"attribute\\";s:15:\\"additionalimage\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:7:\\"imag

e_2\\";s:6:\\"suffix\\";s:0:\\"\\";}i:13;a:5:{s:8:\\"rowCount\\";s:2:\\"13\\";s:9:\\"attribute\\";s:15:\\"additionalimage\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:7:\\"image_3\\";s:6:\\"suffix\\";s:0

:\\"\\";}i:15;a:5:{s:8:\\"rowCount\\";s:2:\\"15\\";s:9:\\"attribute\\";s:12:\\"availability\\";s:6:\\"prefix\\";s:0:\\"\\";s:7:\\"mapfrom\\";s:15:\\"pa_availability\\";s:6:\\"suffix\\";s:0:\\"\\";}}s:6:\\"addr

ow\\";s:1:\\"4\\";s:6:\\"utm_on\\";s:2:\\"on\\";s:10:\\"utm_medium\\";s:3:\\"cpc\\";s:12:\\"utm_campaign\\";s:9:\\"Bestprice\\";s:8:\\"utm_term\\";s:2:\\"id\\";s:11:\\"utm_content\\";s:0:\\"\\";s:11:\\"woosea_

page\\";s:13:\\"filters_rules\\";s:6:\\"active\\";b:1;s:12:\\"last_updated\\";s:17:\\"27 Feb 2024 03:03\\";s:7:\\"running\\";s:10:\\"processing\\";s:11:\\"nr_products\\";s:4:\\"2890\\";s:21:\\"nr_products_proc

essed\\";d:723;s:16:\\"history_products\\";a:20939:{s:17:\\"18 Oct 2019 09:04\\";i:662;s:17:\\"18 Oct 2019 09:18\\";i:662;s:17:\\"18 Oct 2019 10:05\\";i:662;s:17:\\"18 Oct 2019 11:05\\";i:662;s:17:\\"18 Oct 20

19 12:11\\";i:662;s:17:\\"18 Oct 2019 14:05\\";i:662;s:17:\\"18 Oct 2019 14:50\\";i:662;s:17:\\"18 Oct 2019 15:09\\";i:662;s:17:\\"18 Oct 2019 16:28\\";i:660;s:17:\\"18 Oct 2019 17:53\\";i:660;s:17:\\"18 Oct 2

019 18:08\\";i:660;s:17:\\"18 Oct 2019 19:10\\";i:660;s:17:\\"18 Oct 2019 20:08\\";i:660;s:17:\\"18 Oct 2019 21:05\\";i:660;s:17:\\"18 Oct 2019 22:02\\";i:660;s:17:\\"18 Oct 2019 22:59\\";i:660;s:17:\\"19 Oct

2019 00:05\\";i:660;s:17:\\"19 Oct 2019 01:05\\";i:660;s:17:\\"19 Oct 2019 02:02\\";i:660;s:17:\\"19 Oct 2019 03:05\\";i:660;s:17:\\"19 Oct 2019 04:05\\";i:660;s:17:\\"19 Oct 2019 05:03\\";i:660;s:17:\\"19 Oct

 2019 07:12\\";i:660;s:17:\\"19 Oct 2019 08:33\\";i:660;s:17:\\"19 Oct 2019 09:19\\";i:660;s:17:\\"19 Oct 2019 09:59\\";i:660;s:17:\\"19 Oct 2019 11:03\\";i:660;s:17:\\"19 Oct 2019 12:08\\";i:660;s:17:\\"19 Oc

t 2019 13:10\\";i:660;s:17:\\"19 Oct 2019 14:05\\";i:660;s:17:\\"19 Oct 2019 15:00\\";i:660;s:17:\\"19 Oct 2019 18:05\\";i:660;s:17:\\"19 Oct 2019 19:18\\";i:660;s:17:\\"19 Oct 2019 20:52\\";i:660;s:17:\\"19 O

ct 2019 21:07\\";i:660;s:17:\\"19 Oct 2019 23:05\\";i:660;s:17:\\"19 Oct 2019 23:37\\";i:660;s:17:\\"20 Oct 2019 00:16\\";i:660;s:17:\\"20 Oct 2019 02:00\\";i:660;s:17:\\"20 Oct 2019 02:17\\";i:660;s:17:\\"20

Oct 2019 03:36\\";i:660;s:17:\\"20 Oct 2019 04:34\\";i:660;s:17:\\"20 Oct 2019 05:10\\";i:660;s:17:\\"20 Oct 2019 06:07\\";i:660;s:17:\\"20 Oct 2019 07:07\\";i:660;s:17:\\"20 Oct 2019 09:40\\";i:660;s:17:\\"20

 Oct 2019 10:19\\";i:660;s:17:\\"20 Oct 2019 11:19\\";i:660;s:17:\\"20 Oct 2019 12:30\\";i:660;s:17:\\"20 Oct 2019 13:09\\";i:660;........

Which are very large.

~~rw-r~~r- 1 root root 128G Oct 29 20:58 wp_options_query.log
~~rw-r~~r- 1 root root 23G Feb 29 2024 wp_options_query_real.log
~~rw-r~~r- 1 root root 14M Oct 29 20:56 wp_options_query_real_notransient.log

Out of the full option log of 128G of queries,
23G were actually writing on the wp_options table
Out of which 14M only where not of the same type as sample.zip

Out of the 23G file I found these types of queries

18 fornext
6 forupdate
5 itemsUPDATE
1 permissionsUPDATE
60 tablesUPDATE
4 tonet
3295 UPDATE`wp_options`

So I can deduce that consecutively running queries like this reproduces the growth issue

Gabriel Tziotzis added a comment - 2024-10-29 19:03 I've managed to get a good look at the queries running in the DB by getting a huge log of queries in wp_options table. It seems the bulk is queries like in this file sample.zip or queries like this starting UPDATE `wp_options` SET `option_value` = 'a: 3 :{i: 0 ;a: 36 :{s: 12 :\\ "channel_hash\\" ;s: 32 :\\ "d4e5a1ae056299a65ce0d2b72b83ac00\\" ;s: 4 : \\ "name\\" ;s: 7 :\\ "Skroutz\\" ;s: 6 :\\ "fields\\" ;s: 7 :\\ "skroutz\\" ;s: 8 :\\ "taxonomy\\" ;s: 4 :\\ "none\\" ;s: 10 :\\ "utm_source\\" ;s: 9 :\\ "Bestprice\\" ;s: 4 :\\ "type\\" ;s: 26 :\\ "Comparison shopping engine\\" ;s: 11 :\\"projectn ame\\ ";s:9:\\" Bestprice\\ ";s:9:\\" countries\\ ";s:6:\\" Greece\\ ";s:10:\\" fileformat\\ ";s:3:\\" xml\\ ";s:4:\\" cron\\ ";s:6:\\" hourly\\ ";s:4:\\" step\\ ";s:3:\\" 100 \\ ";s:12:\\" project_hash\\ ";s:32:\\" hFlvg33s2QXLeYvd 1ozmtfHQvqMZK2DO\\ ";s:8:\\" filename\\ ";s:32:\\" hFlvg33s2QXLeYvd1ozmtfHQvqMZK2DO\\ ";s:13:\\" external_file\\ ";s:105:\\" https: //www.mydb.gr/wp-content/uploads/woo-product-feed-pro/xml/hFlvg33s2QXLeYvd1ozmtfHQvqMZ K2DO.xml\\ ";s:9:\\" query_log\\ ";s:79:\\" https: //www.mydb.gr/wp-content/uploads/woo-product-feed-pro/logs/query.log\\";s:16:\\"query_output_log\\";s:86:\\"https://www.mydb.gr/wp-content/uploads/woo-product-feed -pro/logs/query_output.log\\ ";s:10:\\" attributes\\ ";a:14:{i:0;a:5:{s:8:\\" rowCount\\ ";s:1:\\" 0 \\ ";s:9:\\" attribute\\ ";s:2:\\" id\\ ";s:6:\\" prefix\\ ";s:0:\\" \\ ";s:7:\\" mapfrom\\ ";s:2:\\" id\\ ";s:6:\\" suffix\\";s: 0 :\\ "\\" ;}i: 1 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 1 :\\ "1\\" ;s: 9 :\\ "attribute\\" ;s: 4 :\\ "name\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 5 :\\ "title\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}i: 2 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 1 :\ \ "2\\" ;s: 9 :\\ "attribute\\" ;s: 4 :\\ "link\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 4 :\\ "link\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}i: 3 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 1 :\\ "3\\" ;s: 9 :\\ "attribute\\" ;s: 5 :\\ "image\\" ;s : 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 5 :\\ "image\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}i: 4 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 1 :\\ "4\\" ;s: 9 :\\ "attribute\\" ;s: 8 :\\ "category\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\"mapfr om\\ ";s:21:\\" category_path_skroutz\\ ";s:6:\\" suffix\\ ";s:0:\\" \\ ";}i:5;a:5:{s:8:\\" rowCount\\ ";s:1:\\" 5 \\ ";s:9:\\" attribute\\ ";s:14:\\" price_with_vat\\ ";s:6:\\" prefix\\ ";s:0:\\" \\ ";s:7:\\" mapfrom\\ ";s:5:\\" pr ice\\ ";s:6:\\" suffix\\ ";s:0:\\" \\ ";}i:6;a:5:{s:8:\\" rowCount\\ ";s:1:\\" 6 \\ ";s:9:\\" attribute\\ ";s:12:\\" manufacturer\\ ";s:6:\\" prefix\\ ";s:0:\\" \\ ";s:7:\\" mapfrom\\ ";s:8:\\" pa_brand\\ ";s:6:\\" suffix\\ ";s:0:\\" \\ ";}i:7;a:5:{s:8:\\" rowCount\\ ";s:1:\\" 7 \\ ";s:9:\\" attribute\\ ";s:3:\\" mpn\\ ";s:6:\\" prefix\\ ";s:0:\\" \\ ";s:7:\\" mapfrom\\ ";s:3:\\" sku\\ ";s:6:\\" suffix\\ ";s:0:\\" \\ ";}i:9;a:5:{s:8:\\" rowCount\\ ";s:1:\\" 9 \\";s : 9 :\\ "attribute\\" ;s: 4 :\\ "size\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 7 :\\ "pa_size\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}i: 10 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 2 :\\ "10\\" ;s: 9 :\\ "attribute\\" ;s: 5 :\\ "color\\" ;s: 6 : \\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 8 :\\ "pa_color\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}i: 11 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 2 :\\ "11\\" ;s: 9 :\\ "attribute\\" ;s: 15 :\\ "additionalimage\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 7 :\\ "image_1\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}i: 12 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 2 :\\ "12\\" ;s: 9 :\\ "attribute\\" ;s: 15 :\\ "additionalimage\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 7 :\\"imag e_2\\ ";s:6:\\" suffix\\ ";s:0:\\" \\ ";}i:13;a:5:{s:8:\\" rowCount\\ ";s:2:\\" 13 \\ ";s:9:\\" attribute\\ ";s:15:\\" additionalimage\\ ";s:6:\\" prefix\\ ";s:0:\\" \\ ";s:7:\\" mapfrom\\ ";s:7:\\" image_3\\ ";s:6:\\" suffix\\";s: 0 :\\ "\\" ;}i: 15 ;a: 5 :{s: 8 :\\ "rowCount\\" ;s: 2 :\\ "15\\" ;s: 9 :\\ "attribute\\" ;s: 12 :\\ "availability\\" ;s: 6 :\\ "prefix\\" ;s: 0 :\\ "\\" ;s: 7 :\\ "mapfrom\\" ;s: 15 :\\ "pa_availability\\" ;s: 6 :\\ "suffix\\" ;s: 0 :\\ "\\" ;}}s: 6 :\\"addr ow\\ ";s:1:\\" 4 \\ ";s:6:\\" utm_on\\ ";s:2:\\" on\\ ";s:10:\\" utm_medium\\ ";s:3:\\" cpc\\ ";s:12:\\" utm_campaign\\ ";s:9:\\" Bestprice\\ ";s:8:\\" utm_term\\ ";s:2:\\" id\\ ";s:11:\\" utm_content\\ ";s:0:\\" \\ ";s:11:\\" woosea_ page\\ ";s:13:\\" filters_rules\\ ";s:6:\\" active\\ ";b:1;s:12:\\" last_updated\\ ";s:17:\\" 27 Feb 2024 03 : 03 \\ ";s:7:\\" running\\ ";s:10:\\" processing\\ ";s:11:\\" nr_products\\ ";s:4:\\" 2890 \\ ";s:21:\\" nr_products_proc essed\\ ";d:723;s:16:\\" history_products\\ ";a:20939:{s:17:\\" 18 Oct 2019 09 : 04 \\ ";i:662;s:17:\\" 18 Oct 2019 09 : 18 \\ ";i:662;s:17:\\" 18 Oct 2019 10 : 05 \\ ";i:662;s:17:\\" 18 Oct 2019 11 : 05 \\ ";i:662;s:17:\\" 18 Oct 20 19 12 : 11 \\ ";i:662;s:17:\\" 18 Oct 2019 14 : 05 \\ ";i:662;s:17:\\" 18 Oct 2019 14 : 50 \\ ";i:662;s:17:\\" 18 Oct 2019 15 : 09 \\ ";i:662;s:17:\\" 18 Oct 2019 16 : 28 \\ ";i:660;s:17:\\" 18 Oct 2019 17 : 53 \\ ";i:660;s:17:\\" 18 Oct 2 019 18 : 08 \\ ";i:660;s:17:\\" 18 Oct 2019 19 : 10 \\ ";i:660;s:17:\\" 18 Oct 2019 20 : 08 \\ ";i:660;s:17:\\" 18 Oct 2019 21 : 05 \\ ";i:660;s:17:\\" 18 Oct 2019 22 : 02 \\ ";i:660;s:17:\\" 18 Oct 2019 22 : 59 \\ ";i:660;s:17:\\" 19 Oct 2019 00 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 01 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 02 : 02 \\ ";i:660;s:17:\\" 19 Oct 2019 03 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 04 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 05 : 03 \\ ";i:660;s:17:\\" 19 Oct 2019 07 : 12 \\ ";i:660;s:17:\\" 19 Oct 2019 08 : 33 \\ ";i:660;s:17:\\" 19 Oct 2019 09 : 19 \\ ";i:660;s:17:\\" 19 Oct 2019 09 : 59 \\ ";i:660;s:17:\\" 19 Oct 2019 11 : 03 \\ ";i:660;s:17:\\" 19 Oct 2019 12 : 08 \\ ";i:660;s:17:\\" 19 Oc t 2019 13 : 10 \\ ";i:660;s:17:\\" 19 Oct 2019 14 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 15 : 00 \\ ";i:660;s:17:\\" 19 Oct 2019 18 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 19 : 18 \\ ";i:660;s:17:\\" 19 Oct 2019 20 : 52 \\ ";i:660;s:17:\\" 19 O ct 2019 21 : 07 \\ ";i:660;s:17:\\" 19 Oct 2019 23 : 05 \\ ";i:660;s:17:\\" 19 Oct 2019 23 : 37 \\ ";i:660;s:17:\\" 20 Oct 2019 00 : 16 \\ ";i:660;s:17:\\" 20 Oct 2019 02 : 00 \\ ";i:660;s:17:\\" 20 Oct 2019 02 : 17 \\ ";i:660;s:17:\\" 20 Oct 2019 03 : 36 \\ ";i:660;s:17:\\" 20 Oct 2019 04 : 34 \\ ";i:660;s:17:\\" 20 Oct 2019 05 : 10 \\ ";i:660;s:17:\\" 20 Oct 2019 06 : 07 \\ ";i:660;s:17:\\" 20 Oct 2019 07 : 07 \\ ";i:660;s:17:\\" 20 Oct 2019 09 : 40 \\ ";i:660;s:17:\\" 20 Oct 2019 10 : 19 \\ ";i:660;s:17:\\" 20 Oct 2019 11 : 19 \\ ";i:660;s:17:\\" 20 Oct 2019 12 : 30 \\ ";i:660;s:17:\\" 20 Oct 2019 13 : 09 \\";i: 660 ;........ Which are very large. rw-r r - 1 root root 128G Oct 29 20:58 wp_options_query.log rw-r r - 1 root root 23G Feb 29 2024 wp_options_query_real.log rw-r r - 1 root root 14M Oct 29 20:56 wp_options_query_real_notransient.log Out of the full option log of 128G of queries, 23G were actually writing on the wp_options table Out of which 14M only where not of the same type as sample.zip Out of the 23G file I found these types of queries 18 fornext 6 forupdate 5 itemsUPDATE 1 permissionsUPDATE 60 tablesUPDATE 4 tonet 3295 UPDATE`wp_options` So I can deduce that consecutively running queries like this reproduces the growth issue

Marko Mäkelä added a comment - 2024-11-10 06:02

ioweb.gr, thank you for the update. As far as I understand, the column wp_options.option_value is basically a key/value store, a database within database. This could be a good design when the contents is relatively small or updated infrequently.

InnoDB will primarily try to store an entire record in a B-tree leaf page. If this is not possible, then some of the longest columns will be stored elsewhere in the tablespace, in a singly-linked list of BLOB pages. Updating a BLOB always involves making a copy. The old copy would be deleted when the history of the committed UPDATE transaction is purged.

For what it is worth, in the most recent quarterly releases (MariaDB Server 10.6.20, 10.11.10 and so on), ~~MDEV-34515~~ was one more improvement to the performance of the purge subsystem. It might allow the old values of the UPDATE to be freed soon enough so that the wp_options.ibd file would have fewer chances to grow in size.

In any case, it would be great if you could provide a self-contained SQL test case for reproducing the anomaly so that we can investigate if anything could be improved further. You may anonymize some data. In particular, if the option_value column is not indexed, then only the length of the column should matter, and we might just execute statements like UPDATE wp_options SET option_value=REPEAT('x',12345) WHERE ….

Marko Mäkelä added a comment - 2024-11-10 06:02 ioweb.gr , thank you for the update. As far as I understand, the column wp_options.option_value is basically a key/value store, a database within database. This could be a good design when the contents is relatively small or updated infrequently. InnoDB will primarily try to store an entire record in a B-tree leaf page. If this is not possible, then some of the longest columns will be stored elsewhere in the tablespace, in a singly-linked list of BLOB pages. Updating a BLOB always involves making a copy. The old copy would be deleted when the history of the committed UPDATE transaction is purged. For what it is worth, in the most recent quarterly releases (MariaDB Server 10.6.20, 10.11.10 and so on), MDEV-34515 was one more improvement to the performance of the purge subsystem. It might allow the old values of the UPDATE to be freed soon enough so that the wp_options.ibd file would have fewer chances to grow in size. In any case, it would be great if you could provide a self-contained SQL test case for reproducing the anomaly so that we can investigate if anything could be improved further. You may anonymize some data. In particular, if the option_value column is not indexed, then only the length of the column should matter, and we might just execute statements like UPDATE wp_options SET option_value=REPEAT('x',12345) WHERE … .

Gabriel Tziotzis added a comment - 2024-11-17 08:17 - edited

Just an update @Marko Mäkelä I still haven't given up on a way to replicate this consistently but the nature of the queries is weird in the system itself. I'm not even sure of the routine that generates them and looking into that. Perhaps checking the php code side that generates them will hint at a good way to consistently replicate the issue as well.

As far as the wp_options table, indeed wordpress is using this table for a key/value storage type but most likely it wasn't intended to be storing blobs like that and Woocommerce is abusing this table not thinking about possible consequences.

But still the space at some point should be reclaimed fully while it isn't.

Also in my current mariadb version the history list length keeps growing substantially

mariadb Ver 15.1 Distrib 10.6.20-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

The file can be downloaded here it's over 10MB so I can't upload it via the system
https://plik.yourstaging.eu/file/g3gVhth1WMVtrna0/EYQdUPCpO7Ya1KZW/listlength.zip

Gabriel Tziotzis added a comment - 2024-11-17 08:17 - edited Just an update @Marko Mäkelä I still haven't given up on a way to replicate this consistently but the nature of the queries is weird in the system itself. I'm not even sure of the routine that generates them and looking into that. Perhaps checking the php code side that generates them will hint at a good way to consistently replicate the issue as well. As far as the wp_options table, indeed wordpress is using this table for a key/value storage type but most likely it wasn't intended to be storing blobs like that and Woocommerce is abusing this table not thinking about possible consequences. But still the space at some point should be reclaimed fully while it isn't. Also in my current mariadb version the history list length keeps growing substantially mariadb Ver 15.1 Distrib 10.6.20-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2 The file can be downloaded here it's over 10MB so I can't upload it via the system https://plik.yourstaging.eu/file/g3gVhth1WMVtrna0/EYQdUPCpO7Ya1KZW/listlength.zip

Marko Mäkelä added a comment - 2024-11-21 15:34

ioweb.gr, I think that we could try to repeat this issue by running a workload that updates long BLOBs. I suppose that the WHERE condition is on the PRIMARY KEY. Can you post the result of the following?

SELECT MIN(LENGTH(option_value)),MAX(LENGTH(option_value)),AVG(LENGTH(option_value)) FROM wp_options;

This would give us some idea of what kind of workload to run.

Marko Mäkelä added a comment - 2024-11-21 15:34 ioweb.gr , I think that we could try to repeat this issue by running a workload that updates long BLOBs. I suppose that the WHERE condition is on the PRIMARY KEY . Can you post the result of the following? SELECT MIN (LENGTH(option_value)), MAX (LENGTH(option_value)), AVG (LENGTH(option_value)) FROM wp_options; This would give us some idea of what kind of workload to run.

Gabriel Tziotzis added a comment - 2024-11-22 13:34

Hi, of course here it is

 SELECT MIN(LENGTH(option_value)),MAX(LENGTH(option_value)),AVG(LENGTH(option_value)) FROM wp_options;

+---------------------------+---------------------------+---------------------------+

| MIN(LENGTH(option_value)) | MAX(LENGTH(option_value)) | AVG(LENGTH(option_value)) |

+---------------------------+---------------------------+---------------------------+

|                         0 |                    758101 |                  999.0166 |

+---------------------------+---------------------------+---------------------------+

1 row in set (0.019 sec)

Gabriel Tziotzis added a comment - 2024-11-22 13:34 Hi, of course here it is SELECT MIN(LENGTH(option_value)),MAX(LENGTH(option_value)),AVG(LENGTH(option_value)) FROM wp_options; +---------------------------+---------------------------+---------------------------+ | MIN(LENGTH(option_value)) | MAX(LENGTH(option_value)) | AVG(LENGTH(option_value)) | +---------------------------+---------------------------+---------------------------+ | 0 | 758101 | 999.0166 | +---------------------------+---------------------------+---------------------------+ 1 row in set ( 0.019 sec)

MariaDB Server

mariadb filesystem size increasing for wp_options.ibd files

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration