Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28038

Why is there a big gap in the actual Table space? (calculated vs information_schema vs file size)

Details

    Description

      I have a InnoDB table with the following columns:

      `BIGINT` -> 8 bytes
      `BIGINT` -> 8 bytes
      `ENUM(0,1)` -> 1 byte
      `MEDIUMINT` -> 3 bytes
      `INT` -> 4 bytes

      = 24 bytes per row

      It contains 10454004 rows (based on COUNT), so I calculate the Data size as *250MB*.

      This table's `PRIMARY` key is `BIGINT`, `BIGINT`, `ENUM`.
      And has an `INDEX` on the `INT`.

      Based on `information_schema`,
      `DATA_LENGTH` = *530.8MiB*
      `INDEX_LENGTH` = 272.0MiB
      `DATA_FREE` = 6.0MiB
      = 808.8MiB = *848 MB*

      Also, `AVG_ROW_LENGTH` = 87 instead of 24

      And then I look at the actual `.ibd` file size, which is reported as:
      1451229184 bytes = *1451.23 MB*

      I understand there is fragmentation and sparse files in question (the records on this table are regularly expired, but never had a bigger retention time than currently is).
      But I'm not understanding why do I see 3 different sizes, with such big gap each.

      Goes from 250MB of real calculable data, to 530MiB of Data.
      And `DATA_FREE` is only 6MiB.

      And 848 MB reported by `information_schema` (which per my understanding includes "free space" as well), much lower than the actual `.ibd` file.

      *What am I missing or not considering?*

      Attachments

        Issue Links

          Activity

            nunop, thank you for the report. This could be related to MDEV-13013 and MDEV-26782, which report that InnoDB is extending data files even though they already contain plenty of free pages.

            Can you provide some SQL for reproducing this? It can be smaller, and I think that it can mostly be dummy data. What could be relevant is the number of pages, so it could be useful to set innodb_page_size=4k for reproducing this inconsistency while using less data.

            You could use MDEV-26782 as a starting point. Yes, it is about temporary tables, and the temporary tablespace also contains undo log pages, so it is not an exact match of this report. I would recommend using innochecksum -S tablename.ibd for gathering statistics on the page allocation within the file.

            marko Marko Mäkelä added a comment - nunop , thank you for the report. This could be related to MDEV-13013 and MDEV-26782 , which report that InnoDB is extending data files even though they already contain plenty of free pages. Can you provide some SQL for reproducing this? It can be smaller, and I think that it can mostly be dummy data. What could be relevant is the number of pages, so it could be useful to set innodb_page_size=4k for reproducing this inconsistency while using less data. You could use MDEV-26782 as a starting point. Yes, it is about temporary tables, and the temporary tablespace also contains undo log pages, so it is not an exact match of this report. I would recommend using innochecksum -S tablename.ibd for gathering statistics on the page allocation within the file.
            nunop Nuno added a comment - - edited

            marko - thank you for the answer and references.

            I said above that the records on this table are regularly expired. Every day, there are new records being inserted all the time, and then once a day, old records deleted.
            Am I right to assume that the space "reclaimed" from the old records should be used for new records?
            Note also that the PRIMARY KEY is not an autoincrement. It's based on two foreign IDs (e.g. User ID <-> Post ID)
            I wonder if this could somehow be the reason for bigger fragmentation, since the records aren't added in a consistent order.

            CREATE TABLE `mytable` (
              `uid` bigint(18) unsigned NOT NULL,
              `tid` bigint(18) unsigned NOT NULL,
              `fst` enum('0','1') COLLATE utf8mb4_unicode_ci NOT NULL,
              `sn` mediumint(8) unsigned NOT NULL DEFAULT 1,
              `ls` int(10) unsigned NOT NULL,
              PRIMARY KEY (`uid`,`tid`,`fst`) USING BTREE,
              KEY `ls` (`ls`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
            

            > Can you provide some SQL for reproducing this?

            I'm not sure if I can easily provide some SQL that will visibly reproduce this.

            Basically we have LOADS of INSERT queries every day, adding random: UID, TID, fst=1 or 2, ls=UNIX_TIMESTAMP()
            Then a number of UPDATE queries incrementing "sn"

            Once a day, DELETE from this table where ls < 5 months.

            Hope this makes sense.

            Thank you very much.

            nunop Nuno added a comment - - edited marko - thank you for the answer and references. I said above that the records on this table are regularly expired. Every day, there are new records being inserted all the time, and then once a day, old records deleted. Am I right to assume that the space "reclaimed" from the old records should be used for new records? Note also that the PRIMARY KEY is not an autoincrement. It's based on two foreign IDs (e.g. User ID <-> Post ID) I wonder if this could somehow be the reason for bigger fragmentation, since the records aren't added in a consistent order. CREATE TABLE `mytable` ( `uid` bigint (18) unsigned NOT NULL , `tid` bigint (18) unsigned NOT NULL , `fst` enum( '0' , '1' ) COLLATE utf8mb4_unicode_ci NOT NULL , `sn` mediumint(8) unsigned NOT NULL DEFAULT 1, `ls` int (10) unsigned NOT NULL , PRIMARY KEY (`uid`,`tid`,`fst`) USING BTREE, KEY `ls` (`ls`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_unicode_ci > Can you provide some SQL for reproducing this? I'm not sure if I can easily provide some SQL that will visibly reproduce this. Basically we have LOADS of INSERT queries every day, adding random: UID, TID, fst=1 or 2, ls=UNIX_TIMESTAMP() Then a number of UPDATE queries incrementing "sn" Once a day, DELETE from this table where ls < 5 months. Hope this makes sense. Thank you very much.

            MDEV-32663 is a similar report and possibly already fixed in the upcoming quarterly releases.

            marko Marko Mäkelä added a comment - MDEV-32663 is a similar report and possibly already fixed in the upcoming quarterly releases.

            People

              Unassigned Unassigned
              nunop Nuno
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.