Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5041

Table data bloat EMPTY records removal tool

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 1.2.5, 5.6.5, 6.3.1
    • Icebox
    • ?
    • Red Hat Enterprise Linux Server release 7.6
      Mariadb 10.3 CS 1.2.5

    Description

      Hello,

      is it somehow possible to maintain bloat tables?
      One of our Servers needed 1,8T for data, after we reinstalled mariadb and after a fresh ETL the datasize is about 350Gb.
      So we assume that the database is bloat somehow over time.
      mysqlcheck --optimize is not working with Engine ColumnStore.
      ty in advance.
      br Andreas

      Attachments

        Activity

          aeae81 andreas eschbacher created issue -
          toddstoffel Todd Stoffel (Inactive) made changes -
          Field Original Value New Value
          Rank Ranked higher
          toddstoffel Todd Stoffel (Inactive) made changes -
          Affects Version/s Icebox [ 22302 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Affects Version/s 1.2.5 [ 23613 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Fix Version/s Icebox [ 22302 ]
          Affects Version/s 6.2.3 [ 27102 ]
          Affects Version/s Icebox [ 22302 ]
          drrtuy Roman added a comment - - edited

          Greetings aeae81.
          In our columnar format EMPTY records are represented by a dedicated number(depends on the underlying data type). When one deletes a record EMPTY records replaces actual values to tell the record has been removed. However these records are not reused by cpimport or INSERT..SELECT only by INSERT so in most cases they are waste disk space.
          No, we don't have bloat removal tool yet but we have for the tool in mind. The only way ATM is to re-ingest the data.

          drrtuy Roman added a comment - - edited Greetings aeae81 . In our columnar format EMPTY records are represented by a dedicated number(depends on the underlying data type). When one deletes a record EMPTY records replaces actual values to tell the record has been removed. However these records are not reused by cpimport or INSERT..SELECT only by INSERT so in most cases they are waste disk space. No, we don't have bloat removal tool yet but we have for the tool in mind. The only way ATM is to re-ingest the data.
          drrtuy Roman made changes -
          Affects Version/s 6.3.1 [ 25801 ]
          Affects Version/s 5.6.5 [ 27515 ]
          Affects Version/s 1.2.5 [ 23613 ]
          Affects Version/s 6.2.3 [ 27102 ]
          drrtuy Roman made changes -
          Summary Optimize Tables Table data bloat EMPTY records removal tool

          would https://jira.mariadb.org/browse/MCOL-5021 solve the issue with bloated tables?

          aeae81 andreas eschbacher added a comment - would https://jira.mariadb.org/browse/MCOL-5021 solve the issue with bloated tables?

          Re: MCOL-5021. Unfortunately No. This change is to speed up the process of DELETE operations, but it doesn't address the file bloat issue.

          There is a statement to remove entire partitions. But that is tricky as you would have to know that all data in the partition is no longer needed. Generally, if you are deleting random parts of the table, this is very difficult to ascertain.

          David.Hall David Hall (Inactive) added a comment - Re: MCOL-5021 . Unfortunately No. This change is to speed up the process of DELETE operations, but it doesn't address the file bloat issue. There is a statement to remove entire partitions. But that is tricky as you would have to know that all data in the partition is no longer needed. Generally, if you are deleting random parts of the table, this is very difficult to ascertain.

          HI David,

          We found an acceptable solution,
          we created a script that finds bloated tables and creates them again using insert into select.

          br andreas

          aeae81 andreas eschbacher added a comment - HI David, We found an acceptable solution, we created a script that finds bloated tables and creates them again using insert into select. br andreas

          People

            Unassigned Unassigned
            aeae81 andreas eschbacher
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.