Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-6432

ASan/LSan memory leak in anyNullInTheColumn(): RowGroup never freed on ALTER TABLE ... NOT NULL

    XMLWordPrintable

Details

    • Bug
    • Status: In Testing (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • 2026-7

    Description

      Summary

      AddressSanitizer / LeakSanitizer detected a memory leak in the ColumnStore plugin during the CI regression run. A rowgroup::RowGroup object (and its internal buffers) allocated in anyNullInTheColumn() is never freed on any code path. The leak occurs every time an ALTER TABLE ... CHANGE/MODIFY <col> ... NOT NULL is executed against a ColumnStore table (i.e. whenever a NOT NULL constraint is added to an existing column).

      • Detected by: ASan/LSan, CI build *#3840, stage **#22* (stable-23.10 ubuntu:24.04 amd64 10.6-enterprise ASanRegr)
      • Build: https://ci.columnstore.mariadb.net/mariadb-corporation/mariadb-columnstore-engine/3840 (It's a refactored CI to split sanitizer tests to make them fit within time limits)
      • Component build: ColumnStore 25.10.5, MariaDB 10.6.26.22, Ubuntu 24.04, gcc-13
      • Total leaked: 1738 byte(s) in 32 allocation(s) (deterministic — see below)

      Where / when it was caught

      The leak is reported by LeakSanitizer at mariadb server shutdown (LSan runs at process exit), after the last test (test500). The process that produced the report is the server itself: asan.mariadb.755159.

      The leaked allocations are introduced by test005 — "Working DML Test", specifically the query file mysql/queries/working_dml/misc/notnullconstraint.negative.sql, which contains the two statements that drive the leaking code path:

      create table notnulltest5 (col_1 bigint) engine=columnstore;
      alter table notnulltest5 change column col_1 col_1 bigint not null;   -- triggers anyNullInTheColumn()
       
      create table notnulltest6 (col_1 bigint) engine=columnstore;
      ...
      alter table notnulltest6 change column col_2 col_2 bigint not null;   -- triggers anyNullInTheColumn()
      

      This matches the report exactly: every leak block reports in 2 object(s), i.e. the leaking path ran exactly twice during the whole regression run — once per ALTER ... NOT NULL above.

      Root cause

      anyNullInTheColumn() in storage/columnstore/columnstore/dbcon/mysql/ha_mcs_ddl.cpp allocates a RowGroup on a raw pointer and never deletes it:

      // ha_mcs_ddl.cpp:667
      rowgroup::RowGroup* rowGroup = 0;
      ...
      // ha_mcs_ddl.cpp:717-727
      if (!rowGroup)
      {
        // This is mete data
        rowGroup = new rowgroup::RowGroup();   // <-- line 720: leaked
        rowGroup->deserialize(msg);            // <-- allocates internal vectors / shared arrays (indirect leaks)
        ...
        continue;
      }
      ...
      return anyRow;                           // <-- line 748: returns without delete rowGroup
      

      There is no delete rowGroup anywhere in the function (confirmed by grep). The function also exits via several throw runtime_error(...) paths, all of which leak the same object once it has been allocated. The indirect leaks in the report (RowGroup::RowGroup() / RowGroup::deserialize() allocating std::vector<uint32_t>, std::vector<charset_info_st const*>, the boost::shared_ptr<bool[]>, etc.) are all owned by this leaked RowGroup.

      Call path (from the report):

      operator new
        anyNullInTheColumn                ha_mcs_ddl.cpp:720
        ProcessDDLStatement               ha_mcs_ddl.cpp:2046   (DDL_NOT_NULL branch on ALTER)
        ha_mcs_impl_create_               ha_mcs_ddl.cpp:2636
        ha_mcs_impl_create                ha_mcs_impl.cpp:2812
        ha_mcs::create                    ha_mcs.cpp:1078
        ha_mcs_cache::create              ha_mcs.cpp:1440
        handler::ha_create -> ha_create_table -> mysql_alter_table -> Sql_cmd_alter_table::execute
      

      (ALTER TABLE adding a NOT NULL constraint goes through the create path on the rebuilt table; ProcessDDLStatement at line ~2037 detects DDL_NOT_NULL and calls anyNullInTheColumn() to verify no existing rows are NULL.)

      A few minor secondary leaks from the DDL parser (ddlparse() in ddl-gram.cpp:2487/2563, reached via ProcessDDLStatement line 767) appear in the same report and should be reviewed at the same time, though the dominant/clear-cut leak is the RowGroup.

      Suggested fix

      Make rowGroup an owning smart pointer (e.g. std::unique_ptr<rowgroup::RowGroup>) or add delete rowGroup on all exit paths (return and the throw sites). A unique_ptr (or wrapping the body in RAII) is preferred since the function has multiple throw exits. The SimpleFilter/ConstantColumn/SimpleColumn raw {{new}}s earlier in the function should also be reviewed for ownership while in the area.

      Reproduction (manual)

      On an ASan-instrumented ColumnStore build:

      CREATE TABLE t (a BIGINT) ENGINE=ColumnStore;
      ALTER TABLE t CHANGE COLUMN a a BIGINT NOT NULL;   -- leaks one RowGroup per execution
      

      Shut the server down cleanly; LeakSanitizer reports the leak rooted at anyNullInTheColumn.

      ASan report

      Full LeakSanitizer report attached (asan.mariadb.755159). Durable copy on the CI artifact store:
      https://cspkg.s3.amazonaws.com/stable-23.10/pull_request/3840/10.6-enterprise/amd64/ubuntu24.04_ASanRegr/core/asan.mariadb.755159

      Report tail:

      SUMMARY: AddressSanitizer: 1738 byte(s) leaked in 32 allocation(s).
      

      Related

      • MCOL-6018 — ASAN memory leaks and UBSAN in plugin (broader ASan cleanup)

      Attachments

        Activity

          People

            vasily.kozhukhovskiy Vasily Kozhukhovskiy
            vasily.kozhukhovskiy Vasily Kozhukhovskiy
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 2d
                2d
                Remaining:
                Remaining Estimate - 2d
                2d
                Logged:
                Time Spent - Not Specified
                Not Specified

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.