Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5366

Research TPC-DS queries failing due to memory/resource constraints

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Critical
    • Resolution: Unresolved
    • None
    • Icebox
    • None
    • None

    Description

      Following TPC-DS queries need more investigation as they are failing due to memory related errors:

      query2:
      On a 64GB RAM, 16 cores, 256GB SSD system:

        [root@tntnatbry-rockylinux8 queries]# mariadb tpc_ds < query2.sql 
        ERROR 1815 (HY000) at line 2: Internal error: (437) MCS-2001: Join or subselect exceeds memory limit.
      

      On a 128GB RAM, 32cores, 256GB SSD system:

        [root@tntnatbry-rockylinux8-2 queries]# mariadb tpc_ds < query2.sql
        ERROR 1815 (HY000) at line 2: Internal error: MCS-2003: Aggregation/Distinct memory limit is exceeded.
      

      Enabling disk based aggregation and joins resulted in query running for over 2.5 hours at which point it was killed with ctl+c signal.

      query14a:
      On a 64GB RAM, 16 cores, 256GB SSD system:

        [root@tntnatbry-rockylinux8 queries]# mariadb tpc_ds < query14a.sql : FROZE
      

      On a 128GB RAM, 32cores, 256GB SSD system:

      [root@tntnatbry-rockylinux8-2 queries]# mariadb tpc_ds < query14a.sql
      ERROR 1815 (HY000) at line 2: Internal error: InetStreamSocket::readToMagic: Remote is closed
       
      Dec 14 20:44:56 tntnatbry-rockylinux8-2 env[229238]: Too much memory allocated!
      Dec 14 20:44:56 tntnatbry-rockylinux8-2 env[229238]: ExeMgr[229238]: 56.449486 |0|0|0| C 16 CAL0044: FATAL ERROR: ExeMgr has allocated too much memory! Percent allocation-96, allowed-95. ExeMgr is restarting.
      Dec 14 20:44:58 tntnatbry-rockylinux8-2 ExeMgr[229238]: 56.449486 |0|0|0| C 16 CAL0044: FATAL ERROR: ExeMgr has allocated too much memory! Percent allocation-96, allowed-95. ExeMgr is restarting.
      Dec 14 20:45:39 tntnatbry-rockylinux8-2 env[229238]: Warning: 2072 bytes lost at 0x7f3f6b9d7270, allocated by T@0 at 0x7f5e1027d7a6, 0x7f5e10275b24, 0x7f5e1027b947, 0x7f5e10274d07, 0x7f5e10275296, 0x7f5e0ec3012e, 0x7f5e0ec30161, 0x7f5e10212545
      Dec 14 20:46:10 tntnatbry-rockylinux8-2 kernel: Unspecified invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
      

      ^ PrimProc crash

      Enabling disk based aggregation and joins:

      [root@tntnatbry-rockylinux8-2 queries]# mariadb tpc_ds < query14a.sql
      ERROR 1815 (HY000) at line 2: Internal error: InetStreamSocket::readToMagic: Remote is closed
       
      Dec 16 23:02:35 tntnatbry-rockylinux8-2 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mcs-primproc.service,task=PrimProc,pid=261137,uid=994
      Dec 16 23:02:35 tntnatbry-rockylinux8-2 kernel: Out of memory: Killed process 261137 (PrimProc) total-vm:276619940kB, anon-rss:129170584kB, file-rss:0kB, shmem-rss:0kB, UID:994 pgtables:334836kB oom_score_adj:0
      Dec 16 23:02:35 tntnatbry-rockylinux8-2 kernel: oom_reaper: reaped process 261137 (PrimProc), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      Dec 16 23:02:35 tntnatbry-rockylinux8-2 env[255052]: Warning: 2072 bytes lost at 0x7f99e79ce870, allocated by T@0 at ??:0, ??:0, ??:0, ??:0, ??:0, 0x7fb3ea57d12e, ??:0, Printing to addr2line failed
      Dec 16 23:02:35 tntnatbry-rockylinux8-2 env[255052]: 0x7fb3ebb5f545
      Dec 16 23:02:38 tntnatbry-rockylinux8-2 kernel: PrimProc invoked oom-killer: gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0
      

      ^ kernel OOM killed PrimProc

      query72:
      On a 64GB RAM, 16cores, 256GB SSD system:

      [root@tntnatbry-rockylinux8 queries]# mariadb tpc_ds < query72.sql
      ERROR 1815 (HY000) at line 2: Internal error: InetStreamSocket::readToMagic: Remote is closed
      

      ^ PrimProc crash

      On a 128GB RAM, 32cores, 256GB SSD system with and without disk based aggregation and disk based joins:

      [root@tntnatbry-rockylinux8-2 queries]# mariadb tpc_ds < query72.sql : HUNG for 1hr45mins
      

      query95

      [root@tntnatbry-rockylinux8 queries]# mariadb tpc_ds < query95.sql
      ERROR 1815 (HY000) at line 2: Internal error: (437) MCS-2001: Join or subselect exceeds memory limit.
      

      query67a

      [root@tntnatbry-rockylinux8-2 queries]# mariadb tpc_ds < query67a.sql
      ERROR 1815 (HY000) at line 2: Internal error: TupleAggregateStep::threadedAggregateRowGroups()[19] MCS-2003: Aggregation/Distinct memory limit is exceeded.
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tntnatbry Gagan Goel (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.