Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3379

join happened on the UM instead of being pushed down to the PM

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Stalled (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2.4
    • Fix Version/s: 22.08.3
    • Component/s: Connector
    • Labels:
    • Environment:
      2um 2pm with local query

      Description

      Reported and ask by customer:

      Inquiry, it seems that the optimizer should have pushed this join to the PM instead of leaving it on the UM, unless somehow this is a CTE requirement ? Can we make those trace more clear about where a join is happening and when data is transfered back and forth ? Also Once on the UM, it could be pushed down to the PM by block of PmMaxMemorySmallSide ??

      I am playing with the query in this ticket, and removed the limit to transform all the data (could go to 15Billion rows), and with the query as it is we are hitting a um memory join limit (on the local pm).

      the trace looks like this, and it wasn't very clear to me why and where there was a UM join, the last HJS happens on the PM...but the actual join happens on the UM side, also a hint is the TNS step before on the UM transferring 1836811 rows to the UM :

      Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
      BPS PM b 14516 (batch_name) 0 68 0 0.002 675
      BPS PM c 16955 (batch_name,num_sample_id,sample_id) 1347 2012 0 0.058 155273
      HJS PM c-b 16955 - - - - ----- -
      TNS UM - - - - - - 0.038 100
      BPS PM b 14516 (batch_name,start_rid) 123 81 0 0.060 675
      BPS PM c 16955 (batch_name,num_sample_id) 1678 1431 0 0.097 155273
      HJS PM c-b 16955 - - - - ----- -
      TNS UM - - - - - - 0.066 100
      TAS UM - - - - - - 0.000 1
      TNS UM - - - - - - 0.000 1
      BPS PM b 14516 (batch_name,end_rid) 59 78 0 0.047 675
      BPS PM c 16955 (batch_name,num_sample_id) 0 1419 0 0.028 155273
      HJS PM c-b 16955 - - - - ----- -
      TNS UM - - - - - - 0.023 100
      TAS UM - - - - - - 0.000 1
      TNS UM - - - - - - 0.000 1
      BPS PM a 14505 (allele1,allele2,batch_name,platform_id,row_id,sample_id,snp_name) 26627 27862 2986068 0.638 1836811
      HJS PM a-b 14505 - - - - ----- -
      TNS UM - - - - - - 0.472 1836811
      BPS PM d 16123 (assay_id,platform_id,snp_name) 6899 6396 0 1.089 53983
      HJS PM d-a 16123 - - - - ----- -
      TNS UM - - - - - - 1.730 1795249
      TNS UM - - - - - - 4.008 1795249

      after some rework, I managed to pushdown that join to the PM only avoid the memory limitation. now I can transform the data without limit of rows (memory) and this trace all happens on the PM at the end. :

      Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
      BPS PM b 14516 (batch_name) 0 68 0 0.002 675
      BPS PM c 16955 (batch_name,num_sample_id,sample_id) 1347 2012 0 0.137 155273
      HJS PM c-b 16955 - - - - ----- -
      TNS UM - - - - - - 0.045 100
      BPS PM b 14516 (batch_name,start_rid) 123 81 0 0.072 675
      BPS PM c 16955 (batch_name,num_sample_id) 1678 1435 0 0.128 155273
      HJS PM c-b 16955 - - - - ----- -
      TNS UM - - - - - - 0.035 100
      TAS UM - - - - - - 0.000 1
      TNS UM - - - - - - 0.000 1
      BPS PM b 14516 (batch_name,end_rid) 59 78 0 0.020 675
      BPS PM c 16955 (batch_name,num_sample_id) 0 1419 0 0.028 155273
      HJS PM c-b 16955 - - - - ----- -
      TNS UM - - - - - - 0.023 100
      TAS UM - - - - - - 0.000 1
      TNS UM - - - - - - 0.000 1
      BPS PM a 14505 (allele1,allele2,batch_name,platform_id,row_id,sample_id,snp_name) 22019 21725 746519 1.161 1836811
      HJS PM a-b 14505 - - - - ----- -
      BPS PM d 16123 (assay_id,platform_id,snp_name) 6899 6398 0 1.350 53983
      HJS PM d-a 16123 - - - - ----- -
      TNS UM - - - - - - 0.673 1795249
      TNS UM - - - - - - 1.634 1795249
      TNS UM - - - - - - 3.726 1795249

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            hill David Hill (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.