Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4794

Combine all processes in a single one. Phase 1.

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 6.1.1
    • Fix Version/s: 22.08
    • Component/s: None
    • Labels:
      None

      Description

      As of 6.1.1 MCS leverages 8 processes:

      1. StorageManager (with S3 only)
      2. mcs-loadbrm.py logic
      3. workernode
      4. controllernode (on primary node only)
      5. ExeMgr
      6. PrimProc
      7. WriteEngineServer
      8. DMLProc (on primary node only)
      9. DDLProc (on primary node only)

      The separation:

      • makes the startup process fragile, errorpron.
      • limits the way MCS accounts and limits its resource consumption.
      • forces to use shared memory to enable metadata shared visibility.
      • single-node has to copy significant amounts of data mutiple times sending it over the loopback network hop.

      The last two are more important comparing with the first.
      The solution is to combine all processes together. At phase one one must run all services in a single runtime.

      The startup procedure will become a sequential partially conditional startup:

      1. Start StorageManager if needed
      2. Either call mcs-loadbrm.py or move the logic into cpp
      3. Depending if the node is primary or not start workernode 1 or 2(calls for additional research)
      4. Depending if the node is primary or not start controllernode or not
      5. Start ExeMgr, PrimProc, WriteEngineServer
      6. Depending if the node is primary or not start DMLProc, DDLProc or not
        Future information about the sequence can be deduced from systemd units interdependency.

      The shutdown procedure must take into account the fact that with S3 one needs to save shared mem before shutting down StorageManager in the overal columnstore process.
      The shutdown must follow the reverted sequence:

      1. DMLProc (on primary node only)
      2. DDLProc (on primary node only)
      3. WriteEngineServer
      4. PrimProc
      5. ExeMgr
      6. controllernode (on primary node only)
      7. workernode
      8. mcs-savebrm.py logic
      9. StorageManager (with S3 only)

      Phase 1 doesn't affect the way MCS build process produces and leverages .so libraries for different facilities. The future clean up should reduce the number of .so libraries produced for the release code. It is wise to leverage .so objects in day-to-day development b/c separation into multiple .so saves time on re-compilation.

      There are number of shared or overlaping symbols in different services now, e.g.
      EM and PP are both using different kinds of ThreadPools.

      As of 6.1.1 CMAPI delays controllernode start at the primary untill all workernodes' sockets are available. It also delays DML/DDLProcs start untill controllernode socket is available. CMAPI must be changed to allow to start overal columnstore processes w/o the mentioned checks.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              toddstoffel Todd Stoffel
              Reporter:
              drrtuy Roman
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.