Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16249

CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default

Details

    Description

      When doing 'CHECKSUM TABLE t' on a partitioned spider table it will fetch all rows from the different data nodes sequentially and store the result on the spider head. On very large tables the mysqld process will be killed due to OOM (without a trace in the error log).

      One suggested workaround is to set spider_quick_mode = 3 before running such statement, but we would prefer that the command is sent to each data node and executed in parallel and then aggregate (xor?) the result on the spider head.

      This appears to be a specific case of the more general issue that a large result may cause an Out-Of-Memory on the spider head. This should never be the case, and thus we would prefer that spider have an upper limit on how much results it can cache on the spider head, or some other way to avoid a valid query causing a server crash due to out of memory.

      Attachments

        Issue Links

          Activity

            mattiasjonsson Mattias Jonsson created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Fix Version/s 10.3 [ 22126 ]
            Assignee Jacob Mathew [ jacob-mathew ]
            mattiasjonsson Mattias Jonsson made changes -
            Attachment heap_profile.pdf [ 45673 ]

            The attached heap_profile.pdf was created by:
            export HEAPPROFILE=/tmp/mybin.hprof
            export LD_PRELOAD="/usr/lib64/libtcmalloc_and_profiler.so.4"

            start mysqld

            ran the CHECKSUM TABLE big_table and then generated a pdf by:
            sudo pprof --base=/tmp/mybin.hprof.0554.heap --pdf /usr/local/mysql/bin/mysqld /tmp/mybin.hprof.0632.heap > ~/heap_profile.pdf

            mattiasjonsson Mattias Jonsson added a comment - The attached heap_profile.pdf was created by: export HEAPPROFILE=/tmp/mybin.hprof export LD_PRELOAD="/usr/lib64/libtcmalloc_and_profiler.so.4" start mysqld ran the CHECKSUM TABLE big_table and then generated a pdf by: sudo pprof --base=/tmp/mybin.hprof.0554.heap --pdf /usr/local/mysql/bin/mysqld /tmp/mybin.hprof.0632.heap > ~/heap_profile.pdf
            Eric_Herman Eric Herman added a comment -

            Checksum per-partition has been requested for Oracle MySQL as well: https://bugs.mysql.com/bug.php?id=78733

            Eric_Herman Eric Herman added a comment - Checksum per-partition has been requested for Oracle MySQL as well: https://bugs.mysql.com/bug.php?id=78733

            Probably the same issue as the more generic queries in MDEV-16520?

            mattiasjonsson Mattias Jonsson added a comment - Probably the same issue as the more generic queries in MDEV-16520 ?

            Also the server crashes if using spider_quick_mode = 3: MDEV-16521

            mattiasjonsson Mattias Jonsson added a comment - Also the server crashes if using spider_quick_mode = 3: MDEV-16521
            jacob-mathew Jacob Mathew (Inactive) made changes -
            jacob-mathew Jacob Mathew (Inactive) made changes -
            ratzpo Rasmus Johansson (Inactive) made changes -
            Assignee Jacob Mathew [ jacob-mathew ] Kentoku [ kentoku ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Kentoku Kentoku Shiba (Inactive) added a comment - - edited

            Add handler methods for pushing down checksum table with options.
            This changes are included small server side changes, please review about this.
            60ed52f

            Kentoku Kentoku Shiba (Inactive) added a comment - - edited Add handler methods for pushing down checksum table with options. This changes are included small server side changes, please review about this. 60ed52f
            Kentoku Kentoku Shiba (Inactive) made changes -
            Assignee Kentoku [ kentoku ] Michael Widenius [ monty ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            julien.fritsch Julien Fritsch made changes -
            Priority Major [ 3 ] Critical [ 2 ]

            I'd suggest a different approach. Make existing handler::checksum() method to calculate the checksum in the engine, the slow way. Move current implementation from sql_table.cc to handler::checksum().

            And to get the fast checksum value from the engine — use handler::info(), not handler::checksum().

            serg Sergei Golubchik added a comment - I'd suggest a different approach. Make existing handler::checksum() method to calculate the checksum in the engine, the slow way. Move current implementation from sql_table.cc to handler::checksum(). And to get the fast checksum value from the engine — use handler::info(), not handler::checksum().
            serg Sergei Golubchik made changes -
            Fix Version/s 10.4 [ 22408 ]

            serg

            > And to get the fast checksum value from the engine — use handler::info(), not handler::checksum().

            Currently, hsndler::info() does not return checksum value. Should I add checksum value into ha_statistics class?
            Also, I think it requires to move logic from handler::checksum() to handler::info() of all storage engines. Would it possible to I avoid it?

            Kentoku Kentoku Shiba (Inactive) added a comment - serg > And to get the fast checksum value from the engine — use handler::info(), not handler::checksum(). Currently, hsndler::info() does not return checksum value. Should I add checksum value into ha_statistics class? Also, I think it requires to move logic from handler::checksum() to handler::info() of all storage engines. Would it possible to I avoid it?
            serg Sergei Golubchik made changes -
            Assignee Michael Widenius [ monty ] Sergei Golubchik [ serg ]

            Yes, that's what I mean, moving checksum to handler::info(), ha_statistics looks like a good place.

            Only myisam and aria support live checksum, there are no other engines that are affected, as far as I can see.

            serg Sergei Golubchik added a comment - Yes, that's what I mean, moving checksum to handler::info(), ha_statistics looks like a good place. Only myisam and aria support live checksum, there are no other engines that are affected, as far as I can see.
            serg Sergei Golubchik made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            serg Sergei Golubchik made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            serg Sergei Golubchik made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]

            checksum API refactoring is pushed into 10.3, commit ffb83ba6502

            serg Sergei Golubchik added a comment - checksum API refactoring is pushed into 10.3, commit ffb83ba6502
            serg Sergei Golubchik made changes -
            Assignee Sergei Golubchik [ serg ] Kentoku [ kentoku ]
            Kentoku Kentoku Shiba (Inactive) added a comment - - edited

            serg
            I added checksum_null parameter to ha_statistics, because Spider have to set null value if Spider get null from data nodes. Would you please review this again?
            a252067

            Kentoku Kentoku Shiba (Inactive) added a comment - - edited serg I added checksum_null parameter to ha_statistics, because Spider have to set null value if Spider get null from data nodes. Would you please review this again? a252067
            Kentoku Kentoku Shiba (Inactive) made changes -
            Assignee Kentoku [ kentoku ] Sergei Golubchik [ serg ]
            Status Stalled [ 10000 ] In Review [ 10002 ]
            serg Sergei Golubchik made changes -
            Assignee Sergei Golubchik [ serg ] Kentoku [ kentoku ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            Kentoku Kentoku Shiba (Inactive) made changes -
            issue.field.resolutiondate 2019-06-10 15:26:34.0 2019-06-10 15:26:34.834
            Kentoku Kentoku Shiba (Inactive) made changes -
            Fix Version/s 10.4.6 [ 23412 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            GeoffMontee Geoff Montee (Inactive) made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 87390 ] MariaDB v4 [ 154405 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 177946

            People

              Kentoku Kentoku Shiba (Inactive)
              mattiasjonsson Mattias Jonsson
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.