Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31535

optimize directory listing for information_schema tables based on privileges

    XMLWordPrintable

Details

    Description

      Usually when INFORMATION_SCHEMA.TABLES (or any other table that is implemented via get_all_tables() function) is queried, it creates a list of all schemas first, then for every schema it creates a list of all files in that schema.

      In certain cases the above is optimized:

      • when a specific table is requested via TABLE_SCHEMA=xxx AND TABLE_NAME=yyy in the WHERE clase — in this case only that one table is opened
      • when a specific schema is requested via TABLE_SCHEMA=xxx — tables for only that schema are listed, the list of all schemas is not created
      • when privileges only allow access to certain schemas — the list of all schemas is created, but tables are listed only for those schemas that pass the privilege check

      Note that in the last case the server still creates a list of all schemas. This can be expensive, if there're thousands of them and the privileges only allow access to one specific schema. It makes sense to treat this case as if the schema name was explicitly specified on the WHERE clause. Almost, because the user will also have access to the INFORMATION_SCHEMA itself, but it's already treated specially anyway.

      That is:

      • if the user does not have global grants that allow to see all schemas, then
      • for every schema-level (and table-level?) grant:
        • if the schema name is not a pattern (does not contain wildcards), directly append this schema name to the list, if the schema exists
      • append "INFORMATION_SCHEMA"

      if the above isn't true — fallback to the directory listing.

      One details: if a user has two or three exact-schema grants, the algorithm above will directly append them all to the list and won't scan. This is all good, but what if there are many thousands of such grants, but they all are for non-existent schemas and only few schemas actually exist? In this (admittedly, artificial) case it'd be faster to scan. I suggest we draw a line at about 10 exact schema grants, and if there're more — fallback to scanning. The practical use case is just one grant anyway. If there will be practical use cases with many exact schema grants and few actually existing schemas — it'll be a separate MDEV, possible solutions will be listed there.

      Attachments

        Issue Links

          Activity

            People

              TheLinuxJedi Andrew Hutchings
              serg Sergei Golubchik
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.