Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22572

Unable to access Zip table from multiple Zip files

Details

    Description

      I am not able to get the zip table of multiple zip. This is required when references are split into multiple .zip files to limit the size or group files in functional domains.

      Creation of TestZip2 table will be fine :

      {{create table TestZip2 (
      fn varchar(256)not null,
      cmpsize bigint not null flag=1,
      uncsize bigint not null flag=2,
      method int not null flag=3,
      date datetime not null flag=4)
      engine=connect table_type=ZIP multiple=1 file_name='/tmp/CCAMArchiveLatest/CCAM06300_DBF_PART*.zip'
      }}

      But trying to access the created table will result in an error like :
      " #1296 - Got error 174 'Zipfile open error' from CONNECT"

      Example of file used in the tmp folder is :
      https://www.ameli.fr/fileadmin/user_upload/documents/CCAM06300_DBF_PART1.zip
      https://www.ameli.fr/fileadmin/user_upload/documents/CCAM06300_DBF_PART2.zip
      https://www.ameli.fr/fileadmin/user_upload/documents/CCAM06300_DBF_PART3.zip

      FYI, Fixing this would help to implement the "Self Referencing Zip Table" Epic as well.

      Attachments

        Activity

          Multiple is not implemented for ZIP tables. Actually, it generally makes no sense. It is meant to return the descriptions of the files that are zipped into the zip file. If the multiple zip files contain different internal files as in your sample files, what should be returned? But if you now that they all contain the same internal files, suffice to test only one of them.

          bertrandop Olivier Bertrand added a comment - Multiple is not implemented for ZIP tables. Actually, it generally makes no sense. It is meant to return the descriptions of the files that are zipped into the zip file. If the multiple zip files contain different internal files as in your sample files, what should be returned? But if you now that they all contain the same internal files, suffice to test only one of them.

          Bonjour Olivier,

          It actually makes quite sense in an MDM interface context.

          Here as an example the CCAM reference data (action codes for the French Assurance Maladie that people are required to use to be reimbursed) are a perfect example of such pattern. CCAM are updated regulary and we don't know how much zip we can get and which zip will hold which table. All we know is where we get the data and that there will be DBF in a ZIP buch of files to contain all the required tables.

          So, as a sum up, if makes perfect sense when files are provided as an unknown amount of ZIP files splitted for any reason by the source team. Which will be the case in most MDM interfaces where source team tend to act as king of their castle.

          As a consuming system, we have to digest them as a whole and can use them as Connect DBF using say a schedules table creation.

          Having all the meta data at once enables us to get the whole list of that tables by iterating the ZIP table and get data files entries reference and create the tables using connect (here table TDB). This really a great feature, thanks to the Connect engine It works as a charm. But as multiple is not working we have to add an extra shell cron script in front to aggregate them. Not so clean I think as it adds up another skill to the stack that maes maintenance less smooth.

          As multiple is working on other cases in connect, I was expecting this to work on ZIP as well as the agregation on other type looked quite close to me.

          Can you reopen this ?

          bjb Jean-Baptiste BUGEAUD added a comment - Bonjour Olivier, It actually makes quite sense in an MDM interface context. Here as an example the CCAM reference data (action codes for the French Assurance Maladie that people are required to use to be reimbursed) are a perfect example of such pattern. CCAM are updated regulary and we don't know how much zip we can get and which zip will hold which table. All we know is where we get the data and that there will be DBF in a ZIP buch of files to contain all the required tables. So, as a sum up, if makes perfect sense when files are provided as an unknown amount of ZIP files splitted for any reason by the source team. Which will be the case in most MDM interfaces where source team tend to act as king of their castle. As a consuming system, we have to digest them as a whole and can use them as Connect DBF using say a schedules table creation. Having all the meta data at once enables us to get the whole list of that tables by iterating the ZIP table and get data files entries reference and create the tables using connect (here table TDB). This really a great feature, thanks to the Connect engine It works as a charm. But as multiple is not working we have to add an extra shell cron script in front to aggregate them. Not so clean I think as it adds up another skill to the stack that maes maintenance less smooth. As multiple is working on other cases in connect, I was expecting this to work on ZIP as well as the agregation on other type looked quite close to me. Can you reopen this ?

          Seems to be useful.

          bertrandop Olivier Bertrand added a comment - Seems to be useful.

          This will list all the zipped files in all zip files. Useful mainly when the special column giving the zip file names is added.

          bertrandop Olivier Bertrand added a comment - This will list all the zipped files in all zip files. Useful mainly when the special column giving the zip file names is added.

          This fix was not included in MariaDB 10.5.4

          bertrandop Olivier Bertrand added a comment - This fix was not included in MariaDB 10.5.4

          People

            bertrandop Olivier Bertrand
            bjb Jean-Baptiste BUGEAUD
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.