Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Do
    • None
    • N/A
    • N/A
    • None

    Description

      When running the query:

      MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-01-19' limit 10;

      and this is what I got back:

      ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.
      I expected this amount of records to come back (well only 10 as I limited it but at least they would travel from PM to UM and I would know that it is OK):

      MariaDB [tradealert]> SELECT count(1) FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-01-19';
      ----------

      count(1)

      ----------

      1501056

      ----------
      1 row in set (3.22 sec)

      In the stretch of all market dates of Q1 2018 it happened on Jan 19., Feb 5., Feb. 21. and from Mar 5. all the way through Mar 29. All other queries for all other days in this stretch (across the Q1) returned valid and full resultsets.

      Attached is zipped support report file.

      We also ran redistributeData but there was not much to distribute because of the configuration. So it ran for 10 ms and it did not help.
      after that we created 2 extra DBROOTS and then re-run redistributeData - this time successfully but it did not help resolving an issue.

      Attachments

        Issue Links

          Activity

            The problem here is that you are using NFS mounts and particularly S3 NFS mounts do something very odd with file times that break ColumnStore. I highly recommend you do not use NFS for any database storage backend.

            For eng team:

            Config::checkReload() is using mtime which S3 NFS mounts often set to zero. Therefore the config never gets loaded in and the DBRoot part is empty upon file access. We should maybe do a checksum based test instead? Or at least don't rely on mtime being correct on a shared/distributed FS.

            LinuxJedi Andrew Hutchings (Inactive) added a comment - The problem here is that you are using NFS mounts and particularly S3 NFS mounts do something very odd with file times that break ColumnStore. I highly recommend you do not use NFS for any database storage backend. For eng team: Config::checkReload() is using mtime which S3 NFS mounts often set to zero. Therefore the config never gets loaded in and the DBRoot part is empty upon file access. We should maybe do a checksum based test instead? Or at least don't rely on mtime being correct on a shared/distributed FS.

            So that we may move forward, I have done away with the EFS file systems and setup EBS (io1) volumes for dbroots 1,3 (pm1) and dbroots 2,4 (pm2).

            jdutch John Dutchover added a comment - So that we may move forward, I have done away with the EFS file systems and setup EBS (io1) volumes for dbroots 1,3 (pm1) and dbroots 2,4 (pm2).
            madamec Martin Adamec added a comment -

            Andrew,
            Please help me figuring out what could be still wrong with our installation. We rebuilt the storage, I removed databases running drop database xyz from UM1, recreated them, recreated tables and started loading data from scratch.

            There was one "weirdness" I noticed at the very initial loading. mcsimport reported in few cases quite a lot of invalid records. here is a copy of the reported result:
            Execution time: 0.817828s
            Rows inserted: 10064
            Truncation count: 0
            Saturated count: 0
            Invalid count: 0
            cat calendar-datadock_2018_Q1.csv | wc -l
            10064

            Execution time: 0.161977s
            Rows inserted: 2077
            Truncation count: 0
            Saturated count: 0
            Invalid count: 1
            cat groups-datadock_2018_Q1.csv | wc -l
            2077

            Execution time: 4281.18s
            Rows inserted: 87483118
            Truncation count: 0
            Saturated count: 0
            Invalid count: 32809561
            cat onelots-datadock_2018_Q1.csv | wc -l
            87483118

            Execution time: 1043.61s
            Rows inserted: 53818130
            Truncation count: 0
            Saturated count: 0
            Invalid count: 38551020
            cat open_interest-datadock_2018_Q1.csv | wc -l
            53818130

            Execution time: 0.481345s
            Rows inserted: 17387
            Truncation count: 0
            Saturated count: 0
            Invalid count: 159
            cat optimal_ex-datadock_2018_Q1.csv | wc -l
            17387

            Execution time: 2718.69s
            Rows inserted: 101144576
            Truncation count: 0
            Saturated count: 0
            Invalid count: 146538555
            cat rates-datadock_2018_Q1.csv | wc -l
            101144576

            Execution time: 91.0303s
            Rows inserted: 4278121
            Truncation count: 0
            Saturated count: 0
            Invalid count: 34791
            cat securities-datadock_2018_Q1.csv | wc -l
            4278121

            Execution time: 38.2951s
            Rows inserted: 264387
            Truncation count: 0
            Saturated count: 0
            Invalid count: 17
            cat underlyings-datadock_2018_Q1.csv | wc -l
            264387

            Execution time: 26.1987s
            Rows inserted: 494710
            Truncation count: 61
            Saturated count: 0
            Invalid count: 1940652
            cat usec_hist-datadock_2018_Q1.csv | wc -l
            494710

            I was looking for some detail LOG similar to csimport but did not find any. could you point me to the place I should be able to find these detail logs? I tried to check loaded data and ran few queries that counted rows also did come conditional counting and it all looked OK to me. Obviously that was not enough to be 100% sure that everything loaded correctly. If there are no logs it would be probably enough to know what triggeres that invalid count to grow when loading using mcsimport.

            What is more scary is my next finding:
            We had a case of missing files exactly the same as we had before when I opened this ticket for the first time. There is only a little difference between then and now. Not all dates between march 5 and 29 are with missing files. We have these files missing:

            {{MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-01-19';
            ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.

            MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-02-05';
            ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.

            MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-02-21';
            ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.

            MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-03-09' limit 10;
            ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.

            MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-03-27' limit 10;
            ------------------------------------------------------------------------------------------------------------

            option_id market_date time size price spot open_int theo ivol delta vega theta secid

            ------------------------------------------------------------------------------------------------------------

            55344 2018-03-27 15:45:41.313 1 87.7 1002.01 1180 88.8097 0.268411 -0.432299 3.50937 -0.141225 20393363
            5897 2018-03-27 15:45:41.325 1 59.05 1489 3039 60.4389 0.35868 0.374017 2.64117 -0.617602 23051456
            1270919 2018-03-27 15:45:41.331 1 0.26 220.71 4681 0.261891 0.828654 0.047409 0.016148 -0.233111 25047193
            1493877 2018-03-27 15:45:41.333 1 12.25 220.71 1201 11.9327 0.631092 0.620184 0.139078 -0.453351 25269065
            1653542 2018-03-27 15:45:41.375 1 2.55 277.91 17 2.44285 0.742268 0.277963 0.069004 -1.39039 25430466
            1548574 2018-03-27 15:45:41.388 1 9.85 213.22 60 9.61333 0.316144 -0.662974 0.162157 -0.154413 25314047
            228803 2018-03-27 15:45:41.401 2 6.61 1002.01 5553 6.70226 0.336513 0.152676 0.606164 -0.430907 23298935
            1697220 2018-03-27 15:45:41.444 1 0.26 6.905 668 0.238412 0.594428 0.29706 0.009021 -0.005256 25507298
            1697220 2018-03-27 15:45:41.445 2 0.26 6.905 668 0.238412 0.594428 0.29706 0.009021 -0.005256 25507298
            1697220 2018-03-27 15:45:41.445 1 0.26 6.905 668 0.238412 0.594428 0.29706 0.009021 -0.005256 25507298

            ------------------------------------------------------------------------------------------------------------
            10 rows in set (3.59 sec)

            MariaDB [tradealert]> SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-03-27';
            ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.
            }}

            I ran the query one time with limit 10 and another time without and I had 2 different results for March 27.

            We need to know what is causing this issue to come back. Without knowing the reason it makes the whole environment unpredictable for us and that is very risky proposition.

            I attached a support report file (labeled with today's date for easier recognition) to document what we currently have configured.
            columnstoreSupportReport_20181104.csdb-pg1.tar.gz

            madamec Martin Adamec added a comment - Andrew, Please help me figuring out what could be still wrong with our installation. We rebuilt the storage, I removed databases running drop database xyz from UM1, recreated them, recreated tables and started loading data from scratch. There was one "weirdness" I noticed at the very initial loading. mcsimport reported in few cases quite a lot of invalid records. here is a copy of the reported result: Execution time: 0.817828s Rows inserted: 10064 Truncation count: 0 Saturated count: 0 Invalid count: 0 cat calendar-datadock_2018_Q1.csv | wc -l 10064 Execution time: 0.161977s Rows inserted: 2077 Truncation count: 0 Saturated count: 0 Invalid count: 1 cat groups-datadock_2018_Q1.csv | wc -l 2077 Execution time: 4281.18s Rows inserted: 87483118 Truncation count: 0 Saturated count: 0 Invalid count: 32809561 cat onelots-datadock_2018_Q1.csv | wc -l 87483118 Execution time: 1043.61s Rows inserted: 53818130 Truncation count: 0 Saturated count: 0 Invalid count: 38551020 cat open_interest-datadock_2018_Q1.csv | wc -l 53818130 Execution time: 0.481345s Rows inserted: 17387 Truncation count: 0 Saturated count: 0 Invalid count: 159 cat optimal_ex-datadock_2018_Q1.csv | wc -l 17387 Execution time: 2718.69s Rows inserted: 101144576 Truncation count: 0 Saturated count: 0 Invalid count: 146538555 cat rates-datadock_2018_Q1.csv | wc -l 101144576 Execution time: 91.0303s Rows inserted: 4278121 Truncation count: 0 Saturated count: 0 Invalid count: 34791 cat securities-datadock_2018_Q1.csv | wc -l 4278121 Execution time: 38.2951s Rows inserted: 264387 Truncation count: 0 Saturated count: 0 Invalid count: 17 cat underlyings-datadock_2018_Q1.csv | wc -l 264387 Execution time: 26.1987s Rows inserted: 494710 Truncation count: 61 Saturated count: 0 Invalid count: 1940652 cat usec_hist-datadock_2018_Q1.csv | wc -l 494710 I was looking for some detail LOG similar to csimport but did not find any. could you point me to the place I should be able to find these detail logs? I tried to check loaded data and ran few queries that counted rows also did come conditional counting and it all looked OK to me. Obviously that was not enough to be 100% sure that everything loaded correctly. If there are no logs it would be probably enough to know what triggeres that invalid count to grow when loading using mcsimport. What is more scary is my next finding: We had a case of missing files exactly the same as we had before when I opened this ticket for the first time. There is only a little difference between then and now. Not all dates between march 5 and 29 are with missing files. We have these files missing: {{MariaDB [tradealert] > SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-01-19'; ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information. MariaDB [tradealert] > SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-02-05'; ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information. MariaDB [tradealert] > SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-02-21'; ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information. MariaDB [tradealert] > SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-03-09' limit 10; ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information. MariaDB [tradealert] > SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-03-27' limit 10; ---------- ----------- ------------ ---- ----- ------- -------- -------- -------- --------- -------- --------- --------- option_id market_date time size price spot open_int theo ivol delta vega theta secid ---------- ----------- ------------ ---- ----- ------- -------- -------- -------- --------- -------- --------- --------- 55344 2018-03-27 15:45:41.313 1 87.7 1002.01 1180 88.8097 0.268411 -0.432299 3.50937 -0.141225 20393363 5897 2018-03-27 15:45:41.325 1 59.05 1489 3039 60.4389 0.35868 0.374017 2.64117 -0.617602 23051456 1270919 2018-03-27 15:45:41.331 1 0.26 220.71 4681 0.261891 0.828654 0.047409 0.016148 -0.233111 25047193 1493877 2018-03-27 15:45:41.333 1 12.25 220.71 1201 11.9327 0.631092 0.620184 0.139078 -0.453351 25269065 1653542 2018-03-27 15:45:41.375 1 2.55 277.91 17 2.44285 0.742268 0.277963 0.069004 -1.39039 25430466 1548574 2018-03-27 15:45:41.388 1 9.85 213.22 60 9.61333 0.316144 -0.662974 0.162157 -0.154413 25314047 228803 2018-03-27 15:45:41.401 2 6.61 1002.01 5553 6.70226 0.336513 0.152676 0.606164 -0.430907 23298935 1697220 2018-03-27 15:45:41.444 1 0.26 6.905 668 0.238412 0.594428 0.29706 0.009021 -0.005256 25507298 1697220 2018-03-27 15:45:41.445 2 0.26 6.905 668 0.238412 0.594428 0.29706 0.009021 -0.005256 25507298 1697220 2018-03-27 15:45:41.445 1 0.26 6.905 668 0.238412 0.594428 0.29706 0.009021 -0.005256 25507298 ---------- ----------- ------------ ---- ----- ------- -------- -------- -------- --------- -------- --------- --------- 10 rows in set (3.59 sec) MariaDB [tradealert] > SELECT sio.`id` AS option_id, o.`date` AS market_date, o.`time`, o.`size`, o.`price`, o.`spot`, o.open_int, o.theo, o.ivol, o.delta, o.vega, o.theta, o.`secid` FROM onelots AS o LEFT JOIN securities_occ AS so ON o.secid = so.secid LEFT JOIN dds_main.`sec_instrument_option` AS sio ON so.`occ_identifier` = sio.`occ_identifier` WHERE o.date = '2018-03-27'; ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information. }} I ran the query one time with limit 10 and another time without and I had 2 different results for March 27. We need to know what is causing this issue to come back. Without knowing the reason it makes the whole environment unpredictable for us and that is very risky proposition. I attached a support report file (labeled with today's date for easier recognition) to document what we currently have configured. columnstoreSupportReport_20181104.csdb-pg1.tar.gz

            How did you move from one storage type to the other? It looks like you may have just moved the data files over to the new storage and started it? In which case you are trying to start with already damaged files. I highly recommend a clean install.

            As for the invalid count. Please file a new ticket for this and attach an example CSV file and schema for it so we can take a look.

            LinuxJedi Andrew Hutchings (Inactive) added a comment - How did you move from one storage type to the other? It looks like you may have just moved the data files over to the new storage and started it? In which case you are trying to start with already damaged files. I highly recommend a clean install. As for the invalid count. Please file a new ticket for this and attach an example CSV file and schema for it so we can take a look.

            The data had been retained when moved to EBS. It was a misunderstanding on my part.

            A clean installation has been completed. I generated a support file columnstoreSupportReport.cstore-pg1c.tar.gz before turning over to @madamec. Note the new system name is cstore-pg1c to distinguish from previous install.

            jdutch John Dutchover added a comment - The data had been retained when moved to EBS. It was a misunderstanding on my part. A clean installation has been completed. I generated a support file columnstoreSupportReport.cstore-pg1c.tar.gz before turning over to @madamec. Note the new system name is cstore-pg1c to distinguish from previous install.

            People

              Unassigned Unassigned
              madamec Martin Adamec
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.