Here is how the corresponding chapter could be re-written:
Random Access of ODBC Tables
Note: In CONNECT version 1.03 ODBC tables are not indexable. Version 1.04 adds remote indexing facility to the ODBC table type.
However, some queries require random access to an ODBC table; for instance, when it is joined to another table or used in an order by queries applied to a long column or large tables.
There are several ways to enable random (position) access to a CONNECT ODBC table. They are depending on eventual indexing or from the following table options:
Option |
Type |
Used For |
Block_Size |
Integer |
Specifying the rowset size. |
Memory* |
Integer |
Storing the result set in memory. |
Scrollable* |
Boolean |
Using a scrollable cursor. |
: To be specified in the option_list.
Using memory means that the data retrieved from the remote table will be locally memory stored. It is not always possible, for instance with large tables, but is the faster way to deal with random access.
When dealing with small tables, the simpler way to enable random access is to specify a rowset size equal or larger than the table size (or the result set size if a push down where clause is used). This means that the whole result is in memory on the first FETCH and CONNECT will use it for further positional accesses.
Another way to have the result set in memory is to use the MEMORY option. This option can be set to the following values:
0 - No memory used (the default). Best when the table is read sequentially as in SELECT statements with only eventual WHERE clauses.
1 - Memory size required is calculated during the first sequential table read. The allocated memory is filled during the second sequential read. Then the table rows are retrieved from the memory. This to be used when the table will be accessed many times randomly such as in sub-select or being the target table of a join.
2 - A first query is executed to get the result set size and the needed memory is allocated. It is filled on the first sequential reading. Then random access of the table is possible. This can be used in case of ORDER BY clauses when MariaDB uses position reading.
Note that the best way to handle ORDER BY is to set the max_length_for_sort_data variable to a larger value (its default value is 1024 that is pretty small). Indeed, it requires less memory to be used, particularly when a WHERE clause limits the retrieved data set. This is because in the case of an ORDER BY query, MariaDB firstly retrieves sequentially the result set and the position of each records. Often the sort can be done from that result set if it is not too big. But if it is too big, or if it implies some “long” columns, only the positions are sorted and MariaDB retrieves the final result from the table read in random order. If setting the max_length_for_sort_data variable is not feasible or does not work, to be able to retrieve table data from memory after the first sequential read, the MEMORY option must be set to 2.
For tables too large to be stored in memory another possibility is to make your table to use a scrollable cursor. In this case each randomly accessed row can be retrieved from the data source specifying its cursor position, which is reasonably fast. However, scrollable cursors are not supported by all data sources.
From CONNECT version 1.04, another way to provide random access is to specify some columns to be indexed. This should be done only when the corresponding column of the source table is also indexed. This can be used for tables too large to be stored in memory and is similar to the remote indexing used by the MYSQL table type and by the FEDERATED engine.
Then remains the possibility to sequentially extract the requested data from the external table and to construct another local table from the data source. For instance, to construct a table containing the CUSTOMER table data, create the table as:
create table Custloc as select * from customer;
|
Now you can use custloc for fast database operations on the copied customer table data, eventually restricted by a WHERE clause.
Sorry. I don't know why, but the online documentation has shifted numbers from the original document. You can find it attach here. Connect-1.6.8.docx
It should say:
0 - No memory used.
1 - Memory size required is calculated during the first sequential table read. The allocated memory is filled during the second sequential read. Then the table rows are retrieved from memory.
2 - A first query is executed to get the result set size and the needed memory is allocated. It is filled on the first sequential reading. Then random access of the table is possible.
Let me know whether it is still unclear.