[MDEV-26151] Documentation on spider_casual_read is insufficient/non-existing Created: 2021-07-15 Updated: 2023-10-19 Resolved: 2023-08-02 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Documentation, Storage Engine - Spider |
| Fix Version/s: | N/A |
| Type: | Task | Priority: | Critical |
| Reporter: | Valerii Kravchuk | Assignee: | Anne Strasser (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
It is not clear from this document: https://mariadb.com/docs/reference/mdb/cli/mariadbd/spider-casual-read/ and KB: https://mariadb.com/kb/en/spider-server-system-variables/#spider_casual_read what is the "connection channel" and what settings should be used and at what level for typical use cases: 1. The overall data node is 4 nodes and there are two partitions that point to the same data node. ) 2. Case with four overall data nodes and three partitions pointing to the same data node — ) 3. Case where the overall data node is 4 nodes and every partition points to a different data node — 4. Case where all partitions point to a local table — So, can you, please, clarify in the documentation, Enterprise or KB, or both, what value to set for spider_casual_read, and on what level, table or global, for each of these 4 cases? |
| Comments |
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-07-27 ] | ||||||||||||||||||
|
The word "a connection channel" refers to a database connection to a data node (the Spider SE may have multiple channels to a single data node). With more partitions on a Spider table, you would have more chances to leverage parallelism. As the documentation said, this is true even if multiple shards are stored on the same physical server. Please note that it quite depends on how much the response time is reduced by parallel search, so it is best to have the actual measurement done in the user's environment. Setting spider_casual_read=X (X in [2, 3..., 63]) is to let the Spider storage engine use the specific channel of No. X to each data node. This setting seems to be for fine-tuning and sometimes results in an excessive tweak. So, if one wants to benefit from a parallel read, I recommend setting spider_casual_read = 1 in most cases. | ||||||||||||||||||
| Comment by Yuchen Pei [ 2023-06-15 ] | ||||||||||||||||||
|
The best way to know for sure the behaviour of this parameter is to find or create a test covering it. I will be working on this. | ||||||||||||||||||
| Comment by Yuchen Pei [ 2023-07-26 ] | ||||||||||||||||||
|
As I suspected, none of the tests cover the case where To make this task more "interesting", | ||||||||||||||||||
| Comment by Yuchen Pei [ 2023-07-27 ] | ||||||||||||||||||
|
After looking into the code, and creating, running and inspecting 1. it only takes effect when bgs mode is on (bgs_mode > 0), and
3. Whenever casual read take effect, a bg thread has been created for
While working on this ticket, I also did some code cleanup, which I I will also open a ticket to deprecate and remove this sysvar / Disclaimer: this is not a mathematical proof, and rests on the two Update on [2023-07-31 Mon]: Regarding the assumption * above, I did 1. A relaxed assumption does not hold: we have that the same So what do we do? I think there are two possible approaches: 1. We take casual read at face value, in which case it is pretty I still think 1 is much better, as spider code is too legacy and [1] https://github.com/MariaDB/server/commit/fe5ca4a3f58 Also holyfoot, feel free to chime in if you have thoughts about | ||||||||||||||||||
| Comment by Yuchen Pei [ 2023-08-02 ] | ||||||||||||||||||
|
Following up on discussions with valerii, let me wrap up this Basically, my understanding after testing and analysing the code is Whenever the casual read value is used, a background thread has Take the test spider/bg.direct_aggregate_part[1] for example. This [1] https://github.com/MariaDB/server/blob/11.2/storage/spider/mysql-test/spider/bg/t/direct_aggregate_part.test For more detailed analysis, see my previous comments in this ticket. | ||||||||||||||||||
| Comment by Valerii Kravchuk [ 2023-08-26 ] | ||||||||||||||||||
|
Should I create a new MDEV/task to change KB (https://mariadb.com/kb/en/spider-server-system-variables/#spider_casual_read) according to the results presented in this MDEV? Current text there still seems misleading. | ||||||||||||||||||
| Comment by Yuchen Pei [ 2023-08-27 ] | ||||||||||||||||||
|
> Should I create a new MDEV/task to change KB That's fine by me. |