[MDEV-17255] New optimizer defaults and ANALYZE TABLE - Jira

Sergei Petrunia created issue - 2018-09-20 14:51

Sergei Petrunia made changes - 2018-09-20 14:51

Field	Original Value	New Value
Link		This issue is part of ~~MDEV-15253~~ [ ~~MDEV-15253~~ ]

Sergei Petrunia made changes - 2018-09-20 14:52

Description

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity.

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: introduce

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity.

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: introduce

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

Sergei Petrunia made changes - 2018-09-20 14:59

Description

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity.

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: introduce

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity.

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predicatable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

Sergei Petrunia made changes - 2018-09-20 14:59

Description

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity.

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predicatable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predicatable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

Sergei Petrunia made changes - 2018-09-20 15:00

Description

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that
{code:sql}
ANALYZE TABLE t1
{code}

will collect EITS stats after ~~MDEV-15253~~.
This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).
However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it will take much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predicatable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that statement like
{code:sql}
ANALYZE TABLE t1
{code}

after ~~MDEV-15253~~ will start to collect EITS stats.

This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).

However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it takes much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predicatable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

Sergei Petrunia made changes - 2018-09-20 15:01

Description

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that statement like
{code:sql}
ANALYZE TABLE t1
{code}

after ~~MDEV-15253~~ will start to collect EITS stats.

This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).

However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it takes much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predicatable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that statement like
{code:sql}
ANALYZE TABLE t1
{code}

after ~~MDEV-15253~~ will start to collect EITS stats.

This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).

However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it takes much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predictable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

Sergei Petrunia made changes - 2018-09-20 15:04

Description

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that statement like
{code:sql}
ANALYZE TABLE t1
{code}

after ~~MDEV-15253~~ will start to collect EITS stats.

This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).

However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it takes much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predictable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.

We have ~~MDEV-15253~~, which changes optimizer defaults to include using the histograms and their selectivity:

{noformat}
optimizer_use_condition_selectivity=4
use_stat_tables=PREFERABLY
{noformat}

One of the effects of the new settings is that statement like
{code:sql}
ANALYZE TABLE t1
{code}

after ~~MDEV-15253~~ will start to collect EITS stats.

This was enabled in MTR and was instrumental in finding a lot of bugs related to EITS (Good).

However, it is not appropriate for production uses: If {{ANALYZE TABLE t1}} collects EITS stats, it takes much more time (my measurement: 10x time the full table scan). This is BAD.

Possible ways out:

h2. Solution 1: Make {{ANALYZE TABLE t1}} not collect EITS stats
* We will need to rollback all of the changes to .result files in ~~MDEV-15253~~.
* EITS will have few test coverage.

h2. Solution 2: Make {{ANALYZE TABLE t1}} collect EITS stats for MTR but not users.
* Let {{use_stat_tables=preferably}} remain what it currently is: {{ANALYZE TABLE t1}} collects EITS stats. MTR will run with this setting.
* Introduce another value of {{use_stat_tables=preferably_for_reads}} (name is tentative). This will be the default for the users. It will mean that {{ANALYZE TABLE t1}} does not collect EITS stats.

(One may argue that this is bad as MTR will run in an environment that's not like the users have. On the other hand, MTR will run with predictable statistical data. MTR used to run with sampled, non-predictable stats which made `rows` column and query plans unstable)

h2. Solution 3:
Wait until Vicentiu and Teodor are done with EITS-via-sampling.
This is bad as it creates a dependency between these two tasks.
We do not want to push the defaults change late in the release cycle.

Elena Stepanova made changes - 2018-09-20 23:37

Issue Type

Bug [ 1 ]

Task [ 3 ]

Sergei Petrunia made changes - 2018-09-27 19:45

Assignee

Varun Gupta [ varun ]

Sergei Petrunia made changes - 2018-09-27 19:46

Fix Version/s

10.4 [ 22408 ]

Varun Gupta (Inactive) made changes - 2018-11-16 18:26

Status

Open [ 1 ]

In Progress [ 3 ]

Varun Gupta (Inactive) made changes - 2018-11-16 18:26

Assignee	Varun Gupta [ varun ]	Sergei Petrunia [ psergey ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Sergei Petrunia made changes - 2018-11-24 18:49

Status

In Review [ 10002 ]

Stalled [ 10000 ]

Sergei Petrunia made changes - 2018-11-24 18:49

Assignee

Sergei Petrunia [ psergey ]

Varun Gupta [ varun ]

Varun Gupta (Inactive) made changes - 2018-12-04 11:46

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Varun Gupta (Inactive) made changes - 2018-12-05 14:10

Assignee	Varun Gupta [ varun ]	Sergei Petrunia [ psergey ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Sergei Petrunia made changes - 2018-12-06 19:15

Assignee	Sergei Petrunia [ psergey ]	Varun Gupta [ varun ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Varun Gupta (Inactive) made changes - 2018-12-06 21:27

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Varun Gupta (Inactive) made changes - 2018-12-06 21:27

Assignee	Varun Gupta [ varun ]	Sergei Petrunia [ psergey ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Sergei Petrunia made changes - 2018-12-08 12:41

Assignee	Sergei Petrunia [ psergey ]	Varun Gupta [ varun ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Varun Gupta (Inactive) made changes - 2018-12-11 13:48

Component/s		Optimizer [ 10200 ]
Fix Version/s		10.4.1 [ 23228 ]
Fix Version/s	10.4 [ 22408 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Sergei Golubchik made changes - 2021-12-06 21:23

Workflow

MariaDB v3 [ 89690 ]

MariaDB v4 [ 133684 ]

MariaDB Server

New optimizer defaults and ANALYZE TABLE

Details

Description

Solution 1: Make `ANALYZE TABLE t1` not collect EITS stats

Solution 2: Make `ANALYZE TABLE t1` collect EITS stats for MTR but not users.

Solution 3:

Attachments

Issue Links

Activity

People

Dates

Git Integration

MariaDB Server

Details

Description

Solution 1: Make ANALYZE TABLE t1 not collect EITS stats

Solution 2: Make ANALYZE TABLE t1 collect EITS stats for MTR but not users.

Solution 3:

Attachments

Issue Links

Activity

People

Dates

Git Integration

Solution 1: Make `ANALYZE TABLE t1` not collect EITS stats

Solution 2: Make `ANALYZE TABLE t1` collect EITS stats for MTR but not users.