[MDEV-7635] update defaults and simplify mysqld config parameters Created: 2015-02-25  Updated: 2017-07-07  Resolved: 2017-02-13

Status: Closed
Project: MariaDB Server
Component/s: Server
Fix Version/s: 10.2.4

Type: Task Priority: Blocker
Reporter: Richard Bensley Assignee: Nirbhay Choubey (Inactive)
Resolution: Fixed Votes: 3
Labels: 10.2-rc

Issue Links:
Duplicate
is duplicated by MDEV-13247 innodb_log_compressed_pages=OFF break... Closed
PartOf
includes MDEV-7261 Backport the default autosized value ... Stalled
includes MDEV-8429 Change binlog_checksum default to mat... Closed
includes MDEV-8923 [PATCH] port innodb_buffer_pool_dump_... Closed
includes MDEV-9999 Change the default value of optimizer... Closed
Relates
relates to MDEV-6066 Merge new defaults from 5.6 and 5.7 Closed
relates to MDEV-7845 update date column with erroneous dat... Closed
relates to MDEV-10244 Change default to use optimistic mode... Closed
relates to MDEV-11062 Different result for time type with o... Closed
relates to MDEV-11082 mysql_client_test: test_ps_query_cach... Closed
relates to MDEV-11092 Assertion `!writer.checksum_len || wr... Closed
relates to MDEV-11094 Blackhole table updates on slave fail... Closed
relates to MDEV-11095 rpl.rpl_row_mysqlbinlog test fails if... Closed
relates to MDEV-11235 LOAD DATA does not honor strict modes Closed
relates to MDEV-11269 In NO_ZERO_DATE mode, multiple timest... Closed
relates to MDEV-11527 Virtual columns do not get along well... Closed
relates to MDEV-11592 REPLACE IGNORE Open
relates to MDEV-11666 rpl.rpl_row_index_choice fails with h... Confirmed
relates to MDEV-11685 sql_mode can't be set with non-ascii ... Closed
relates to MDEV-11691 CSV: Table marked as crashed and shou... Confirmed
relates to MDEV-11819 NO_ZERO_IN_DATE: Incorrect generated ... Closed
relates to MDEV-11822 Virtual columns: No warning on data t... Open
relates to MDEV-11908 New default configuration produces wa... Closed
relates to MDEV-11995 ALTER TABLE proceeds despite reportin... Closed
relates to MDEV-9474 group_concat_max_len not being read f... Closed
relates to MDEV-10981 Check for side effects on switching o... Closed
relates to MDEV-11040 Different checksum values Confirmed
relates to MDEV-11060 sql/protocol.cc:532: void Protocol::e... Closed
relates to MDEV-11874 "Data too long for column" instead of... Closed
relates to MDEV-11875 Inconsistent behavior of CREATE TABLE... Open
relates to MDEV-11876 CONVERT produces a warning and result... Open
relates to MDEV-11888 SELECT produces more warnings in non-... Open
relates to MDEV-11890 Effect of NO_ZERO_DATE and NO_ZERO_IN... Stalled
relates to MDEV-11891 Warnings on wrong function parameters... Open
relates to MDEV-11892 Warning caused by NO_ZERO_DATE is dif... Open
relates to MDEV-11893 Query execution plan depends on stric... Open
relates to MDEV-11895 NO_ZERO_DATE affects timestamp values... Closed
relates to MDEV-11907 Bogus 'Division by 0' warnings upon u... Closed
relates to MDEV-13272 Document default change on parameter ... Closed
Sprint: 5.5.51 & 10.2.2, 10.2.2-1, 10.2.3-1, 10.2.3-2, 10.2.4-4, 10.2.4-1, 10.2.4-5, 10.0.29

 Description   

Poor MySQL has received some negative feedback over the years 1 2. As entertaining as it is watching and listening to developers struggle and whine, I think it's about time we changed some of the defaults and made the config easier to understand.

Oracle are working on changing more defaults in 5.7 3 4 5, I think we should too.

Something else they have done, is to radically change the simplicity of the default server config file, /etc/my.cnf:
https://raw.githubusercontent.com/mysql/mysql-server/5.7/support-files/my-default.cnf.sh

It's quite clear there, the sql_mode has been set, which will satisfy most developers and users looking for an ACID compliant database with little tweaking, and a rough note on increasing or decreasing RAM needed for InnoDB. MariaDB now probably needs something similar for Galera, CONNET and TokuDB.

  1. http://sql-info.de/mysql/gotchas.html
  2. http://blog.ionelmc.ro/2014/12/28/terrible-choices-mysql/
  3. http://www.tocker.ca/2015/01/23/proposal-to-change-additional-defaults-in-mysql-5-7.html
  4. http://www.tocker.ca/2015/01/14/proposal-to-change-replication-and-innodb-settings-in-mysql-5-7.html
  5. http://www.tocker.ca/2015/02/24/proposal-to-change-additional-defaults-in-mysql-5-7-february-edition.html

Proposed new defaults:

innodb_autoinc_lock_mode            = 2         (was: 1)
innodb_buffer_pool_dump_at_shutdown = ON        (was: OFF)
innodb_buffer_pool_dump_pct         = 25        (was: 100)
innodb_buffer_pool_load_at_startup  = ON        (was: OFF)
innodb_checksum_algorithm           = CRC32     (was: INNODB)
innodb_file_format                  = Barracuda (was: Antelope)
innodb_file_format_max              = Barracuda (was: Antelope)
innodb_large_prefix                 = ON        (was: OFF)
innodb_log_compressed_pages         = ON        (was: OFF)
innodb_purge_threads                = 4         (was: 1)
innodb_strict_mode                  = ON        (was: OFF)
 
binlog_annotate_row_events          = ON        (was: OFF)
binlog_format                       = MIXED     (was: STATEMENT)
group_concat_max_len                = 1M        (was: 1025)
lock_wait_timeout                   = 86400(1day)(was: 1year)
log_slow_admin_statements           = ON        (was: OFF)
log_slow_slave_statements           = ON        (was: OFF)
log_warnings                        = 2         (was: 1)
max_allowed_packet                  = 16M       (was: 4M)
replicate_annotate_row_events       = ON        (was: OFF)
slave_net_timeout                   = 60        (was: 3600)
sync_binlog                         = 1         (was: 0)
aria_recover                        = FORCE, BACKUP (was: NORMAL)
myisam_recover_options              = FORCE, BACKUP (was: OFF)



 Comments   
Comment by Daniel Black [ 2015-02-26 ]

Morgan Tocker's defaults, for everything actually in MariaDB, seem quite sane to me. Ref 1 - bit dated.

Ref 2 interesting rant:

bad points:

  • seems to oversimplify transactions by assuming the API are the same as the data
  • character collations - rtfm and stop making assumptions.
  • transaction isolation level - wrong reason to change and MySQL backed out of this change
  • doesn't get that null isn't a value

on the valid good points:

  • sql_mode - http://dev.mysql.com/doc/refman/5.7/en/server-options.html#option_mysqld_sql-mode - invalid dates part of STRICT_TRANS_TABLES - part of Tocker's changes
  • utf8 - i think where there, +innodb_large_prefix will help with indexes as Tocker point out
  • we have computed columns
  • conditional index - i'm sure there is a feature request somewhere though probably low priority
  • non-transaction DDL/slow alter table - fair point - hard problem - if their south/migration implementer didn't hate mysql could implement pt-online-schema-change in migrations but they won't (with probably good reason)
Comment by Daniel Black [ 2015-09-20 ]

As pointed out on irc, sql_mode=STRICT_TRANS_TABLES would of helped prevent https://www.bugzilla.org/security/4.2.14/ .

A major release is the only time to change these defaults. 10.2 is a long way off.

So many of these values are already out of date addressing some would be a good start.

So to give this issue some concrete values (from mysql-5.7 changes):

  • sql_mode = ONLY_FULL_GROUP_BY STRICT_TRANS_TABLES NO_ZERO_IN_DATE NO_ZERO_DATE ERROR_FOR_DIVISION_BY_ZERO NO_AUTO_CREATE_USER NO_ENGINE_SUBSTITUTION
  • innodb_large_prefix = yes
  • log_slow_admin_statements ON
  • log_slow_slave_statements ON
  • long-query-time 2
  • log-queries-not-using-indexes ON
  • min-examined-row-limit 1000
  • group_concat_max_len 1M
  • max_allowed_packet 16M
  • binlog_format ROW (require for galera)
  • innodb_autoinc_lock_mode 2 (required for galera anyway)
  • innodb_checksum_algorithm crc32
  • innodb_purge_threads 4
  • innodb_strict_mode ON
  • innodb_log_file_size 128M
  • innodb_buffer_pool_dump_at_shutdown ON
  • innodb_buffer_pool_load_at_startup ON
  • innodb_buffer_pool_dump_pct 25 (Added in MDEV-8923)
  • innodb_file_format Barracuda
  • sync_binlog =1 ( or at most 10)
  • slave_net_timeout 60
  • performance schema - Enable events_statements_history and events_transactions_history consumers by default
  • mysql client prompt="\u@\h [\d] > "

My additions:

  • skip-name-resolve (packages can enable on upgrade if host based usernames exist)
  • binlog_annotate_row_events ON
  • replicate_annotate_row_events ON
  • innodb_stats_traditional FALSE
  • use_stat_tables complementary
  • histogram_size 255
  • aria_recover QUICK,BACKUP
  • myisam-recover QUICK, BACKUP
  • table_definition_cache autosize to max(2000, 400+table_open_cache / 2) like 5.7
  • query_cache_strip_comments = ON

If label=order-by-optimisations are resolved:

  • optimizer_switch = mrr=on,mrr_sort_keys=on,optimize_join_buffer_size=on
  • optimizer_use_condition_selectivity =4
  • join_cache_level = 8
Comment by Jan Steinman [ 2016-01-27 ]

I would REALLY like to see group_concat_max_len default to something much higher than 1k, particularly since (as I note in https://mariadb.atlassian.net/browse/MDEV-9474) you apparently cannot set this value in either /etc/my.cnf nor as a command line argument to mysqld. Neither is this value stored in the binlog, which means potential synch problems with slaves (https://mariadb.atlassian.net/browse/MDEV-8292)

Comment by Federico Razzoli [ 2016-05-19 ]

I don't agree with --skip-name-resolve, because some people rely on hostnames.
For aria_recover and myisam_recover, I suggest FORCE,BACKUP (not just QUICK,BACKUP).

My proposals:

  • disable query cache, it's not useful for common OLTP workloads
  • lock_wait_timeout - ALTER TABLEs can be long, but having transactions waiting for 1 year is a potential disaster
  • lower_case_table_names = 1 - default value is annoying and useless
  • updatable_views_with_limit - having YES by default seems to me very dangerous

I also hope that this task will include more checks during startup. In a couple cases, inconsistent settings cause warnings, but more checks could be done, particularly when read_only = 1.

Comment by Richard Bensley [ 2016-08-24 ]

Agree with FORCE,BACKUP for myisam recover. Out of the box MariaDB should be a reliable database to cater for the majority of installations and encourage best practices (the entire point of this ticket). Just because a server is "on" does not mean it's right.

Comment by Sergei Golubchik [ 2016-08-31 ]

My comments:

  1. innodb-*, slave_net_timeout — no opinion
  2. min_examined_row_limit — not quite. On the other hand, it goes directly against other suggested changes (that increase slow log verbosity), and after all a slow query is a slow query, no matter how many rows were examined. On the other hand, if log_queries_not_using_indexes is ON, then without min_examined_row_limit it'll log too much. A good idea might be to have a setting, similar to min_examined_row_limit, that only applies to log_queries_not_using_indexes. And practically it can be done by changing log_queries_not_using_indexes from ON/OFF boolean, to an integer, meaning “log queries not using indexes, if they examine at least that number of rows”. Because ON is boolean TRUE, that is 1, and OFF is 0, one can still use =ON and =OFF and they'll work as expected.
  3. query_cache_strip_comments — not sure. Query cache was supposed to be as low overhead as possible, stripping comments make it more expensive by default. Also, the query cache is most useful for an application using identical generated queries, and in that case all queries will typically have identical comments.
  4. default_tmp_storage_engine, updatable_views_with_limit — why?
  5. lower_case_table_names — not quite agree. May be 2? It'll give us standard-compliant case-insensitive behaviour, but will preserve user specified letter case.
  6. other suggested changes I agree with
Comment by Federico Razzoli [ 2016-08-31 ]

updatable_views_with_limit: Because I can't think a case when running a DELETE or UPDATE against a view with LIMIT is reliable. Instead, I assume that usually it is a bug. In fact the number of affected rows depends on LIMIT and the number of rows in the base table. Even if you delete by id, the number of affected rows could be 0 or 1 depending if underlying data changed.

If the user is aware of this and considers this a good practice, he can still change updatable_views_with_limit.

Comment by Daniel Black [ 2016-08-31 ]

serg,

  1. min_examined_row_limit - main purpose I saw was not logging too much. I'm happy with your "log queries not using indexes, if they examine at least that number of rows" variant.
  2. query_cache_strip_comments - quite right - happy to drop that one.

An alternate point of view for innodb_checksum_algorithm=crc32 is none. Just a gut feel but it seems that many forms storage media itself have forms of error detection/correction. The number of modern filesystems that supports data checksum/parity (based on wikipedia: filesystems) seems to include only ZFS/Btrfs/GPFS, which while generally not the greatest choice for databases, provide an indication of the low importance of this feature at a filesystem level. Its unclear if filesystems don't solve this because a) its a storage problem to solve or b) its a application level problem to solve. Provided there is sufficient bounds checking on the values read perhaps checksums on pages isn't needed? Happy to hear other opinions more researched than my simplified analysis here.

Comment by Michiel Hazelhof [ 2016-09-01 ]

Is there a big advantage to sql_mode = ONLY_FULL_GROUP_BY? If not I think this would cause more harm than good (especially for users migrating to MariaDB only to find out some default setting breaks their app).

Also I'd like to object against lower_case_table_names, case insensitive compares are slower and it doesn't encourage proper coding styles.

Regarding optimizer_use_condition_selectivity we should be extra cautious, we have been tried this setting and anything higher than 2 resulted in infinite hanging queries (only large queries on large tables, we have yet to debug this further).

On a semi related note, wouldn't it be better to stick to a default dash or underscore? The dash and underscore are now alternately used which neither promotes good standards nor makes it easier to read.

Comment by Daniel Black [ 2016-10-08 ]

Noted in the MariaDB dev meetup Amsterdam 2016

character set

default-set-server
collation-server = utf8mb4-general-ci

default-character-set = utf8mb4

socket authentication

  • in rpm and debs but not tarballs
  • new installs only
  • in mysql_secure_install to do it?

expose plugin name in error message for access denied to help people that do (user) mysql -u root

possibly too late?

Comment by Elena Stepanova [ 2016-10-15 ]

My opinion on values that I saw in the description as of Oct 14.

Part 1: VERY STRONGLY DISAGREE

Note: if there is such thing as a QA veto, consider it such.

lower_case_table_names=1,2

I don't know how it's meant to work. With either value, all existing tables which have upper-case letters in their names will become unaccessible?
For lower_case_table_names=2, currently server does not even allow to set it, switching to 0, as it says, "to avoid future problems". AI knows better this time.
If there is a fix coming with this change which will certainly allow to handle existing tables, I'm ready to reconsider.

use_stat_tables=COMPLEMENTARY + histogram_size=255 + optimizer_use_condition_selectivity=4

These are awful defaults, highly dangerous, and will cause a lot of pain to most users without any gain whatsoever.

Table statistics and histograms are fine tools, but they are not for everyone, and cannot be used as defaults. They should be used as tuning, when people experience problems, and when their setup is thoroughly examined to make sure it won't make more harm than good. Basically, it's a prescription drug, not over-the-counter vitamins.

The idea of good defaults is either to make sure that the configuration works "out-of-the-box" for most users, or it simplifies runtime tuning (e.g. not all options can be changed without server startup). This configuration does neither of that.

Let's consider different scenarios.

1) Users who do not run ANALYZE TABLE

  • gain: none, since there are no stats
  • loss:
    • regressions (e.g. MDEV-9628, MDEV-6727, MDEV-9187, and probably many more);
    • likely hidden problems, as the configuration with optimizer_use_condition_selectivity>1 but without the statistical data was considered a corner case with special treatment (there have been a bunch of bugs about this combination initially, and they were getting fixed, but the configuration was claimed not to be a priority because it's meaningless – which in itself says a lot about proposed defaults).
    • unnecessary extra logic (attempts to search for the statistics upon every query) which will possibly affect performance, even if only slightly.

2) Users who run ANALYZE TABLE on their own reasons, on their own schedule

  • gain: possible performance improvement for some queries in some cases, when engine statistics is off
  • loss:
    • since the users don't have their ANALYZE schedule adjusted specifically to EITS needs, they are likely to have wrong statistics in the tables most of the time, so plans can quite possibly get worse rather than better;
    • ANALYZE with statistics has a huge problem with disk space (MDEV-6529). In his experiment, SPetrunia used a table with 1M rows and got disk consumption of 300M upon ANALYZE. Our users (not just biggest customers) already routinely have tables with billions rows. So, it can easily be hundreds of gigabytes.
    • ANALYZE with the new configuration will have disastrous performance. When it was taking less than a second before, it will easily take hours. E.g. on my machine, ANALYZE for dbt3 table lineitem (sf1, 6 mln rows) takes 0.15 sec without statistics, and 40 seconds with statistics. ANALYZE on a table with 600 mln rows will take over an hour, possibly much more. Existing maintenance jobs will all be broken, they simply won't fit into the scheduled window.
    • ANALYZE is replicated, even if nobody needs statistics on the slave, and thus will make the slave lag dramatically. OM=>NS setups which are used for upgrades will be most affected.
    • regressions (see scenario 1)

3) Users who run ANALYZE TABLE specifically to collect EITS data

  • gain: none, since they already have it configured and used at their discretion
  • loss: possibly same as in scenario 2, especially if they are smart enough to run ANALYZE with explicit PERSISTENT FOR ... when they need statistics to be recollected, and run "normal" ANALYZE otherwise.

log_queries_not_using_indexes=1

It's a good tool, but a crazy default.
It's only useful when somebody wants to make a special effort to analyze and fine-tune their workflow, they can always turn it on temporarily.
Otherwise, in the form that it exists, most people don't need it.
If the query is really slow, it will be reflected in the log anyway as a general slow query. If it's not slow, nobody cares – it's fine to query 10-row tables without indexes, and people do it all the time, the log will be huge. Apparently it was meant to be controlled by min-examined-row-limit, but it's not suitable for that, see below.
Until this option can be properly configured, it mustn't become default.
It's not surprising that MySQL introduced log_throttle_queries_not_using_indexes (which I also don't like though), and even with it, they were smart enough not to turn this logging ON.

min-examined-row-limit=N

It's yet another tool for analyzing and fine-tuning a special workflow, it mustn't be used as a default. What if a simple DML is slow because it's blocked for no good reason? What if ANALYZE is slow – it won't be logged at all, regardless the number of rows in the table? What if you forgot a long sleep somewhere deep in your query or function? All of these should go to the slow log, but with this option enabled, they won't.

Part 2: DISAGREE

default_tmp_storage_engine=InnoDB

Pointless nuissance. It works much better now with 'none' and temporary tables defaulting to @@default_storage_engine. Vast majority of users have default engine either InnoDB, or MyISAM, or Aria, and maybe in future RocksDB, and would want their tmp tables be of the same engine as normal tables, right? That's what 'none' does. It's hard to imagine that somebody would want default engine MyISAM, but default tmp engine InnoDB. Setting it by default will only make non-InnoDB users have to remember to modify it separately. Worse still, people who run without InnoDB on whatever reason will start getting failures on startup until they reconfigure the instance.
Those who run by default with very special engines which can't be used for tmp tables already specify the engine for temporary tables explicitly anyway.

long_query_time=2

I think it's excessive, it will create more noise than useful output, and will be all false-positives. Instances grow in size and concurrency faster than they grow in overall speed, many queries are bound to be longer than 2 seconds even if there is nothing wrong with them. Besides, any fluctuations in IO can make it breach the limit.
When somebody reaches the point in their performance tuning that queries longer than 2 seconds really bother them, I think they'll be happy to decrease the value. But it won't be the majority.

updatable_views_with_limit=OFF

I don't see any reason to tighten this setting. AFAIR we haven't had a single complaint from users regarding effects of updates which are controlled by this option. Making it forbidden by default will most likely cause some existing 3rd-party tools stop working properly (at least MySQL manual says that some GUI tools tend to generate such queries), and thus will lower the adoption without any good reason – people will try installing the new version, see it doesn't work well on some obscure reason, and will switch back.

innodb_log_file_size =128M

According to MySQL manual, it's meaningless. "Sensible values range from 4MB to 1/N-th of the size of the buffer pool, where N is the number of log files in the group". Currently default buffer pool size is 128M, and there are 2 log files in the group.

query_cache_strip_comments=1

Has it ever been performance-tested? I'm not sure it's very valuable, but rather suspicious. Yes, there is a specific scenario when comments are painful for query cache, but I think it's mostly a test scenario (e.g. when queries are numerated or otherwise get unique identifiers in the comments – I myself use it a lot in tests).

OTHER NOTES

lock_wait_timeout

I also vote for lowering the value to something that people can realistically wait for (even 1 day would be much better than 1 year). Currently there is no practical difference between "locked" and "frozen forever".

innodb_buffer_pool_dump_at_shutdown=ON and innodb_buffer_pool_load_at_startup=ON

I'm not very happy about it, not because these values are wrong by themselves, but because of our assorted issues with startup/shutdown timeouts in scripts. I expect a considerable growth of complaints.

sync_binlog=1

It would be good to have and publish at least some numbers of our performance sacrifice. Is it more like 1% or more like 50%?

binlog_format=MIXED

While I support this change, I think we must make a clear statement about it in a form of a blog, preferably before the GA release, to give people time to prepare to it. The change can be absolutely disastrous for setups with insufficient indexes. People need to either make sure they switch back to STATEMENT or add indexes before they get into a trouble and start loudly complaining about the regression MariaDB introduced.

Also, it should be noted that in some cases MBR is not just "unsafe DML is logged as ROW, everything else STMT". There are some subtleties related to temporary tables and MBR, there have always been some bugs around it. Unfortunately, we can't see MySQL bug reports anymore, but the fact that they switched right to RBR rather than MBR makes me a little nervous about MIXED as a new default. I'll try to find time to run some tests around it.

log-warnings=2

Fine, but please make sure InnoDB notifications go to a level higher than 2.

table_definition_cache = autosized

Have we actually merged the logic which autosizes it? Did anyone review it?

innodb_stats_traditional=off

Has it been performance-tested (more or less recently)?

Comment by Daniel Black [ 2016-10-17 ]

Thanks elenst, happy with QA Vetos - yes I'm happy with that idea.

log_queries_not_using_indexes, its meaning has changed to reflect a number of rows, commit d0064c6 - per https://lists.launchpad.net/maria-developers/msg09996.html I don't think its 100% right yet. min-examined-row-limit, partially agree that this is quite a brute force instrument - mentioned alternates in mailing list post.

binlog_format - I don't think a proper examination of pros/cons of MIXED/ROW has been done. Cases like galera need to be considered and the edge cases elenst mentions. Also consider that package script detection of missing primary keys might be a good option to add a config file of binlog_format=MIXED(or keep STATEMENT) for those cases significantly affected by a ROW default. Other negatives or RBR - DML statements affecting large number of rows - can/do parallel slave options process each RBR component separately (they are in blocks of some size now anyway). innodb_force_primary_key=ON is appealing to me however apparently dragging users into modern SQL seems unpopular and letting them fail on edge cases is ok (hence sql_mode ONLY_FULL_GROUP_BY STRICT_TRANS_TABLES NO_ZERO_IN_DATE NO_ZERO_DATE ERROR_FOR_DIVISION_BY_ZERO NO_AUTO_CREATE_USER NO_ENGINE_SUBSTITUTION).

A proper pros/cons of socket authentication could also be done rather than a 1/2 prepared 10mins at a dev meetup.

innodb_buffer_pool_size, also a horrible default because of the need to support those running on desktops or choosing to install on very small VMs. Is a packaging prompt to choose between "dedicated DB" and "shared server" (or a scale in-between) to configure between 80 and 20% of free RAM an option? Perhaps choose 50% of free RAM for server distros like RHEL/Centos. innodb_log_file_size could flow from this value. Making a better default for users with decent hardware and limited experience would be good.

skip-name-resolve could also be installed based on a packaging script if DNS usernames aren't being used. A warning based on creation should also exist if this option is enabled.

innodb_buffer_pool_dump_at_shutdown - agree timeout is an issue which is only slightly mitigated by innodb_buffer_pool_dump_pct. I think closing MDEV-9202 without action was a mistake.
edit: load is in background while active so really only shutdown time is important here.

log-slow-verbosity=query_plan,explain hasn't been mentioned and I think it would be good to log more details since its available.

Comment by Nirbhay Choubey (Inactive) [ 2016-10-17 ]

I have updated the defaults list in the description to reflect arguments made by elenst and danblack.
Also, based on the feedbacks, existing open issues and their current values in 5.7, I think its logical to keep
the current defaults for the following variables (no changes):

  • innodb_stats_traditional
  • histogram_size
  • log_queries_not_using_indexes
  • long_query_time
  • lower_case_table_names
  • min_examined_row_limit,
  • optimizer_use_condition_selectivity
  • updatable_views_with_limit
  • use_stat_tables
  • query_cache_strip_comments

There are still some variables, for which I am not clear about (and thus I have not changed them yet):

  • innodb_log_file_size = (proposed) 128M : Based on the rule that elenst pointed, should we change it to 64M or leave it to 48M?
  • long_query_time = (proposed) 2 : I agree that the proposed value could be pretty low. Should we change it to 5secs or leave it to 10secs?
  • table_definition_cache = (proposed) autosized : I believe 'autosized' is not implemented yet.
  • log_slow_verbosity = query_plan,explain : Proposed by danblack. @elena : any comments on this?
  • sql_mode : 5.7 has following additional mode ONLY_FULL_GROUP_BY STRICT_TRANS_TABLES NO_ZERO_IN_DATE NO_ZERO_DATE ERROR_FOR_DIVISION_BY_ZERO NO_AUTO_CREATE_USER NO_ENGINE_SUBSTITUTION
  • optimizer_switch : The following switches are currently off in 10.2 while they are on in 5.7: engine_condition_pushdown=on,mrr=on,mrr_cost_based=on.
  • Charsets: Should we make changes to the default client/server character sets as was discussed during the meetup?
Comment by Elena Stepanova [ 2016-10-17 ]

innodb_log_file_size = (proposed) 128M : Based on the rule that Elena Stepanova pointed, should we change it to 64M or leave it to 48M?

IMO, just keep it the same as it is in MySQL. Increasing to 64M is not a big gain anyway, and even though resizing is not a critical problem anymore, there is no reason to do it when unnecessary. I'd leave it to jplindst to decide.

long_query_time = (proposed) 2 : I agree that the proposed value could be pretty low. Should we change it to 5secs or leave it to 10secs?

It doesn't matter much – as long as the value is not too low; it's totally arbitrary anyway. I'd keep it to 10 seconds on one simple reason – as soon as people upgrade and start getting their slow log growing rapidly, they'll assume it's a performance degradation, and before they figure out it's really not, there might be some FUD and/or rollbacks to previous versions.

log_slow_verbosity = query_plan,explain : Proposed by Daniel Black. @elena : any comments on this?

I lean towards "no", but I don't have a strong enough opinion on the subject.
There have been some bugs around verbosity=explain, e.g. MDEV-6439, although it's old, it might have already gone away (since it's about charsets which have changed a lot). Also, performance impact should be considered.
In general, I don't quite agree we should have everything in the log just because it's available. As a tester, I'd love that, but it's not always in users' best interest. If a user has a query which shows up in the slow log every now and then, they can raise the verbosity at runtime and get the additional info. One case when they could actually benefit from the default is if the slow query was a one-time thing, but they are still desperate to find out what happened there (rare, but possible).

sql_mode : 5.7 has following additional mode ONLY_FULL_GROUP_BY STRICT_TRANS_TABLES NO_ZERO_IN_DATE NO_ZERO_DATE ERROR_FOR_DIVISION_BY_ZERO NO_AUTO_CREATE_USER NO_ENGINE_SUBSTITUTION

I ran MTR tests with ERROR_FOR_DIVISION_BY_ZERO upon Monty's request. It will require some test updates (see MDEV-10981), but it's doable, and I didn't see anything alarming during these tests.
ONLY_FULL_GROUP_BY – I actually like it myself, but we have surprisingly many users who routinely run queries which don't satisfy the requirements, so there will be lots of angry screams.
I have no opinion on the other values, but I think Monty does, please check with him so that we don't have to rollback at the last moment.

optimizer_switch : The following switches are currently off in 10.2 while they are on in 5.7: engine_condition_pushdown=on,mrr=on,mrr_cost_based=on.

  • engine_condition_pushdown is deprecated in MariaDB, so it certainly doesn't go to defaults.
  • mrr has always seemed shaky to me in terms of quality, but I personally didn't do a lot of testing for it, only in early 5.3 – it was implemented before my time and hasn't much changed since. If optimizer team really wants to turn it on and is willing to address possible bugs in a timely manner, I can run some tests for it.
Comment by Nirbhay Choubey (Inactive) [ 2016-10-17 ]

Also, with current set of defaults the following warning would pop up on server start :

2016-10-17 16:23:50 139906085902144 [Warning] options --log-slow-admin-statements, --log-queries-not-using-indexes and --log-slow-slave-statements have no effect if --log_slow_queries is not set

Comment by Daniel Black [ 2016-10-17 ]

Case for innodb_stats_traditional=false follows from MDEV-7084 where it was introduced. The ability of the optimizer to make really poor decisions based on too little information for large tables results can result in large table scans. The cost is that occasionally when innodb stats are pulled more pages are pulled to generate the statistics, (log2(rows)*innodb_stats_sample_pages), and this is significantly less impacting than the wrong query plan. This was extensively deployed at my previous employer across many clients resulting in no similar spurious queries suddenly doing a range/ALL scan on multimillion row tables because a more accurate sampling realised a consistent predictable query using an alternative index.

Rational for log_slow_verbosity = query_plan,explain. when a query has taken 2-10 seconds already the overhead of putting an extra 3-10 lines in log file is rather small. Disk space shouldn't be an issue for this amount and if they are going to do an explain anyway to debug it may as well include it at the time of the query.

ONLY_FULL_GROUP_BY – I actually like it myself, but we have surprisingly many users who routinely run queries which don't satisfy the requirements, so there will be lots of angry screams.

Lets mitigate this with a clear notice in the release notes/blog, like binlog_format, long_query_time!=10 as it is impacting, but in a way that will help users generally, even if they need to change their query, indexes, or add a primary key. It is a major version upgrade, expecting users to read release notes and do basic testing I think is a reasonable expectation.

I have no opinion on the other values, but I think Monty does, please check with him so that we don't have to rollback at the last moment.

While I'm in favour of a QA veto utilizing the experience of multiple avenues of bug reports to gauge the relative stability of features, I don't think a Monty veto based on "I wrote the code" or whatever reason appropriately keeps with the ideals of a community rough consensus for technical decisions. There where a few hallway/pub comments at Percona Live / MariaDB dev meetup specifically around Monty vetos/code gatekeeper, which I defended (perhaps prematurely), and as such having a strong option and proclaiming "for the good of the users" shouldn't equate to veto because community raise changes for the same reason.

Comment by Elena Stepanova [ 2016-10-17 ]

I don't think a Monty veto based on "I wrote the code" or whatever reason appropriately keeps with the ideals of a community rough consensus for technical decisions.

I wasn't talking about a veto. In this issue, opinion of all interested parties is considered. In regard to sql_mode nobody so far has expressed particularly strong opinion. It makes all sense to consult with and listen to the reasoning from somebody who a) does have an opinion; b) does actually speak to different users and customers; and c) does actually know the code, regardless of writing or not writing it.

He does not watch issues which don't belong to him unless specifically asked to, so he might come up with his comments quite late, which will make things more complicated should others agree with his reasoning.

Comment by Igor Babaev [ 2016-10-18 ]

Nirbhay,

What is the exact reason not to have by default:
optimizer_use_condition_selectivity=4
?

Comment by Nirbhay Choubey (Inactive) [ 2016-10-18 ]

igor Please check Elena's comments above. Also, we recently found a bug around it MDEV-11062.

Comment by Vladislav Vaintroub [ 2016-10-19 ]

sql_mode = ONLY_FULL_GROUP_BY STRICT_TRANS_TABLES NO_ZERO_IN_DATE NO_ZERO_DATE ERROR_FOR_DIVISION_BY_ZERO NO_AUTO_CREATE_USER NO_ENGINE_SUBSTITUTION

No zero in dates is really useful for connectors. dates with zeroes are nonsense especially any languages that have a type for dates and timestamps, so connectors go great length to map invalid dates to exception or to null or to special date in (.NET uses year 1601 if I recall correctly)

STRICT_TRANS_TABLES, NO_ENGINE_SUBSTITUTION - this one actually in use by the Windows installer since 5.2 (i.e sql_mode set to this in my.ini during installation). I do not recall any complains, this seems very safe. NO_ENGINE_SUBSTITUTION a must in my opinion.

Comment by Sergei Golubchik [ 2016-10-27 ]

My comments based on what is in bb-10.2-mdev7635 branch, as of

commit 6f36f28
Author: Nirbhay Choubey
Date:   Tue Oct 25 16:37:21 2016 -0400
 
    MDEV-7635: Fix expected warning regex (should not prefix process)

--- a/mysql-test/suite/sys_vars/r/sysvars_aria.result
+++ b/mysql-test/suite/sys_vars/r/sysvars_aria.result
@@ -213,9 +213,9 @@ READ_ONLY	NO
 COMMAND_LINE_ARGUMENT	REQUIRED
 VARIABLE_NAME	ARIA_RECOVER_OPTIONS
 SESSION_VALUE	NULL
-GLOBAL_VALUE	NORMAL
+GLOBAL_VALUE	BACKUP,FORCE

I'd do BACKUP,QUICK. Unexpected multi-hour repair or automatic data loss
could be quite a WTF.
Same for MyISAM.

--- a/mysql-test/suite/sys_vars/r/sysvars_innodb.result
+++ b/mysql-test/suite/sys_vars/r/sysvars_innodb.result
@@ -161,9 +161,9 @@ READ_ONLY	NO
 COMMAND_LINE_ARGUMENT	REQUIRED
 VARIABLE_NAME	INNODB_AUTOINC_LOCK_MODE
 SESSION_VALUE	NULL
-GLOBAL_VALUE	1
+GLOBAL_VALUE	2

This is unsafe in SBR, so I don't think we can change this now.

--- a/mysql-test/suite/sys_vars/r/sysvars_server_embedded.result
+++ b/mysql-test/suite/sys_vars/r/sysvars_server_embedded.result
@@ -1746,10 +1746,10 @@ ENUM_VALUE_LIST	innodb,query_plan,explain
 READ_ONLY	NO
 COMMAND_LINE_ARGUMENT	REQUIRED
 VARIABLE_NAME	LOG_WARNINGS
-SESSION_VALUE	1
-GLOBAL_VALUE	1
+SESSION_VALUE	2
+GLOBAL_VALUE	2
 GLOBAL_VALUE_ORIGIN	COMPILE-TIME
-DEFAULT_VALUE	1
+DEFAULT_VALUE	2

I'm not sure it's a good idea, error log is contains too much noise as it is

--- a/mysql-test/r/index_merge_innodb.result
+++ b/mysql-test/r/index_merge_innodb.result
@@ -106,6 +106,7 @@ insert into t1 (key1a, key1b, key2a, key2b, key3a, key3b)
 select key1a, key1b, key2a, key2b, key3a, key3b from t1;
 analyze table t1;
 Table	Op	Msg_type	Msg_text
+test.t1	analyze	status	Engine-independent statistics collected

where does that come from?

--- a/mysql-test/r/statistics.result
+++ b/mysql-test/r/statistics.result
@@ -925,36 +925,36 @@ db_name	table_name	cardinality
 test	t1	40
 SELECT * FROM mysql.column_stats;
 db_name	table_name	column_name	min_value	max_value	nulls_ratio	avg_length	avg_frequency	hist_size	hist_type	histogram
-test	t1	c	aaaa	dddddddd	0.1250	6.6571	7.0000	0	NULL	NULL
-test	t1	e	0.01	0.112	0.2250	8.0000	6.2000	0	NULL	NULL
 test	t1	b	vvvvvvvvvvvvv	zzzzzzzzzzzzzzzzzz	0.2000	17.1250	6.4000	0	NULL	NULL
+test	t1	e	0.01	0.112	0.2250	8.0000	6.2000	0	NULL	NULL
+test	t1	c	aaaa	dddddddd	0.1250	6.6571	7.0000	0	NULL	NULL

where does that come from?
(same reason as above, I suspect)

Comment by Nirbhay Choubey (Inactive) [ 2016-10-28 ]

serg:

I'd do BACKUP,QUICK. Unexpected multi-hour repair or automatic data loss
could be quite a WTF.
Same for MyISAM.

OK, I will update them.

This is unsafe in SBR, so I don't think we can change this now.

Good point. will revert it.

I'm not sure it's a good idea, error log is contains too much noise as it is

hmm.. I did not notice a lot of change in the test results after setting --log-warnings=2. But, the logs could get flooded with warnings like "Deadlock found when trying to get lock".

where does that come from?

oh.. its leftover from the commit where I had use_stat_tables enabled and later the test always skipped due to "Requires: have_xtradb" so it was never caught.

where does that come from?
(same reason as above, I suspect)

No, its due to the change in binlog_format. For some reason changing the binlog format caused the row order to change.

Comment by Nirbhay Choubey (Inactive) [ 2016-10-30 ]

@serg: Pushed the changes to bb-10.2-mdev7635 branch.
1) https://github.com/MariaDB/server/commit/c67e1e3e4fd40cb100ea3a9da1af53ec33a94817 (defaults changes)
2) https://github.com/MariaDB/server/commit/9912b1420d1212de971c3b953a54031f6a7982d0 (s/standards_compliant_cte/standard_compliant_cte)

Comment by Daniel Black [ 2016-12-21 ]

expire_log_days, 0 (aka infinity) is an exceptionally large number of days. With a default max_binlog_size of 1G and general storage size I suspect 20 may be a better default value.

Comment by Elena Stepanova [ 2016-12-22 ]

I disagree, rather strongly. Keeping binary logs can be crucially important for various reasons. Removing them automatically by default can be a disaster, which the user will only find out when it's too late. Size does not matter here – the activity can be low, the disk can be large, and any reasonable real-life setup will monitor the disk status anyway, and the admin will make the decision about binary logs if it comes to that.

Comment by Daniel Black [ 2016-12-22 ]

Fair call. disk monitoring is a much better approach.

Comment by Vladislav Vaintroub [ 2016-12-22 ]

I propose (discussed with elenst in the past while analyzing MDEV-11226)

thread_pool_max_threads=65535

Usually, there should not really be an upper bound to thread count by default, but DBAs who know their workload well, might want set it.
had a patch for this variable myself, but since you're already at it, and chance the same places, maybe it should go into this one.

Comment by Elena Stepanova [ 2017-02-06 ]

nirbhay_c,

As requested, I've looked at bb-10.2-mdev7635, diff between fd0479ce592e0b7c13d67b5deda62e9090956309 and the current e2d94b684a070b94519cd45e923a3c41267f66c4.

There are a few suites that haven't been fixed. I've noticed

  • mroonga/* (most important, since it's a part of the default set)
  • jp (also runs in buildbot on fulltest)
  • engines/*

I expect you're going to adjust them as well.

Also I have some doubts about innodb_storedproc* tests. Unlike in other tests, here you've kept the strict mode without ignore close, so that it didn't allow data to be loaded, which naturally further caused differences in result sets which were re-recorded. Unless you did it on purpose, maybe it makes sense to restore the test logic.

Apart from those, it looks mostly fine. I have some technical suggestions/fixes, all of which are pushed into bb-10.2-mdev7635.elenst, see
https://github.com/MariaDB/server/commits/bb-10.2-mdev7635.elenst

I intentionally created very itemized commits, so that it was clear which is doing what, and you could pick them up separately (they are mostly independent, except for maybe one or two).
You only need those that start with "Additional test fixes for MDEV-7635 #N". There are 13 of them now.
You don't need two others, one of which is to exclude mroonga from defaults (you don't need to merge it as it has to be fixed anyway), and one to disable a test which currently fails on 10.2 (you don't need to merge it either, it's already disabled in 10.2 and will be re-enabled when the problem is fixed).

Not related to tests directly, I still have some concerns about sync-binlog (see MDEV-11900), and about the new combination of slow-log options which ends up with a warning (see MDEV-11908), I don't think it makes any sense.

Comment by Nirbhay Choubey (Inactive) [ 2017-02-10 ]

thread_pool_max_threads=65535
https://github.com/MariaDB/server/commit/def258061b884df91624562c27c818f9d1fa2d5c

Generated at Thu Feb 08 07:21:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.