[MDEV-19229] Allow innodb_undo_tablespaces to be changed after database creation Created: 2019-04-10  Updated: 2023-12-09  Resolved: 2022-10-25

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: 10.11.1

Type: Task Priority: Critical
Reporter: Thirunarayanan Balathandayuthapani Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 3
Labels: None

Issue Links:
Blocks
blocks MDEV-29986 Set innodb_undo_tablespaces=3 by default Closed
is blocked by MDEV-21216 InnoDB performs dirty read of TRX_SYS... Closed
Problem/Incident
causes MDEV-30122 mariabackup.skip_innodb crashes when ... Closed
causes MDEV-30158 InnoDB fails to start ther server 10.... Closed
causes MDEV-30311 system-wide max transaction id corrup... Closed
Relates
relates to MDEV-27121 mariabackup incompatible with disable... Closed
relates to MDEV-29983 Deprecate innodb_file_per_table Closed
relates to MDEV-14481 Execute InnoDB crash recovery in the ... Closed
relates to MDEV-14795 InnoDB system tablespace cannot be sh... Closed
relates to MDEV-21952 ibdata1 file size growing in MariaDB Closed
relates to MDEV-31488 Restart on backupped data fails with ... Closed

 Description   

To change the number of undo tablespaces, we need to reinitialize the data directory (dump + restore).
InnoDB should allow changing number of undo tablespaces as a part of restart process.
System tablespace contains InnoDB dictionary tables, doublewrite buffer, change buffer tree
and undo log pages. By allowing this change, InnoDB can reduce the workload on system tablespace.

Requirements

A prior shutdown with SET GLOBAL innodb_fast_shutdown=0 must be executed before adding the undo log tablespaces.

This is because the undo logs must be empty (no incomplete or XA PREPARE transactions, nothing to be purged) so that the old undo tablespaces can discarded and new ones created.

Steps

  1. Check whether the existing undo log exists. If exists then give the warnings about the slow shutdown and start the server normally.
  2. If not exist then free the system tablespace rollback segment header page else throw message about slow shutdown and continue the normal server start operation
  3. Free the system tablespace rollback segment header page of the slots 1...127
  4. Reset the TRX_SYS page and reinitialize the system tablespace rollback segment slot(0th slot), doublewrite information,
    update the binlog info and WSREP info in system rollback segment header page (0th slot page)
    Step(3) and (4) should happen within a single mini-transaction
  5. Delete the old undo tablespaces if any
  6. Make checkpoint to get rid of old undo tablespace redo log records
  7. Read the latest MAX_SPACE_ID from dictionary header page
  8. Create the new specified undo log tablespace and initialize the page0 of all undo tablespaces
    Step (7) and Step (8) should happen within a single mini-transaction
  9. Make the checkpoint again to make sure that next startup or backup reads the undo log tablespaces before
    opening the redo log records
  10. Create the rollback segment for each rollback segment in a round robin fashion


 Comments   
Comment by Thirunarayanan Balathandayuthapani [ 2019-04-10 ]

I don't see the reason for innodb_rollback_segments variable value have lesser value than 128. Hopefully, we can remove it in the future.

Comment by Matthias Leich [ 2022-09-01 ]

origin/bb-10.6-MDEV-19229 23ab688cec4b3031262bc93921592a04749a4341 2022-08-29T11:58:22+05:30
performed well in RQG testing.

Comment by Marko Mäkelä [ 2022-09-05 ]

thiru, please update the Description to say what will happen when the requirements are not met, and to document each failure scenario.

To my understanding, it was always possible to start InnoDB with a smaller number of innodb_undo_tablespaces so that some of the previously created undo tablespaces would be left unused. The main change here is that we allow new undo tablespaces to be created if the specified innodb_undo_tablespaces exceeds the number of undo tablespace files.

Comment by Marko Mäkelä [ 2022-09-05 ]

If it is possible to implement the following, I would suggest that we do it like this:

  1. If innodb_undo_tablespaces matches the number of undo tablespace files, the database will start up normally.
  2. Else, if the undo logs are not empty, InnoDB will refuse to start up.
  3. Else, undo tablespace files will be created or deleted to match the specified number of innodb_undo_tablespaces.

This would imply a notable change of existing behaviour: If innodb_undo_tablespaces is specified to be smaller than the number of undo tablespace files, InnoDB could refuse to start up (instead of just ignoring the ‘extra’ files).

Comment by Marko Mäkelä [ 2022-09-26 ]

Thank you. This looks feature-complete, including testing that incremental backup will be refused if the undo tablespaces have been reinitialized.

I posted some review comments.

Comment by Marko Mäkelä [ 2022-10-04 ]

I posted some more review comments, mostly minor, mainly about the fault injection.

Comment by Marko Mäkelä [ 2022-10-10 ]

Before this change, in trx_assign_rseg_low() there was some code that would avoid using anything else than the first rollback segment in the system tablespace when the server is started with innodb_undo_tablespaces=0.

With this change, we should rebuild the undo tablespaces whenever the detected number of undo tablespace files disagrees with the specified number. That is, the special handling in trx_assign_rseg_low() can be simplified.

Comment by Matthias Leich [ 2022-10-19 ]

origin/bb-10.6-MDEV-19229 162cccee73bc04cf089d35cf457eb109596aec4e 2022-10-17T21:51:01+05:30
performed well in RQG testing. The bad effects observed are in origin/10.6 1feccb505f9ec5cada8f8e2c544f736c1a533633 2022-10-13T09:09:03+03:00 too.

Comment by Marko Mäkelä [ 2022-10-20 ]

To reduce the risk of potentially breaking anything in stable releases, a decision was made to only fix this in the 10.11 series for now. Technically, the same change should work in any version starting with 10.6 where it was developed and tested.

Comment by Marko Mäkelä [ 2022-10-20 ]

I posted some review comments for the 10.11 version.

Comment by Marko Mäkelä [ 2022-10-21 ]

The revised patch for 10.11 is OK to push, as soon as it has passed the stress tests.

Generated at Thu Feb 08 08:50:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.