rw_trx_list contains ACTIVE, PREPARED and recovered COMMITTED transactions.
Move recovered COMMITTED transactions to purge_list.
ACTIVE and PREPARED transactions are already available through rw_trx_hash.
Also removed a hack from lock_trx_release_locks(). Instead let recovery
rollback thread to skip committed XA transactions.
Sergey Vojtovich
added a comment - Pushed to 10.3:
commit ec32c050726bad8c0504c6b6b74a0fa3f8f8acbb
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 23:03:18 2018 +0400
Get rid of trx->read_view pointer juggling
trx->read_view|= 1 was done in a silly attempt to fix race condition
where trx->read_view was closed without trx_sys.mutex lock by read-only
trasnactions.
This just made the problem less likely to happen. In fact there was race
condition in const version of trx_get_read_view(): pointer may change to
garbage any moment after MVCC::is_view_active(trx->read_view) check and
before this function returns.
This patch doesn't fix this race condition, but rather makes it's
consequences less destructive.
commit 95070bf93977bf42b157819edea7573edaf1369e
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 21:42:33 2018 +0400
MVCC simplifications
Simplified away MVCC::get_oldest_view()
Simplified away MVCC::get_view()
Removed unused MVCC::view_release()
commit 90bf55673e63bf7c6633598abe52217e42516447
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 19:05:43 2018 +0400
Misc trx_sys scalability fixes
trx_erase_lists(): trx->read_view is owned by current thread and thus
doesn't need trx_sys.mutex protection for reading it's value. Move
trx->read_view check out of mutex
trx_start_low(): moved assertion out of mutex.
Call ReadView::creator_trx_id() directly: allows to inline this one-line
method.
commit 64048bafe0fe5a7e73244929d6e4cae8eebb9a00
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 19:11:16 2018 +0400
Removed purge_trx_id_age and purge_view_trx_id_age
These were unused status variables available in debug builds only.
Also removed trx_sys.rw_max_trx_id: not used anymore.
commit db5bb785f9b989b4ca6b1087b77a06b31b5ddf71
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Jan 17 19:43:08 2018 +0400
Allocate trx_sys.mvcc at link time
trx_sys.mvcc was allocated dynamically for no good reason.
commit f8882cce93f9828ec4a5474134d893f8c68d28db
Author: Marko Mäkelä <marko.makela@mariadb.com>
Date: Fri Dec 22 16:15:41 2017 +0200
Replace trx_sys_t* trx_sys with trx_sys_t trx_sys
There is only one transaction system object in InnoDB.
Allocate the storage for it at link time, not at runtime.
lock_rec_fetch_page(): Use the correct fetch mode BUF_GET.
Pages may never be deallocated from a tablespace while
record locks are pointing to them.
commit 7078203389b04e742de660d78c36034a3a4deb59
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 20:07:20 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Use atomic operations when accessing trx_sys_t::max_trx_id. We can't yet
move trx_sys_t::get_new_trx_id() out of mutex because it must be updated
atomically along with trx_sys_t::rw_trx_ids.
commit c6d2842d9a6f46c592d7dfe465bb2b647ebb4d19
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 16:23:53 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Remove rw_trx_list.
commit a447980ff3ba000968d89e0c0c16239addeaf438
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 15:38:23 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let lock_print_info_all_transactions() iterate rw_trx_hash instead of
rw_trx_list.
When printing info of locks for transactions, InnoDB monitor doesn't
attempt to read relevant page from disk anymore. The code was prone
to race conditions.
Note that TrxListIterator didn't work as advertised: it iterated
rw_trx_list only.
commit 886af392d301dda720a1585a0e4e550c4d9cef69
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 14:24:34 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let trx_rollback_recovered() iterate rw_trx_hash instead of rw_trx_list.
commit 02270b44d07b78336e0f0d6afe9934587281e056
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Sun Dec 24 21:23:10 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let lock_validate_table_locks(), lock_rec_other_trx_holds_expl(),
lock_table_locks_lookup(), trx_recover_for_mysql(), trx_get_trx_by_xid(),
trx_roll_must_shutdown(), fetch_data_into_cache() iterate rw_trx_hash
instead of rw_trx_list.
commit d8c0caad320827d4f92a769ece707e2f5d373b98
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Sun Dec 24 19:57:11 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Removed trx_sys_validate_trx_list(): with rw_trx_hash elements are not
required to be ordered by transaction id. Transaction state is now guarded
by asserts in rw_trx_hash_t.
commit 900b07908bf9dbd2c79c3a66fc471e6be4cf0d13
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 01:04:08 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Removed trx_sys_t::n_prepared_recovered_trx: never used.
Removed trx_sys_t::n_prepared_trx: used only at shutdown, we can perfectly
get this value from rw_trx_hash.
commit a0b385ea2b00734b3e06e217abaafd6f9e13f91e
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Tue Dec 26 23:53:38 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Determine minimum transaction id by iterating rw_trx_hash, not rw_trx_list.
It is more expensive than previous implementation since it does linear
search, especially if there're many concurrent transactions running. But in
such case mutex is much bigger evil. And since it doesn't require
trx_sys->mutex protection it scales better.
For low concurrency performance difference is neglible.
commit 868c77df3ea0dfb5bd9263cf01df731ab147a8b3
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Thu Dec 21 17:20:14 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Replaced UT_LIST_GET_LEN(trx_sys->rw_trx_list) with
trx_sys->rw_trx_hash.size().
Moved freeing of trx objects at shutdown to rw_trx_hash destructor.
Small clean-up in trx_rollback_recovered().
commit d09f14693406ea7612a7010917b39b895d77593f
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Thu Dec 21 15:45:40 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Reduce divergence between trx_sys_t::rw_trx_hash and trx_sys_t::rw_trx_list
by not adding recovered COMMITTED transactions to trx_sys_t::rw_trx_list.
Such transactions are discarded immediately without creating trx object.
This also required to split rollback and cleanup phases of recovery. To
reflect these updates the following renames happened:
trx_rollback_or_clean_all_recovered() -> trx_rollback_all_recovered()
trx_rollback_or_clean_is_active -> trx_rollback_is_active
trx_rollback_or_clean_recovered() -> trx_rollback_recovered()
trx_cleanup_at_db_startup() -> trx_cleanup_recovered()
Also removed a hack from lock_trx_release_locks(). Instead let recovery
rollback thread to skip committed XA transactions.
Pushed to 10.3:
commit ec32c050726bad8c0504c6b6b74a0fa3f8f8acbb
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 23:03:18 2018 +0400
Get rid of trx->read_view pointer juggling
trx->read_view|= 1 was done in a silly attempt to fix race condition
where trx->read_view was closed without trx_sys.mutex lock by read-only
trasnactions.
This just made the problem less likely to happen. In fact there was race
condition in const version of trx_get_read_view(): pointer may change to
garbage any moment after MVCC::is_view_active(trx->read_view) check and
before this function returns.
This patch doesn't fix this race condition, but rather makes it's
consequences less destructive.
commit 95070bf93977bf42b157819edea7573edaf1369e
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 21:42:33 2018 +0400
MVCC simplifications
Simplified away MVCC::get_oldest_view()
Simplified away MVCC::get_view()
Removed unused MVCC::view_release()
commit 90bf55673e63bf7c6633598abe52217e42516447
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 19:05:43 2018 +0400
Misc trx_sys scalability fixes
trx_erase_lists(): trx->read_view is owned by current thread and thus
doesn't need trx_sys.mutex protection for reading it's value. Move
trx->read_view check out of mutex
trx_start_low(): moved assertion out of mutex.
Call ReadView::creator_trx_id() directly: allows to inline this one-line
method.
commit 64048bafe0fe5a7e73244929d6e4cae8eebb9a00
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Fri Jan 19 19:11:16 2018 +0400
Removed purge_trx_id_age and purge_view_trx_id_age
These were unused status variables available in debug builds only.
Also removed trx_sys.rw_max_trx_id: not used anymore.
commit db5bb785f9b989b4ca6b1087b77a06b31b5ddf71
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Jan 17 19:43:08 2018 +0400
Allocate trx_sys.mvcc at link time
trx_sys.mvcc was allocated dynamically for no good reason.
commit f8882cce93f9828ec4a5474134d893f8c68d28db
Author: Marko Mäkelä <marko.makela@mariadb.com>
Date: Fri Dec 22 16:15:41 2017 +0200
Replace trx_sys_t* trx_sys with trx_sys_t trx_sys
There is only one transaction system object in InnoDB.
Allocate the storage for it at link time, not at runtime.
lock_rec_fetch_page(): Use the correct fetch mode BUF_GET.
Pages may never be deallocated from a tablespace while
record locks are pointing to them.
commit 7078203389b04e742de660d78c36034a3a4deb59
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 20:07:20 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Use atomic operations when accessing trx_sys_t::max_trx_id. We can't yet
move trx_sys_t::get_new_trx_id() out of mutex because it must be updated
atomically along with trx_sys_t::rw_trx_ids.
commit c6d2842d9a6f46c592d7dfe465bb2b647ebb4d19
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 16:23:53 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Remove rw_trx_list.
commit a447980ff3ba000968d89e0c0c16239addeaf438
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 15:38:23 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let lock_print_info_all_transactions() iterate rw_trx_hash instead of
rw_trx_list.
When printing info of locks for transactions, InnoDB monitor doesn't
attempt to read relevant page from disk anymore. The code was prone
to race conditions.
Note that TrxListIterator didn't work as advertised: it iterated
rw_trx_list only.
commit 886af392d301dda720a1585a0e4e550c4d9cef69
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 14:24:34 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let trx_rollback_recovered() iterate rw_trx_hash instead of rw_trx_list.
commit 02270b44d07b78336e0f0d6afe9934587281e056
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Sun Dec 24 21:23:10 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let lock_validate_table_locks(), lock_rec_other_trx_holds_expl(),
lock_table_locks_lookup(), trx_recover_for_mysql(), trx_get_trx_by_xid(),
trx_roll_must_shutdown(), fetch_data_into_cache() iterate rw_trx_hash
instead of rw_trx_list.
commit d8c0caad320827d4f92a769ece707e2f5d373b98
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Sun Dec 24 19:57:11 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Removed trx_sys_validate_trx_list(): with rw_trx_hash elements are not
required to be ordered by transaction id. Transaction state is now guarded
by asserts in rw_trx_hash_t.
commit 900b07908bf9dbd2c79c3a66fc471e6be4cf0d13
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Wed Dec 27 01:04:08 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Removed trx_sys_t::n_prepared_recovered_trx: never used.
Removed trx_sys_t::n_prepared_trx: used only at shutdown, we can perfectly
get this value from rw_trx_hash.
commit a0b385ea2b00734b3e06e217abaafd6f9e13f91e
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Tue Dec 26 23:53:38 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Determine minimum transaction id by iterating rw_trx_hash, not rw_trx_list.
It is more expensive than previous implementation since it does linear
search, especially if there're many concurrent transactions running. But in
such case mutex is much bigger evil. And since it doesn't require
trx_sys->mutex protection it scales better.
For low concurrency performance difference is neglible.
commit 868c77df3ea0dfb5bd9263cf01df731ab147a8b3
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Thu Dec 21 17:20:14 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Replaced UT_LIST_GET_LEN(trx_sys->rw_trx_list) with
trx_sys->rw_trx_hash.size().
Moved freeing of trx objects at shutdown to rw_trx_hash destructor.
Small clean-up in trx_rollback_recovered().
commit d09f14693406ea7612a7010917b39b895d77593f
Author: Sergey Vojtovich <svoj@mariadb.org>
Date: Thu Dec 21 15:45:40 2017 +0400
MDEV-14756 - Remove trx_sys_t::rw_trx_list
Reduce divergence between trx_sys_t::rw_trx_hash and trx_sys_t::rw_trx_list
by not adding recovered COMMITTED transactions to trx_sys_t::rw_trx_list.
Such transactions are discarded immediately without creating trx object.
This also required to split rollback and cleanup phases of recovery. To
reflect these updates the following renames happened:
trx_rollback_or_clean_all_recovered() -> trx_rollback_all_recovered()
trx_rollback_or_clean_is_active -> trx_rollback_is_active
trx_rollback_or_clean_recovered() -> trx_rollback_recovered()
trx_cleanup_at_db_startup() -> trx_cleanup_recovered()
Also removed a hack from lock_trx_release_locks(). Instead let recovery
rollback thread to skip committed XA transactions.