[MDEV-36472] 10.6 History length regression - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.6
Fix Version/s: 10.6
Component/s: Storage Engine - InnoDB
Labels:
- performance
- regression-10.6

Description

Original finding here. There is a history length regression in 10.6;

Testcase setup:

sudo apt install python3-pymysql

Then start a fresh server instance with the following options:

--innodb_flush_method=O_DIRECT --innodb_change_buffering=none --innodb_buffer_pool_size=17179869184

Start a client and add this table:

CREATE TABLE `catalog` (

  `uid` BIGINT UNSIGNED NOT NULL,

  `security_uid` INT UNSIGNED,

  `owner` varchar(255) binary NOT NULL default '',

  `viewable_owner` varchar(255) binary NOT NULL default '',

  `path` varchar(255) NOT NULL default '',

  `relative_url` varchar(255) NOT NULL default '',

  `parent_uid` BIGINT UNSIGNED default '0',

  `id` varchar(255) default '',

  `description` text,

  `title` varchar(255) default '',

  `meta_type` varchar(255) default '',

  `portal_type` varchar(255) default '',

  `opportunity_state` varchar(255) default '',

  `corporate_registration_code` varchar(255),

  `ean13_code` varchar(255),

  `validation_state` varchar(255) default '',

  `simulation_state` varchar(255) default '',

  `causality_state` varchar(255) default '',

  `invoice_state` varchar(255) default '',

  `payment_state` varchar(255) default '',

  `event_state` varchar(255) default '',

  `immobilisation_state` varchar(255) default '',

  `reference` varchar(255) binary default '',

  `grouping_reference` varchar(255) default '',

  `grouping_date` datetime,

  `source_reference` varchar(255) default '',

  `destination_reference` varchar(255) default '',

  `string_index` varchar(255),

  `int_index` INT,

  `float_index` real,

  `has_cell_content` bool,

  `creation_date` datetime,

  `modification_date` datetime,

  `indexation_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

  PRIMARY KEY  (`uid`),

  KEY `security_uid` (`security_uid`),

  KEY `owner` (`owner`),

  KEY `viewable_owner` (`viewable_owner`),

  KEY `Parent` (`parent_uid`),

  KEY `Path` (`path`),

  KEY `Title` (`title`),

  KEY `Reference` (`reference`),

  KEY `relative_url` (`relative_url`),

  KEY `Portal Type` (`portal_type`, `reference`),

  KEY `opportunity_state` (`opportunity_state`),

  KEY `validation_state_portal_type` (`validation_state`, `portal_type`),

  KEY `simulation_state_portal_type` (`simulation_state`, `portal_type`),

  KEY `causality_state_portal_type` (`causality_state`, `portal_type`),

  KEY `invoice_state` (`invoice_state`),

  KEY `payment_state` (`payment_state`),

  KEY `event_state` (`event_state`)

) ENGINE=InnoDB;

Next find out the port of the server (ps -ef | grep $PWD | grep port or similar), and change the port = variable setting at the head of test.py, which is otherwise as follows:

import threading

import time

import pymysql

import random

host = "127.0.0.1"

port = 10238

user = "root"

password = ""

database = "test"

num_threads = 512

num_items = 2

query_list = [

    "BEGIN",

    "SELECT uid, path from catalog WHERE path IN (" + ", ".join("'%s'" for _ in range(num_items)) + ")",

    "SELECT uid, path from catalog WHERE uid IN (" + ", ".join("%s" for _ in range(num_items)) + ")",

    "REPLACE INTO catalog (`uid`, `path`) VALUES " + ", ".join("(%s, '%s')" for _ in range(num_items)),

    "COMMIT",

],

    "BEGIN",

    "SELECT uid, path from catalog WHERE path IN (" + ", ".join("'%s'" for _ in range(num_items)) + ")",

    "SELECT uid, path from catalog WHERE uid IN (" + ", ".join("%s" for _ in range(num_items)) + ")",

    "COMMIT",

],

connection_list = [

  pymysql.connect(host=host, port=port, user=user, password=password, database=database, cursorclass=pymysql.cursors.DictCursor)

  for _ in range(num_threads + 1)

def worker(thread_id):

"""

  Worker function for each thread.

"""

  conn = connection_list[thread_id]

  cursor = conn.cursor()

  try:

    for _ in range(100):

      uid_list = [random.randint(1, 10**12) for _ in range(num_items)]

      if thread_id % 2 == 0:

        query = query_list[0]

        args = sum(([e, e, e, e] for e in uid_list), [])

      else:

        query = query_list[1]

        args = sum(([e, e] for e in uid_list), [])

      for q in query:

        num_args = q.count('%s')

        args_ = args[:num_args]

        cursor.execute(q, args_)

        args = args[num_args:]

  except Exception as e:

    print(f"Thread {thread_id}: Error: {e}")

    raise

  finally:

    cursor.close()

if __name__ == "__main__":

  conn = connection_list[-1]

  cursor = conn.cursor()

  for i in range(100):

    threads = []

    start_time = time.time()

    for thread_id in range(num_threads):

        thread = threading.Thread(target=worker, args=(thread_id,))

        threads.append(thread)

        thread.start()

    for thread in threads:

        thread.join()

    end_time = time.time()

    print(f"Iteration {i}: executed in {end_time - start_time:.4f} seconds", flush=True)

    cursor.execute("SHOW TABLE STATUS LIKE 'catalog'")

    print(cursor.fetchall(), flush=True)

    cursor.execute("SHOW ENGINE INNODB STATUS")

    print([e for e in cursor.fetchone()['Status'].splitlines() if 'History' in e][0], flush=True)

Note that the script was updated from the original source version. Run as follows:

python3 test.py | tee log.txt

You will see per-iteration output. To capture details from the same, you can use:

cat log.txt | grep Histor | grep -o '[0-9]\+' | head -n23  # History length

cat log.txt | grep exec | sed 's|.*in||' | grep -o '[\.0-9]\+' | head -n23  # Execution time

cat log.txt | grep -o 'Rows.: [0-9]\+' | grep -o '[\.0-9]\+' | head -n23  # Number of rows as per SHOW TABLE STATUS

Attached (MDEV-36472_test_results_2.png) an overview of the first 23 iterations which show an significant increase in history length in 10.6.

Versions used
CS 10.5.29 c43d0a015f974c5a0142e6779332089a7a979853 (Optimized) Build 15/02/2025
CS 10.6.22 f1d7e0c17e33f77278e6226dd94aeb30fc856bf0 (Optimized) Build 15/02/2025

The increase in history length seems to go matched with a longer per-iteration duration, and a lesser number of rows processed. Based on results during test tuning, it is expected that additional tuning of the workload can further increase the history length and other offsets.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

MDEV-36472_test_results_2.png
222 kB
2025-04-03 12:44

Issue Links

relates to

MDEV-17598 InnoDB index option for per-record transaction ID

Open

MDEV-34515 Contention between secondary index UPDATE and purge due to large innodb_purge_batch_size

Closed

MDEV-33099 Regression in select count query while large transaction keeps undo logs

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2025-04-03 11:46

I think that this will be more interesting to check in a branch that includes a fix of ~~MDEV-21923~~.

Marko Mäkelä added a comment - 2025-04-03 11:46 I think that this will be more interesting to check in a branch that includes a fix of MDEV-21923 .

Roel Van de Paar added a comment - 2025-04-03 12:45 - edited

10.11-MDEV-21923 Yields worse results overall.

Additional results uploaded as MDEV-36472_test_results_2.png (and removed previous results which are still fully included in the same).

Roel Van de Paar added a comment - 2025-04-03 12:45 - edited 10.11-MDEV-21923 Yields worse results overall. Additional results uploaded as MDEV-36472_test_results_2.png (and removed previous results which are still fully included in the same).

Marko Mäkelä added a comment - 2025-04-03 13:22

The table contains an large number of secondary indexes. In ~~MDEV-34515~~ I noted that this problem would likely be fixed by implementing MDEV-17598.

Marko Mäkelä added a comment - 2025-04-03 13:22 The table contains an large number of secondary indexes. In MDEV-34515 I noted that this problem would likely be fixed by implementing MDEV-17598 .

Marko Mäkelä added a comment - 2025-04-04 13:56

In ~~MDEV-33099~~, I had written the following:

Before ~~MDEV-26465~~ was fixed, some purging of committed transaction history could have been missed if the undo log page had meanwhile been reused for something else and the TRX_UNDO_NEEDS_PURGE field was no longer being skipped. So, you would have been left with some extra garbage in any secondary indexes; https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/ can help understand this.

Further correctness fixes ~~MDEV-30671~~ and ~~MDEV-31355~~ caused more lag to the purge of history. Even though in our internal testing we only reproduced failures when using innodb_undo_log_truncate=ON, I believe that these bugs affect also the "default" case where the undo tablespaces are not being truncated, or the undo logs are being stored in the InnoDB system tablespace. In retrospect, I think that ~~MDEV-14407~~ was only "fixed", but its actual root cause could have been fixed by ~~MDEV-30671~~ or ~~MDEV-31355~~. Had we had https://rr-project.org in use back then, it should have been possible to properly analyze and fix the race condition.

Marko Mäkelä added a comment - 2025-04-04 13:56 In MDEV-33099 , I had written the following: Before MDEV-26465 was fixed , some purging of committed transaction history could have been missed if the undo log page had meanwhile been reused for something else and the TRX_UNDO_NEEDS_PURGE field was no longer being skipped. So, you would have been left with some extra garbage in any secondary indexes; https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/ can help understand this. Further correctness fixes MDEV-30671 and MDEV-31355 caused more lag to the purge of history. Even though in our internal testing we only reproduced failures when using innodb_undo_log_truncate=ON , I believe that these bugs affect also the "default" case where the undo tablespaces are not being truncated, or the undo logs are being stored in the InnoDB system tablespace. In retrospect, I think that MDEV-14407 was only "fixed", but its actual root cause could have been fixed by MDEV-30671 or MDEV-31355 . Had we had https://rr-project.org in use back then, it should have been possible to properly analyze and fix the race condition.

Marko Mäkelä added a comment - 2025-04-10 07:41

Apart from implementing MDEV-17598, there could be a simpler solution: Implement a parameter such as

SET GLOBAL innodb_purge_unsafe=100;

which would give the purge permission to remove records that may be accessible to read views in old transactions, while preserving the specified amount of history. The default value could be 0, which would have the special meaning that purge_sys.view will never advance beyond the oldest active read view.

Upon accessing prematurely purged records, user transactions would return DB_MISSING_HISTORY inside InnoDB, which could be converted to ER_CHECKREAD in the SQL layer. InnoDB uses this old error code when the parameter innodb_snapshot_isolation is enabled (~~MDEV-35124~~).

Marko Mäkelä added a comment - 2025-04-10 07:41 Apart from implementing MDEV-17598 , there could be a simpler solution: Implement a parameter such as SET GLOBAL innodb_purge_unsafe=100; which would give the purge permission to remove records that may be accessible to read views in old transactions, while preserving the specified amount of history. The default value could be 0, which would have the special meaning that purge_sys.view will never advance beyond the oldest active read view. Upon accessing prematurely purged records, user transactions would return DB_MISSING_HISTORY inside InnoDB, which could be converted to ER_CHECKREAD in the SQL layer. InnoDB uses this old error code when the parameter innodb_snapshot_isolation is enabled ( MDEV-35124 ).

People

Assignee:: Marko Mäkelä

Reporter:: Roel Van de Paar

Votes:: 2 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2025-04-03 11:42

Updated:: 2025-04-10 07:42

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration