[MXS-4656] Setting session_track_trx_state=true leads to OOM kiled. Created: 2023-06-26  Updated: 2023-07-13  Resolved: 2023-07-13

Status: Closed
Project: MariaDB MaxScale
Component/s: readwritesplit
Affects Version/s: 6.4.7, 22.08.6, 23.02.2
Fix Version/s: 2.5.27, 6.4.8, 22.08.7, 23.02.3, 23.08.0

Type: Bug Priority: Major
Reporter: Alex Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None
Environment:

Debian buster
Linux 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux


Sprint: MXS-SPRINT-186

 Description   

invoking
maxctrl alter service Read-Write-Service session_track_trx_state=true

Leads to OOM kill very fast

Jun 26 18:47:19 MaxScale kernel: [31388711.925490] monitor-task invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
Jun 26 18:47:20 MaxScale kernel: [31388711.995785] monitor-task cpuset=/ mems_allowed=0
Jun 26 18:47:20 MaxScale kernel: [31388712.030204] CPU: 0 PID: 16447 Comm: monitor-task Not tainted 4.19.0-16-amd64 #1 Debian 4.19.181-1
Jun 26 18:47:20 MaxScale kernel: [31388712.065459] Hardware name: Hetzner vServer, BIOS 20171111 11/11/2017
Jun 26 18:47:20 MaxScale kernel: [31388712.099954] Call Trace:
Jun 26 18:47:20 MaxScale kernel: [31388712.131637]  dump_stack+0x66/0x81
Jun 26 18:47:20 MaxScale kernel: [31388712.162844]  dump_header+0x6b/0x283
Jun 26 18:47:20 MaxScale kernel: [31388712.193261]  ? do_try_to_free_pages+0x2ec/0x370
Jun 26 18:47:20 MaxScale kernel: [31388712.223902]  oom_kill_process.cold.30+0xb/0x1cf
Jun 26 18:47:20 MaxScale kernel: [31388712.254715]  ? oom_badness+0x23/0x140
Jun 26 18:47:20 MaxScale kernel: [31388712.285741]  out_of_memory+0x1a5/0x450
Jun 26 18:47:20 MaxScale kernel: [31388712.316769]  __alloc_pages_slowpath+0xbd8/0xcb0
Jun 26 18:47:20 MaxScale kernel: [31388712.347638]  __alloc_pages_nodemask+0x28b/0x2b0
Jun 26 18:47:20 MaxScale kernel: [31388712.377605]  filemap_fault+0x333/0x780
Jun 26 18:47:20 MaxScale kernel: [31388712.407313]  ? alloc_set_pte+0x49e/0x560
Jun 26 18:47:20 MaxScale kernel: [31388712.436717]  ? filemap_map_pages+0x139/0x3a0
Jun 26 18:47:20 MaxScale kernel: [31388712.466949]  ext4_filemap_fault+0x2c/0x40 [ext4]
Jun 26 18:47:20 MaxScale kernel: [31388712.497334]  __do_fault+0x36/0x130
Jun 26 18:47:20 MaxScale kernel: [31388712.526640]  __handle_mm_fault+0xdf9/0x11f0
Jun 26 18:47:20 MaxScale kernel: [31388712.556075]  ? __sys_recvfrom+0x120/0x180
Jun 26 18:47:20 MaxScale kernel: [31388712.585989]  handle_mm_fault+0xd6/0x200
Jun 26 18:47:20 MaxScale kernel: [31388712.615090]  __do_page_fault+0x249/0x4f0
Jun 26 18:47:20 MaxScale kernel: [31388712.643594]  ? async_page_fault+0x8/0x30
Jun 26 18:47:20 MaxScale kernel: [31388712.672372]  async_page_fault+0x1e/0x30
Jun 26 18:47:20 MaxScale kernel: [31388712.700973] RIP: 0033:0x7f634712c6c0
Jun 26 18:47:20 MaxScale kernel: [31388712.729198] Code: Bad RIP value.
Jun 26 18:47:20 MaxScale kernel: [31388712.757742] RSP: 002b:00007f633cff8288 EFLAGS: 00010206
Jun 26 18:47:20 MaxScale kernel: [31388712.786399] RAX: 0000000000000001 RBX: 00007f633cff8480 RCX: 0000000000000000
Jun 26 18:47:20 MaxScale kernel: [31388712.816569] RDX: 00007f632c008a60 RSI: 00007f632c008a96 RDI: 0000564b44246db8
Jun 26 18:47:20 MaxScale kernel: [31388712.845854] RBP: 00007f633cff85e0 R08: 0000000000000128 R09: 00007f63240131d0
Jun 26 18:47:20 MaxScale kernel: [31388712.875862] R10: 0000000000000000 R11: 0000000000000020 R12: 00007f633cff8390
Jun 26 18:47:20 MaxScale kernel: [31388712.905495] R13: 0000564b44246c00 R14: 0000564b44246db8 R15: 0000000000000000
Jun 26 18:47:20 MaxScale kernel: [31388712.935999] Mem-Info:
Jun 26 18:47:20 MaxScale kernel: [31388712.965715] active_anon:338717 inactive_anon:114404 isolated_anon:0
Jun 26 18:47:20 MaxScale kernel: [31388712.965715]  active_file:250 inactive_file:362 isolated_file:0
Jun 26 18:47:20 MaxScale kernel: [31388712.965715]  unevictable:0 dirty:0 writeback:0 unstable:0
Jun 26 18:47:20 MaxScale kernel: [31388712.965715]  slab_reclaimable:6263 slab_unreclaimable:13930
Jun 26 18:47:20 MaxScale kernel: [31388712.965715]  mapped:583 shmem:95 pagetables:4222 bounce:0
Jun 26 18:47:20 MaxScale kernel: [31388712.965715]  free:13183 free_pcp:353 free_cma:0
Jun 26 18:47:20 MaxScale kernel: [31388713.151059] Node 0 active_anon:1354868kB inactive_anon:457428kB active_file:1000kB inactive_file:1532kB unevictable:0kB isolated(anon):128kB isolated(file):0kB mapped:2332kB dirty:0kB writeback:0kB shmem:380kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1163264kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Jun 26 18:47:20 MaxScale kernel: [31388713.255297] Node 0 DMA free:7908kB min:364kB low:452kB high:540kB active_anon:7232kB inactive_anon:420kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:16kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Jun 26 18:47:20 MaxScale kernel: [31388713.363711] lowmem_reserve[]: 0 1898 1898 1898 1898
Jun 26 18:47:20 MaxScale kernel: [31388713.398590] Node 0 DMA32 free:44320kB min:44688kB low:55860kB high:67032kB active_anon:1347672kB inactive_anon:457196kB active_file:924kB inactive_file:2008kB unevictable:0kB writepending:0kB present:2031472kB managed:1977848kB mlocked:0kB kernel_stack:1552kB pagetables:16872kB bounce:0kB free_pcp:1496kB local_pcp:920kB free_cma:0kB
Jun 26 18:47:20 MaxScale kernel: [31388713.507171] lowmem_reserve[]: 0 0 0 0 0
Jun 26 18:47:20 MaxScale kernel: [31388713.542637] Node 0 DMA: 3*4kB (UM) 13*8kB (UME) 13*16kB (ME) 17*32kB (UE) 16*64kB (UE) 5*128kB (UE) 7*256kB (UE) 3*512kB (UM) 2*1024kB (UE) 0*2048kB 0*4096kB = 7908kB
Jun 26 18:47:20 MaxScale kernel: [31388713.627606] Node 0 DMA32: 588*4kB (ME) 235*8kB (ME) 226*16kB (UME) 240*32kB (UMEH) 183*64kB (UMEH) 93*128kB (UMEH) 20*256kB (UME) 1*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 44776kB
Jun 26 18:47:20 MaxScale kernel: [31388713.721939] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jun 26 18:47:20 MaxScale kernel: [31388713.763180] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 26 18:47:20 MaxScale kernel: [31388713.803222] 3893 total pagecache pages
Jun 26 18:47:20 MaxScale kernel: [31388713.842965] 3193 pages in swap cache
Jun 26 18:47:20 MaxScale kernel: [31388713.882622] Swap cache stats: add 70109477, delete 70106273, find 616847832/632948864
Jun 26 18:47:20 MaxScale kernel: [31388713.924556] Free swap  = 0kB
Jun 26 18:47:20 MaxScale kernel: [31388713.966208] Total swap = 1048572kB
Jun 26 18:47:20 MaxScale kernel: [31388714.006961] 511866 pages RAM
Jun 26 18:47:20 MaxScale kernel: [31388714.048038] 0 pages HighMem/MovableOnly
Jun 26 18:47:20 MaxScale kernel: [31388714.089976] 13427 pages reserved
Jun 26 18:47:20 MaxScale kernel: [31388714.130501] 0 pages hwpoisoned
Jun 26 18:47:20 MaxScale kernel: [31388714.170720] Tasks state (memory values in pages):
Jun 26 18:47:20 MaxScale kernel: [31388714.211255] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Jun 26 18:47:20 MaxScale kernel: [31388714.253719] [    445]     0   445     2372       26    61440      310             0 dhclient
Jun 26 18:47:20 MaxScale kernel: [31388714.296312] [    643]     0   643     5011       55    73728      336             0 systemd-logind
Jun 26 18:47:20 MaxScale kernel: [31388714.339332] [    646]   104   646     2374      145    53248      113          -900 dbus-daemon
Jun 26 18:47:20 MaxScale kernel: [31388714.383225] [    648]     0   648     2110       27    49152       44             0 cron
Jun 26 18:47:20 MaxScale kernel: [31388714.426111] [    671]     0   671     1372        7    45056       39             0 atd
Jun 26 18:47:20 MaxScale kernel: [31388714.468748] [    674]     0   674     3963        3    69632      213         -1000 sshd
Jun 26 18:47:20 MaxScale kernel: [31388714.510998] [    676]     0   676     1337        0    49152       37             0 agetty
Jun 26 18:47:20 MaxScale kernel: [31388714.553148] [    678]     0   678      662        0    45056       32             0 agetty
Jun 26 18:47:20 MaxScale kernel: [31388714.595144] [    685]     0   685     2372       62    53248      263             0 dhclient
Jun 26 18:47:20 MaxScale kernel: [31388714.637283] [    707]     0   707     5135      369    90112     2045             0 munin-node
Jun 26 18:47:20 MaxScale kernel: [31388714.679708] [  22890]     0 22890     5256        0    81920      326             0 systemd
Jun 26 18:47:20 MaxScale kernel: [31388714.722330] [  22891]     0 22891    42921        1   110592      688             0 (sd-pam)
Jun 26 18:47:20 MaxScale kernel: [31388714.765188] [  24853]     0 24853    10921      115   114688      664             0 systemd-journal
Jun 26 18:47:20 MaxScale kernel: [31388714.806430] [  26755]     0 26755     1607       21    49152       39             0 qemu-ga
Jun 26 18:47:20 MaxScale kernel: [31388714.847445] [  26840]     0 26840    55943       45    77824      317             0 rsyslogd
Jun 26 18:47:20 MaxScale kernel: [31388714.888919] [  27643]     0 27643     5454       14    65536      191         -1000 systemd-udevd
Jun 26 18:47:20 MaxScale kernel: [31388714.930191] [   4313]   106  4313    19201       44    61440      110             0 ntpd
Jun 26 18:47:20 MaxScale kernel: [31388714.972298] [  19053]     0 19053     5925      282    77824      301             0 mosh-server
Jun 26 18:47:20 MaxScale kernel: [31388715.014363] [  19054]     0 19054     2108      101    57344      303             0 bash
Jun 26 18:47:20 MaxScale kernel: [31388715.055897] [  16437]   998 16437  2198207   447987 16392192   250588             0 maxscale
Jun 26 18:47:20 MaxScale kernel: [31388715.096734] [  18199]     0 18199     1390       48    57344        0             0 less
Jun 26 18:47:20 MaxScale kernel: [31388715.135826] Out of memory: Kill process 16437 (maxscale) score 923 or sacrifice child
Jun 26 18:47:20 MaxScale kernel: [31388715.174160] Killed process 16437 (maxscale) total-vm:8792828kB, anon-rss:1791948kB, file-rss:0kB, shmem-rss:0kB
Jun 26 18:47:20 MaxScale kernel: [31388715.266777] oom_reaper: reaped process 16437 (maxscale), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Jun 26 18:47:20 MaxScale systemd[1]: maxscale.service: Main process exited, code=killed, status=9/KILL
Jun 26 18:47:20 MaxScale systemd[1]: maxscale.service: Failed with result 'signal'.
Jun 26 18:47:20 MaxScale systemd[1]: maxscale.service: Service RestartSec=100ms expired, scheduling restart.
Jun 26 18:47:20 MaxScale systemd[1]: maxscale.service: Scheduled restart job, restart counter is at 12.
Jun 26 18:47:20 MaxScale systemd[1]: Stopped MariaDB MaxScale Database Proxy.
Jun 26 18:47:20 MaxScale systemd[1]: Starting MariaDB MaxScale Database Proxy...
Jun 26 18:47:20 MaxScale systemd[1]: Started MariaDB MaxScale Database Proxy.

Using readwritesplit

│ Parameters          │ {                                                           │
│                     │     "auth_all_servers": true,                               │
│                     │     "causal_reads": "global",                               │
│                     │     "causal_reads_timeout": "2000ms",                       │
│                     │     "connection_keepalive": "0ms",                          │
│                     │     "connection_timeout": "0ms",                            │
│                     │     "delayed_retry": true,                                  │
│                     │     "delayed_retry_timeout": "30000ms",                     │
│                     │     "disable_sescmd_history": false,                        │
│                     │     "enable_root_user": false,                              │
│                     │     "idle_session_pool_time": "-1ms",                       │
│                     │     "lazy_connect": false,                                  │
│                     │     "localhost_match_wildcard_host": true,                  │
│                     │     "log_auth_warnings": true,                              │
│                     │     "log_debug": false,                                     │
│                     │     "log_info": false,                                      │
│                     │     "log_notice": false,                                    │
│                     │     "log_warning": false,                                   │
│                     │     "master_accept_reads": false,                           │
│                     │     "master_failure_mode": "fail_on_write",                 │
│                     │     "master_reconnection": true,                            │
│                     │     "max_connections": 0,                                   │
│                     │     "max_sescmd_history": 1500,                             │
│                     │     "max_slave_connections": 255,                           │
│                     │     "max_slave_replication_lag": "2000ms",                  │
│                     │     "multiplex_timeout": "60000ms",                         │
│                     │     "net_write_timeout": "0ms",                             │
│                     │     "optimistic_trx": false,                                │
│                     │     "password": "*****",                                    │
│                     │     "prune_sescmd_history": true,                           │
│                     │     "rank": "primary",                                      │
│                     │     "retain_last_statements": -1,                           │
│                     │     "retry_failed_reads": true,                             │
│                     │     "reuse_prepared_statements": false,                     │
│                     │     "router": "readwritesplit",                             │
│                     │     "session_trace": false,                                 │
│                     │     "session_track_trx_state": false,                       │
│                     │     "slave_connections": 1,                                 │
│                     │     "slave_selection_criteria": "LEAST_CURRENT_OPERATIONS", │
│                     │     "strict_multi_stmt": true,                              │
│                     │     "strict_sp_calls": true,                                │
│                     │     "strip_db_esc": true,                                   │
│                     │     "transaction_replay": true,                             │
│                     │     "transaction_replay_attempts": 20,                      │
│                     │     "transaction_replay_checksum": "full",                  │
│                     │     "transaction_replay_max_size": 10485760,                │
│                     │     "transaction_replay_retry_on_deadlock": true,           │
│                     │     "transaction_replay_retry_on_mismatch": false,          │
│                     │     "transaction_replay_timeout": "0ms",                    │
│                     │     "type": "service",                                      │
│                     │     "use_sql_variables_in": "master",                       │
│                     │     "user": "maxscale",                                     │
│                     │     "user_accounts_file": null,                             │
│                     │     "user_accounts_file_usage": "add_when_load_ok",         │
│                     │     "version_string": null                                  │
│                     │ }                                                           │



 Comments   
Comment by markus makela [ 2023-06-27 ]

What version of MariaDB are you using and how have you configured it?

Comment by markus makela [ 2023-06-27 ]

Would it be possible for you to try to profile the memory usage of MaxScale? The instructions for it can be found here: https://mariadb.com/kb/en/maxscale-troubleshooting/#profiling-memory-usage

Comment by markus makela [ 2023-06-27 ]

The value of "max_sescmd_history": 1500 might be a bit on the high side. That means 1500 SQL statements are kept in memory and executed whenever a reconnection occurs. Are you using a connection pool of some sorts? It's likely that you'd be okay with a much smaller value. The default 50 values is probably a good starting point but I'd still recommend looking at the traffic and figuring out how many SET statements there are.

Comment by Alex [ 2023-06-28 ]

MariaDB 10.11.4

Comment by Alex [ 2023-06-28 ]

I am not qualified to do memory profiling. And it is a production system. I do not want to mess around with it.

Comment by markus makela [ 2023-06-28 ]

Have you seen this happen with any other combination of MaxScale and MariaDB?

Comment by Alex [ 2023-06-28 ]

I did not tried.

Comment by Alex [ 2023-06-28 ]

Changed max_sescmd_history to 200 will observe

Comment by Alex [ 2023-06-28 ]

Did not help. Same story. Memory consumption rises rapidly after setting session_track_trx_state=false.

Comment by markus makela [ 2023-07-04 ]

What are the values of the following MariaDB variables?

  • session_track_system_variables
  • session_track_transaction_info
  • session_track_state_change
  • session_track_schema
Comment by markus makela [ 2023-07-04 ]

Does the memory usage keep growing if you disable transaction_replay in readwritesplit?

Comment by markus makela [ 2023-07-04 ]

I think I managed to reproduce this with transaction_replay=true and session_track_trx_state=true and an infinite loop of SET autocommit=0; SET autocommit=1.

Comment by Alex [ 2023-07-04 ]

MariaDB [meshok]> show global variables like 'session_track%';
+--------------------------------+----------------------------------------------------------------------------------------------------+
| Variable_name                  | Value                                                                                              |
+--------------------------------+----------------------------------------------------------------------------------------------------+
| session_track_schema           | ON                                                                                                 |
| session_track_state_change     | ON                                                                                                 |
| session_track_system_variables | autocommit,character_set_client,character_set_connection,character_set_results,time_zone,last_gtid |
| session_track_transaction_info | CHARACTERISTICS                                                                                    |
+--------------------------------+----------------------------------------------------------------------------------------------------+

Comment by Alex [ 2023-07-04 ]

Will test transaction replay later today

Comment by markus makela [ 2023-07-04 ]

Based on the maxctrl output, you have transaction_replay_max_size=10Mi set as the maximum transaction size. If you lower this to transaction_replay_max_size=256Ki, does the memory usage stop growing?

Comment by markus makela [ 2023-07-04 ]

This appears to be caused by some internal changes done in 22.08 that make the size calculations of the stored transaction sizes inaccurate. However the root cause of the problem is in how session_track_trx_state interacts with readwritesplit and how the SET autocommit=0; SELECT 1; SET autocommit=1 cycle is treated as one long contiguous transaction.

Comment by markus makela [ 2023-07-05 ]

I'd recommend disabling session_track_trx_state for now, it does not seem to result in the same transaction state as the query classification based one that's used by default.

Comment by Alex [ 2023-07-06 ]

OK

Comment by markus makela [ 2023-07-06 ]

The next patch releases of all versions will now automatically disable transaction_replay and optimistic_trx if session_track_trx_state is enabled. This is done as they are not compatible with each other and result in other problems than excessive memory usage.

Starting with 23.08.0, the session_track_trx_state parameter will no longer be needed and it will be deprecated. The transaction state will always be updated correctly even if something that's not seen by MaxScale (e.g. stored procedure) causes a transaction to open. The speedup mentioned in the old documentation is probably no longer true as the parsed queries are cached in the query classification cache and the results are reused to track the transaction state. With readwritesplit there never was a performance improvement as it had to do extra work to parse the query.

Generated at Thu Feb 08 04:30:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.