[MXS-4620] Investigate excessive GWBUF moves Created: 2023-05-22  Updated: 2023-06-05  Resolved: 2023-06-05

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: None
Fix Version/s: 22.08.7, 23.02.3, 23.08.0

Type: Task Priority: Major
Reporter: Johan Wikman Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None

Sprint: MXS-SPRINT-183

 Comments   
Comment by markus makela [ 2023-05-22 ]

I compiled MaxScale with -O2 and investigated the moves:

  • There's always an EPOLLOUT event whenever there's an EPOLLIN event. This is because epoll reports the state of the socket and a non-blocked socket is always writable even if it previously wasn't blocked. The post-write code gets executed always even if m_writeq was empty when the function was entered. This can be avoided by only executing the post-write code if data was actually written.
  • DCB::read_impl has two moves, one from m_readq into the return value and one when the return value tuple is constructed. Both mariadb::read_protocol_packet() and MariaDBClientConnection::read_protocol_packet() also recreate the return value tuples. The extra tuple constructions are straightforward to fix but the move from m_readq into the variable isn't that simple.
  • There's one in RWSplitSession::handle_got_target() but it's just the m_current_query that moves the temporary from GWBUF::shallow_clone() into itself. Maybe a GWBUF::clone_from(const GWBUF& other) would solve it?
  • The buffer is eventually moved into DCB::m_writeq. Avoiding this can be done if the writeq drainining was changed to move the remainder into m_writeq only if there was data left over.

I also spotted that the current GWBUF move constructor generates this assembly:

0000000000131130 <GWBUF::GWBUF(GWBUF&&)>:
  131130:       55                      push   %rbp
  131131:       48 89 e5                mov    %rsp,%rbp
  131134:       41 54                   push   %r12
  131136:       49 89 f4                mov    %rsi,%r12
  131139:       53                      push   %rbx
  13113a:       48 89 fb                mov    %rdi,%rbx
  13113d:       e8 9e ab fc ff          call   fbce0 <GWBUF::GWBUF()@plt>
  131142:       4c 89 e6                mov    %r12,%rsi
  131145:       48 89 df                mov    %rbx,%rdi
  131148:       5b                      pop    %rbx
  131149:       41 5c                   pop    %r12
  13114b:       5d                      pop    %rbp
  13114c:       e9 bf 78 fc ff          jmp    f8a10 <GWBUF::move_helper(GWBUF&&)@plt>
  131151:       66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
  131158:       00 00 00 00 
  13115c:       0f 1f 40 00             nopl   0x0(%rax)

Removing the GWBUF() call and manually assinging the m_owner debug-only variable results in this:

0000000000131170 <GWBUF::GWBUF(GWBUF&&)>:
  131170:       48 8d 47 50             lea    0x50(%rdi),%rax
  131174:       c6 47 50 00             movb   $0x0,0x50(%rdi)
  131178:       66 0f ef c0             pxor   %xmm0,%xmm0
  13117c:       48 89 47 40             mov    %rax,0x40(%rdi)
  131180:       48 8d 47 70             lea    0x70(%rdi),%rax
  131184:       48 c7 47 30 00 00 00    movq   $0x0,0x30(%rdi)
  13118b:       00 
  13118c:       48 c7 47 38 00 00 00    movq   $0x0,0x38(%rdi)
  131193:       00 
  131194:       48 c7 47 48 00 00 00    movq   $0x0,0x48(%rdi)
  13119b:       00 
  13119c:       48 89 47 60             mov    %rax,0x60(%rdi)
  1311a0:       48 c7 47 68 00 00 00    movq   $0x0,0x68(%rdi)
  1311a7:       00 
  1311a8:       c6 47 70 00             movb   $0x0,0x70(%rdi)
  1311ac:       48 c7 87 90 00 00 00    movq   $0x0,0x90(%rdi)
  1311b3:       00 00 00 00 
  1311b7:       0f 11 07                movups %xmm0,(%rdi)
  1311ba:       0f 11 47 10             movups %xmm0,0x10(%rdi)
  1311be:       0f 11 47 20             movups %xmm0,0x20(%rdi)
  1311c2:       0f 11 87 80 00 00 00    movups %xmm0,0x80(%rdi)
  1311c9:       e9 52 78 fc ff          jmp    f8a20 <GWBUF::move_helper(GWBUF&&)@plt>
  1311ce:       66 90                   xchg   %ax,%ax

The GWBUF::move_helper is an additional 300 instructions on top of these two. Move-initializing all the variables without first value-initializing them generates about 170 instructions on my machine. This seems like a beneficial thing and should make move-constructing a GWBUF a lighter operation.

Generated at Thu Feb 08 04:29:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.