|
I compiled MaxScale with -O2 and investigated the moves:
- There's always an EPOLLOUT event whenever there's an EPOLLIN event. This is because epoll reports the state of the socket and a non-blocked socket is always writable even if it previously wasn't blocked. The post-write code gets executed always even if m_writeq was empty when the function was entered. This can be avoided by only executing the post-write code if data was actually written.
- DCB::read_impl has two moves, one from m_readq into the return value and one when the return value tuple is constructed. Both mariadb::read_protocol_packet() and MariaDBClientConnection::read_protocol_packet() also recreate the return value tuples. The extra tuple constructions are straightforward to fix but the move from m_readq into the variable isn't that simple.
- There's one in RWSplitSession::handle_got_target() but it's just the m_current_query that moves the temporary from GWBUF::shallow_clone() into itself. Maybe a GWBUF::clone_from(const GWBUF& other) would solve it?
- The buffer is eventually moved into DCB::m_writeq. Avoiding this can be done if the writeq drainining was changed to move the remainder into m_writeq only if there was data left over.
I also spotted that the current GWBUF move constructor generates this assembly:
0000000000131130 <GWBUF::GWBUF(GWBUF&&)>:
|
131130: 55 push %rbp
|
131131: 48 89 e5 mov %rsp,%rbp
|
131134: 41 54 push %r12
|
131136: 49 89 f4 mov %rsi,%r12
|
131139: 53 push %rbx
|
13113a: 48 89 fb mov %rdi,%rbx
|
13113d: e8 9e ab fc ff call fbce0 <GWBUF::GWBUF()@plt>
|
131142: 4c 89 e6 mov %r12,%rsi
|
131145: 48 89 df mov %rbx,%rdi
|
131148: 5b pop %rbx
|
131149: 41 5c pop %r12
|
13114b: 5d pop %rbp
|
13114c: e9 bf 78 fc ff jmp f8a10 <GWBUF::move_helper(GWBUF&&)@plt>
|
131151: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
|
131158: 00 00 00 00
|
13115c: 0f 1f 40 00 nopl 0x0(%rax)
|
Removing the GWBUF() call and manually assinging the m_owner debug-only variable results in this:
0000000000131170 <GWBUF::GWBUF(GWBUF&&)>:
|
131170: 48 8d 47 50 lea 0x50(%rdi),%rax
|
131174: c6 47 50 00 movb $0x0,0x50(%rdi)
|
131178: 66 0f ef c0 pxor %xmm0,%xmm0
|
13117c: 48 89 47 40 mov %rax,0x40(%rdi)
|
131180: 48 8d 47 70 lea 0x70(%rdi),%rax
|
131184: 48 c7 47 30 00 00 00 movq $0x0,0x30(%rdi)
|
13118b: 00
|
13118c: 48 c7 47 38 00 00 00 movq $0x0,0x38(%rdi)
|
131193: 00
|
131194: 48 c7 47 48 00 00 00 movq $0x0,0x48(%rdi)
|
13119b: 00
|
13119c: 48 89 47 60 mov %rax,0x60(%rdi)
|
1311a0: 48 c7 47 68 00 00 00 movq $0x0,0x68(%rdi)
|
1311a7: 00
|
1311a8: c6 47 70 00 movb $0x0,0x70(%rdi)
|
1311ac: 48 c7 87 90 00 00 00 movq $0x0,0x90(%rdi)
|
1311b3: 00 00 00 00
|
1311b7: 0f 11 07 movups %xmm0,(%rdi)
|
1311ba: 0f 11 47 10 movups %xmm0,0x10(%rdi)
|
1311be: 0f 11 47 20 movups %xmm0,0x20(%rdi)
|
1311c2: 0f 11 87 80 00 00 00 movups %xmm0,0x80(%rdi)
|
1311c9: e9 52 78 fc ff jmp f8a20 <GWBUF::move_helper(GWBUF&&)@plt>
|
1311ce: 66 90 xchg %ax,%ax
|
The GWBUF::move_helper is an additional 300 instructions on top of these two. Move-initializing all the variables without first value-initializing them generates about 170 instructions on my machine. This seems like a beneficial thing and should make move-constructing a GWBUF a lighter operation.
|