Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25764

SELECT binlog_gtid_pos takes a very long time when binary log is encrypted

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.2.27
    • 11.4.1
    • Encryption, Replication
    • None
    • AWS t2.xlarge w/30,000 IOPS io2 provisioned SSD running CentOS 7 w/mariadb-10.2.27-bintar-centos7-amd64-128indexes-openssl-2020-06-19 installed

    Description

      When connecting a replica to a primary with an encrypted binary log (MDEV-25709 means this happens even with GTID disabled) or when running SELECT binlog_gtid_pos for other reasons, if the target event position is not near the start of the target binary log, the query will take a very long time to run.

      The query seems to spend a lot of time setting-up, using, and then destroying a cryptographic context for each event parsed in the binary log. For example-

      Thread 2 (Thread 0x7fe973c6d700 (LWP 13890)):
      #0  0x000055bfbfa141e8 in MyCTX_nopad::finish (this=0x7fe973c694f0, dst=0x7fe9900b75bc "", dlen=0x7fe973c69730) at /home/buildbot/buildbot/build/mysys_ssl/my_crypt.cc:134
              i = 14
              buf = 0x7fe973c69530 "\005>\240\066\366l\016\265\320T\211\033\240", <incomplete sequence \325>
              mask = "\005=\243\071\344o#\265\320V\177\342\240nl0"
              mlen = 16
      #1  0x000055bfbfa13990 in my_aes_crypt_finish (ctx=0x7fe973c694f0, dst=<optimized out>, dlen=<optimized out>) at /home/buildbot/buildbot/build/mysys_ssl/my_crypt.cc:278
              res = <optimized out>
      #2  0x000055bfbf5ae53d in encryption_crypt (src=<optimized out>, slen=slen@entry=46, dst=dst@entry=0x7fe9900b759c "\023d", dlen=dlen@entry=0x7fe973c697bc, key=key@entry=0x7fe9900662c8 "\253\225x", klen=<optimized out>, iv=iv@entry=0x7fe973c697c0 "\373\235F\272\031ا\036SK\230Q\362\064\342\004e\207\247}G]\212\265Z2", key_version=1, key_id=1, flags=2, ivlen=16) at /home/buildbot/buildbot/build/include/mysql/service_encryption.h:120
              ctx = 0x7fe973c694f0
              res2 = <optimized out>
              d2 = 64
              res1 = 0
              d1 = 32
      #3  0x000055bfbf5b7b06 in Log_event::read_log_event (file=file@entry=0x7fe973c69a00, packet=packet@entry=0x7fe973c698d0, fdle=fdle@entry=0x7fe9900661f8, checksum_alg_arg=BINLOG_CHECKSUM_ALG_OFF) at /home/buildbot/buildbot/build/sql/log_event.cc:1845
              dstlen = 84
              dst = 0x7fe9900b7598 "\031\336h[\023d"
              iv = "\373\235F\272\031ا\036SK\230Q\362\064\342\004"
              newpkt = 0x7fe9900b7598 "\031\336h[\023d"
              src = <optimized out>
              data_len = 50
              buf = "e\207\247}G]\212\265Z2\000\000\000Fu\256\002\340", <incomplete sequence \317>
              ev_offset = 0 '\000'
              max_allowed_packet = <optimized out>
      #4  0x000055bfbf33c071 in gtid_state_from_pos (gtid_state=0x7fe973c698f0, offset=1020005351, name=0x7fe973c69cf0 "/var/lib/mysql/mysql-binlog.000003") at /home/buildbot/buildbot/build/sql/sql_repl.cc:1384
              typ = <optimized out>
              cur_pos = <optimized out>
              errormsg = 0x0
              found_gtid_list_event = true
              current_checksum_alg = BINLOG_CHECKSUM_ALG_CRC32
              err = <optimized out>
              packet = {<Sql_alloc> = {<No data fields>}, Ptr = 0x7fe99005e208 "e\207\247}G]\212\265Ze\207\247}Fu\256\002\340ϱpG\333]*2\276\301\271\020=!Qʂ\355\005>\240\066\366l\016\265\320T\211\033\240", <incomplete sequence \325>, str_length = 50, Alloced_length = 89, extra_alloc = 0, alloced = true, thread_specific = false, str_charset = 0x55bfc029ee60 <my_charset_bin>}
              fdev = 0x7fe9900661f8
              file = 45
              valid_pos = <optimized out>
              cache = {pos_in_file = 81928192, end_of_file = 18446744073709551615, read_pos = 0x7fe9900579fc "\350\357\254ӫ\320\376?\301Y", read_end = 0x7fe9900584d8 "\021\027", buffer = 0x7fe9900564d8 "\276O*\246\200\353\\\302\"&ƳEXG\234\017\r\266H$Y", request_pos = 0x7fe9900564d8 "\276O*\246\200\353\\\302\"&ƳEXG\234\017\r\266H$Y", write_buffer = 0x7fe9900564d8 "\276O*\246\200\353\\\302\"&ƳEXG\234\017\r\266H$Y", append_read_pos = 0x55bfc0321960 <mdl_locks> "\340\223\321\300\277U", write_pos = 0x7fe9900564d8 "\276O*\246\200\353\\\302\"&ƳEXG\234\017\r\266H$Y", write_end = 0x0, append_buffer_lock = {m_mutex = {__data = {__lock = 0, __count = 0, __owner = 1942395872, __nusers = 32745, __kind = 1942395616, __spins = 32745, __elision = 0, __list = {__prev = 0x7fe973c69ad0, __next = 0x7fe973c69ac0}}, __size = "\000\000\000\000\000\000\000\000\340\233\306s\351\177\000\000\340\232\306s\351\177\000\000К\306s\351\177\000\000\300\232\306s\351\177\000", __align = 0}, m_psi = 0x7fe973c69ab0}, share = 0x0, read_function = 0x55bfbfa1b060 <_my_b_cache_read>, write_function = 0x55bfbfa1a990 <_my_b_cache_write>, type = READ_CACHE, disk_writes = 0, file_name = 0x73c6a160 <Address 0x73c6a160 out of bounds>, dir = 0x7fe990150f68 "", prefix = "\000\000", file = 45, next_file_user = 0x0, seek_not_done = 0, error = 0, buffer_length = 8192, read_length = 8192, myflags = 528, alloced_buffer = 1 '\001'}
              found_format_description_event = true
      #5  gtid_state_from_binlog_pos (in_name=<optimized out>, pos=pos@entry=1020005351, out_str=out_str@entry=0x7fe973c6a070) at /home/buildbot/buildbot/build/sql/sql_repl.cc:1543
              gtid_state = {static START_OWN_SLAVE_POS = 1, static START_ON_EMPTY_DOMAIN = 2, hash = {key_offset = 0, key_length = 4, blength = 2, records = 1, flags = 1, array = {buffer = 0x7fe99013b3f8 "\377\377\377\377\001", elements = 1, max_element = 32, alloc_increment = 64, size_of_element = 16, malloc_flags = 0}, get_key = 0x0, hash_function = 0x55bfbfa19010 <my_hash_sort>, free = 0x55bfbfa38010 <my_free>, charset = 0x55bfc029ee60 <my_charset_bin>}, gtid_sort_array = {buffer = 0x7fe99013b608 "x", elements = 0, max_element = 8, alloc_increment = 8, size_of_element = 16, malloc_flags = 0}}
              lookup_name = <optimized out>
              name_buf = "/var/lib/mysql/mysql-binlog.000003\000\000\000\000\000\000\000\000\000-v\243\267A\000\000\000\000\000\000\000\000\b\027\025\220\351\177\000\000@\035\025\220\351\177\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\b\201\177,\352\177\000\000P\360\065\300\277U\000\000\020\201\177,\352\177\000\000\b\201\177,\352\177\000\000\000\000\000\000\000\000\000\000\340\233\306s\351\177\000\000F]{\277\277U", '\000' <repeats 26 times>, "@\360\065\300\277U\000\000\000\234\306s\351\177\000\000\000\000\000-v\243\267A\001\000\000\000>\000\000\000"...
              linfo = {log_file_name = "/var/lib/mysql/mysql-binlog.000003", '\000' <repeats 30 times>, ">\000\000\000\000\000\000\000h\017\025\220\351\177\000\000\060\237\306s\351\177\000\000,#5\277\277U\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\b\027\025\220\351\177\000\000\377\377\377\377\000\000\000\000p\037\025\220\351\177\000\000\000\000\000\000\376\377\377\377", '\000' <repeats 24 times>, "\001\000\000\000\000\000\000\000\300\235\306s\351\177\000\000\300\235\306s\351\177\000\000\370\001", '\000' <repeats 22 times>..., index_file_offset = 105, index_file_start_offset = 70, pos = 0, fatal = false, lock = {m_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 3, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\003", '\000' <repeats 22 times>, __align = 0}, m_psi = 0x0}}
      #6  0x000055bfbf5562b0 in Item_func_binlog_gtid_pos::val_str (this=0x7fe990150850, str=0x7fe973c6a070) at /home/buildbot/buildbot/build/sql/item_strfunc.cc:3112
              name_str = {<Sql_alloc> = {<No data fields>}, Ptr = 0x0, str_length = 0, Alloced_length = 0, extra_alloc = 0, alloced = false, thread_specific = false, str_charset = 0x55bfc029ee60 <my_charset_bin>}
              name = 0x7fe9901506e0
              pos = <optimized out>
      #7  0x000055bfbf4d017f in Item::send (this=0x7fe990150850, protocol=0x7fe99005ea48, buffer=0x7fe973c6a070) at /home/buildbot/buildbot/build/sql/item.cc:6904
              res = <optimized out>
              result = <optimized out>
              f_type = MYSQL_TYPE_LONG_BLOB
      #8  0x000055bfbf295ffb in Protocol::send_result_set_row (this=this@entry=0x7fe99005ea48, row_items=row_items@entry=0x7fe9900627c0) at /home/buildbot/buildbot/build/sql/protocol.cc:990
              item = <optimized out>
              buffer = "\376\000\000\002\000-3400017\000\000\000\001\000\000\000\351\177\000\000\000\000\000\000\000\177\000\000\315r\324\061\352\177", '\000' <repeats 18 times>, "\377\377\377\377\377\377\377\377\315r\324\061\352\177\000\000\001\000\000\000\352\177\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000 \241\306s\351\177\000\000\000\000\000\000\000\000\000\000\034j\302\060\352\177\000\000\000\000\000\000\000\000\000\000 ,\000\000\000\000\000\000p\r\000\000\000\000\000\000!j\302\060\352\177\000\000P\b\025\220\351\177\000\000\000,\000\000\000\000\000\000\020\t\025\220\351\177\000\000P\241\306s\351\177\000\000\000\000\000\000\000\000\000\000\030\345\005\220\351\177", '\000' <repeats 18 times>...
              str_buffer = {<Sql_alloc> = {<No data fields>}, Ptr = 0x7fe973c6a090 "\376", str_length = 766, Alloced_length = 766, extra_alloc = 0, alloced = false, thread_specific = false, str_charset = 0x55bfc029ee60 <my_charset_bin>}
              it = {<base_list_iterator> = {list = 0x7fe9900627c0, el = 0x7fe990150910, prev = 0x0, current = 0x0}, <No data fields>}
      #9  0x000055bfbf2ef463 in select_send::send_data (this=0x7fe9901509e8, items=...) at /home/buildbot/buildbot/build/sql/sql_class.cc:2726
              protocol = 0x7fe99005ea48
      #10 0x000055bfbf37a5a4 in JOIN::exec_inner (this=this@entry=0x7fe990150a08) at /home/buildbot/buildbot/build/sql/sql_select.cc:3503
              columns_list = 0x7fe9900627c0
              __FUNCTION__ = "exec_inner"
      #11 0x000055bfbf37a753 in JOIN::exec (this=this@entry=0x7fe990150a08) at /home/buildbot/buildbot/build/sql/sql_select.cc:3422
      No locals.
      #12 0x000055bfbf378c6a in mysql_select (thd=thd@entry=0x7fe99005e518, tables=0x0, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=result@entry=0x7fe9901509e8, unit=unit@entry=0x7fe990061f58, select_lex=select_lex@entry=0x7fe990062698) at /home/buildbot/buildbot/build/sql/sql_select.cc:3822
              err = 0
              free_join = true
              join = 0x7fe990150a08
              __FUNCTION__ = "mysql_select"
      #13 0x000055bfbf3797c4 in handle_select (thd=thd@entry=0x7fe99005e518, lex=lex@entry=0x7fe990061e90, result=result@entry=0x7fe9901509e8, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/buildbot/buildbot/build/sql/sql_select.cc:377
              unit = 0x7fe990061f58
              res = <optimized out>
              select_lex = 0x7fe990062698
      #14 0x000055bfbf254c29 in execute_sqlcom_select (thd=thd@entry=0x7fe99005e518, all_tables=0x0) at /home/buildbot/buildbot/build/sql/sql_parse.cc:6226
              save_protocol = 0x0
              lex = 0x7fe990061e90
              result = 0x7fe9901509e8
              res = <optimized out>
      #15 0x000055bfbf31eb2a in mysql_execute_command (thd=thd@entry=0x7fe99005e518) at /home/buildbot/buildbot/build/sql/sql_parse.cc:3533
              privileges_requested = <optimized out>
              res = <optimized out>
              lex = 0x7fe990061e90
              orig_binlog_format = <optimized out>
              up_result = 0
              rpl_filter = <optimized out>
              orig_current_stmt_binlog_format = <optimized out>
              __FUNCTION__ = "mysql_execute_command"
              select_lex = 0x7fe990062698
              first_table = <optimized out>
              all_tables = 0x0
              unit = 0x7fe990061f58
              have_table_map_for_update = <optimized out>
      #16 0x000055bfbf32512e in mysql_parse (thd=0x7fe99005e518, rawbuf=<optimized out>, length=<optimized out>, parser_state=0x7fe973c6c100, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /home/buildbot/buildbot/build/sql/sql_parse.cc:7760
              found_semicolon = <optimized out>
              lex = 0x7fe990061e90
              err = <optimized out>
              error = <optimized out>
              __FUNCTION__ = "mysql_parse"
      #17 0x000055bfbf32828d in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7fe99005e518, packet=packet@entry=0x7fe990159fc9 "", packet_length=packet_length@entry=56, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1832
              packet_end = 0x7fe9901505d8 ""
              parser_state = {m_input = {m_compute_digest = false}, m_lip = {m_thd = 0x7fe99005e518, yylineno = 1, yylval = 0x7fe973c6b260, lookahead_token = -1, lookahead_yylval = 0x0, m_ptr = 0x7fe9901505d9 "", m_tok_start = 0x7fe9901505d9 "", m_tok_end = 0x7fe9901505d9 "", m_end_of_query = 0x7fe9901505d8 "", m_tok_start_prev = 0x7fe9901505d8 "", m_buf = 0x7fe9901505a0 "SELECT binlog_gtid_pos('mysql-binlog.000003',1020005351)", m_buf_length = 56, m_echo = true, m_echo_saved = false, m_cpp_buf = 0x7fe990150638 "SELECT binlog_gtid_pos('mysql-binlog.000003',1020005351)", m_cpp_ptr = 0x7fe990150670 "", m_cpp_tok_start = 0x7fe990150670 "", m_cpp_tok_start_prev = 0x7fe990150670 "", m_cpp_tok_end = 0x7fe990150670 "", m_body_utf8 = 0x0, m_body_utf8_ptr = 0x0, m_cpp_utf8_processed_ptr = 0x0, next_state = MY_LEX_END, found_semicolon = 0x0, tok_bitmap = 127 '\177', ignore_space = false, stmt_prepare_mode = false, multi_statements = true, in_comment = NO_COMMENT, in_comment_saved = (unknown: 1942405792), m_cpp_text_start = 0x7fe990150665 "1020005351)", m_cpp_text_end = 0x7fe99015066f ")", m_underscore_cs = 0x0, m_digest = 0x0}, m_yacc = {yacc_yyss = 0x0, yacc_yyvs = 0x0, m_set_signal_info = {m_item = {0x0 <repeats 12 times>}}, m_lock_type = TL_READ_DEFAULT, m_mdl_type = MDL_SHARED_READ}, m_digest_psi = 0x0}
              net = 0x7fe99005e778
              do_end_of_statement = true
              __FUNCTION__ = "dispatch_command"
              error = false
              drop_more_results = false
      #18 0x000055bfbf328ea5 in do_command (thd=0x7fe99005e518) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1386
              return_value = <optimized out>
              packet = 0x7fe990159fc8 "\001"
              packet_length = 57
              net = 0x7fe99005e778
              command = COM_QUERY
      #19 0x000055bfbf3f87b1 in do_handle_one_connection (connect=connect@entry=0x55bfc145fb18) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1336
              create_user = true
              thr_create_utime = <optimized out>
              thd = 0x7fe99005e518
      #20 0x000055bfbf3f887d in handle_one_connection (arg=0x55bfc145fb18) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1241
              connect = 0x55bfc145fb18
      #21 0x00007fea33842ea5 in start_thread () from /lib64/libpthread.so.0
      No symbol table info available.
      #22 0x00007fea31df89fd in clone () from /lib64/libc.so.6
      No symbol table info available.
      

      Due to MDEV-4991, when the event needed is near the back of the binary log, everything from the start to almost the back must be read, making for a lot of cryptographic overhead. On OpenSSL builds running version v1.0.2 and earlier (common because OpenSSL v1.0.2 is the last build from the OpenSSL foundation which has a FIPS compliance module- the next one will be for OpenSSL v3.x which is currently in alpha), this behavior seems to heavily exacerbate problematic OpenSSL "locking" behavior, which this blog posts expands upon in a data-in-transit context-
      https://christarazi.me/blog/impact-locks-and-lockless/

      Running this query on an idle MariaDB server with an event towards the end of the target, encrypted binary log can already take 18+ seconds. When even low-volume, simple SELECT activity occurs alongside this where such activity uses data-in-transit encryption, the time to complete this query can balloon exponentially- from dissecting thread dumps, this ballooning seems to be due to a large number of OpenSSL contending with each other.

      Running this query against an unencrypted binary log completes, on the same VM, in 0.3 seconds, and shows no, significant increase due to other activity in MariaDB.

      While explicit testing has not yet been done, the problem is hypothesized to not occur on YaSSL builds. Likewise, testing is planned to be done on OpenSSL v1.1.1 as well to see if that improves the situation (current effort was thwarted as MariaDB Community RPMs for CentOS 7 are compiled on OpenSSL v1.0.2 systems and initiate OpenSSL v1.0.2 fallback mode when running on an OpenSSL v1.1.1 system, accordingly not taking advantage of OpenSSL v1.1.1 codepaths; also of note that OpenSSL v1.1.1 is not FIPS-compliant for CentOS 7 users and is therefore not a viable solution for many users).

      This bug is being opened as even if different SSL libraries can alleviate the worst of this issue, there is strong likelihood a bug or underlying performance problem in MariaDB is also contributing to this. And again, for CentOS 7 users, there is no way around OpenSSL v1.0.2 for FIPS compliance, so we need a solution or work-around for that use-case.

      Larger context of why this is a stability problem

      Even if MDEV-25709 gets fixed, users who require GTID will still have SELECT binlog_gtid_pos be run when connecting/reconnecting a replica (and until MDEV-25709 gets fixed, even users who are not using GTID are impacted). Even when the primary is idle, it can still take 1/3 the default setting of slave_net_timeout to complete just this query, let alone other connection/reconnection operations. When the primary is anything but idle, this situation only worsens, to the point where already users have needed to increase slave_net_timeout to get around this. However, the upper limit on what slave_net_timeout needs to be is unknown and could be indefinite and impossible to satisfy. Secondly, causing such long stalls upfront can also cause a cascade of other issues regarding high availability (especially as the long establishment times are not due to the replica doing things on its end, but rather because a long-running, contending query is occurring on the primary, which means the act of hooking-up a replica to improve high-availability/disaster recovery can actually induce a disaster recovery-necessitating scenario if the long-running query overloads or stalls the primary).

      Attachments

        Issue Links

          Activity

            People

              bnestere Brandon Nesterenko
              rob.schwyzer@mariadb.com Rob Schwyzer
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.