Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26293

InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started

Details

    Description

      MariaDB has crashed due to the following error. Is it possible to know the exact reason for this crash? Attached mariadbd dump and database error file for reference.

      2021-08-01 04:09:31 0x11d8  InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488
      InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      InnoDB: about forcing recovery.
      210801  4:09:31 [ERROR] mysqld got exception 0x80000003 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.5.9-MariaDB-log
      key_buffer_size=33554432
      read_buffer_size=8388608
      max_used_connections=64
      max_threads=65537
      thread_count=56
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 52371 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x274c52f9038
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      server.dll!my_parameter_handler()
      ucrtbase.dll!raise()
      ucrtbase.dll!abort()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!json_valid()
      server.dll!?ha_open@handler@@QEAAHPEAUTABLE@@PEBDHIPEAUst_mem_root@@PEAV?$List@VString@@@@@Z()
      server.dll!?open_table_from_share@@YA?AW4open_frm_error@@PEAVTHD@@PEAUTABLE_SHARE@@PEBUst_mysql_const_lex_string@@IIIPEAUTABLE@@_NPEAV?$List@VString@@@@@Z()
      server.dll!?open_table@@YA_NPEAVTHD@@PEAUTABLE_LIST@@PEAVOpen_table_context@@@Z()
      server.dll!?open_and_lock_tables@@YA_NPEAVTHD@@AEBUDDL_options_st@@PEAUTABLE_LIST@@_NIPEAVPrelocking_strategy@@@Z()
      server.dll!?open_tables@@YA_NPEAVTHD@@AEBUDDL_options_st@@PEAPEAUTABLE_LIST@@PEAIIPEAVPrelocking_strategy@@@Z()
      server.dll!?mysqld_show_create_get_fields@@YA_NPEAVTHD@@PEAUTABLE_LIST@@PEAV?$List@VItem@@@@PEAVString@@@Z()
      server.dll!?mysqld_show_create@@YA_NPEAVTHD@@PEAUTABLE_LIST@@@Z()
      server.dll!?mysql_execute_command@@YAHPEAVTHD@@@Z()
      server.dll!?mysql_parse@@YAXPEAVTHD@@PEADIPEAVParser_state@@_N3@Z()
      server.dll!?dispatch_command@@YA_NW4enum_server_command@@PEAVTHD@@PEADI_N3@Z()
      server.dll!?do_command@@YA_NPEAVTHD@@@Z()
      server.dll!?pool_of_threads_scheduler@@YAXPEAUscheduler_functions@@PEAKPEAV?$Atomic_counter@I@@@Z()
      server.dll!?tp_callback@@YAXPEAUTP_connection@@@Z()
      KERNEL32.DLL!TermsrvOpenRegEntry()
      ntdll.dll!RtlReleasePebLock()
      ntdll.dll!RtlReleaseSRWLockExclusive()
      KERNEL32.DLL!BaseThreadInitThunk()
      ntdll.dll!RtlUserThreadStart()
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x274c2c215b0): show create table `iv_crl_ca_details`
      Connection ID (thread ID): 39444
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
       
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Writing a core file at C:\Program Files\McAfee\Network Security Manager\MariaDB\data\
      Minidump written to C:\Program Files\McAfee\Network Security Manager\MariaDB\data\mariadbd.dmp
      

      Attachments

        1. crashreport.txt
          86 kB
        2. image-2021-08-11-16-01-07-223.png
          image-2021-08-11-16-01-07-223.png
          583 kB
        3. mariadbd.dmp
          243 kB
        4. mariadbd.exe
          24 kB
        5. mariadbd.pdb
          716 kB
        6. mariadbd104.dmp
          223 kB
        7. mariadbd182.dmp
          163 kB
        8. MDEV-26293.cfg
          43 kB
        9. MDEV-26293.yy
          0.6 kB
        10. out.txt
          24 kB
        11. screenshot-1.png
          screenshot-1.png
          580 kB
        12. serverdll.zip
          5.52 MB
        13. serverpdb.zip.001
          5.00 MB
        14. serverpdb.zip.002
          5.00 MB
        15. serverpdb.zip.003
          5.00 MB
        16. serverpdb.zip.004
          2.39 MB
        17. VK-REL-NSM.err
          24 kB
        18. WIN104.err
          10 kB
        19. WIN-G57FI8VPMSD.err
          42 kB

        Issue Links

          Activity

            schandol srinivas chandolu created issue -
            schandol srinivas chandolu made changes -
            Field Original Value New Value
            Attachment mariadbd.dmp [ 58427 ]
            schandol srinivas chandolu made changes -
            Attachment WIN-G57FI8VPMSD.err [ 58428 ]

            There is a mysqld.pdb or mariadbd.pdb depending on the version, that is distributed in all MariaDB packages by us.
            schandol, it would be nice if you put that into your McAfee distribution, next to the executable, so that we do not always have to decipher the stacktrace

            wlad Vladislav Vaintroub added a comment - There is a mysqld.pdb or mariadbd.pdb depending on the version, that is distributed in all MariaDB packages by us. schandol , it would be nice if you put that into your McAfee distribution, next to the executable, so that we do not always have to decipher the stacktrace
            marko Marko Mäkelä made changes -
            Labels need_feedback
            wlad Vladislav Vaintroub made changes -
            Attachment out.txt [ 58429 ]
            wlad Vladislav Vaintroub added a comment - - edited

            schandol, this does not seem to be our 10.5.9 either. At least both Visual Studio and cdb debugger detect mismatches. there is some noisy symbol loading information in out.txt that says that.

            DBGHELP: C:\bug\server.dll - file not found
            DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\server.dll - mismatched timestamp
            DBGHELP: server.dll not found in c:\bug\mariadb-10.5.9-winx64\bin
            DBGENG:  C:\Program Files\McAfee\Network Security Manager\MariaDB\bin\server.dll - Couldn't map image from disk.
            DBGENG:  server.dll - Partial symbol image load missing image info
            DBGHELP: Module is not fully loaded into memory.
            DBGHELP: Searching for symbols using debugger-provided data.
            DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\server.pdb - mismatched pdb
            DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\dll\server.pdb - file not found
            DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\symbols\dll\server.pdb - file not found
            SYMSRV:  BYINDEX: 0xA
            

            What's interesting is that the paths n assertion are correct (D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc - this is where it is built from in our build)
            So I do not really know, looks like executable was tampered by someone else, or maybe .dmp is somehow damaged.

            wlad Vladislav Vaintroub added a comment - - edited schandol , this does not seem to be our 10.5.9 either. At least both Visual Studio and cdb debugger detect mismatches. there is some noisy symbol loading information in out.txt that says that. DBGHELP: C:\bug\server.dll - file not found DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\server.dll - mismatched timestamp DBGHELP: server.dll not found in c:\bug\mariadb-10.5.9-winx64\bin DBGENG: C:\Program Files\McAfee\Network Security Manager\MariaDB\bin\server.dll - Couldn't map image from disk. DBGENG: server.dll - Partial symbol image load missing image info DBGHELP: Module is not fully loaded into memory. DBGHELP: Searching for symbols using debugger-provided data. DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\server.pdb - mismatched pdb DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\dll\server.pdb - file not found DBGHELP: c:\bug\mariadb-10.5.9-winx64\bin\symbols\dll\server.pdb - file not found SYMSRV: BYINDEX: 0xA What's interesting is that the paths n assertion are correct (D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc - this is where it is built from in our build) So I do not really know, looks like executable was tampered by someone else, or maybe .dmp is somehow damaged.
            schandol srinivas chandolu added a comment - - edited

            Vladislav Vaintroub, Thanks for the update. However, we are not bundling all the debug files. As suggested for deciphering will include mariadbd.pdb file also. Just curious to know whether any dependency files also needs to be bundled? If so can you please update.

            We are using mariadb named binaries downloaded from https://downloads.mariadb.org/mariadb/10.5.9/. For the actual problem mentioned i.e., "that the paths are correct .... but looks like executable was tampered by someone else". Also as a note, we are able to restart the service and login to database. Whether this tampering has happened by some process while database is running?

            schandol srinivas chandolu added a comment - - edited Vladislav Vaintroub, Thanks for the update. However, we are not bundling all the debug files. As suggested for deciphering will include mariadbd.pdb file also. Just curious to know whether any dependency files also needs to be bundled? If so can you please update. We are using mariadb named binaries downloaded from https://downloads.mariadb.org/mariadb/10.5.9/ . For the actual problem mentioned i.e., "that the paths are correct .... but looks like executable was tampered by someone else". Also as a note, we are able to restart the service and login to database. Whether this tampering has happened by some process while database is running?
            wlad Vladislav Vaintroub added a comment - - edited

            I do not know, I can't load symbols for this dmp with the released version. I have no other clues than "mismatched binary" by cdb and Visual Studio. Maybe you have some better clues than I have, and I do something wrong, and your output of cdb command is different?

            cdb -lines -z  C:\bug\mariadb-10.5.9-winx64\bin\mariadbd.dmp -i  C:\bug\mariadb-10.5.9-winx64\bin -y  C:\bug\mariadb-10.5.9-winx64\bin -logo out.txt -c "!sym noisy;!sym prompts;.reload;lml;!uniqstack -p;q"
            

            Of course with fixed path, the "bin" directory (in my case is the C:\bug\mariadb-10.5.9-winx64\bin) is the one that contains the crashdump mariadbd.dmp, server.dll, server.pdb (also mariadbd.exe, but this one is no more important, as server.dll now has all the server functionality)

            wlad Vladislav Vaintroub added a comment - - edited I do not know, I can't load symbols for this dmp with the released version. I have no other clues than "mismatched binary" by cdb and Visual Studio. Maybe you have some better clues than I have, and I do something wrong, and your output of cdb command is different? cdb -lines -z C:\bug\mariadb-10.5.9-winx64\bin\mariadbd.dmp -i C:\bug\mariadb-10.5.9-winx64\bin -y C:\bug\mariadb-10.5.9-winx64\bin -logo out.txt -c "!sym noisy;!sym prompts;.reload;lml;!uniqstack -p;q" Of course with fixed path, the "bin" directory (in my case is the C:\bug\mariadb-10.5.9-winx64\bin) is the one that contains the crashdump mariadbd.dmp, server.dll, server.pdb (also mariadbd.exe, but this one is no more important, as server.dll now has all the server functionality)

            Vladislav Vaintroub, we have hit this issue on two systems with similar error

            InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488
            InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
            InnoDB: We intentionally generate a memory trap.

            Also, the two systems uses windows 2016 as OS.

            Attached the two crash dumps with err files for reference.

            Can we have more priority for this one to know exact reason?

            schandol srinivas chandolu added a comment - Vladislav Vaintroub, we have hit this issue on two systems with similar error InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488 InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started InnoDB: We intentionally generate a memory trap. Also, the two systems uses windows 2016 as OS. Attached the two crash dumps with err files for reference. Can we have more priority for this one to know exact reason?
            schandol srinivas chandolu made changes -
            Attachment mariadbd104.dmp [ 58499 ]
            schandol srinivas chandolu made changes -
            Attachment WIN104.err [ 58500 ]
            schandol srinivas chandolu made changes -
            Attachment mariadbd182.dmp [ 58501 ]
            schandol srinivas chandolu made changes -
            Attachment VK-REL-NSM.err [ 58502 ]
            wlad Vladislav Vaintroub added a comment - - edited

            schandol, Well, analyzing a minidump is not a big deal. It is well documented, and it should be possible to do that with the exact command that I wrote in previous entry (cdb.exe should be installed). It does not work for me for your dumps, neither the old, nor the new.

            Thus I alleged, your executable must be different somehow from what we ship. If I'm wrong, and you are right, you should easily execute a couple of cdb commands, as above https://jira.mariadb.org/browse/MDEV-26293?focusedCommentId=195859&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-195859 , or Visual Studio commands, and prove me wrong. However, the stackdumping code in the program itself, and cdb , and Visual Studio , none of those think they can do anything with that dump, and I'm relying on one of them to work, so there is nothing I can do.

            wlad Vladislav Vaintroub added a comment - - edited schandol , Well, analyzing a minidump is not a big deal. It is well documented, and it should be possible to do that with the exact command that I wrote in previous entry (cdb.exe should be installed). It does not work for me for your dumps, neither the old, nor the new. Thus I alleged, your executable must be different somehow from what we ship. If I'm wrong, and you are right, you should easily execute a couple of cdb commands, as above https://jira.mariadb.org/browse/MDEV-26293?focusedCommentId=195859&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-195859 , or Visual Studio commands, and prove me wrong. However, the stackdumping code in the program itself, and cdb , and Visual Studio , none of those think they can do anything with that dump, and I'm relying on one of them to work, so there is nothing I can do.
            wlad Vladislav Vaintroub made changes -
            Assignee Marko Mäkelä [ marko ]
            wlad Vladislav Vaintroub made changes -
            Attachment screenshot-1.png [ 58503 ]
            wlad Vladislav Vaintroub added a comment - - edited

            See, "no matching binary found" for mariadbd.exe, for server.dll , and (you can't see it) for server_audit.dll . The symbol path includes 10.5.9 bin directory, and there is pdb for server.dll and for mariadbd.exe in there. Proof :

            wlad Vladislav Vaintroub added a comment - - edited See, "no matching binary found" for mariadbd.exe, for server.dll , and (you can't see it) for server_audit.dll . The symbol path includes 10.5.9 bin directory, and there is pdb for server.dll and for mariadbd.exe in there. Proof :
            wlad Vladislav Vaintroub made changes -
            Attachment image-2021-08-11-16-01-07-223.png [ 58504 ]
            wlad Vladislav Vaintroub added a comment - - edited

            Ok I know what exactly happened. Some releases were rebuilt due to https://jira.mariadb.org/browse/MDEV-24944 , and downloads were fixed, so that they contained newer builds .

            If you downloaded just after release, the timestamp of the executable is Feb.19
            This is what cdb says for your .dmp

            00007ff7`f0580000 00007ff7`f058a000   mariadbd   (deferred)             
                Image path: C:\Program Files\McAfee\Network Security Manager\MariaDB\bin\mariadbd.exe
                Image name: mariadbd.exe
                Timestamp:        Fri Feb 19 11:36:22 2021 (602F94A6)
             ...
            

            the one I got downloaded now has the timestamp of March,22. Output of dumpbin /headers contains

            FILE HEADER VALUES
                        8664 machine (x64)
                           7 number of sections
                    6058F60E time date stamp Mon Mar 22 20:54:54 2021
            ... 
            

            Both files are built from the same source revision 3a8ca9096ea82ca61811450775511533d6cb1bb4, but I can't use debug symbols of the new build on the old exe.
            Thus schandol, if you did not save the server.dll,server.pdb and mariadbd.exe,mariadbd.pdb somewhere (which are in the ZIP distro that you downloaded) , if you have not got them, we are out of luck, because I don't that those files either. I would suggest to use the executables from the current download, or better yet, 10.5.12 or whatever is current. I would suggest not throw away the pdbs in the future, they can be useful.

            If you saved them, can you please attach them to the ticket.

            wlad Vladislav Vaintroub added a comment - - edited Ok I know what exactly happened. Some releases were rebuilt due to https://jira.mariadb.org/browse/MDEV-24944 , and downloads were fixed, so that they contained newer builds . If you downloaded just after release, the timestamp of the executable is Feb.19 This is what cdb says for your .dmp 00007ff7`f0580000 00007ff7`f058a000 mariadbd (deferred) Image path: C:\Program Files\McAfee\Network Security Manager\MariaDB\bin\mariadbd.exe Image name: mariadbd.exe Timestamp: Fri Feb 19 11:36:22 2021 (602F94A6) ... the one I got downloaded now has the timestamp of March,22. Output of dumpbin /headers contains FILE HEADER VALUES 8664 machine (x64) 7 number of sections 6058F60E time date stamp Mon Mar 22 20:54:54 2021 ... Both files are built from the same source revision 3a8ca9096ea82ca61811450775511533d6cb1bb4, but I can't use debug symbols of the new build on the old exe. Thus schandol , if you did not save the server.dll,server.pdb and mariadbd.exe,mariadbd.pdb somewhere (which are in the ZIP distro that you downloaded) , if you have not got them, we are out of luck, because I don't that those files either. I would suggest to use the executables from the current download, or better yet, 10.5.12 or whatever is current. I would suggest not throw away the pdbs in the future, they can be useful. If you saved them, can you please attach them to the ticket.
            schandol srinivas chandolu made changes -
            Attachment mariadbd.exe [ 58505 ]
            Attachment mariadbd.pdb [ 58506 ]

            schandol, thanks, and server.dll/pdb also, if possible , please

            wlad Vladislav Vaintroub added a comment - schandol , thanks, and server.dll/pdb also, if possible , please
            schandol srinivas chandolu made changes -
            Attachment serverdll.zip [ 58507 ]

            Vladislav Vaintroub, Thanks for the update. Added three files server.dll, mariadbd.exe,mariadbd.pdb. The server.dll file has been added as a zip file since uploading file limit is 10 MB. However, the server.pdb file size is 55.9 MB, Exceeding 10 MB for all the windows/linux zip formats. Is there any alternate way to upload this file?

            Also, mentioned that the binary set of files have been updated for 10.5.9 in downloads section of mariadb. Curious to know the fixes that went into with the latest 10.5.9 when compared to that one used at our end.

            schandol srinivas chandolu added a comment - Vladislav Vaintroub, Thanks for the update. Added three files server.dll, mariadbd.exe,mariadbd.pdb. The server.dll file has been added as a zip file since uploading file limit is 10 MB. However, the server.pdb file size is 55.9 MB, Exceeding 10 MB for all the windows/linux zip formats. Is there any alternate way to upload this file? Also, mentioned that the binary set of files have been updated for 10.5.9 in downloads section of mariadb. Curious to know the fixes that went into with the latest 10.5.9 when compared to that one used at our end.
            wlad Vladislav Vaintroub added a comment - - edited

            schandol, thanks. you could just share the remaining file on a google/azure drive, and add the link here. Or, you can use 7zip to split large archives like here https://www.techulator.com/resources/6601-How-split-join-files-using-7-zip-free-tool.aspx

            new 10.5.9 is built from the very same sources, the binaries and MSI was just signed differently, with timestamping. only MDEV-24944 was fixed, nothing else.

            wlad Vladislav Vaintroub added a comment - - edited schandol , thanks. you could just share the remaining file on a google/azure drive, and add the link here. Or, you can use 7zip to split large archives like here https://www.techulator.com/resources/6601-How-split-join-files-using-7-zip-free-tool.aspx new 10.5.9 is built from the very same sources, the binaries and MSI was just signed differently, with timestamping. only MDEV-24944 was fixed, nothing else.
            schandol srinivas chandolu made changes -
            Attachment serverpdb.zip.004 [ 58508 ]
            schandol srinivas chandolu made changes -
            Attachment serverpdb.zip.003 [ 58509 ]
            schandol srinivas chandolu made changes -
            Attachment serverpdb.zip.002 [ 58510 ]
            schandol srinivas chandolu made changes -
            Attachment serverpdb.zip.001 [ 58511 ]

            Vladislav Vaintroub, Uploaded the file by splitting into 4 files using 7 zip.

            schandol srinivas chandolu added a comment - Vladislav Vaintroub, Uploaded the file by splitting into 4 files using 7 zip.
            wlad Vladislav Vaintroub made changes -
            Attachment crashreport.txt [ 58517 ]
            wlad Vladislav Vaintroub added a comment - - edited

            So, marko.
            One of the crashes is this

             	server.dll!ut_dbg_assertion_failed(const char * expr, const char * file, unsigned int line) Line 60	C++
             	server.dll!fil_node_t::prepare_to_close_or_detach() Line 487	C++
             	server.dll!fil_node_t::close() Line 466	C++
             	server.dll!fil_space_t::try_to_close(bool print_info) Line 110	C++
             	server.dll!fil_node_open_file(fil_node_t * node) Line 438	C++
             	server.dll!fil_space_t::prepare(bool have_mutex) Line 666	C++
             	server.dll!fil_space_t::get(unsigned __int64 id) Line 1446	C++
             	server.dll!buf_flush_space(const unsigned int id) Line 1226	C++
             	server.dll!buf_do_flush_list_batch(unsigned __int64 max_n, unsigned __int64 lsn) Line 1450	C++
             	server.dll!buf_flush_lists(unsigned __int64 max_n, unsigned __int64 lsn) Line 1580	C++
             	server.dll!buf_flush_page_cleaner(void * __formal) Line 2195	C++
            

            another one is this

            >	server.dll!my_parameter_handler(const wchar_t * expression, const wchar_t * function, const wchar_t * file, unsigned int line, unsigned __int64 pReserved) Line 278	C
             	ucrtbase.dll!raise()	Unknown
             	ucrtbase.dll!abort()	Unknown
             	server.dll!ut_dbg_assertion_failed(const char * expr, const char * file, unsigned int line) Line 60	C++
             	server.dll!fil_node_t::prepare_to_close_or_detach() Line 487	C++
             	server.dll!fil_node_t::close() Line 466	C++
             	server.dll!fil_space_t::try_to_close(bool print_info) Line 110	C++
             	server.dll!fil_node_open_file(fil_node_t * node) Line 438	C++
             	server.dll!fil_space_t::prepare(bool have_mutex) Line 666	C++
             	server.dll!fil_space_t::acquire() Line 535	C++
             	server.dll!btr_cur_prefetch_siblings(const buf_block_t * block, const dict_index_t * index) Line 3341	C++
             	server.dll!btr_cur_optimistic_delete_func(btr_cur_t * cursor, mtr_t * mtr) Line 5620	C++
             	server.dll!row_purge_remove_clust_if_poss_low(purge_node_t * node, unsigned __int64 mode) Line 138	C++
             	server.dll!row_purge_remove_clust_if_poss(purge_node_t * node) Line 185	C++
             	server.dll!row_purge_record_func(purge_node_t * node, unsigned char * undo_rec, bool updated_extern) Line 1047	C++
             	server.dll!row_purge(purge_node_t * node, unsigned char * undo_rec, que_thr_t * thr) Line 1112	C++
             	server.dll!row_purge_step(que_thr_t * thr) Line 1159	C++
             	server.dll!que_thr_step(que_thr_t * thr) Line 946	C++
             	server.dll!que_run_threads_low(que_thr_t * thr) Line 1011	C++
             	server.dll!que_run_threads(que_thr_t * thr) Line 1050	C++
             	server.dll!srv_task_execute() Line 1872	C++
             	server.dll!purge_worker_callback(void * __formal) Line 2034	C++
             	server.dll!tpool::task_group::execute(tpool::task * t) Line 57	C++
             	ntdll.dll!TppSimplepExecuteCallback()	Unknown
             	ntdll.dll!TppWorkerThread()	Unknown
            

            There is a third one, with all stacks , it is in crashreport.txt, look around STACK_TEXT. There are also all threads stacktraces in this file

            wlad Vladislav Vaintroub added a comment - - edited So, marko . One of the crashes is this server.dll!ut_dbg_assertion_failed(const char * expr, const char * file, unsigned int line) Line 60 C++ server.dll!fil_node_t::prepare_to_close_or_detach() Line 487 C++ server.dll!fil_node_t::close() Line 466 C++ server.dll!fil_space_t::try_to_close(bool print_info) Line 110 C++ server.dll!fil_node_open_file(fil_node_t * node) Line 438 C++ server.dll!fil_space_t::prepare(bool have_mutex) Line 666 C++ server.dll!fil_space_t::get(unsigned __int64 id) Line 1446 C++ server.dll!buf_flush_space(const unsigned int id) Line 1226 C++ server.dll!buf_do_flush_list_batch(unsigned __int64 max_n, unsigned __int64 lsn) Line 1450 C++ server.dll!buf_flush_lists(unsigned __int64 max_n, unsigned __int64 lsn) Line 1580 C++ server.dll!buf_flush_page_cleaner(void * __formal) Line 2195 C++ another one is this > server.dll!my_parameter_handler(const wchar_t * expression, const wchar_t * function, const wchar_t * file, unsigned int line, unsigned __int64 pReserved) Line 278 C ucrtbase.dll!raise() Unknown ucrtbase.dll!abort() Unknown server.dll!ut_dbg_assertion_failed(const char * expr, const char * file, unsigned int line) Line 60 C++ server.dll!fil_node_t::prepare_to_close_or_detach() Line 487 C++ server.dll!fil_node_t::close() Line 466 C++ server.dll!fil_space_t::try_to_close(bool print_info) Line 110 C++ server.dll!fil_node_open_file(fil_node_t * node) Line 438 C++ server.dll!fil_space_t::prepare(bool have_mutex) Line 666 C++ server.dll!fil_space_t::acquire() Line 535 C++ server.dll!btr_cur_prefetch_siblings(const buf_block_t * block, const dict_index_t * index) Line 3341 C++ server.dll!btr_cur_optimistic_delete_func(btr_cur_t * cursor, mtr_t * mtr) Line 5620 C++ server.dll!row_purge_remove_clust_if_poss_low(purge_node_t * node, unsigned __int64 mode) Line 138 C++ server.dll!row_purge_remove_clust_if_poss(purge_node_t * node) Line 185 C++ server.dll!row_purge_record_func(purge_node_t * node, unsigned char * undo_rec, bool updated_extern) Line 1047 C++ server.dll!row_purge(purge_node_t * node, unsigned char * undo_rec, que_thr_t * thr) Line 1112 C++ server.dll!row_purge_step(que_thr_t * thr) Line 1159 C++ server.dll!que_thr_step(que_thr_t * thr) Line 946 C++ server.dll!que_run_threads_low(que_thr_t * thr) Line 1011 C++ server.dll!que_run_threads(que_thr_t * thr) Line 1050 C++ server.dll!srv_task_execute() Line 1872 C++ server.dll!purge_worker_callback(void * __formal) Line 2034 C++ server.dll!tpool::task_group::execute(tpool::task * t) Line 57 C++ ntdll.dll!TppSimplepExecuteCallback() Unknown ntdll.dll!TppWorkerThread() Unknown There is a third one, with all stacks , it is in crashreport.txt , look around STACK_TEXT. There are also all threads stacktraces in this file
            wlad Vladislav Vaintroub made changes -
            Labels need_feedback

            wlad, can you find all threads that are accessing the same tablespace object? Is the value of fil_space_t::n_pending available?

            marko Marko Mäkelä added a comment - wlad , can you find all threads that are accessing the same tablespace object? Is the value of fil_space_t::n_pending available?
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Vladislav Vaintroub [ wlad ]
            wlad Vladislav Vaintroub made changes -
            Assignee Vladislav Vaintroub [ wlad ] Marko Mäkelä [ marko ]
            wlad Vladislav Vaintroub added a comment - - edited

            fil_space_t::n_pending is not available. The crashreport.txt can give an idea of what threads are doing. I'm not sure how to automate finding threads using the same object there.

            wlad Vladislav Vaintroub added a comment - - edited fil_space_t::n_pending is not available. The crashreport.txt can give an idea of what threads are doing. I'm not sure how to automate finding threads using the same object there.

            Can we reproduce this with a small value of innodb_open_files? What is the value of that parameter, and roughly how many distinct .ibd files or InnoDB tablespaces are being accessed concurrently?

            marko Marko Mäkelä added a comment - Can we reproduce this with a small value of innodb_open_files ? What is the value of that parameter, and roughly how many distinct .ibd files or InnoDB tablespaces are being accessed concurrently?

            Vladislav Vaintroub / Marko Mäkelä , Just curious to know if we found out the reason for the mentioned abrupt shutdowns. Can you please provide an update.

            schandol srinivas chandolu added a comment - Vladislav Vaintroub / Marko Mäkelä , Just curious to know if we found out the reason for the mentioned abrupt shutdowns. Can you please provide an update.

            schandol, I guess the last marko's question was for your.

            Also after producing the stacktrace, which marko asked for, I resort to uninvolved observer's role

            wlad Vladislav Vaintroub added a comment - schandol , I guess the last marko 's question was for your. Also after producing the stacktrace, which marko asked for, I resort to uninvolved observer's role

            Vladislav Vaintroub, I was in the impression that the chat was between You and Marko Mäkelä. We are using innodb_open_files parameter with value as 300. There are 752 tables in the application. Also the configuration has been done using innodb_file_per_table format. So with this there will be 752 ibd files. Also, there will be one tablespace with a default value of 100M which will be autoextended.
            Marko Mäkelä, Can you suggest a value for this parameter based on crash dumps or analysis that has been done from your end. Also, I think we are accessing roughly around 10 distinct .ibd files as per the database log or slowquery log file during this crash. Is there any way to find exact access count?

            schandol srinivas chandolu added a comment - Vladislav Vaintroub, I was in the impression that the chat was between You and Marko Mäkelä. We are using innodb_open_files parameter with value as 300. There are 752 tables in the application. Also the configuration has been done using innodb_file_per_table format. So with this there will be 752 ibd files. Also, there will be one tablespace with a default value of 100M which will be autoextended. Marko Mäkelä, Can you suggest a value for this parameter based on crash dumps or analysis that has been done from your end. Also, I think we are accessing roughly around 10 distinct .ibd files as per the database log or slowquery log file during this crash. Is there any way to find exact access count?

            schandol, based on the information that is available so far, I cannot suggest a safe value for this parameter. I would not think that 300 is unreasonably small. Because of several layers of caching, there is no sharp correlation between SQL statements being executed and the number of files being open. It might be turn out that innodb_open_files is useless on modern operating systems, and having a large number of open file descriptors is not an actual problem anymore. If that is the case, setting the parameter to the maximum possible value should work around the problem.

            wlad is our Windows expert, and he can perform tests on Windows. Because the file system APIs work differently between Windows and (say) Linux, and because we have not observed this problem on Linux, where most of our internal testing takes place, I think that our best chances to repeat this problem would be on Microsoft Windows.

            What kind of storage is being used? HDD or SSD? How big are the tables?

            marko Marko Mäkelä added a comment - schandol , based on the information that is available so far, I cannot suggest a safe value for this parameter. I would not think that 300 is unreasonably small. Because of several layers of caching, there is no sharp correlation between SQL statements being executed and the number of files being open. It might be turn out that innodb_open_files is useless on modern operating systems, and having a large number of open file descriptors is not an actual problem anymore. If that is the case, setting the parameter to the maximum possible value should work around the problem. wlad is our Windows expert, and he can perform tests on Windows. Because the file system APIs work differently between Windows and (say) Linux, and because we have not observed this problem on Linux, where most of our internal testing takes place, I think that our best chances to repeat this problem would be on Microsoft Windows. What kind of storage is being used? HDD or SSD? How big are the tables?

            The errors are usually portable, and reproducible elsewhere if reasonable efforts are made. The exception from there is real Windows APIs , like asynchronous IO.

            Having said that, you can have innodb_open_files easily as million, the number of open handles in a Windows program is limited by 2^24.

            wlad Vladislav Vaintroub added a comment - The errors are usually portable, and reproducible elsewhere if reasonable efforts are made. The exception from there is real Windows APIs , like asynchronous IO. Having said that, you can have innodb_open_files easily as million, the number of open handles in a Windows program is limited by 2^24.
            Roel Roel Van de Paar made changes -
            Description MariaDB has crashed due to the following error. Is it possible to know the exact reason for this crash? Attached mariadbd dump and database error file for reference.

            2021-08-01 04:09:31 0x11d8 InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488
            InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
            InnoDB: We intentionally generate a memory trap.
            InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            InnoDB: If you get repeated assertion failures or crashes, even
            InnoDB: immediately after the mysqld startup, there may be
            InnoDB: corruption in the InnoDB tablespace. Please refer to
            InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
            InnoDB: about forcing recovery.
            210801 4:09:31 [ERROR] mysqld got exception 0x80000003 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.5.9-MariaDB-log
            key_buffer_size=33554432
            read_buffer_size=8388608
            max_used_connections=64
            max_threads=65537
            thread_count=56
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 52371 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x274c52f9038
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            server.dll!my_parameter_handler()
            ucrtbase.dll!raise()
            ucrtbase.dll!abort()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!?ha_open@handler@@QEAAHPEAUTABLE@@PEBDHIPEAUst_mem_root@@PEAV?$List@VString@@@@@Z()
            server.dll!?open_table_from_share@@YA?AW4open_frm_error@@PEAVTHD@@PEAUTABLE_SHARE@@PEBUst_mysql_const_lex_string@@IIIPEAUTABLE@@_NPEAV?$List@VString@@@@@Z()
            server.dll!?open_table@@YA_NPEAVTHD@@PEAUTABLE_LIST@@PEAVOpen_table_context@@@Z()
            server.dll!?open_and_lock_tables@@YA_NPEAVTHD@@AEBUDDL_options_st@@PEAUTABLE_LIST@@_NIPEAVPrelocking_strategy@@@Z()
            server.dll!?open_tables@@YA_NPEAVTHD@@AEBUDDL_options_st@@PEAPEAUTABLE_LIST@@PEAIIPEAVPrelocking_strategy@@@Z()
            server.dll!?mysqld_show_create_get_fields@@YA_NPEAVTHD@@PEAUTABLE_LIST@@PEAV?$List@VItem@@@@PEAVString@@@Z()
            server.dll!?mysqld_show_create@@YA_NPEAVTHD@@PEAUTABLE_LIST@@@Z()
            server.dll!?mysql_execute_command@@YAHPEAVTHD@@@Z()
            server.dll!?mysql_parse@@YAXPEAVTHD@@PEADIPEAVParser_state@@_N3@Z()
            server.dll!?dispatch_command@@YA_NW4enum_server_command@@PEAVTHD@@PEADI_N3@Z()
            server.dll!?do_command@@YA_NPEAVTHD@@@Z()
            server.dll!?pool_of_threads_scheduler@@YAXPEAUscheduler_functions@@PEAKPEAV?$Atomic_counter@I@@@Z()
            server.dll!?tp_callback@@YAXPEAUTP_connection@@@Z()
            KERNEL32.DLL!TermsrvOpenRegEntry()
            ntdll.dll!RtlReleasePebLock()
            ntdll.dll!RtlReleaseSRWLockExclusive()
            KERNEL32.DLL!BaseThreadInitThunk()
            ntdll.dll!RtlUserThreadStart()

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x274c2c215b0): show create table `iv_crl_ca_details`
            Connection ID (thread ID): 39444
            Status: NOT_KILLED

            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

            The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
            information that should help you find out what is causing the crash.
            Writing a core file at C:\Program Files\McAfee\Network Security Manager\MariaDB\data\
            Minidump written to C:\Program Files\McAfee\Network Security Manager\MariaDB\data\mariadbd.dmp
            MariaDB has crashed due to the following error. Is it possible to know the exact reason for this crash? Attached mariadbd dump and database error file for reference.
            {noformat}
            2021-08-01 04:09:31 0x11d8 InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488
            InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
            InnoDB: We intentionally generate a memory trap.
            InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            InnoDB: If you get repeated assertion failures or crashes, even
            InnoDB: immediately after the mysqld startup, there may be
            InnoDB: corruption in the InnoDB tablespace. Please refer to
            InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
            InnoDB: about forcing recovery.
            210801 4:09:31 [ERROR] mysqld got exception 0x80000003 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.5.9-MariaDB-log
            key_buffer_size=33554432
            read_buffer_size=8388608
            max_used_connections=64
            max_threads=65537
            thread_count=56
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 52371 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x274c52f9038
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            server.dll!my_parameter_handler()
            ucrtbase.dll!raise()
            ucrtbase.dll!abort()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!json_valid()
            server.dll!?ha_open@handler@@QEAAHPEAUTABLE@@PEBDHIPEAUst_mem_root@@PEAV?$List@VString@@@@@Z()
            server.dll!?open_table_from_share@@YA?AW4open_frm_error@@PEAVTHD@@PEAUTABLE_SHARE@@PEBUst_mysql_const_lex_string@@IIIPEAUTABLE@@_NPEAV?$List@VString@@@@@Z()
            server.dll!?open_table@@YA_NPEAVTHD@@PEAUTABLE_LIST@@PEAVOpen_table_context@@@Z()
            server.dll!?open_and_lock_tables@@YA_NPEAVTHD@@AEBUDDL_options_st@@PEAUTABLE_LIST@@_NIPEAVPrelocking_strategy@@@Z()
            server.dll!?open_tables@@YA_NPEAVTHD@@AEBUDDL_options_st@@PEAPEAUTABLE_LIST@@PEAIIPEAVPrelocking_strategy@@@Z()
            server.dll!?mysqld_show_create_get_fields@@YA_NPEAVTHD@@PEAUTABLE_LIST@@PEAV?$List@VItem@@@@PEAVString@@@Z()
            server.dll!?mysqld_show_create@@YA_NPEAVTHD@@PEAUTABLE_LIST@@@Z()
            server.dll!?mysql_execute_command@@YAHPEAVTHD@@@Z()
            server.dll!?mysql_parse@@YAXPEAVTHD@@PEADIPEAVParser_state@@_N3@Z()
            server.dll!?dispatch_command@@YA_NW4enum_server_command@@PEAVTHD@@PEADI_N3@Z()
            server.dll!?do_command@@YA_NPEAVTHD@@@Z()
            server.dll!?pool_of_threads_scheduler@@YAXPEAUscheduler_functions@@PEAKPEAV?$Atomic_counter@I@@@Z()
            server.dll!?tp_callback@@YAXPEAUTP_connection@@@Z()
            KERNEL32.DLL!TermsrvOpenRegEntry()
            ntdll.dll!RtlReleasePebLock()
            ntdll.dll!RtlReleaseSRWLockExclusive()
            KERNEL32.DLL!BaseThreadInitThunk()
            ntdll.dll!RtlUserThreadStart()

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x274c2c215b0): show create table `iv_crl_ca_details`
            Connection ID (thread ID): 39444
            Status: NOT_KILLED

            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

            The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
            information that should help you find out what is causing the crash.
            Writing a core file at C:\Program Files\McAfee\Network Security Manager\MariaDB\data\
            Minidump written to C:\Program Files\McAfee\Network Security Manager\MariaDB\data\mariadbd.dmp
            {noformat}
            Roel Roel Van de Paar added a comment - - edited

            I note (https://jira.mariadb.org/secure/attachment/58503/screenshot-1.png) that a Virus scanner (McAfee Network Security Manager) is operating on this system and that it somehow seems to be interacting or trying to interact with mariadb.exe/server.dll. Virus scanners may cause issues with mysqld. Please try excluding mysqld.exe/dll's (in fact the whole directory/subdirectories in which the server is installed as well as the data directory and it's subdirectories) from any virus/network scanner activity (in disk and memory), at the very least as a test. Other possible mysqld interaction causes are anti-virus/anti-malware, firewall, and UAC. Also ensure that the OS is fully patched/updated.

            Roel Roel Van de Paar added a comment - - edited I note ( https://jira.mariadb.org/secure/attachment/58503/screenshot-1.png ) that a Virus scanner (McAfee Network Security Manager) is operating on this system and that it somehow seems to be interacting or trying to interact with mariadb.exe/server.dll. Virus scanners may cause issues with mysqld. Please try excluding mysqld.exe/dll's (in fact the whole directory/subdirectories in which the server is installed as well as the data directory and it's subdirectories) from any virus/network scanner activity (in disk and memory), at the very least as a test. Other possible mysqld interaction causes are anti-virus/anti-malware, firewall, and UAC. Also ensure that the OS is fully patched/updated.
            Roel Roel Van de Paar added a comment - - edited

            2021-08-11  5:20:15 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 22 pending operations
            ....
            2021-08-11  5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync
            2021-08-11  5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync
            2021-08-11  5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync
            2021-08-11  5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync
            2021-08-11  5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync
            2021-08-11 05:21:49 0x187c  InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488
            InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
            InnoDB: We intentionally generate a memory trap.
            

            There are regular pending operations and pending fsync's. In this case the crash happens very close to them. I would not be surpised if the issues were connected, and anti-virus or some other 3rd party interaction remains a major possibility imho. schandol What sort of SAN or similar are you using please?

            Roel Roel Van de Paar added a comment - - edited 2021-08-11 5:20:15 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 22 pending operations .... 2021-08-11 5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync 2021-08-11 5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync 2021-08-11 5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync 2021-08-11 5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync 2021-08-11 5:20:56 39251 [Note] InnoDB: Cannot close file .\lf\iv_dev_prof_data.ibd because of 0 pending operations and pending fsync 2021-08-11 05:21:49 0x187c InnoDB: Assertion failure in file D:\winx64-packages\build\src\storage\innobase\fil\fil0fil.cc line 488 InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started InnoDB: We intentionally generate a memory trap. There are regular pending operations and pending fsync's. In this case the crash happens very close to them. I would not be surpised if the issues were connected, and anti-virus or some other 3rd party interaction remains a major possibility imho. schandol What sort of SAN or similar are you using please?

            Also interesting is that the FAILURE_BUCKET_ID is BREAKPOINT_80000003_server.dll!my_parameter_handler which may indicate underlying issues on the system or OS itself, or a memory leak in the server. Some limited info on these example URL's:
            https://techcult.com/fix-error-0x80000003/
            https://www.auslogics.com/en/articles/fix-breakpoint-error-0x80000003/ (I don't recommend using the driver updater!)
            https://www.thewindowsclub.com/breakpoint-reached-error-0x80000003

            Roel Roel Van de Paar added a comment - Also interesting is that the FAILURE_BUCKET_ID is BREAKPOINT_80000003_server.dll!my_parameter_handler which may indicate underlying issues on the system or OS itself, or a memory leak in the server. Some limited info on these example URL's: https://techcult.com/fix-error-0x80000003/ https://www.auslogics.com/en/articles/fix-breakpoint-error-0x80000003/ (I don't recommend using the driver updater!) https://www.thewindowsclub.com/breakpoint-reached-error-0x80000003

            Marko Mäkelä, Vladislav Vaintroub and Roel Van de Paar, Thanks for the updates.

            The kind of storage used in these systems is HDD. Generally there will be around six tables which are of in units to tens of GBs. In all the crashed systems the highest table data is around 10 GB and remaining 5 tables will vary between 2 GB to 7 GB. The remaining tables are of few MBs.

            Regarding innodb_open_files parameter noticed that we can give up to 4096 in one of the system. However more than has been given (for example 8092) it is taking as 256 (equal to table_open_cache value). But in documentation (https://mariadb.com/kb/en/innodb-system-variables/#innodb_open_files) it has been mentioned that we can have upto 4294967295. Is it dependent on other parameters? Also how we can set with higher value?

            As a note, still now we have not faced issues by having having 300 as its value in previous mariadb versions of 10.3.Now we are facing with 10.5 for the first time.

            On other hand on other system if this parameter has been set with still lower value say 10, it is getting crashed with in 2 hours after raising semaphore and wait issues. We have changed to 4096 to check its behavior.

            Also McAfee Network Security Manager is not an antivirus application. On all the affected systems the virus scan is not running.

            However for other points mentioned will check with concerned and will update accordingly.

            schandol srinivas chandolu added a comment - Marko Mäkelä, Vladislav Vaintroub and Roel Van de Paar, Thanks for the updates. The kind of storage used in these systems is HDD. Generally there will be around six tables which are of in units to tens of GBs. In all the crashed systems the highest table data is around 10 GB and remaining 5 tables will vary between 2 GB to 7 GB. The remaining tables are of few MBs. Regarding innodb_open_files parameter noticed that we can give up to 4096 in one of the system. However more than has been given (for example 8092) it is taking as 256 (equal to table_open_cache value). But in documentation ( https://mariadb.com/kb/en/innodb-system-variables/#innodb_open_files ) it has been mentioned that we can have upto 4294967295. Is it dependent on other parameters? Also how we can set with higher value? As a note, still now we have not faced issues by having having 300 as its value in previous mariadb versions of 10.3.Now we are facing with 10.5 for the first time. On other hand on other system if this parameter has been set with still lower value say 10, it is getting crashed with in 2 hours after raising semaphore and wait issues. We have changed to 4096 to check its behavior. Also McAfee Network Security Manager is not an antivirus application. On all the affected systems the virus scan is not running. However for other points mentioned will check with concerned and will update accordingly.

            schandol, in 10.5 the code flow is different from older versions mostly due to MDEV-23855, which reduced contention on fil_system.mutex especially for write workloads.

            marko Marko Mäkelä added a comment - schandol , in 10.5 the code flow is different from older versions mostly due to MDEV-23855 , which reduced contention on fil_system.mutex especially for write workloads.
            Roel Roel Van de Paar made changes -
            Environment  Microsoft Windows Server 2016 Standard
            VMware x-64 based PC.
            32 GB Memory
            Microsoft Windows Server 2016 Standard, VMware x-64 based PC, 32 GB Memory
            Roel Roel Van de Paar made changes -
            Summary MariaDB 10.5.9 crash InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
            Roel Roel Van de Paar made changes -
            Component/s Storage Engine - InnoDB [ 10129 ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.5 [ 23123 ]
            Roel Roel Van de Paar made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 124046 ] MariaDB v4 [ 143060 ]

            Customer got the same assertion failure in different context:

            Thread 1 (Thread 0x7e6d68762700 (LWP 263532)):
            #0  0x00007fefb327aaa1 in pthread_kill () from /lib64/libpthread.so.0
            No symbol table info available.
            #1  0x0000562c5f35827e in handle_fatal_signal (sig=6) at /usr/src/debug/MariaDB-/src_0/sql/signal_handler.cc:330
                    curr_time = 1655794807
                    tm = {tm_sec = 7, tm_min = 0, tm_hour = 14, tm_mday = 21, tm_mon = 5, tm_year = 122, tm_wday = 2, tm_yday = 171, tm_isdst = 0, tm_gmtoff = 25200, tm_zone = 0x562c62d416d0 "+07"}
                    thd = 0x7e6bb40009b8
                    print_invalid_query_pointer = true
            #2  <signal handler called>
            No symbol table info available.
            #3  0x00007fefb13453d7 in raise () from /lib64/libc.so.6
            No symbol table info available.
            #4  0x00007fefb1346ac8 in abort () from /lib64/libc.so.6
            No symbol table info available.
            #5  0x0000562c5f02d3af in ut_dbg_assertion_failed (expr=expr@entry=0x562c5fc0c130 "space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started", file=file@entry=0x562c5fc0c060 "/home/jenkins/workspace/MariaDBE-Custom-RPM/label/rhel-7/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/innobase/fil/fil0fil.cc", line=line@entry=488) at /usr/src/debug/MariaDB-/src_0/storage/innobase/ut/ut0dbg.cc:60
            No locals.
            #6  0x0000562c5f87aac8 in fil_node_t::prepare_to_close_or_detach (this=this@entry=0x7e6bd502fba8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:487
            No locals.
            #7  0x0000562c5f87ab27 in fil_node_t::close (this=0x7e6bd502fba8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:463
            No locals.
            #8  0x0000562c5f87aca3 in fil_space_t::try_to_close (print_info=false) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:109
                    node = 0x7e6bd502fba8
                    space = 0x7e6bd517f948
            #9  0x0000562c5f87fc03 in fil_node_open_file (node=node@entry=0x7e6bc4307098) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:438
                    count = 0
            #10 0x0000562c5f03245c in fil_space_t::prepare (this=0x7e6bc4304238, have_mutex=have_mutex@entry=false) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:666
                    node = 0x7e6bc4307098
                    is_open = <optimized out>
            #11 0x0000562c5f7f4535 in acquire (this=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/include/fil0fil.h:535
            No locals.
            #12 btr_cur_prefetch_siblings (block=block@entry=0x7eea2b818c00, index=index@entry=0x7e6b71ba93a0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/btr/btr0cur.cc:3343
                    page = <optimized out>
            #13 0x0000562c5f7fcc11 in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7e6d68760190, offsets=offsets@entry=0x7e6d687600f0, heap=heap@entry=0x7e6d687600e0, entry=entry@entry=0x7e6bade5d690, rec=rec@entry=0x7e6d687608b0, big_rec=big_rec@entry=0x7e6d687600d0, n_ext=n_ext@entry=0, thr=thr@entry=0x7e6b9efc8bc8, mtr=mtr@entry=0x7e6d68760b10) at /usr/src/debug/MariaDB-/src_0/storage/innobase/btr/btr0cur.cc:3470
                    big_rec_vec = 0x0
                    err = DB_FAIL
                    max_size = <optimized out>
                    n_recs = <optimized out>
                    page_cursor = <optimized out>
                    block = 0x7eea2b818c00
                    page = 0x7eea2c068000 <Address 0x7eea2c068000 out of bounds>
                    dummy = 0x7e6bb4a296f6 "\201\064\212Í€"
                    rec_size = 196
                    index = 0x7e6b71ba93a0
                    reorg = <optimized out>
                    inherit = true
            #14 0x0000562c5f7341c4 in row_ins_clust_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=2, index=index@entry=0x7e6b71ba93a0, n_uniq=n_uniq@entry=1, entry=entry@entry=0x7e6bade5d690, n_ext=n_ext@entry=0, thr=thr@entry=0x7e6b9efc8bc8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:2739
                    insert_rec = 0x7f6a87477100 <Address 0x7f6a87477100 out of bounds>
                    pcur = {btr_cur = {index = 0x7e6b71ba93a0, page_cur = {index = 0x0, rec = 0x7eea2c06b4eb <Address 0x7eea2c06b4eb out of bounds>, offsets = 0x0, block = 0x7eea2b818c00}, purge_node = 0x0, left_block = 0x0, thr = 0x7e6b9efc8bc8, flag = BTR_CUR_BINARY, tree_height = 3, up_match = 0, up_bytes = 0, low_match = 0, low_bytes = 0, n_fields = 0, n_bytes = 0, fold = 0, path_arr = 0x0, rtr_info = 0x0}, latch_mode = 2, old_stored = false, old_rec = 0x0, old_n_fields = 0, rel_pos = 0, block_when_stored = {m_block = 0x0, m_page_id = {m_id = 0}}, modify_clock = 0, pos_state = BTR_PCUR_IS_POSITIONED, search_mode = PAGE_CUR_LE, trx_if_known = 0x0, old_rec_buf = 0x0, buf_size = 0}
                    cursor = 0x7e6d68760190
                    err = <optimized out>
                    big_rec = 0x0
                    mtr = {m_last = 0x0, m_last_offset = 0, m_log_mode = 0, m_modifications = 0, m_made_dirty = 1, m_inside_ibuf = 0, m_trim_pages = 0, m_memo = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7e6d68760b48, prev = 0x7e6d68760b48}}, size_ = 1}, m_size = 64, m_first_block = {<ilist_node<void>> = {next = 0x7e6d68760b28, prev = 0x7e6d68760b28}, m_data = "\000\000\000\000\000\000\000\000 \000\000\000m~\000\000\000\000\000\000\000\000\000\000\001\000\000\000\323\177\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\214\201+\352~\000\000\002", '\000' <repeats 15 times>, "\022\000\000\000k~\000\000\000j\337}\vmF\251", '\000' <repeats 16 times>, "p\021vhm~\000\000\230\376¶k~", '\000' <repeats 35 times>, "h\221k\246~\000\000\030\000\f_,V\000\000\000\000\000\000\000\000\000\000(\fvhm~\000\000(\fvhm~\000\000\001", '\000' <repeats 15 times>..., m_used = 64}}, m_log = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7e6d68760d78, prev = 0x7e6d68760d78}}, size_ = 1}, m_size = 0, m_first_block = {<ilist_node<void>> = {next = 0x7e6d68760d58, prev = 0x7e6d68760d58}, m_data = "`\220E`,V\000\000pf\276\341\062V\000\000r\000\000\000\000\000\000\000\330J\324b,V\000\000\021\000\000\000\000\000\000\000\060\016vhm~\000\000\241\251\230_,V\000\000pf\276\341\062V\000\000\000\000\000\000\000\000\000\000n\000\000\000\022\000\000\000(o\250\267k~\000\000\310\065!\350j~\000\000\250\245\256\261k~\000\000\030\066[\324k~\000\000\212\001\213\001\214\001\215\001\060\016vhm~\000\000 o\250\267k~\000\000pf\276\341\062V\000\000\277O\230_,V\000\000 o\250\267k~\000\000pf\276\341\062V\000\000\300\016vhm~\000\000\021\355$_,V\000\000H\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000"..., m_used = 0}}, m_user_space = 0x7e6bc4304238, m_commit_lsn = 0, m_freed_space = 0x0, m_freed_pages = 0x0}
                    auto_inc = <optimized out>
                    offsets_heap = 0x0
                    offsets_ = {300, 66, 32814, 4, 10, 16, 23, 27, 34, 36, 46, 48, 51, 53, 56, 60, 64, 67, 99, 107, 123, 127, 131, 134, 138, 140, 142, 174, 175, 176, 177, 178, 179, 187, 199, 207, 219, 222, 225, 228, 230, 234, 237, 241, 245, 251, 257, 265, 266, 274, 275, 276, 277, 278, 280, 281, 283, 285, 287, 290, 295, 296, 297, 298, 299, 305, 306, 834, 841, 424, 427, 437, 447, 450, 453, 463, 473, 476, 479, 489, 300, 3, 32774, 4, 10, 42, 531, 541, 551, 554, 557, 567, 577, 586, 595, 604, 613, 622, 631, 640, 649, 658, 667, 676, 685, 694, 703, 712, 721, 730, 739, 748, 300, 8, 32777, 4, 10, 16, 23, 24, 678, 679, 686, 856, 865, 874, 883, 892, 901, 910, 919, 928, 937, 946, 955, 964, 973, 982, 991, 1000, 1009, 1018, 1027, 1036, 1045, 1054, 1063, 1072, 1081, 1090, 1099, 1108, 1117, 1126, 1135, 1144, 1153, 1162, 1171, 1180, 1189, 1198, 1207, 1216, 1225, 1234, 1243, 1252, 1261, 1273, 1289, 1301, 1317, 1321, 1393, 1399, 1519, 1522, 1523, 1645, 1648, 1673, 1676, 1684, 1693, 1697, 1705, 1708, 1738, 1768, 1828, 1832, 1839, 0, 0, 0, 19672, 40316, 32363, 0...}
                    offsets = 0x7e6d68760650
            #15 0x0000562c5f736149 in row_ins_clust_index_entry (index=index@entry=0x7e6b71ba93a0, entry=entry@entry=0x7e6bade5d690, thr=thr@entry=0x7e6b9efc8bc8, n_ext=n_ext@entry=0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3210
                    err = <optimized out>
                    n_uniq = 1
                    flags = 0
                    skip_locking = <optimized out>
                    orig_n_fields = 25
            #16 0x0000562c5f73685c in row_ins_index_entry (thr=0x7e6b9efc8bc8, entry=0x7e6bade5d690, index=0x7e6b71ba93a0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3333
            No locals.
            #17 row_ins_index_entry_step (thr=0x7e6b9efc8bc8, node=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3502
            No locals.
            #18 row_ins (thr=0x7e6b9efc8bc8, node=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3661
                    index = <optimized out>
                    type = <optimized out>
            #19 row_ins_step (thr=thr@entry=0x7e6b9efc8bc8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3807
                    node = <optimized out>
                    err = <optimized out>
                    sel_node = <optimized out>
            #20 0x0000562c5f74936b in row_insert_for_mysql (mysql_rec=<optimized out>, prebuilt=0x7e6b9efc7b70, ins_mode=ROW_INS_NORMAL) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0mysql.cc:1421
                    err = 1604526928
                    was_lock_wait = <optimized out>
                    node = 0x7e6b9efc8478
                    savept = {least_undo_no = 19}
                    trx = 0x7fefac7c6ff0
                    table = 0x7e6b701515e0
                    blob_heap = <optimized out>
            #21 0x0000562c5f67b604 in ha_innobase::write_row (this=0x7e6b9d9b6080, record=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM         0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/storage/innobase/handler/ha_innodb.cc:7637
            No locals.
            #22 0x0000562c5f36547f in handler::ha_write_row (this=0x7e6b9d9b6080, buf=buf@entry=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM         0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/sql/handler.cc:7210
            No locals.
            #23 0x0000562c5f5ca318 in ha_partition::write_row (this=0x7e6b9eaaf900, buf=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM         0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/sql/ha_partition.cc:4492
                    part_id = 431
                    func_value = 3541341631
                    sms = {thd = 0x7e6bb40009b8, old_mode = 524288}
                    saved_auto_inc_field_not_null = true
                    error = 0
                    have_auto_increment = <optimized out>
                    thd = 0x7e6bb40009b8
            #24 0x0000562c5f36547f in handler::ha_write_row (this=0x7e6b9eaaf900, buf=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM         0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/sql/handler.cc:7210
            No locals.
            #25 0x0000562c5f471705 in Rows_log_event::write_row (this=this@entry=0x7e6bef5c3848, rgi=rgi@entry=0x7e6a2f7fddc0, overwrite=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/log_event_server.cc:7287
                    key = {m_ptr = 0x0}
                    table = 0x7e6b9e17d428
                    error = <optimized out>
                    keynum = <optimized out>
                    invoke_triggers = false
            #26 0x0000562c5f471b6d in Write_rows_log_event::do_exec_row (this=0x7e6bef5c3848, rgi=0x7e6a2f7fddc0) at /usr/src/debug/MariaDB-/src_0/sql/log_event_server.cc:7507
                    tmp = 0x562c5fa27835 "Executing"
                    message = <optimized out>
                    error = <optimized out>
            #27 0x0000562c5f466fb4 in Rows_log_event::do_apply_event (this=0x7e6bef5c3848, rgi=0x7e6a2f7fddc0) at /usr/src/debug/MariaDB-/src_0/sql/log_event_server.cc:5679
                    old_thd = 0x7e6bb40009b8
                    transactional_table = true
                    after_image = <optimized out>
                    saved_sql_mode = 1411383296
                    rli = 0x5632e1d8cb68
                    lex = <optimized out>
                    table = 0x7e6b9e17d428
                    error = <optimized out>
                    new_trg_event_map = <optimized out>
            #28 0x0000562c5f09b36f in apply_event_and_update_pos_apply (ev=0x7e6bef5c3848, thd=0x7e6bb40009b8, rgi=0x7e6a2f7fddc0, reason=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/log_event.h:1498
            No locals.
            #29 0x0000562c5f294f6e in rpt_handle_event (qev=qev@entry=0x7e6bedb60138, rpt=0x7e6bec009c60) at /usr/src/debug/MariaDB-/src_0/sql/rpl_parallel.cc:62
                    err = <optimized out>
                    rli = 0x5632e1d8cb68
                    rgi = 0x7e6a2f7fddc0
                    thd = 0x7e6bb40009b8
                    ev = 0x7e6bef5c3848
            #30 0x0000562c5f29750b in handle_rpl_parallel_thread (arg=arg@entry=0x7e6bec009c60) at /usr/src/debug/MariaDB-/src_0/sql/rpl_parallel.cc:1363
                    event_type = WRITE_ROWS_EVENT_V1
                    rgi = 0x7e6a2f7fddc0
                    entry = 0x7e6bec01d318
                    end_of_group = <optimized out>
                    group_ending = 0
                    wait_count = <optimized out>
                    qev = <optimized out>
                    next_qev = 0x7e6bef53cdd8
                    in_event_group = true
                    group_rgi = 0x7e6a2f7fddc0
                    event_gtid_sub_id = 1752521350
                    sql_info = {cached_charset = "!\000!\000!", rpl_filter = 0x562c62d548b0}
                    err = <optimized out>
                    rpt = <optimized out>
                    group_standalone = false
                    thd = 0x7e6bb40009b8
                    old_stage = {m_key = 0, m_name = 0x562c5fa27a27 "After apply log event", m_flags = 1615069400}
                    events = <optimized out>
                    skip_event_group = false
                    gco = <optimized out>
            #31 0x0000562c5f5cdc1d in pfs_spawn_thread (arg=0x7e6bec00b998) at /usr/src/debug/MariaDB-/src_0/storage/perfschema/pfs.cc:2201
                    typed_arg = 0x7e6bec00b998
                    user_arg = 0x7e6bec009c60
                    pfs = <optimized out>
                    user_start_routine = 0x562c5f296dd0 <handle_rpl_parallel_thread(void*)>
                    klass = <optimized out>
            #32 0x00007fefb3275ea5 in start_thread () from /lib64/libpthread.so.0
            No symbol table info available.
            #33 0x00007fefb140d9fd in clone () from /lib64/libc.so.6
            No symbol table info available.
            

            valerii Valerii Kravchuk added a comment - Customer got the same assertion failure in different context: Thread 1 (Thread 0x7e6d68762700 (LWP 263532)): #0 0x00007fefb327aaa1 in pthread_kill () from /lib64/libpthread.so.0 No symbol table info available. #1 0x0000562c5f35827e in handle_fatal_signal (sig=6) at /usr/src/debug/MariaDB-/src_0/sql/signal_handler.cc:330 curr_time = 1655794807 tm = {tm_sec = 7, tm_min = 0, tm_hour = 14, tm_mday = 21, tm_mon = 5, tm_year = 122, tm_wday = 2, tm_yday = 171, tm_isdst = 0, tm_gmtoff = 25200, tm_zone = 0x562c62d416d0 "+07"} thd = 0x7e6bb40009b8 print_invalid_query_pointer = true #2 <signal handler called> No symbol table info available. #3 0x00007fefb13453d7 in raise () from /lib64/libc.so.6 No symbol table info available. #4 0x00007fefb1346ac8 in abort () from /lib64/libc.so.6 No symbol table info available. #5 0x0000562c5f02d3af in ut_dbg_assertion_failed (expr=expr@entry=0x562c5fc0c130 "space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started", file=file@entry=0x562c5fc0c060 "/home/jenkins/workspace/MariaDBE-Custom-RPM/label/rhel-7/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/innobase/fil/fil0fil.cc", line=line@entry=488) at /usr/src/debug/MariaDB-/src_0/storage/innobase/ut/ut0dbg.cc:60 No locals. #6 0x0000562c5f87aac8 in fil_node_t::prepare_to_close_or_detach (this=this@entry=0x7e6bd502fba8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:487 No locals. #7 0x0000562c5f87ab27 in fil_node_t::close (this=0x7e6bd502fba8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:463 No locals. #8 0x0000562c5f87aca3 in fil_space_t::try_to_close (print_info=false) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:109 node = 0x7e6bd502fba8 space = 0x7e6bd517f948 #9 0x0000562c5f87fc03 in fil_node_open_file (node=node@entry=0x7e6bc4307098) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:438 count = 0 #10 0x0000562c5f03245c in fil_space_t::prepare (this=0x7e6bc4304238, have_mutex=have_mutex@entry=false) at /usr/src/debug/MariaDB-/src_0/storage/innobase/fil/fil0fil.cc:666 node = 0x7e6bc4307098 is_open = <optimized out> #11 0x0000562c5f7f4535 in acquire (this=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/include/fil0fil.h:535 No locals. #12 btr_cur_prefetch_siblings (block=block@entry=0x7eea2b818c00, index=index@entry=0x7e6b71ba93a0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/btr/btr0cur.cc:3343 page = <optimized out> #13 0x0000562c5f7fcc11 in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7e6d68760190, offsets=offsets@entry=0x7e6d687600f0, heap=heap@entry=0x7e6d687600e0, entry=entry@entry=0x7e6bade5d690, rec=rec@entry=0x7e6d687608b0, big_rec=big_rec@entry=0x7e6d687600d0, n_ext=n_ext@entry=0, thr=thr@entry=0x7e6b9efc8bc8, mtr=mtr@entry=0x7e6d68760b10) at /usr/src/debug/MariaDB-/src_0/storage/innobase/btr/btr0cur.cc:3470 big_rec_vec = 0x0 err = DB_FAIL max_size = <optimized out> n_recs = <optimized out> page_cursor = <optimized out> block = 0x7eea2b818c00 page = 0x7eea2c068000 <Address 0x7eea2c068000 out of bounds> dummy = 0x7e6bb4a296f6 "\201\064\212̀" rec_size = 196 index = 0x7e6b71ba93a0 reorg = <optimized out> inherit = true #14 0x0000562c5f7341c4 in row_ins_clust_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=2, index=index@entry=0x7e6b71ba93a0, n_uniq=n_uniq@entry=1, entry=entry@entry=0x7e6bade5d690, n_ext=n_ext@entry=0, thr=thr@entry=0x7e6b9efc8bc8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:2739 insert_rec = 0x7f6a87477100 <Address 0x7f6a87477100 out of bounds> pcur = {btr_cur = {index = 0x7e6b71ba93a0, page_cur = {index = 0x0, rec = 0x7eea2c06b4eb <Address 0x7eea2c06b4eb out of bounds>, offsets = 0x0, block = 0x7eea2b818c00}, purge_node = 0x0, left_block = 0x0, thr = 0x7e6b9efc8bc8, flag = BTR_CUR_BINARY, tree_height = 3, up_match = 0, up_bytes = 0, low_match = 0, low_bytes = 0, n_fields = 0, n_bytes = 0, fold = 0, path_arr = 0x0, rtr_info = 0x0}, latch_mode = 2, old_stored = false, old_rec = 0x0, old_n_fields = 0, rel_pos = 0, block_when_stored = {m_block = 0x0, m_page_id = {m_id = 0}}, modify_clock = 0, pos_state = BTR_PCUR_IS_POSITIONED, search_mode = PAGE_CUR_LE, trx_if_known = 0x0, old_rec_buf = 0x0, buf_size = 0} cursor = 0x7e6d68760190 err = <optimized out> big_rec = 0x0 mtr = {m_last = 0x0, m_last_offset = 0, m_log_mode = 0, m_modifications = 0, m_made_dirty = 1, m_inside_ibuf = 0, m_trim_pages = 0, m_memo = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7e6d68760b48, prev = 0x7e6d68760b48}}, size_ = 1}, m_size = 64, m_first_block = {<ilist_node<void>> = {next = 0x7e6d68760b28, prev = 0x7e6d68760b28}, m_data = "\000\000\000\000\000\000\000\000 \000\000\000m~\000\000\000\000\000\000\000\000\000\000\001\000\000\000\323\177\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\214\201+\352~\000\000\002", '\000' <repeats 15 times>, "\022\000\000\000k~\000\000\000j\337}\vmF\251", '\000' <repeats 16 times>, "p\021vhm~\000\000\230\376¶k~", '\000' <repeats 35 times>, "h\221k\246~\000\000\030\000\f_,V\000\000\000\000\000\000\000\000\000\000(\fvhm~\000\000(\fvhm~\000\000\001", '\000' <repeats 15 times>..., m_used = 64}}, m_log = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7e6d68760d78, prev = 0x7e6d68760d78}}, size_ = 1}, m_size = 0, m_first_block = {<ilist_node<void>> = {next = 0x7e6d68760d58, prev = 0x7e6d68760d58}, m_data = "`\220E`,V\000\000pf\276\341\062V\000\000r\000\000\000\000\000\000\000\330J\324b,V\000\000\021\000\000\000\000\000\000\000\060\016vhm~\000\000\241\251\230_,V\000\000pf\276\341\062V\000\000\000\000\000\000\000\000\000\000n\000\000\000\022\000\000\000(o\250\267k~\000\000\310\065!\350j~\000\000\250\245\256\261k~\000\000\030\066[\324k~\000\000\212\001\213\001\214\001\215\001\060\016vhm~\000\000 o\250\267k~\000\000pf\276\341\062V\000\000\277O\230_,V\000\000 o\250\267k~\000\000pf\276\341\062V\000\000\300\016vhm~\000\000\021\355$_,V\000\000H\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000"..., m_used = 0}}, m_user_space = 0x7e6bc4304238, m_commit_lsn = 0, m_freed_space = 0x0, m_freed_pages = 0x0} auto_inc = <optimized out> offsets_heap = 0x0 offsets_ = {300, 66, 32814, 4, 10, 16, 23, 27, 34, 36, 46, 48, 51, 53, 56, 60, 64, 67, 99, 107, 123, 127, 131, 134, 138, 140, 142, 174, 175, 176, 177, 178, 179, 187, 199, 207, 219, 222, 225, 228, 230, 234, 237, 241, 245, 251, 257, 265, 266, 274, 275, 276, 277, 278, 280, 281, 283, 285, 287, 290, 295, 296, 297, 298, 299, 305, 306, 834, 841, 424, 427, 437, 447, 450, 453, 463, 473, 476, 479, 489, 300, 3, 32774, 4, 10, 42, 531, 541, 551, 554, 557, 567, 577, 586, 595, 604, 613, 622, 631, 640, 649, 658, 667, 676, 685, 694, 703, 712, 721, 730, 739, 748, 300, 8, 32777, 4, 10, 16, 23, 24, 678, 679, 686, 856, 865, 874, 883, 892, 901, 910, 919, 928, 937, 946, 955, 964, 973, 982, 991, 1000, 1009, 1018, 1027, 1036, 1045, 1054, 1063, 1072, 1081, 1090, 1099, 1108, 1117, 1126, 1135, 1144, 1153, 1162, 1171, 1180, 1189, 1198, 1207, 1216, 1225, 1234, 1243, 1252, 1261, 1273, 1289, 1301, 1317, 1321, 1393, 1399, 1519, 1522, 1523, 1645, 1648, 1673, 1676, 1684, 1693, 1697, 1705, 1708, 1738, 1768, 1828, 1832, 1839, 0, 0, 0, 19672, 40316, 32363, 0...} offsets = 0x7e6d68760650 #15 0x0000562c5f736149 in row_ins_clust_index_entry (index=index@entry=0x7e6b71ba93a0, entry=entry@entry=0x7e6bade5d690, thr=thr@entry=0x7e6b9efc8bc8, n_ext=n_ext@entry=0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3210 err = <optimized out> n_uniq = 1 flags = 0 skip_locking = <optimized out> orig_n_fields = 25 #16 0x0000562c5f73685c in row_ins_index_entry (thr=0x7e6b9efc8bc8, entry=0x7e6bade5d690, index=0x7e6b71ba93a0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3333 No locals. #17 row_ins_index_entry_step (thr=0x7e6b9efc8bc8, node=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3502 No locals. #18 row_ins (thr=0x7e6b9efc8bc8, node=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3661 index = <optimized out> type = <optimized out> #19 row_ins_step (thr=thr@entry=0x7e6b9efc8bc8) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0ins.cc:3807 node = <optimized out> err = <optimized out> sel_node = <optimized out> #20 0x0000562c5f74936b in row_insert_for_mysql (mysql_rec=<optimized out>, prebuilt=0x7e6b9efc7b70, ins_mode=ROW_INS_NORMAL) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0mysql.cc:1421 err = 1604526928 was_lock_wait = <optimized out> node = 0x7e6b9efc8478 savept = {least_undo_no = 19} trx = 0x7fefac7c6ff0 table = 0x7e6b701515e0 blob_heap = <optimized out> #21 0x0000562c5f67b604 in ha_innobase::write_row (this=0x7e6b9d9b6080, record=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM 0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/storage/innobase/handler/ha_innodb.cc:7637 No locals. #22 0x0000562c5f36547f in handler::ha_write_row (this=0x7e6b9d9b6080, buf=buf@entry=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM 0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/sql/handler.cc:7210 No locals. #23 0x0000562c5f5ca318 in ha_partition::write_row (this=0x7e6b9eaaf900, buf=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM 0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/sql/ha_partition.cc:4492 part_id = 431 func_value = 3541341631 sms = {thd = 0x7e6bb40009b8, old_mode = 524288} saved_auto_inc_field_not_null = true error = 0 have_auto_increment = <optimized out> thd = 0x7e6bb40009b8 #24 0x0000562c5f36547f in handler::ha_write_row (this=0x7e6b9eaaf900, buf=0x7e6b9d23c388 "BONEW1PI339J8GLW", ' ' <repeats 80 times>, "\201\064\212\315TRM 0351040", ' ' <repeats 14 times>, "\r") at /usr/src/debug/MariaDB-/src_0/sql/handler.cc:7210 No locals. #25 0x0000562c5f471705 in Rows_log_event::write_row (this=this@entry=0x7e6bef5c3848, rgi=rgi@entry=0x7e6a2f7fddc0, overwrite=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/log_event_server.cc:7287 key = {m_ptr = 0x0} table = 0x7e6b9e17d428 error = <optimized out> keynum = <optimized out> invoke_triggers = false #26 0x0000562c5f471b6d in Write_rows_log_event::do_exec_row (this=0x7e6bef5c3848, rgi=0x7e6a2f7fddc0) at /usr/src/debug/MariaDB-/src_0/sql/log_event_server.cc:7507 tmp = 0x562c5fa27835 "Executing" message = <optimized out> error = <optimized out> #27 0x0000562c5f466fb4 in Rows_log_event::do_apply_event (this=0x7e6bef5c3848, rgi=0x7e6a2f7fddc0) at /usr/src/debug/MariaDB-/src_0/sql/log_event_server.cc:5679 old_thd = 0x7e6bb40009b8 transactional_table = true after_image = <optimized out> saved_sql_mode = 1411383296 rli = 0x5632e1d8cb68 lex = <optimized out> table = 0x7e6b9e17d428 error = <optimized out> new_trg_event_map = <optimized out> #28 0x0000562c5f09b36f in apply_event_and_update_pos_apply (ev=0x7e6bef5c3848, thd=0x7e6bb40009b8, rgi=0x7e6a2f7fddc0, reason=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/log_event.h:1498 No locals. #29 0x0000562c5f294f6e in rpt_handle_event (qev=qev@entry=0x7e6bedb60138, rpt=0x7e6bec009c60) at /usr/src/debug/MariaDB-/src_0/sql/rpl_parallel.cc:62 err = <optimized out> rli = 0x5632e1d8cb68 rgi = 0x7e6a2f7fddc0 thd = 0x7e6bb40009b8 ev = 0x7e6bef5c3848 #30 0x0000562c5f29750b in handle_rpl_parallel_thread (arg=arg@entry=0x7e6bec009c60) at /usr/src/debug/MariaDB-/src_0/sql/rpl_parallel.cc:1363 event_type = WRITE_ROWS_EVENT_V1 rgi = 0x7e6a2f7fddc0 entry = 0x7e6bec01d318 end_of_group = <optimized out> group_ending = 0 wait_count = <optimized out> qev = <optimized out> next_qev = 0x7e6bef53cdd8 in_event_group = true group_rgi = 0x7e6a2f7fddc0 event_gtid_sub_id = 1752521350 sql_info = {cached_charset = "!\000!\000!", rpl_filter = 0x562c62d548b0} err = <optimized out> rpt = <optimized out> group_standalone = false thd = 0x7e6bb40009b8 old_stage = {m_key = 0, m_name = 0x562c5fa27a27 "After apply log event", m_flags = 1615069400} events = <optimized out> skip_event_group = false gco = <optimized out> #31 0x0000562c5f5cdc1d in pfs_spawn_thread (arg=0x7e6bec00b998) at /usr/src/debug/MariaDB-/src_0/storage/perfschema/pfs.cc:2201 typed_arg = 0x7e6bec00b998 user_arg = 0x7e6bec009c60 pfs = <optimized out> user_start_routine = 0x562c5f296dd0 <handle_rpl_parallel_thread(void*)> klass = <optimized out> #32 0x00007fefb3275ea5 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #33 0x00007fefb140d9fd in clone () from /lib64/libc.so.6 No symbol table info available.

            valerii, can you also get the following output from that crash (which is related to enforcing the innodb_open_files limit)?

            frame 6
            print *this
            print *space
            

            Furthermore, in the output of thread apply all backtrace, do you see any pointers to either object in any other thread?

            marko Marko Mäkelä added a comment - valerii , can you also get the following output from that crash (which is related to enforcing the innodb_open_files limit)? frame 6 print *this print *space Furthermore, in the output of thread apply all backtrace , do you see any pointers to either object in any other thread?

            Neither the being-closed tablespace object nor the name of the partitioned table occur in any other thread.

            The value of space->n_pending in the core dump is CLOSING+1, which would seem to satisfy the assertion expression space->is_ready_to_close().

            I think that we should try to reproduce this with astress test run with a small value of innodb_open_files and large numbers of partitions, on a non-debug executable. Once we have something reproducible, it should be easier to figure out what is going on.

            marko Marko Mäkelä added a comment - Neither the being-closed tablespace object nor the name of the partitioned table occur in any other thread. The value of space->n_pending in the core dump is CLOSING+1 , which would seem to satisfy the assertion expression space->is_ready_to_close() . I think that we should try to reproduce this with astress test run with a small value of innodb_open_files and large numbers of partitions, on a non-debug executable. Once we have something reproducible, it should be easier to figure out what is going on.
            mleich Matthias Leich added a comment - - edited

            I am able to replay the assert on
                origin/10.5 674842bee0a564a2d94e99c4e1319a716aa10aa9 2022-06-16T00:12:11+03:00
            compiled without debug.
            All attempts to replay with invoking 'rr' or using a build with debug had no luck.
             
            RQG
            ===
            # git clone https://github.com/mleich1/rqg --branch experimental RQG
            #
            # GIT_SHOW: HEAD -> experimental, origin/experimental a32979c6d789ddf0982e6a56af687565be00590c 2022-06-14T17:21:56+02:00
            # rqg.pl  : Version 4.0.6 (2022-05)
            #
            # $RQG_HOME/rqg.pl \
            # --duration=300 \
            # --queries=10000000 \
            # --no_mask \
            # --seed=random \
            # --rpl_mode=none \
            # --max_gd_duration=1200 \
            # --engine=InnoDB \
            # --mysqld=--file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
            # --mysqld=--innodb-lock-wait-timeout=50 \
            # --mysqld=--sync-binlog=1 \
            # --mysqld=--net_read_timeout=30 \
            # --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
            # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
            # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
            # --mysqld=--slave_net_timeout=60 \
            # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
            # --mysqld=--innodb_open_files=15 \
            # --mysqld=--plugin-load-add=file_key_management.so \
            # --mysqld=--wait_timeout=28800 \
            # --mysqld=--interactive_timeout=28800 \
            # --mysqld=--lock-wait-timeout=86400 \
            # --mysqld=--log_bin_trust_function_creators=1 \
            # --mysqld=--loose-idle_write_transaction_timeout=0 \
            # --mysqld=--connect_timeout=60 \
            # --mysqld=--loose-max-statement-time=30 \
            # --mysqld=--loose-idle_transaction_timeout=0 \
            # --mysqld=--innodb-buffer-pool-size=8M \
            # --mysqld=--loose-innodb_read_only_compressed=OFF \
            # --mysqld=--log-bin \
            # --mysqld=--log-output=none \
            # --mysqld=--loose-table_lock_wait_timeout=50 \
            # --mysqld=--net_write_timeout=60 \
            # --mysqld=--innodb_page_size=8K \
            # --mysqld=--loose-plugin-load-add=provider_lz4.so \
            # --reporters=Backtrace,Deadlock1,ErrorLog \
            # --validators=None \
            # --grammar=MDEV-26293.yy \
            # --threads=7 \
            # <local settings>
             
            In average several hundred RQG runs are required for one replay.
             
            The simplified RQG grammar MDEV-26293 shows
            alter:
                ALTER TABLE table_name_part alter_operation ;
             
            alter_operation:
                ADD PARTITION (PARTITION p3 VALUES LESS THAN MAXVALUE) |
                ORDER BY `col_int_nokey` ;
             
            create_part:
                CREATE TABLE IF NOT EXISTS table_name_part (
            `col_int_nokey` INTEGER,
            `col_int_key` INTEGER NOT NULL,
            KEY (`col_int_key`)
            ) ENGINE = INNODB ;
             
            exec_sql:
                alter |
                alter |
                alter |
                alter |
                alter |
                alter |
                create_part |
                create_part |
                create_part ;
             
            query:
                exec_sql ;
             
            query_init:
                ;
             
            table_name_part:
                d |
                f |
                g |
                h |
                j |
                k |
                n |
                q |
                t |
                y ;
             
            thread_connect:
                ;
            

            mleich Matthias Leich added a comment - - edited I am able to replay the assert on origin/10.5 674842bee0a564a2d94e99c4e1319a716aa10aa9 2022-06-16T00:12:11+03:00 compiled without debug. All attempts to replay with invoking 'rr' or using a build with debug had no luck.   RQG === # git clone https://github.com/mleich1/rqg --branch experimental RQG # # GIT_SHOW: HEAD -> experimental, origin/experimental a32979c6d789ddf0982e6a56af687565be00590c 2022-06-14T17:21:56+02:00 # rqg.pl : Version 4.0.6 (2022-05) # # $RQG_HOME/rqg.pl \ # --duration=300 \ # --queries=10000000 \ # --no_mask \ # --seed=random \ # --rpl_mode=none \ # --max_gd_duration=1200 \ # --engine=InnoDB \ # --mysqld=--file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \ # --mysqld=--innodb-lock-wait-timeout=50 \ # --mysqld=--sync-binlog=1 \ # --mysqld=--net_read_timeout=30 \ # --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \ # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \ # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \ # --mysqld=--slave_net_timeout=60 \ # --mysqld=--loose-idle_readonly_transaction_timeout=0 \ # --mysqld=--innodb_open_files=15 \ # --mysqld=--plugin-load-add=file_key_management.so \ # --mysqld=--wait_timeout=28800 \ # --mysqld=--interactive_timeout=28800 \ # --mysqld=--lock-wait-timeout=86400 \ # --mysqld=--log_bin_trust_function_creators=1 \ # --mysqld=--loose-idle_write_transaction_timeout=0 \ # --mysqld=--connect_timeout=60 \ # --mysqld=--loose-max-statement-time=30 \ # --mysqld=--loose-idle_transaction_timeout=0 \ # --mysqld=--innodb-buffer-pool-size=8M \ # --mysqld=--loose-innodb_read_only_compressed=OFF \ # --mysqld=--log-bin \ # --mysqld=--log-output=none \ # --mysqld=--loose-table_lock_wait_timeout=50 \ # --mysqld=--net_write_timeout=60 \ # --mysqld=--innodb_page_size=8K \ # --mysqld=--loose-plugin-load-add=provider_lz4.so \ # --reporters=Backtrace,Deadlock1,ErrorLog \ # --validators=None \ # --grammar=MDEV-26293.yy \ # --threads=7 \ # <local settings>   In average several hundred RQG runs are required for one replay.   The simplified RQG grammar MDEV-26293 shows alter: ALTER TABLE table_name_part alter_operation ;   alter_operation: ADD PARTITION (PARTITION p3 VALUES LESS THAN MAXVALUE) | ORDER BY `col_int_nokey` ;   create_part: CREATE TABLE IF NOT EXISTS table_name_part ( `col_int_nokey` INTEGER, `col_int_key` INTEGER NOT NULL, KEY (`col_int_key`) ) ENGINE = INNODB ;   exec_sql: alter | alter | alter | alter | alter | alter | create_part | create_part | create_part ;   query: exec_sql ;   query_init: ;   table_name_part: d | f | g | h | j | k | n | q | t | y ;   thread_connect: ;
            mleich Matthias Leich made changes -
            Attachment MDEV-26293.yy [ 64339 ]
            mleich Matthias Leich made changes -
            Attachment MDEV-26293.cfg [ 64340 ]
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            I was unable to reproduce this locally. A core dump that mleich provided illustrated the problem. I think that the scenario was as follows:

            1. fil_space_t::try_to_close() had called fil_space_t::set_closing(), which returned 0, meaning that no references exist and no NEEDS_FSYNC flag was set.
            2. Another thread executed fil_space_t::acquire_low(), which incremented the n_pending from CLOSING to CLOSING+1.
            3. The assertion failed due to the unexpected reference.

            My suggested fix is to avoid incrementing the reference count if the CLOSING flag is set and the caller is not holding fil_system.mutex. Incrementing the reference is OK if the NEEDS_FSYNC flag is set. If the reference was not incremented due to the CLOSING flag, fil_space_t::acquire() must acquire fil_system.mutex and try incrementing the reference again. Because fil_space_t::try_to_close() is holding fil_system.mutex, that fix should prevent this race condition.

            marko Marko Mäkelä added a comment - I was unable to reproduce this locally. A core dump that mleich provided illustrated the problem. I think that the scenario was as follows: fil_space_t::try_to_close() had called fil_space_t::set_closing() , which returned 0, meaning that no references exist and no NEEDS_FSYNC flag was set. Another thread executed fil_space_t::acquire_low() , which incremented the n_pending from CLOSING to CLOSING+1 . The assertion failed due to the unexpected reference. My suggested fix is to avoid incrementing the reference count if the CLOSING flag is set and the caller is not holding fil_system.mutex . Incrementing the reference is OK if the NEEDS_FSYNC flag is set. If the reference was not incremented due to the CLOSING flag, fil_space_t::acquire() must acquire fil_system.mutex and try incrementing the reference again. Because fil_space_t::try_to_close() is holding fil_system.mutex , that fix should prevent this race condition.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Labels crash race regression-10.5
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2022-06-30 16:38:23.0 2022-06-30 16:38:23.549
            marko Marko Mäkelä made changes -
            Fix Version/s 10.5.17 [ 27509 ]
            Fix Version/s 10.6.9 [ 27507 ]
            Fix Version/s 10.7.5 [ 27505 ]
            Fix Version/s 10.8.4 [ 27503 ]
            Fix Version/s 10.9.2 [ 27115 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -
            alice Alice Sherepa made changes -
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 190628

            People

              marko Marko Mäkelä
              schandol srinivas chandolu
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.