Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30572

main.large_pages 'innodb' fails on architecture hppa: InnoDB: Operating system error number 14 in a file operation

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Not a Bug
    • 10.11
    • N/A
    • Server
    • None

    Description

      The official Debian builds of MariaDB 1:10.11.1-2 failed on Debian builders arch hppa with at https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=hppa&ver=1%3A10.11.1-2&stamp=1675231483&raw=0 with:

      main.large_pages 'innodb'                w2 [ fail ]
              Test ended at 2023-02-01 00:35:17
      CURRENT_TEST: main.large_pages
      Failed to start mysqld.1
      mysqltest failed but provided no output
       - found 'core' (0/5)
      Trying 'dbx' to get a backtrace
      Trying 'lldb' to get a backtrace from coredump /<<PKGBUILDDIR>>/builddir/mysql-test/var/2/log/main.large_pages-innodb/mysqld.1/data/core
      Compressed file /<<PKGBUILDDIR>>/builddir/mysql-test/var/2/log/main.large_pages-innodb/mysqld.1/data/core
       - saving '/<<PKGBUILDDIR>>/builddir/mysql-test/var/2/log/main.large_pages-innodb/' to '/<<PKGBUILDDIR>>/builddir/mysql-test/var/log/main.large_pages-innodb/'
      Retrying test main.large_pages, attempt(2/3)...
      ***Warnings generated in error logs during shutdown after running tests: main.large_pages
      2023-02-01  0:35:16 0 [Warning] mariadbd: Couldn't allocate 8388608 bytes (Large/HugeTLB memory page size 2097152); errno 22; continuing to smaller size
      2023-02-01  0:35:16 0 [Warning] mariadbd: Couldn't allocate 6291456 bytes (Large/HugeTLB memory page size 2097152); errno 22; continuing to smaller size
      2023-02-01  0:35:16 0 [Warning] mariadbd: Couldn't allocate 6291456 bytes (Large/HugeTLB memory page size 2097152); errno 22; continuing to smaller size
      2023-02-01  0:35:16 0 [Warning] mariadbd: Couldn't allocate 4194304 bytes (Large/HugeTLB memory page size 2097152); errno 22; continuing to smaller size
      2023-02-01  0:35:16 0 [Warning] mariadbd: Couldn't allocate 4194304 bytes (Large/HugeTLB memory page size 2097152); errno 22; continuing to smaller size
      2023-02-01  0:35:16 0 [Warning] InnoDB: Retry attempts for reading partial data failed.
      2023-02-01  0:35:16 0 [ERROR] InnoDB: Operating system error number 14 in a file operation.
      2023-02-01  0:35:16 0 [ERROR] InnoDB: Error number 14 means 'Bad address'
      2023-02-01  0:35:16 0 [ERROR] InnoDB: File (unknown): 'read' returned OS error 214. Cannot continue operation
      Attempting backtrace. You can use the following information to find out
      

      The only other recorded case of OS error 14 was in MDEV-12039.

      This and other hppa issues tracked downstream in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006529

      Attachments

        Issue Links

          Activity

            The above was backported in https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/7ac10dee3b961cf69b330de23df5f8554450783e to latest Debian build. However, now fails to start at all in https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=hppa&ver=1%3A10.11.1-4&stamp=1676007600&raw=0

            ```
            MariaDB Version 10.11.1-MariaDB-4

            • SSL connections supported
              Using suites: main
              Collecting tests...
              Installing system database...
            • found 'core' (0/5)
              Core generated by '/<<PKGBUILDDIR>>/builddir/sql/mariadbd'
              Output from gdb follows. The first stack trace is from the failing thread.
              The following stack traces are from all threads (so the failing one is
              duplicated).
              --------------------------
              warning: Can't open file anon_inode:[io_uring] which was expanded to anon_inode:[io_uring] during file-backed mapping note processing
              warning: Can't open file anon_inode:[io_uring] which was expanded to anon_inode:[io_uring] during file-backed mapping note processing
              [New LWP 31431]
              [New LWP 31427]
              [New LWP 31429]
              [New LWP 31428]
              [New LWP 31430]
              [Thread debugging using libthread_db enabled]
              Using host libthread_db library "/lib/hppa-linux-gnu/libthread_db.so.1".
              Core was generated by `/<<PKGBUILDDIR>>/builddir/sql/mariadbd --no-defaults --dis'.
              Program terminated with signal SIGABRT, Aborted.
              #0 0x43469c84 in my_register_filename (fd=1137958840, FileName=0x6 <error: Cannot access memory at address 0x6>, type_of_file=3646115528, error_message_number=<optimized out>, MyFlags=<optimized out>) at ./mysys/my_open.c:140
              140 ./mysys/my_open.c: No such file or directory.
              [Current thread is 1 (Thread 0xd9d34380 (LWP 31431))]
              #0 0x43469c84 in my_register_filename (fd=1137958840, FileName=0x6 <error: Cannot access memory at address 0x6>, type_of_file=3646115528, error_message_number=<optimized out>, MyFlags=<optimized out>) at ./mysys/my_open.c:140
              Backtrace stopped: Cannot access memory at address 0x7ab3
              ```

            The same upload also had other patches, so what we are seeing might be due to something else as well.

            Thus I re-opened https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006529 and I also see that https://jira.mariadb.org/browse/MDEV-30572 remains open.

            otto Otto Kekäläinen added a comment - The above was backported in https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/7ac10dee3b961cf69b330de23df5f8554450783e to latest Debian build. However, now fails to start at all in https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=hppa&ver=1%3A10.11.1-4&stamp=1676007600&raw=0 ``` MariaDB Version 10.11.1-MariaDB-4 SSL connections supported Using suites: main Collecting tests... Installing system database... found 'core' (0/5) Core generated by '/<<PKGBUILDDIR>>/builddir/sql/mariadbd' Output from gdb follows. The first stack trace is from the failing thread. The following stack traces are from all threads (so the failing one is duplicated). -------------------------- warning: Can't open file anon_inode: [io_uring] which was expanded to anon_inode: [io_uring] during file-backed mapping note processing warning: Can't open file anon_inode: [io_uring] which was expanded to anon_inode: [io_uring] during file-backed mapping note processing [New LWP 31431] [New LWP 31427] [New LWP 31429] [New LWP 31428] [New LWP 31430] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/hppa-linux-gnu/libthread_db.so.1". Core was generated by `/<<PKGBUILDDIR>>/builddir/sql/mariadbd --no-defaults --dis'. Program terminated with signal SIGABRT, Aborted. #0 0x43469c84 in my_register_filename (fd=1137958840, FileName=0x6 <error: Cannot access memory at address 0x6>, type_of_file=3646115528, error_message_number=<optimized out>, MyFlags=<optimized out>) at ./mysys/my_open.c:140 140 ./mysys/my_open.c: No such file or directory. [Current thread is 1 (Thread 0xd9d34380 (LWP 31431))] #0 0x43469c84 in my_register_filename (fd=1137958840, FileName=0x6 <error: Cannot access memory at address 0x6>, type_of_file=3646115528, error_message_number=<optimized out>, MyFlags=<optimized out>) at ./mysys/my_open.c:140 Backtrace stopped: Cannot access memory at address 0x7ab3 ``` The same upload also had other patches, so what we are seeing might be due to something else as well. Thus I re-opened https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006529 and I also see that https://jira.mariadb.org/browse/MDEV-30572 remains open.
            danblack Daniel Black added a comment -

            Putting every hppa issue in this one jira issue isn't helpful. Above stack track isn't complete. my_register_filename with just a large filedescriptor seems implausible as is the FileName address, and type_of_file (enum up to 7), though it would segfault on line 140 if this did occur. Without a full stack track I couldn't guess why this occured.

            The only things to be considered here that I see are:

            • should my_large_malloc fall back to a conventional malloc on a mmap == error EINVAL (I'm currently not convinced)
            • does InnoDB correctly handle my_large_malloc allocation failures.
            danblack Daniel Black added a comment - Putting every hppa issue in this one jira issue isn't helpful. Above stack track isn't complete. my_register_filename with just a large filedescriptor seems implausible as is the FileName address, and type_of_file (enum up to 7), though it would segfault on line 140 if this did occur. Without a full stack track I couldn't guess why this occured. The only things to be considered here that I see are: should my_large_malloc fall back to a conventional malloc on a mmap == error EINVAL (I'm currently not convinced) does InnoDB correctly handle my_large_malloc allocation failures.

            danblack, I think that my_large_malloc() needs to fall back to the aligned_malloc() wrapper that is defined in include/aligned.h. InnoDB certainly assumes that it gets aligned memory.

            marko Marko Mäkelä added a comment - danblack , I think that my_large_malloc() needs to fall back to the aligned_malloc() wrapper that is defined in include/aligned.h . InnoDB certainly assumes that it gets aligned memory.

            Helge Deller reported on the second issue:

            mariadb fails on the hppa architecture, because there is a kernel bug
            (on parisc and probably other architectures) in the io_uring syscall.
            This is worked on upstream, e.g. this mail thread:
            https://lore.kernel.org/io-uring/507c7873-8888-dbcb-c512-4659af486848@bell.net/T/#t
            We hope to get the kernel fixed in upcoming versions.

            otto Otto Kekäläinen added a comment - Helge Deller reported on the second issue: mariadb fails on the hppa architecture, because there is a kernel bug (on parisc and probably other architectures) in the io_uring syscall. This is worked on upstream, e.g. this mail thread: https://lore.kernel.org/io-uring/507c7873-8888-dbcb-c512-4659af486848@bell.net/T/#t We hope to get the kernel fixed in upcoming versions.
            otto Otto Kekäläinen added a comment - This is passing in latest https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=hppa&ver=1%3A10.11.5-3&stamp=1697355657&raw=0 main.large_pages 'innodb' [ pass ] 104

            People

              danblack Daniel Black
              otto Otto Kekäläinen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.