Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14244

MariaDB 10.2.10 fails to run on Debian Stretch with ext3 and O_DIRECT

Details

    Description

      I first spotted this issue when upgrading a production server to 10.2.10, it would fail to start MariaDB after the upgrade with some error along the lines of ibtmp1 failing to be created. I purged mariadb-server-10.2 and tried reinstalling, it still failed, so I downgraded that server to 10.1 and it's now running fine.

      Now I tried installing mariadb-server-10.2 on a different server where it has never been installed before, and this is the output from apt-get install mariadb-server-10.2:

      # apt-get install mariadb-server-10.2
      Reading package lists... Done
      Building dependency tree       
      Reading state information... Done
      The following additional packages will be installed:
        galera-3 gawk libcgi-fast-perl libcgi-pm-perl libdbd-mysql-perl libdbi-perl
        libencode-locale-perl libfcgi-perl libhtml-parser-perl libhtml-tagset-perl
        libhtml-template-perl libhttp-date-perl libhttp-message-perl libio-html-perl
        liblwp-mediatypes-perl libmariadb3 libmariadbclient18 libreadline5
        libsigsegv2 libterm-readkey-perl libtimedate-perl liburi-perl
        mariadb-client-10.2 mariadb-client-core-10.2 mariadb-common
        mariadb-server-core-10.2 mysql-common psmisc socat
      Suggested packages:
        gawk-doc libclone-perl libmldbm-perl libnet-daemon-perl
        libsql-statement-perl libdata-dump-perl libipc-sharedcache-perl libwww-perl
        mailx mariadb-test netcat-openbsd tinyca
      The following NEW packages will be installed:
        galera-3 gawk libcgi-fast-perl libcgi-pm-perl libdbd-mysql-perl libdbi-perl
        libencode-locale-perl libfcgi-perl libhtml-parser-perl libhtml-tagset-perl
        libhtml-template-perl libhttp-date-perl libhttp-message-perl libio-html-perl
        liblwp-mediatypes-perl libmariadb3 libmariadbclient18 libreadline5
        libsigsegv2 libterm-readkey-perl libtimedate-perl liburi-perl
        mariadb-client-10.2 mariadb-client-core-10.2 mariadb-common
        mariadb-server-10.2 mariadb-server-core-10.2 mysql-common psmisc socat
      0 upgraded, 30 newly installed, 0 to remove and 0 not upgraded.
      Need to get 22.6 MB of archives.
      After this operation, 186 MB of additional disk space will be used.
      Do you want to continue? [Y/n] 
      Get:1 http://ftp.debian.org/debian stretch/main amd64 libsigsegv2 amd64 2.10-5 [28.9 kB]
      Get:2 http://ftp.debian.org/debian stretch/main amd64 gawk amd64 1:4.1.4+dfsg-1 [571 kB]
      Get:3 http://ftp.debian.org/debian stretch/main amd64 libdbi-perl amd64 1.636-1+b1 [766 kB]
      Get:4 http://ftp.debian.org/debian stretch/main amd64 libreadline5 amd64 5.2+dfsg-3+b1 [119 kB]
      Get:5 http://ftp.debian.org/debian stretch/main amd64 psmisc amd64 22.21-2.1+b2 [123 kB]
      Get:6 http://ftp.debian.org/debian stretch/main amd64 socat amd64 1.7.3.1-2+deb9u1 [353 kB]
      Get:7 http://ftp.debian.org/debian stretch/main amd64 libhtml-tagset-perl all 3.20-3 [12.7 kB]
      Get:8 http://ftp.debian.org/debian stretch/main amd64 liburi-perl all 1.71-1 [88.6 kB]
      Get:9 http://ftp.debian.org/debian stretch/main amd64 libhtml-parser-perl amd64 3.72-3 [104 kB]
      Get:10 http://ftp.debian.org/debian stretch/main amd64 libcgi-pm-perl all 4.35-1 [222 kB]
      Get:11 http://ftp.debian.org/debian stretch/main amd64 libfcgi-perl amd64 0.78-2 [38.2 kB]
      Get:12 http://ftp.debian.org/debian stretch/main amd64 libcgi-fast-perl all 1:2.12-1 [11.2 kB]
      Get:13 http://ftp.debian.org/debian stretch/main amd64 libdbd-mysql-perl amd64 4.041-2 [114 kB]
      Get:14 http://ftp.debian.org/debian stretch/main amd64 libencode-locale-perl all 1.05-1 [13.7 kB]
      Get:15 http://ftp.debian.org/debian stretch/main amd64 libhtml-template-perl all 2.95-2 [67.1 kB]
      Get:16 http://ftp.debian.org/debian stretch/main amd64 libtimedate-perl all 2.3000-2 [42.2 kB]
      Get:17 http://ftp.debian.org/debian stretch/main amd64 libhttp-date-perl all 6.02-1 [10.7 kB]
      Get:18 http://ftp.debian.org/debian stretch/main amd64 libio-html-perl all 1.001-1 [17.6 kB]
      Get:19 http://ftp.debian.org/debian stretch/main amd64 liblwp-mediatypes-perl all 6.02-1 [22.1 kB]
      Get:20 http://ftp.debian.org/debian stretch/main amd64 libhttp-message-perl all 6.11-1 [75.9 kB]
      Get:21 http://ftp.debian.org/debian stretch/main amd64 libterm-readkey-perl amd64 2.37-1 [27.2 kB]
      Get:22 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main i386 mysql-common all 10.2.10+maria~stretch [8432 B]
      Get:23 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main i386 mariadb-common all 10.2.10+maria~stretch [3328 B]
      Get:24 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 galera-3 amd64 25.3.20-stretch [8267 kB]
      Get:25 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 mariadb-client-core-10.2 amd64 10.2.10+maria~stretch [741 kB]
      Get:26 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 mariadb-client-10.2 amd64 10.2.10+maria~stretch [1101 kB]
      Get:27 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 mariadb-server-core-10.2 amd64 10.2.10+maria~stretch [5474 kB]
      Get:28 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 mariadb-server-10.2 amd64 10.2.10+maria~stretch [4112 kB]
      Get:29 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 libmariadb3 amd64 10.2.10+maria~stretch [110 kB]
      Get:30 http://ftp.ddg.lth.se/mariadb/repo/10.2/debian stretch/main amd64 libmariadbclient18 amd64 10.2.10+maria~stretch [2940 B]
      Fetched 22.6 MB in 2s (9522 kB/s)             
      perl: warning: Setting locale failed.
      perl: warning: Please check that your locale settings:
      	LANGUAGE = (unset),
      	LC_ALL = (unset),
      	LC_CTYPE = "UTF-8",
      	LANG = "en_US.UTF-8"
          are supported and installed on your system.
      perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
      locale: Cannot set LC_CTYPE to default locale: No such file or directory
      locale: Cannot set LC_ALL to default locale: No such file or directory
      Preconfiguring packages ...
      /usr/bin/locale: Cannot set LC_CTYPE to default locale: No such file or directory
      /usr/bin/locale: Cannot set LC_ALL to default locale: No such file or directory
      Selecting previously unselected package libsigsegv2:amd64.
      (Reading database ... 76085 files and directories currently installed.)
      Preparing to unpack .../libsigsegv2_2.10-5_amd64.deb ...
      Unpacking libsigsegv2:amd64 (2.10-5) ...
      Setting up libsigsegv2:amd64 (2.10-5) ...
      Selecting previously unselected package gawk.
      (Reading database ... 76095 files and directories currently installed.)
      Preparing to unpack .../00-gawk_1%3a4.1.4+dfsg-1_amd64.deb ...
      Unpacking gawk (1:4.1.4+dfsg-1) ...
      Selecting previously unselected package mysql-common.
      Preparing to unpack .../01-mysql-common_10.2.10+maria~stretch_all.deb ...
      Unpacking mysql-common (10.2.10+maria~stretch) ...
      Selecting previously unselected package mariadb-common.
      Preparing to unpack .../02-mariadb-common_10.2.10+maria~stretch_all.deb ...
      Unpacking mariadb-common (10.2.10+maria~stretch) ...
      Selecting previously unselected package galera-3.
      Preparing to unpack .../03-galera-3_25.3.20-stretch_amd64.deb ...
      Unpacking galera-3 (25.3.20-stretch) ...
      Selecting previously unselected package libdbi-perl.
      Preparing to unpack .../04-libdbi-perl_1.636-1+b1_amd64.deb ...
      Unpacking libdbi-perl (1.636-1+b1) ...
      Selecting previously unselected package libreadline5:amd64.
      Preparing to unpack .../05-libreadline5_5.2+dfsg-3+b1_amd64.deb ...
      Unpacking libreadline5:amd64 (5.2+dfsg-3+b1) ...
      Selecting previously unselected package mariadb-client-core-10.2.
      Preparing to unpack .../06-mariadb-client-core-10.2_10.2.10+maria~stretch_amd64.deb ...
      Unpacking mariadb-client-core-10.2 (10.2.10+maria~stretch) ...
      Selecting previously unselected package mariadb-client-10.2.
      Preparing to unpack .../07-mariadb-client-10.2_10.2.10+maria~stretch_amd64.deb ...
      Unpacking mariadb-client-10.2 (10.2.10+maria~stretch) ...
      Selecting previously unselected package mariadb-server-core-10.2.
      Preparing to unpack .../08-mariadb-server-core-10.2_10.2.10+maria~stretch_amd64.deb ...
      Unpacking mariadb-server-core-10.2 (10.2.10+maria~stretch) ...
      Selecting previously unselected package psmisc.
      Preparing to unpack .../09-psmisc_22.21-2.1+b2_amd64.deb ...
      Unpacking psmisc (22.21-2.1+b2) ...
      Selecting previously unselected package socat.
      Preparing to unpack .../10-socat_1.7.3.1-2+deb9u1_amd64.deb ...
      Unpacking socat (1.7.3.1-2+deb9u1) ...
      Setting up mysql-common (10.2.10+maria~stretch) ...
      Setting up mariadb-common (10.2.10+maria~stretch) ...
      Selecting previously unselected package mariadb-server-10.2.
      (Reading database ... 76669 files and directories currently installed.)
      Preparing to unpack .../00-mariadb-server-10.2_10.2.10+maria~stretch_amd64.deb ...
      locale: Cannot set LC_CTYPE to default locale: No such file or directory
      locale: Cannot set LC_ALL to default locale: No such file or directory
      Unpacking mariadb-server-10.2 (10.2.10+maria~stretch) ...
      Selecting previously unselected package libhtml-tagset-perl.
      Preparing to unpack .../01-libhtml-tagset-perl_3.20-3_all.deb ...
      Unpacking libhtml-tagset-perl (3.20-3) ...
      Selecting previously unselected package liburi-perl.
      Preparing to unpack .../02-liburi-perl_1.71-1_all.deb ...
      Unpacking liburi-perl (1.71-1) ...
      Selecting previously unselected package libhtml-parser-perl.
      Preparing to unpack .../03-libhtml-parser-perl_3.72-3_amd64.deb ...
      Unpacking libhtml-parser-perl (3.72-3) ...
      Selecting previously unselected package libcgi-pm-perl.
      Preparing to unpack .../04-libcgi-pm-perl_4.35-1_all.deb ...
      Unpacking libcgi-pm-perl (4.35-1) ...
      Selecting previously unselected package libfcgi-perl.
      Preparing to unpack .../05-libfcgi-perl_0.78-2_amd64.deb ...
      Unpacking libfcgi-perl (0.78-2) ...
      Selecting previously unselected package libcgi-fast-perl.
      Preparing to unpack .../06-libcgi-fast-perl_1%3a2.12-1_all.deb ...
      Unpacking libcgi-fast-perl (1:2.12-1) ...
      Selecting previously unselected package libmariadb3.
      Preparing to unpack .../07-libmariadb3_10.2.10+maria~stretch_amd64.deb ...
      Unpacking libmariadb3 (10.2.10+maria~stretch) ...
      Selecting previously unselected package libmariadbclient18.
      Preparing to unpack .../08-libmariadbclient18_10.2.10+maria~stretch_amd64.deb ...
      Unpacking libmariadbclient18 (10.2.10+maria~stretch) ...
      Selecting previously unselected package libdbd-mysql-perl.
      Preparing to unpack .../09-libdbd-mysql-perl_4.041-2_amd64.deb ...
      Unpacking libdbd-mysql-perl (4.041-2) ...
      Selecting previously unselected package libencode-locale-perl.
      Preparing to unpack .../10-libencode-locale-perl_1.05-1_all.deb ...
      Unpacking libencode-locale-perl (1.05-1) ...
      Selecting previously unselected package libhtml-template-perl.
      Preparing to unpack .../11-libhtml-template-perl_2.95-2_all.deb ...
      Unpacking libhtml-template-perl (2.95-2) ...
      Selecting previously unselected package libtimedate-perl.
      Preparing to unpack .../12-libtimedate-perl_2.3000-2_all.deb ...
      Unpacking libtimedate-perl (2.3000-2) ...
      Selecting previously unselected package libhttp-date-perl.
      Preparing to unpack .../13-libhttp-date-perl_6.02-1_all.deb ...
      Unpacking libhttp-date-perl (6.02-1) ...
      Selecting previously unselected package libio-html-perl.
      Preparing to unpack .../14-libio-html-perl_1.001-1_all.deb ...
      Unpacking libio-html-perl (1.001-1) ...
      Selecting previously unselected package liblwp-mediatypes-perl.
      Preparing to unpack .../15-liblwp-mediatypes-perl_6.02-1_all.deb ...
      Unpacking liblwp-mediatypes-perl (6.02-1) ...
      Selecting previously unselected package libhttp-message-perl.
      Preparing to unpack .../16-libhttp-message-perl_6.11-1_all.deb ...
      Unpacking libhttp-message-perl (6.11-1) ...
      Selecting previously unselected package libterm-readkey-perl.
      Preparing to unpack .../17-libterm-readkey-perl_2.37-1_amd64.deb ...
      Unpacking libterm-readkey-perl (2.37-1) ...
      Setting up libhtml-tagset-perl (3.20-3) ...
      Setting up libmariadb3 (10.2.10+maria~stretch) ...
      Setting up psmisc (22.21-2.1+b2) ...
      Setting up libencode-locale-perl (1.05-1) ...
      Setting up libtimedate-perl (2.3000-2) ...
      Setting up socat (1.7.3.1-2+deb9u1) ...
      Setting up libio-html-perl (1.001-1) ...
      Setting up libmariadbclient18 (10.2.10+maria~stretch) ...
      Setting up gawk (1:4.1.4+dfsg-1) ...
      Setting up libterm-readkey-perl (2.37-1) ...
      Setting up liblwp-mediatypes-perl (6.02-1) ...
      Processing triggers for libc-bin (2.24-11+deb9u1) ...
      Setting up galera-3 (25.3.20-stretch) ...
      Setting up liburi-perl (1.71-1) ...
      Processing triggers for systemd (232-25+deb9u1) ...
      Setting up libhtml-parser-perl (3.72-3) ...
      Setting up libcgi-pm-perl (4.35-1) ...
      Processing triggers for man-db (2.7.6.1-2) ...
      Setting up libreadline5:amd64 (5.2+dfsg-3+b1) ...
      Setting up mariadb-server-core-10.2 (10.2.10+maria~stretch) ...
      Setting up libfcgi-perl (0.78-2) ...
      Setting up libdbi-perl (1.636-1+b1) ...
      Setting up mariadb-client-core-10.2 (10.2.10+maria~stretch) ...
      Setting up mariadb-client-10.2 (10.2.10+maria~stretch) ...
      Setting up libhttp-date-perl (6.02-1) ...
      Setting up libhtml-template-perl (2.95-2) ...
      Setting up libcgi-fast-perl (1:2.12-1) ...
      Setting up mariadb-server-10.2 (10.2.10+maria~stretch) ...
      locale: Cannot set LC_CTYPE to default locale: No such file or directory
      locale: Cannot set LC_ALL to default locale: No such file or directory
      2017-11-01 18:34:37 140273120094400 [Note] /usr/sbin/mysqld (mysqld 10.2.10-MariaDB-10.2.10+maria~stretch) starting as process 14433 ...
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Uses event mutexes
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Compressed tables use zlib 1.2.8
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Using Linux native AIO
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Number of pools: 1
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Using SSE2 crc32 instructions
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Initializing buffer pool, total size = 256M, instances = 1, chunk size = 128M
      2017-11-01 18:34:37 140273120094400 [Note] InnoDB: Completed initialization of buffer pool
      2017-11-01 18:34:37 140272383637248 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2017-11-01 18:34:37 140273120094400 [ERROR] InnoDB: The Auto-extending innodb_system data file './ibdata1' is of a different size 0 pages than specified in the .cnf file: initial 768 pages, max 0 (relevant if non-zero) pages!
      2017-11-01 18:34:37 140273120094400 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
      2017-11-01 18:34:38 140273120094400 [Note] InnoDB: Starting shutdown...
      2017-11-01 18:34:38 140273120094400 [ERROR] Plugin 'InnoDB' init function returned error.
      2017-11-01 18:34:38 140273120094400 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
      2017-11-01 18:34:38 140273120094400 [Note] Plugin 'FEEDBACK' is disabled.
      ERROR: 1146  Table 'mysql.user' doesn't exist
      2017-11-01 18:34:38 140273120094400 [ERROR] Aborting
       
      /usr/bin/locale: Cannot set LC_CTYPE to default locale: No such file or directory
      /usr/bin/locale: Cannot set LC_ALL to default locale: No such file or directory
      Created symlink /etc/systemd/system/mysql.service → /lib/systemd/system/mariadb.service.
      Created symlink /etc/systemd/system/mysqld.service → /lib/systemd/system/mariadb.service.
      Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /lib/systemd/system/mariadb.service.
      Job for mariadb.service failed because the control process exited with error code.
      See "systemctl status mariadb.service" and "journalctl -xe" for details.
      invoke-rc.d: initscript mysql, action "start" failed.
      ● mariadb.service - MariaDB database server
         Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
        Drop-In: /etc/systemd/system/mariadb.service.d
                 └─migrated-from-my.cnf-settings.conf
         Active: failed (Result: exit-code) since Wed 2017-11-01 18:34:43 CET; 10ms ago
        Process: 14950 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
        Process: 14791 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
        Process: 14787 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
        Process: 14784 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
       Main PID: 14950 (code=exited, status=1/FAILURE)
         Status: "MariaDB server is down"
       
      Nov 01 18:34:43 vps.ovh.net mysqld[14950]: 2017-11-01 18:34:43 140586479507648 [ERROR] Plugin 'InnoDB' init function returned error.
      Nov 01 18:34:43 vps.ovh.net mysqld[14950]: 2017-11-01 18:34:43 140586479507648 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
      Nov 01 18:34:43 vps.ovh.net mysqld[14950]: 2017-11-01 18:34:43 140586479507648 [Note] Plugin 'FEEDBACK' is disabled.
      Nov 01 18:34:43 vps.ovh.net mysqld[14950]: 2017-11-01 18:34:43 140586479507648 [ERROR] Could not open mysql.plugin table. Some plugin…ot loaded
      Nov 01 18:34:43 vps.ovh.net mysqld[14950]: 2017-11-01 18:34:43 140586479507648 [ERROR] Unknown/unsupported storage engine: InnoDB
      Nov 01 18:34:43 vps.ovh.net mysqld[14950]: 2017-11-01 18:34:43 140586479507648 [ERROR] Aborting
      Nov 01 18:34:43 vps.ovh.net systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
      Nov 01 18:34:43 vps.ovh.net systemd[1]: Failed to start MariaDB database server.
      Nov 01 18:34:43 vps.ovh.net systemd[1]: mariadb.service: Unit entered failed state.
      Nov 01 18:34:43 vps.ovh.net systemd[1]: mariadb.service: Failed with result 'exit-code'.
      Hint: Some lines were ellipsized, use -l to show in full.
      dpkg: error processing package mariadb-server-10.2 (--configure):
       subprocess installed post-installation script returned error exit status 1
      Setting up libhttp-message-perl (6.11-1) ...
      Setting up libdbd-mysql-perl (4.041-2) ...
      Processing triggers for libc-bin (2.24-11+deb9u1) ...
      Processing triggers for systemd (232-25+deb9u1) ...
      Errors were encountered while processing:
       mariadb-server-10.2
      E: Sub-process /usr/bin/dpkg returned an error code (1)
      

      Attachments

        Issue Links

          Activity

            Samman Mark Samman added a comment -

            Commenting out innodb_flush_method in my.cnf seems to do the trick.

            Samman Mark Samman added a comment - Commenting out innodb_flush_method in my.cnf seems to do the trick.

            This regression is related to some InnoDB I/O code refactoring.

            There was some duplicated code for extending InnoDB files, in os_file_set_size() and fil_space_extend_must_retry().

            When wlad was working on fixing MDEV-13941, he decided to clean up the code. Unfortunately, he was unaware of the changes that had been made to the GNU libc function posix_fallocate() over time.

            While the old code had a fallback to write NUL bytes to the file in case posix_fallocate() returned EINVAL, the new code lacked this fallback. Some versions of posix_fallocate() in the GNU libc would fall back to calling pwrite() if the fallocate() system call fails.

            Apparently, under some circumstances this emulation is disabled. (In the libc-2.24.so on my Debian GNU/Linux unstable system, I see that posix_fallocate() invokes the ftruncate syscall, and on the -EOPNOTSUPP return value falls back to internal_fallocate() that uses ftruncate(), pread() and pwrite().)
            The ext3 file system probably does not support the needed fallocate() functionality due to the file mode, and therefore it likely returned -EINVAL, which would cause posix_fallocate() to return EINVAL.

            With innodb_flush_method=O_DIRECT or innodb_flush_method=O_DIRECT_NO_FSYNC InnoDB would invoke os_file_set_nocache(), which translates to fcntl(fd, F_SETFL, O_DIRECT). I did not find the exact point where -EINVAL would be returned by the Linux file system code (fs/ext4/extents.c seems to handle also ext3 nowadays), but I think that it is plausible that setting O_DIRECT may cause the fallocate() system call to fail on ext3.

            Also, some versions of the pwrite() fallback code in posix_fallocate() apparently had a bug that would cause the pre-existing part of the file to be overwritten with zeroes (causing data corruption). This one was caught and fixed as MDEV-14132 before the 10.2.10 release.

            marko Marko Mäkelä added a comment - This regression is related to some InnoDB I/O code refactoring. There was some duplicated code for extending InnoDB files, in os_file_set_size() and fil_space_extend_must_retry(). When wlad was working on fixing MDEV-13941 , he decided to clean up the code . Unfortunately, he was unaware of the changes that had been made to the GNU libc function posix_fallocate() over time. While the old code had a fallback to write NUL bytes to the file in case posix_fallocate() returned EINVAL, the new code lacked this fallback. Some versions of posix_fallocate() in the GNU libc would fall back to calling pwrite() if the fallocate() system call fails. Apparently, under some circumstances this emulation is disabled. (In the libc-2.24.so on my Debian GNU/Linux unstable system, I see that posix_fallocate() invokes the ftruncate syscall, and on the -EOPNOTSUPP return value falls back to internal_fallocate() that uses ftruncate(), pread() and pwrite().) The ext3 file system probably does not support the needed fallocate() functionality due to the file mode, and therefore it likely returned -EINVAL, which would cause posix_fallocate() to return EINVAL. With innodb_flush_method=O_DIRECT or innodb_flush_method=O_DIRECT_NO_FSYNC InnoDB would invoke os_file_set_nocache(), which translates to fcntl(fd, F_SETFL, O_DIRECT). I did not find the exact point where -EINVAL would be returned by the Linux file system code (fs/ext4/extents.c seems to handle also ext3 nowadays), but I think that it is plausible that setting O_DIRECT may cause the fallocate() system call to fail on ext3. Also, some versions of the pwrite() fallback code in posix_fallocate() apparently had a bug that would cause the pre-existing part of the file to be overwritten with zeroes (causing data corruption). This one was caught and fixed as MDEV-14132 before the 10.2.10 release.

            I tried to look where the Linux kernel would return -EINVAL for fallocate(), but I could not find it. All I could find is that it should return -EOPNOTSUPP. Neither ext2 nor ext3 implemented fallocate(). It seems that the ext4_fallocate() is correctly returning -EOPNOTSUPP for ext3 file systems.
            Side note: ext3 support was removed in Linux 4.3.0, and now the ext4 driver is being used for ext3.
            wlad suggested that the problem might be in the GNU libc implementation of posix_fallocate(), and he seems to be right, as demonstrated by this strace output collected from a Linux 4.9.0 kernel by elenst:

            fallocate(4, 0, 0, 12582912)            = -1 EOPNOTSUPP (Operation not supported)
            fcntl(4, F_GETFL)                       = 0xc002 (flags O_RDWR|O_DIRECT|O_LARGEFILE)
            fstat(4, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0
            fstatfs(4, {f_type=EXT2_SUPER_MAGIC, f_bsize=4096, f_blocks=5126157, f_bfree=4605044, f_bavail=4342985, f_files=1310720, f_ffree=1303159, f_fsid={val=[1470644214, 2964447716]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0
            pwrite64(4, "\0", 1, 4095)              = -1 EINVAL (Invalid argument)
            clock_gettime(CLOCK_REALTIME, {tv_sec=1509631447, tv_nsec=707693773}) = 0
            write(2, "2017-11-02 10:04:07 139962507741"..., 1212017-11-02 10:04:07 139962507741376 [ERROR] InnoDB: preallocating 12582912 bytes for file ./ibdata1 failed with error 22
            ) = 121
            

            The posix_fallocate() code is calling pwrite(fd, "", 1, aligned_offset-1) to extend the file by one page. This will not work in O_DIRECT mode, because both the buffer and the write size must be aligned to the file system block size.

            This explains why we have to handle the EINVAL return value from the GNU posix_fallocate(). The fallback code in InnoDB should be compatible with O_DIRECT, provided that no file system block size is bigger than innodb_page_size.

            marko Marko Mäkelä added a comment - I tried to look where the Linux kernel would return -EINVAL for fallocate(), but I could not find it. All I could find is that it should return -EOPNOTSUPP. Neither ext2 nor ext3 implemented fallocate(). It seems that the ext4_fallocate() is correctly returning -EOPNOTSUPP for ext3 file systems. Side note: ext3 support was removed in Linux 4.3.0, and now the ext4 driver is being used for ext3. wlad suggested that the problem might be in the GNU libc implementation of posix_fallocate(), and he seems to be right, as demonstrated by this strace output collected from a Linux 4.9.0 kernel by elenst : fallocate(4, 0, 0, 12582912) = -1 EOPNOTSUPP (Operation not supported) fcntl(4, F_GETFL) = 0xc002 (flags O_RDWR|O_DIRECT|O_LARGEFILE) fstat(4, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0 fstatfs(4, {f_type=EXT2_SUPER_MAGIC, f_bsize=4096, f_blocks=5126157, f_bfree=4605044, f_bavail=4342985, f_files=1310720, f_ffree=1303159, f_fsid={val=[1470644214, 2964447716]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0 pwrite64(4, "\0", 1, 4095) = -1 EINVAL (Invalid argument) clock_gettime(CLOCK_REALTIME, {tv_sec=1509631447, tv_nsec=707693773}) = 0 write(2, "2017-11-02 10:04:07 139962507741"..., 1212017-11-02 10:04:07 139962507741376 [ERROR] InnoDB: preallocating 12582912 bytes for file ./ibdata1 failed with error 22 ) = 121 The posix_fallocate() code is calling pwrite(fd, "", 1, aligned_offset-1) to extend the file by one page. This will not work in O_DIRECT mode, because both the buffer and the write size must be aligned to the file system block size. This explains why we have to handle the EINVAL return value from the GNU posix_fallocate(). The fallback code in InnoDB should be compatible with O_DIRECT, provided that no file system block size is bigger than innodb_page_size.

            There were several issues with the posix_fallocate() call, many of which I fixed as part of MDEV-11520.

            In my refactoring of the code in MariaDB 5.5, 10.0 and 10.1, I made the mistake that an EINVAL return value from posix_fallocate() would be treated as a hard error, and we would not fall back to writing NUL bytes in a pwrite() loop.

            Because the parameter innodb_use_fallocate was OFF by default until 10.2 deprecated the parameter and treated it as if it is always ON, this issue does not affect older MariaDB releases in the same way.

            In 5.5, 10.0, 10.1, if you ask for innodb_use_fallocate, maybe you really should ensure that the file system supports that. If you fail to do that, and if also O_DIRECT is enabled, you would get the same error as the reported error on 10.2.10.

            I already pushed a fix for MariaDB 10.2.11, and I plan to fix this bug in MariaDB 10.1.29, but not in 5.5 or 10.0.

            marko Marko Mäkelä added a comment - There were several issues with the posix_fallocate() call, many of which I fixed as part of MDEV-11520 . In my refactoring of the code in MariaDB 5.5, 10.0 and 10.1, I made the mistake that an EINVAL return value from posix_fallocate() would be treated as a hard error, and we would not fall back to writing NUL bytes in a pwrite() loop. Because the parameter innodb_use_fallocate was OFF by default until 10.2 deprecated the parameter and treated it as if it is always ON, this issue does not affect older MariaDB releases in the same way. In 5.5, 10.0, 10.1, if you ask for innodb_use_fallocate, maybe you really should ensure that the file system supports that. If you fail to do that, and if also O_DIRECT is enabled, you would get the same error as the reported error on 10.2.10. I already pushed a fix for MariaDB 10.2.11, and I plan to fix this bug in MariaDB 10.1.29, but not in 5.5 or 10.0.

            Originally, as part of MDEV-4338 (MariaDB 5.5.37, 10.0.11, 10.1.0) the function posix_fallocate() was used incorrectly: only the return value -1 was treated as an error. So, an EINVAL return value would have been ignored, and the file would not have been extended. This incorrect code was duplicated in MDEV-5746.

            MDEV-11520 introduced almost correct error handling for posix_fallocate(): the only thing that was missing was the fallback for the EINVAL return value.

            marko Marko Mäkelä added a comment - Originally, as part of MDEV-4338 (MariaDB 5.5.37, 10.0.11, 10.1.0) the function posix_fallocate() was used incorrectly: only the return value -1 was treated as an error. So, an EINVAL return value would have been ignored, and the file would not have been extended. This incorrect code was duplicated in MDEV-5746 . MDEV-11520 introduced almost correct error handling for posix_fallocate(): the only thing that was missing was the fallback for the EINVAL return value.

            People

              marko Marko Mäkelä
              Samman Mark Samman
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.