Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28580

Server crash when creating an index after adding a foreign key

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Incomplete
    • 10.7.3
    • N/A
    • None
    • docker desktop on windows

    Description

      When using a newly created database, executing the following sql statements will cause a server crash.

      create table player_profiles (
        id                            varchar(40) not null,
        name                          varchar(255) not null,
        tokens                        integer default 0 not null,
        constraint pk_player_profiles primary key (id)
      );
       
      create table player_report (
        id                            varchar(40) not null,
        sender_player_id              varchar(40) not null,
        target_player_id              varchar(40) not null,
        created_at                    datetime(6) not null,
        constraint pk_player_report primary key (id)
      );
       
      create index ix_player_report_sender_player_id on player_report (sender_player_id);
       
      -- the next line adds a foreign key, removing this line will prevent a server crash
      alter table player_report add constraint fk_player_report_sender_player_id foreign key (sender_player_id) references player_profiles (id) on delete restrict on update restrict;
       
      -- server crashes on this query
      create index ix_player_report_target_player_id on player_report (target_player_id);
      

      The crash occurs when adding a foreign key, followed by the creation of an index. This happens on the same table but on different columns.

      The docker image was freshly pulled from Docker Hub and run on a Windows 10 computer.

      Attachments

        1. docker-compose.yml
          0.4 kB
          Joris Guffens
        2. Logfile3.PML
          9.42 MB
          Joris Guffens
        3. mariadb.strace
          19 kB
          Joris Guffens
        4. mariadb-1.strace
          2.01 MB
          Joris Guffens
        5. mariadb-crash-1.log
          11 kB
          Joris Guffens

        Issue Links

          Activity

            danblack Daniel Black added a comment - - edited

            WSL I think is providing a kernel interface. The glibc interface within the container is the raw ubuntu-20.04 glibc provided by the base container image of mariadb:10.7.

            On mariadb-1.strace

            pwrite based fallback after fallocate fails seem to be working.

            Line:

            659: 68    openat(AT_FDCWD, "./testdb/#sql-alter-1-9a.ibd", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0660) = 50
            ..
            706: 68    pwrite64(50, "\0", 1, 114687)     = 1
            707: 68    fdatasync(50)                     = 0
            708: 68    fdatasync(50)                     = 0
            ..
            (no closing of fd 50)
            ...
            740: 68    stat("./testdb/#sql-alter-1-9a.ibd",  <unfinished ...>
            744: 68    <... stat resumed>{st_mode=S_IFREG|0660, st_size=114688, ...}) = 0
             
            830: 68    rename("./testdb/#sql-alter-1-9a.ibd", "./testdb/player_report.ibd") = 0
            ...
             
            968: 68    recvfrom(41, "\3-- server crashes on this que....
            ..
            1002: 68    fstat(50, 0x7fd0144c9130)         = -1 ENOENT (No such file or directory)
            1003: 68    write(2, "2022-05-19 10:51:51 154 [ERROR] InnoDB: preallocating 131072 bytes for file ./testdb/player_report."..., 123) = 123
            1004: 68    fstat(50, 0x7fd0144c92e0)         = -1 ENOENT (No such file or directory)
            1005: 68    write(2, "2022-05-19 10:51:51 0x7fd0144ce700", 34) = 34
            1006: 68    write(2, "  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.7.3/storage/innobase/f"..., 122) = 122
            

            In the strace there is no closing of file descriptor 50.

            So its looking like fstat on an open file descriptor is loosing track of the file because it was renamed earlier.

            jorisguffens if you could provide the info from the template bug report related to wsl information and kernel information.

            I haven't looked though the PML file yet. My quick look so far can't see a WSL bug on this. I'm happy to write WSL bug using the provide information and I'll search again just to be sure first.

            danblack Daniel Black added a comment - - edited WSL I think is providing a kernel interface. The glibc interface within the container is the raw ubuntu-20.04 glibc provided by the base container image of mariadb:10.7. On mariadb-1.strace pwrite based fallback after fallocate fails seem to be working. Line: 659: 68 openat(AT_FDCWD, "./testdb/#sql-alter-1-9a.ibd", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0660) = 50 .. 706: 68 pwrite64(50, "\0", 1, 114687) = 1 707: 68 fdatasync(50) = 0 708: 68 fdatasync(50) = 0 .. (no closing of fd 50) ... 740: 68 stat("./testdb/#sql-alter-1-9a.ibd", <unfinished ...> 744: 68 <... stat resumed>{st_mode=S_IFREG|0660, st_size=114688, ...}) = 0   830: 68 rename("./testdb/#sql-alter-1-9a.ibd", "./testdb/player_report.ibd") = 0 ...   968: 68 recvfrom(41, "\3-- server crashes on this que.... .. 1002: 68 fstat(50, 0x7fd0144c9130) = -1 ENOENT (No such file or directory) 1003: 68 write(2, "2022-05-19 10:51:51 154 [ERROR] InnoDB: preallocating 131072 bytes for file ./testdb/player_report."..., 123) = 123 1004: 68 fstat(50, 0x7fd0144c92e0) = -1 ENOENT (No such file or directory) 1005: 68 write(2, "2022-05-19 10:51:51 0x7fd0144ce700", 34) = 34 1006: 68 write(2, " InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.7.3/storage/innobase/f"..., 122) = 122 In the strace there is no closing of file descriptor 50. So its looking like fstat on an open file descriptor is loosing track of the file because it was renamed earlier. jorisguffens if you could provide the info from the template bug report related to wsl information and kernel information. I haven't looked though the PML file yet. My quick look so far can't see a WSL bug on this. I'm happy to write WSL bug using the provide information and I'll search again just to be sure first.
            jorisguffens Joris Guffens added a comment -

            The information you requested:
            Microsoft Windows [Version 10.0.19044.1706]
            WSL 2
            Kernel version: 5.10.102.1
            Docker Desktop (Windows), version 4.8.2

            While looking for your information I noticed my wsl kernel was quite a few versions behind (5.4.72). I updated and I was still able to reproduce this issue with the latest kernel.

            jorisguffens Joris Guffens added a comment - The information you requested: Microsoft Windows [Version 10.0.19044.1706] WSL 2 Kernel version: 5.10.102.1 Docker Desktop (Windows), version 4.8.2 While looking for your information I noticed my wsl kernel was quite a few versions behind (5.4.72). I updated and I was still able to reproduce this issue with the latest kernel.
            danblack Daniel Black added a comment - - edited

            WSL bug 8443 lodged.

            danblack Daniel Black added a comment - - edited WSL bug 8443 lodged.
            danblack Daniel Black added a comment -

            FYI I've implemented an attempted workaround (10.3) based. This is available as quay.io/mariadb-foundation/mariadb-devel:10.3-mdev-29015-avoid-wsl8443. If you are able to test this to see if it crashes in a Windows WSL environment that would be appreciated.

            If it fails for some reason, can you perform a strace on the test case like the instructions above.

            danblack Daniel Black added a comment - FYI I've implemented an attempted workaround (10.3) based. This is available as quay.io/mariadb-foundation/mariadb-devel:10.3-mdev-29015-avoid-wsl8443. If you are able to test this to see if it crashes in a Windows WSL environment that would be appreciated. If it fails for some reason, can you perform a strace on the test case like the instructions above.
            danblack Daniel Black added a comment -

            I retested on Windows 10 - 19044.1826 and was unable to reproduce this.

            I also tested on Windows 11 22000.795 and couldn't reproduce it there either.

            Can you please retest if your Windows updates resolve this.

            danblack Daniel Black added a comment - I retested on Windows 10 - 19044.1826 and was unable to reproduce this. I also tested on Windows 11 22000.795 and couldn't reproduce it there either. Can you please retest if your Windows updates resolve this.

            People

              danblack Daniel Black
              jorisguffens Joris Guffens
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.