[MDEV-27634] innodb_zip tests failing on s390x Created: 2022-01-26 Updated: 2023-08-22 Resolved: 2022-02-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Tests |
| Affects Version/s: | 10.5.13, 10.6.5 |
| Fix Version/s: | 10.9.0, 10.2.44, 10.3.35, 10.4.25, 10.5.16, 10.6.8, 10.7.4, 10.8.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Danilo Spinella | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
SUSE Linux Enterprise Server 15 SP 3, |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
innodb_zip.* and innodb.innodb-16k tests are failing on s390x arch with the following error message:
I am attaching the full log of the tests. |
| Comments |
| Comment by Edward Stoever [ 2022-02-07 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Support Case CS0378173 connected to this ticket. Attached log "mariadb-s390x-tests.log" contains 36 cases of ER_TOO_BIG_ROWSIZE and 1 case of ER_LOCK_WAIT_TIMEOUT. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Edward Stoever [ 2022-02-10 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Added files from CS0378173. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I see that mariadb-s390x-build.txt Some tests of ROW_FORMAT=COMPRESSED depend on a particular zlib version. I remember that a change of the definition of the function compressBound() caused some trouble some 10 to 15 years ago. A quick check in the zlib changelog on my AMD64 system suggests that versions 1.2.8 and 1.2.11 could work. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Danilo Spinella [ 2022-02-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The version used is the latest (1.2.11) but with this patch applied. It provides hardware-accelerated on s390x, and I think it could be the root cause of this issue. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I see that the patch indeed does redefine the compressBound() function. It could very well explain those test failures. At least some tests are trying to exercise some internal limits, such as triggering a compressed page overflow. If you want those tests to pass, they should be adjusted for this limit somehow. The following is me thinking aloud. MariaDB does provide WITH_ZLIB=bundled. The MariaDB Foundation CI does include some S390x systems. Why not apply that patch there? Related to that: In Also, unrelated to this, but related to s390x, maybe you would want to implement | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Danilo Spinella [ 2022-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The following is me thinking aloud. MariaDB does provide WITH_ZLIB=bundled. The MariaDB Foundation CI does include some S390x systems. Why not apply that patch there? Yea, that would definitely be a good idea. The best solution would be to zlib to merge this patch, but as far as I remember, they are not accepting any Pull Request at the moment. I will try running the tests without this patch just to confirm that it's the root cause. Regarding | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Danilo Spinella [ 2022-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Just checked and without the patch the tests succeed. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
danyspin97, thank you. For what it is worth, before we did our ‘homebrew’ my_checksum() for the ISO 3309 CRC-32 polynomial, I was advocating the idea of simply getting an optimized implementation via zlib. After some research, it looked like many performance enhancement patches had been waiting to be merged for a long time, so rolling our own version was the most practical option. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I see that we already have a builder where tests fail in a similar way: https://buildbot.mariadb.org/#/builders/328/builds/1495/steps/6/logs/stdio So, this could be fixed by adjusting the tests in some way. I may need interactive access to that builder to be able to do that quickly, or I might try to patch the compressBound() function in a similar way on my local AMD64 based system. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Danilo Spinella [ 2022-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I see that we already have a builder where tests fail in a similar way: This is interesting. Yea, it would be great if the tests could be patched. Is it safe to skip them in the meanwhile? Or is there some issue when mariadb uses the patched zlib? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I think that the failures can be safely ignored meanwhile. My design of the InnoDB ROW_FORMAT=COMPRESSED is conservative. It is assumed that any compression operation may fail (due to the lack of space). The only potentially invalid (but in my opinion, reasonable) assumption is that any zlib version can decompress what any other zlib version compressed. There was a parameter innodb_log_compressed_pages that was deprecated and ignored in For some time, I wanted to get rid of this format (see | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I applied the following patch to simulate the patched s390x compressBound() in any cmake -DWITH_ZLIB=bundled build:
In the server error log I see the following:
In most failed tests, a CREATE TABLE statement fails with ER_TOO_BIG_ROWSIZE (1118). In innodb_zip.wl6347_comp_indx_stat, the observed data file sizes are larger. For the test innodb_zip.bug36169, a debug assertion (which is only enabled in debug builds) fails:
The full list of test failures is as follows:
I will try to tweak all these tests so that they will pass with the normal compressBound() as well the weaker compressBound() guarantee that is associated with the s390x DFLTCC instruction. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The following version of the function generates the same AMD64 code for me:
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||
|
danyspin97, for some time, the results from my test push are available at https://buildbot.mariadb.org/#/grid?branch=st-10.5-MDEV-27634 s390x-rhel-8 archive.archive s390x-sles-15 archive.archive s390x-ubuntu-2004 main.column_compression AnalysisThe builders s390x-rhel-8-rpm-autobake, s390x-sles-15-rpm-autobake had not completed as of this writing. I would assume that they will fare in a similar way to the non-autobake builds. The test perfschema.show_aggregate is failing very often on various platforms, including AMD64. The other failures look like they may be directly related to the zlib version. As you can see, the Docker container that pretends to be Ubuntu 20.04 had fewer failing tests; perhaps they are not applying that s390x DFLTCC patch? The failure of the test main.column_compression might be explained by the bug None of the remaining tests should be related to InnoDB and this ticket. I plan to push the fix to the 10.2 branch and merge from there to later-version branches. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Danilo Spinella [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Thank you @Marko Mäkelä for the help and the fix. Regarding the other tests, I also think that they are not related to s390x DFLTCC patch as they are currently succeeding for us. The main.func_math is currently failing in some configurations and could be related to | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-22 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The test case innodb.row_size_error_log_warnings_3 that was added along with the fix of |