[MDEV-21229] SIGABRT on most simple commands when "wsrep_on=1" AND eating up *all* available memory Created: 2019-12-05 Updated: 2019-12-09 Resolved: 2019-12-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Replication, Server |
| Affects Version/s: | 10.4.10 |
| Fix Version/s: | 10.4.11 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Michal Schorm | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | crash, replication | ||
| Environment: |
Fedora 31; latest updates & kernel applied |
||
| Attachments: |
|
| Description |
|
Hello, — TL;DR: Reproducible everytime on Fedora 31 with MariaDB 10.4.10 and Galera 26.4.3 Fedora packages. Disabling Firewall nor SELinux helps. — Installed packages:
So basically the server, client, server-galera and galera. — Configuration:
We don't need any more machines in the cluster. The issue is reproducible on the single machine started by "galera_new_cluster". The issue is not reproducible, when the MariaDB packages are built in debug mode without optimization. (-O0) – The issue is reproducible on every run, no matter how many times the server was restarted before or if it previously ran with different configuration. I start the server, however, with:
So every time I run with the clean setup. There are no other data, than those created by the server during the first run. — After the server started, I can attach to it by e.g. gdb. The last breakpoint I was able to find is "sql_parse.cc:5061". Uknown number of instructions later, the server will recieve SIGABRT. As a part of SIGABRT handling, the server will try to get a stacktrace. The server has 2GB of RAM; <100M used when the DB is not running; ~500MB used when the DB is running, having ~1,4 GB free. Last safe breakpoint I managed to find before that is "stacktrace.c:273". — Let me know which additional information would you consider helpful and I'll try to get them to you. |
| Comments |
| Comment by Michal Schorm [ 2019-12-05 ] |
|
The issue doesn't seem to affect packages you released. |
| Comment by Michal Schorm [ 2019-12-05 ] |
|
It seems like the issue is not present, when I use the latest git source (branch 10.4: aab6cefe8) for building of the packages. |
| Comment by Daniel Bartholomew [ 2019-12-05 ] |
|
There was a build issue we discovered after the release where some of the Fedora 31 packages were not getting built. We discovered it before announcing the release and so we never announced support for Fedora 31 in the release notes for 10.4.10 or 10.3.20. However, the partial set of packages for Fedora 31 were mistakenly uploaded to the mirrors. I've now removed them from the primary mirror and the rest of the mirrors will update when they next pull from the primary mirror. My understanding is that the build issue has now been resolved, at least from looking at the most recent builds in buildbot. For example, this log from the most recent 10.4 build shows our basic install test succeeding on our fedora-31 builder: http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-fedora31-amd64/builds/158/steps/install/logs/stdio So for the next releases of 10.4 and 10.3 we will have a working set of Fedora 31 packages. |
| Comment by Michal Schorm [ 2019-12-06 ] |
|
I gave it more testing and it looks like it really is resolved. IMHO you can mark it as solved in 10.4.11 |
| Comment by Teemu Ollakka [ 2019-12-06 ] |
|
We found actual issue with wsrep-lib `std::vector` usage which causes assertion in std library when _GLIBCXX_ASSERTIONS is defined. The fix has been merged to wsrep-lib master and I opened a PR against MariaDB 10.4 to update wsrep-lib: https://github.com/MariaDB/server/pull/1423. |
| Comment by Michal Schorm [ 2019-12-09 ] |
|
Just FYI: I found that the commit I was testing did NOT solve the issue. (branch 10.4: aab6cefe8) However also I can confirm, the commit mentioned by Teemu Ollakka DO solve the issue (9a621200899) Thanks for fixing. |