[MDEV-4033] Unable to use slave's temporary directory /tmp - Can't create/write to file '/tmp/SQL_LOAD-' (Errcode: 17 "File exists") Created: 2013-01-14 Updated: 2013-04-14 Resolved: 2013-04-14 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.0.0 |
| Fix Version/s: | 10.0.2 |
| Type: | Bug | Priority: | Major |
| Reporter: | Gordan Bobic | Assignee: | Michael Widenius |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Linux (RHEL6) |
||
| Description |
|
The original unmodified description of the problem can be found at the bottom of the field On SQL thread start, a slave checks permissions for slave_load_tmpdir by creating (and immediately deleting) a dummy file there: sql/slave.cc:
The resulting file is <slave_load_tmpdir>/SQL_LOAD-. If the file exists, slave fails to start. The draft test case below shows that, it fails for any version. Test case below shows how slave fails to start when the file exists
================================= Multi-source replication seems to randomly stop with Only see it occassionally on a huge replication volume (100s of GB/day). Is the master name perhaps missing off the end of that file name? |
| Comments |
| Comment by Gordan Bobic [ 2013-01-14 ] |
|
Looking at some of the other, older error reports on the internet, it seems the file name is missing a suffix of some sort, and it looks like the multiple replication slave threads are trying to write to the same suffixless file. We are using row-based (rather than statement based or mixed) replication. |
| Comment by Elena Stepanova [ 2013-01-16 ] |
|
Hi Gordan, I am able to reproduce the failure, but the curious thing is that, as far as I can see, it should only be happening if your multiple slave threads stop and start during your workload (particularly, SQL slave threads). Can it be so? Is there any indication of SQL threads restart in the error log? I think if there is, and if it's not intentional, it also deserves some attention. |
| Comment by Gordan Bobic [ 2013-01-16 ] |
|
Not 100% sure that something didn't happen here that explicitly stopped the threads. I will keep an eye on it if it happens again. |
| Comment by Elena Stepanova [ 2013-01-16 ] |
|
Okay, then for now I assume that they indeed restarted, and update the description accordingly. |
| Comment by Kristian Nielsen [ 2013-04-14 ] |
|
I was able to easily repeat this problem in eg. test multi_source.info_logs by === modified file 'sql/slave.cc' DBUG_RETURN(0); |
| Comment by Michael Widenius [ 2013-04-14 ] |
|
I have fixed this by caching the result of check_temp_dir() |
| Comment by Michael Widenius [ 2013-04-14 ] |
|
Pushed into 10.0-base |