[MDEV-6155] Upgraded servers from 5.5 to 10 and replication lag keeps growing Created: 2014-04-23 Updated: 2014-05-26 Due: 2014-05-23 Resolved: 2014-05-26 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.0.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Antonio Fernandes | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | replication | ||
| Environment: |
CentOS x86_64 (master) and CentOS i386 (slave) |
||
| Description |
|
I've used MariaDB 5.5 for month with replication without a hiccup. As the upgrade plan sees, I've upgraded the slave to 10 and then the master to 10 also and from that moment I'm seeing constantly increasing slave lag... From iotop shows
and jbd2 clogs completely IO... witch didn't happen... Checking process list, and refreshing it constantly, I see constantly Table lock in COMMIT phase...
I've changed from innodb to myisam (it could be a storage engine bottleneck) but without any changes... Any clues on what might be (besides the change from 5.5 to 10)? Thank you, |
| Comments |
| Comment by Elena Stepanova [ 2014-04-23 ] | ||||||||||||||
|
Hi, When you said
did you mean that you converted all your existing tables from InnoDB to MyISAM (ran ALTER TABLE .. ENGINE=MyISAM), and still get the same IO problem? Also, did you try to stop mysqld and check that when it's not running, jbd2 doesn't clog the IO? I agree it would be a questionable coincidence that it started doing so exactly when you upgraded from 5.5 to 10.0, but coincidences do happen, so it would be good to rule it out. | ||||||||||||||
| Comment by Antonio Fernandes [ 2014-04-23 ] | ||||||||||||||
|
Hi Elena, Just converted some tables. With innotop I've checked that some specific tables were constantly locked. And changed only those (about 3 or 4). As for stopping mysqld, the minute I execute STOP SLAVE, the IO goes to "normal" (and also jbd2). I also tried remounting the filesystem (ext4) with noatime and that didn't change visibly the replication performance. I can now add some more information based on continuous digging:
I have a bunch of scripts that imported data from flat files to MariaDB. They do it line by line with autocommit ON (by default). I've changed those scripts to use transactions in subsets of lines and the loading occurs way faster... I'm suspecting that line by line inserts (with commit also line by line) have a worst performance than 5.5. Since the slave has older/worst hardware, it lags... as the master keeps eating data as normal... Best regards, | ||||||||||||||
| Comment by Elena Stepanova [ 2014-04-23 ] | ||||||||||||||
Indeed, there were many changes in InnoDB which dealt with IO between 5.5 and 5.6/10.0, and since the transactional approach is in favor, the single-statement transactions might require more tuning. What's weird though is that you observed it on MyISAM tables – it shouldn't matter for them whether you do inserts as big transactions or single statements. Maybe it so happened that you converted different tables though, not those which were inserted into. Given that we are still talking about InnoDB, you could also consider adjusting innodb_flush_log_at_trx_commit ( https://mariadb.com/kb/en/xtradbinnodb-server-system-variables/#innodb_flush_log_at_trx_commit ). The default value 1 is the safest, but since you are tuning a slave, you might be able to afford using a different value. Although, if you have already switched to big transactions, it probably won't make much difference. There are other ways to tune InnoDB-related IO, see http://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb-diskio.html . | ||||||||||||||
| Comment by Antonio Fernandes [ 2014-04-23 ] | ||||||||||||||
|
Hi, Change innodb_flush_log_at_trx_commit to 0 did in fact reduce IO (but it can add inconsistency between slave and master in the event of a crash):
Now it's CPU bounded... but that's OK since it's catching up. As for InnoDB vs MyISAM, i'll continue my tests and post then here. Best regards, | ||||||||||||||
| Comment by Antonio Fernandes [ 2014-04-28 ] | ||||||||||||||
|
Hi, I'm starting to monitor the 2 servers with percona-cacti-templates (never had the need BTW is there any procedure to make MyISAM RW bechmark between 5.5 and 10? Best regards, | ||||||||||||||
| Comment by Elena Stepanova [ 2014-04-28 ] | ||||||||||||||
|
I'm not sure what you mean by a procedure. | ||||||||||||||
| Comment by Antonio Fernandes [ 2014-05-15 ] | ||||||||||||||
|
Hello Elena, I couldn't find time to really test this thoroughly Do you know if anyone measured single-threaded table lock/unlock performance for MyISAM between 5.5 and 10? If not, you could close the ticket because I did workaround the problem (don't replicate 6.2 million MyISAM records that are updated every night). Regards, | ||||||||||||||
| Comment by Elena Stepanova [ 2014-05-26 ] | ||||||||||||||
|
I haven't heard of this specific benchmark being done. As discussed above, closing for now. If you have more information that needs to be looked at, please comment to re-open the report. |