we first found this bug in mysql 5.6, when there are too many partitions the slave will delay obviously. The root reason is the function hanlder::external_lock() is called too many times in slave node, but not in master node. That's because the code path is different between master and slave.
For example, we create a table with partitions and delete one row to see the code path.
delete from employees where id = 1;
The master call stack is :
The slave call stack is :
In function lock_tables, will call ha_partition::store_lock and ha_partition::external_lock, these two functions will call partition's handler store_lock and external_lock functions. The code shows that master node lock_tables after prune_partitons, so not all partitions' hanlder will be called. But the slave node open and lock tables in one function without prune partitions. so every partition's handler will be called. If we have too many partitions, that would be a performance penalty.
MariaDB's master and slave code path are both like the slave's , open and lock tables together without prune partitions.
we have a fix demo to delay call hanlder's external_lock and store_lock after partition pruned. The diff code base on mysql 5.6 is in the attachment. we cannot sure that wouldn't cause other issues.
About the master code path, I think mariadb can reference mysql 5.6 's way.