Ok, I measured some more , with and without taskset. So, one can see what my appear as very slight regression if taskset is not used ,specifically for threadpool case. But, this is a phantom regression . Indeed, as mentioned elsewhere (e.g in threadpool documentation, in the section of how to run benchmarks), benchmark driver seems to take a bigger share of the overall CPU. Concretely, in this case in 10.1, without pinning, you can get a situation where sysbench-0.5 is using 10 CPUs out of 32, while mysql is using 22 CPUs, as shown by "top". The idle time is 0%, there are 32 CPUs, that are all busy. However, mysqld can do more, if affinitized (use up to 24 CPUs, which results in better throughput, but then sysbench needs to be restricted).In all of my affinitized test, threadpool outperforms thread-per-connection (the later can be affinitized or not). In all of overall tests, threadpool continues to scale above 1024 concurrent selects.
Either there is something I do wrong on my end, or I'd say that the benchmarks were not run properly, and the same hardware can do better, and outperform thread-per-connection in all aspects, including raw throughput, if the benchmark would run using taskset, as mentioned in the threadpool documentation.
taskset really makes a visible difference.
I shared my results in
https://docs.google.com/spreadsheets/d/12KPobxrP89BzrevPaCoGxGUPnI4kuLWRtTLjTfPJw78/edit#gid=0
axel, I'm reasssigning this back. Could you please confirm, my findings (and, in this case, I think the MDEV can be closed), or tell my whether I do something wrong.
I shared details how I run the benchmarks, including sysbench and mysqld parameters (including the taskset params) in this comment
https://jira.mariadb.org/browse/MDEV-10064?focusedCommentId=84510&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-84510
Ok, I measured some more , with and without taskset. So, one can see what my appear as very slight regression if taskset is not used ,specifically for threadpool case. But, this is a phantom regression . Indeed, as mentioned elsewhere (e.g in threadpool documentation, in the section of how to run benchmarks), benchmark driver seems to take a bigger share of the overall CPU. Concretely, in this case in 10.1, without pinning, you can get a situation where sysbench-0.5 is using 10 CPUs out of 32, while mysql is using 22 CPUs, as shown by "top". The idle time is 0%, there are 32 CPUs, that are all busy. However, mysqld can do more, if affinitized (use up to 24 CPUs, which results in better throughput, but then sysbench needs to be restricted).In all of my affinitized test, threadpool outperforms thread-per-connection (the later can be affinitized or not). In all of overall tests, threadpool continues to scale above 1024 concurrent selects.
Either there is something I do wrong on my end, or I'd say that the benchmarks were not run properly, and the same hardware can do better, and outperform thread-per-connection in all aspects, including raw throughput, if the benchmark would run using taskset, as mentioned in the threadpool documentation.
taskset really makes a visible difference.
I shared my results in
https://docs.google.com/spreadsheets/d/12KPobxrP89BzrevPaCoGxGUPnI4kuLWRtTLjTfPJw78/edit#gid=0
axel, I'm reasssigning this back. Could you please confirm, my findings (and, in this case, I think the MDEV can be closed), or tell my whether I do something wrong.
I shared details how I run the benchmarks, including sysbench and mysqld parameters (including the taskset params) in this comment
https://jira.mariadb.org/browse/MDEV-10064?focusedCommentId=84510&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-84510