Query cache lock contention is widely known in the community issue, e.g. https://dba.stackexchange.com/questions/136790/when-to-disable-mysql-query-cache/136814#136814 . MariaDB KB also blames on it https://mariadb.com/kb/en/library/mariadb-memory-allocation/#mutex-bottleneck .
I easily reproduced the issue with the simple program from the attachment on my 2-core (4HT) laptop. Most of the threads stuck on mysql_mutex_lock(&structure_guard_mutex) inside Query_cache::try_lock().
Current big lock (m_cache_lock_status with try_lock() & unlock() interfaces) should be replaced by a plain mutex and protect only query_cache members (query_cache_size, query_cache_limit, statistic, query_cache_size, queries_blocks, internal hash table array (hash->array currently) etc.). While the free memory management and hash table buckets must have their own locks.
The query and table hash tables should use per-bucket lock instead of current global m_cache_lock_status lock. While HASH is used in many other places a new hash table, just for the query cache, with per-bucket locking should be introduced. Can we take advantage of such data structure in other /sql places?
If two threads try to store the same query, then the only one of them should make progress under the corresponding hash bucket lock and the other one shall wait on the lock and do not try to allocate a new query block.
Currently Query_cache_query->lock_reading() and Query_cache_query->lock_writing() is acquired under m_cache_lock_status to protect against the cache entry removal. Query_cache_query reference counter should be introduced for this and now we can unlock() right after the the entry locking.
Query_cache::resize() is too brutal: it completely free the cache and reallocates it again. Is it required that all cached queries are invalidated? If not, then we can shrink or extend memory without the reallocation and full invalidation. The new hash table can implement current items rehashing into a newly allocated hash array.
If there is no requirement to invalidate all the cached queries on the cache resize, then Query_cache::lock_and_suspend() can be replaced by a normal lock (all current threads still have a chanse to find a query in the cache).
FLUSH QUERY CACHE runs pack() and join() in a loop under the big lock which may block working threads for a while. The single memory cache area should be replaced by set of page chains (the pages may be larger than OS page). Page clusters of 2^n pages can be mmap()ed if we get large results. Each query calls mmap() to store results, so we have no qeury/results fragmentation. However, queries may not fully utilize pages and FLUSH QUERY CACHE coalesces partially filled pages. Freed pages are munmap()ed to let OS provide us large page clusters if required. Now Query_cache::pack() needs to aquire lock for not more than 2 page chains at a given time.
Query_cache::invalidate() shall acquire per-bucket lock - only queries using tables from the hash bucket will be unavailable during the invalidation, there is no need to suspend the whole query cache. It seems Query_cache_table should introduce a rw-lock which should be acquired in read mode on query processing and in write mode on table invalidation. So a query insert/send operations typically should acquire current query hash bucket lock, search tables in the table hash with read lock on the hash buckets and finally acquire
Query_cache_table read locks.
make_base_query() seems can be moved out from the lock. We can free the allocated memory if under a lock we realize that we don't need it - it's better to do some extra work outside the lock and make the synchronized code as small as possible.
After the changes, all the locks will be held for a short period of time, m_cache_lock_status will have no sleeping functions underneath, so there is no need for sleep timeout and try_lock()/unlock() can be replaced with basic mutex. https://jira.mariadb.org/browse/MDEV-6766 won't be needed.