While benchmarking purge, I found that ut_allocator allocations are unnecessarily slow.
About half of the allocation time is spent on looking up perfschema memory keys for corresponding to the source files (so called "auto" keys), the other half is spend on the actual malloc(). Performance schema was not even enabled in the test.
What the lookup function ut_new_get_key_by_file() currently does, it takes the value of _FILE_ passed down by the caller, strips the path, strips the extension, and looks up the result in the STL map
First, the string operations (strcmp/strchr/strrchr) are expensive. Second, if performance schema is OFF , this is just cycles now well spent.
A better solution would be
a) do not do any lookup if perfschema is OFF
b) if perfschema is ON, lookup integers, not the C style string. constexpr trickery can be applied to generate some kind of string hash, e.g djb2, for the basename_noext(_FILE_) already at compile time, and we use this for lookup, rather than strings.