|
Possible approaches:
- convert passwords to a fixed character set, e.g. utf8. Problems:
- existing password hashes becomes invalid
- passwords must consist of valid characters, arbitrary binary passwords are no longer possible
- remember character set per password. Problem:
- extra round-trip to send this charset to the client during handshake
- something complex, like remember a charset per password (but remember ascii, if the repertoire fits), only do a second round-trip if the charset is not ascii. Problems:
- over-engineering
- needs to be careful not to disclose any information if the password is wrong
|
|
I do not think passwords should ever be arbitrary binary blobs.
They are typed on the terminals, n Web pages, passed as command line arguments (and thus, in Windows world have to be convertable to and from Unicode), and exists in different non-C API , and one would be hard pressed to find a non-C language, where string is arbitrary bytes.
If someone needs to pass non-string , certificate or whatever to a hashing function, on the client, there are hex or base64 conversions, which make ASCII/UTF-8 out of anything.
A possible approach could be that server will start returning error, when CREATE USER or SET PASSWORD with invalid UTF8 was used for PASSWORD() function. And CLI starts issuing a warning if it used non-UTF8 to compute the hash . Warning will encourage user to change password maybe to ASCII (such a warning is not likely happen in UTF-8 aware environments).
Then, after a grace period of 10 to 20 years, one could maybe assume passwords are all in UTF-8.
|