Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Not a Bug
-
3.0.6
-
None
Description
The code to compute the surrogate pair looks like this (in org.mariadb.jdbc.client.socket.impl.PacketWriter):
int surrogatePairs = |
((currChar << 10) + nextChar) + (0x010000 - (0xD800 << 10) - 0xDC00); |
According to the Unicode standard, this should look like this, however (https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates):
code = 0x10000; |
code += (units[0] & 0x03FF) << 10; |
code += (units[1] & 0x03FF); |
Not too surprisingly, the two computations don't come to the same results.
Example: \udbc0\udd89
public class MyClass { |
public static void main(String args[]) { |
char current=0xdbc0; |
char next=0xdd89; |
int c=10000; |
c+=(current & 0x3ff) << 10; |
c+=(next & 0x3ff); |
|
int surrogatePairs = |
((current << 10) + next) + (0x010000 - (0xD800 << 10) - 0xDC00); |
|
System.out.println(c+" VS. "+surrogatePairs); |
}
|
}
|