-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deserialization trouble with BINARY(16) columns for UUID #139
Comments
Hi, @bobdaly. I've also encountered this bug when deserializing cc @osheroff, @Naros, @jpechane. MySQL column definition:
public static UUID convertBytesToUUID(byte[] bytes) {
Preconditions.checkNotNull(bytes);
Preconditions.checkArgument(bytes.length == 16, "UUID bytes len: " + bytes.length + ", expected 16");
ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
long high = byteBuffer.getLong();
long low = byteBuffer.getLong();
return new UUID(high, low);
} EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY,
EventDeserializer.CompatibilityMode.INVALID_DATE_AND_TIME_AS_MIN_VALUE
);
client.setEventDeserializer(eventDeserializer); related: #131 |
mysql> select lower(hex(uuid)) as uuid from xxx.xxx where lower(hex(uuid)) like '%fdf085fa10f93ce087a45c5f285809%';
+----------------------------------+
| uuid |
+----------------------------------+
| fdf085fa10f93ce087a45c5f28580900 |
+----------------------------------+
1 row in set (0.01 sec)
I.e. the trailing |
Hi @leiless can you tell me what version of MySQL/MariaDB you're using and version of the binlog client? |
@Naros, FYI, Ok, I've figured it out. It's the behavior of MySQL replication protocol. The MySQL server indeed sent 15 bytes to the MySQL replica, it's not a bug. # All 0x00s case
# {'values': {'UNKNOWN_COL0': 1, 'UNKNOWN_COL1': ''}, 'none_sources': {}}
# UNKNOWN_COL1 len: 0
INSERT INTO db.t (id, uuid) VALUES (NULL, UNHEX(REPLACE("00000000-0000-0000-0000-000000000000", "-","")));
# Suffix 0x00s case
# {'values': {'UNKNOWN_COL0': 2, 'UNKNOWN_COL1': '\x01'}, 'none_sources': {}}
# UNKNOWN_COL1 len: 1
INSERT INTO db.t (id, uuid) VALUES (NULL, UNHEX(REPLACE("01000000-0000-0000-0000-000000000000", "-","")));
# Mid 0x00s case
# {'values': {'UNKNOWN_COL0': 3, 'UNKNOWN_COL1': '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'}, 'none_sources': {}}
# UNKNOWN_COL1 len: 15
INSERT INTO db.t (id, uuid) VALUES (NULL, UNHEX(REPLACE("01000000-0000-0000-0000-000000000100", "-","")));
# Prefix 0x00s case
# {'values': {'UNKNOWN_COL0': 4, 'UNKNOWN_COL1': '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'}, 'none_sources': {}}
# UNKNOWN_COL1 len: 15
INSERT INTO db.t (id, uuid) VALUES (NULL, UNHEX(REPLACE("00000000-0000-0000-0000-000000000100", "-","")));
# Prefix + suffix 0x00s case (suffix 0x00s got truncated)
# {'values': {'UNKNOWN_COL0': 5, 'UNKNOWN_COL1': '\x00\x00\x00\x00\x00\x00\x01\x01'}, 'none_sources': {}}
# UNKNOWN_COL1 len: 8
INSERT INTO db.t (id, uuid) VALUES (NULL, UNHEX(REPLACE("00000000-0000-0101-0000-000000000000", "-",""))); from pymysqlreplication import BinLogStreamReader
from pymysqlreplication.row_event import WriteRowsEvent
mysql_settings = {'host': '127.0.0.1', 'port': 23306, 'user': 'root', 'passwd': '123456'}
stream = BinLogStreamReader(connection_settings=mysql_settings,
server_id=100,
is_mariadb=True,
auto_position='',
blocking=True,
)
for binlogevent in stream:
binlogevent.dump()
if isinstance(binlogevent, WriteRowsEvent):
write_rows_ev: WriteRowsEvent = binlogevent
if f'{write_rows_ev.schema}.{write_rows_ev.table}' == 'db.t':
for row in write_rows_ev.rows:
print(row)
print('UNKNOWN_COL1 len: ' + str(len(row['values']['UNKNOWN_COL1'])))
print()
stream.close() As we can see, for |
Thanks for the follow-up @leiless, so in that case can we close this or is there still an issue? |
Proof: # {'values': {'UNKNOWN_COL0': 17, 'UNKNOWN_COL1': '\x01\x02\x03\x04\x05\x06\x07\x08\t\n'}, 'none_sources': {}}
# UNKNOWN_COL1 len: 10
INSERT INTO db.t (id, uuid) VALUES (NULL, UNHEX(REPLACE("01020304-0506-0708-090a-000000000000", "-",""))); MariaDB server send I think it applies to MySQL also. And for |
Possible solution to this issueprivate static byte[] fixTruncatedUuidBytes(byte[] uuidBytes) {
Preconditions.checkNotNull(uuidBytes);
Preconditions.checkArgument(uuidBytes.length <= 16);
int truncatedTrailingZerosLen = 16 - uuidBytes.length;
if (truncatedTrailingZerosLen == 0) {
return uuidBytes;
}
byte[] fixedUuidBytes = new byte[16];
// The remaining truncatedTrailingZerosLen bytes are already initialized to 0 by default
System.arraycopy(uuidBytes, 0, fixedUuidBytes, 0, uuidBytes.length);
return fixedUuidBytes;
} |
@leiless Hi, I am a bit confused now. The intent of the code you posted is to provide a hint for clients to how to propely consume the value? If yes, is this something you'd like to add to the README? Also, WRT README update would it be possible to send a PR to https://github.com/debezium/mysql-binlog-connector-java ? Thanks |
Yes, in this case, deserialize I'll send a PR to https://github.com/debezium/mysql-binlog-connector-java someday later, pretty bz right now. |
Library version: 0.27.5
This UUID string value ab9d25e0-64e6-469f-9873-8621910ed300 came from: java.util.UUID.randomUUID().toString() (JDK version 11)
It was persisted to a MySQL 8.0 database column of type BINARY(16) using the UUID_TO_BIN() function.
Here is the relevant code relying on the binlog library:
WriteRowsEventData eventData = event.getData();
List<Serializable[]> rowData = eventData.getRows();
(iterate over rowData)
Object obj = (value from UUID/BINARY column which will come back as a byte[])
For the vast majority of UUIDs, the byte[] is of the expected length - 16.
But some, like the value included above, have a length of 15.
The following conversion will fail with only 15 bytes:
ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
long high = byteBuffer.getLong();
long low = byteBuffer.getLong();
java.util.UUID uuid = new UUID(high, low);
The exception is:
java.nio.BufferUnderflowException
at java.base/java.nio.Buffer.nextGetIndex(Buffer.java:650)
at java.base/java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:452)
If the incorrectly sized byte[] has a 16th byte added as follows:
bin2[15] = Byte.valueOf("00");
The conversion works.
I should also state that interacting with the persisted UUID using BIN_TO_UUID works as expected (i.e there is no problem from MySQL's perspective).
BTW, the CHAR_AND_BINARY_AS_BYTE_ARRAY compatibility mode is being set as well:
EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
);
binaryLogClient.setEventDeserializer(eventDeserializer);
The text was updated successfully, but these errors were encountered: