You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Listens to MySQLReplication\Event\DTO\RowsDTO events (inserts, updates, deletes)
Loops over RowsDTO::getValues() to process them
After foreach ($rows->getValues()), stores the binlog position of the event: $event->getEventInfo()->getBinLogCurrent()->getBinFileName() / $event->getEventInfo()->getBinLogCurrent()->getBinLogPosition().
When the app starts, it starts from the previously stored filename / position.
In most of the times, this works well. However, we found a typical case when it doesn't.
Consider you have an INSERT containing several thousands rows (i.e. INSERT INTO [table] SELECT * FROM [some_other_table]). As opposed to what we expected, this doesn't fire a singleRowsDTO event with a huge getValues() array, but severalRowsDTO events with just a few dozens items in getValues() in each.
We didn't understood why, until we discovered the binlog_row_event_max_size option on the replica server, but that's not the point.
To reproduce:
Consider that before doing your big INSERT query, your position is 00001
Consider that position increments to 01000 at the end of the insert
Imagine that you stop your app in the middle of processing one of RowsDTO::getValues() event
Consider that you stored 00500 as the last processed position when you stopped the app (e.g. you have only processed half of the rows of that insert)
Expected Result:
When restarting the app, it should resume on position 00500 and trigger remaining RowsDTO events.
Actual Result:
Remaining RowsDTO events of that INSERT query are simply dismissed and apps resumes to the next query in the binlog.
The only solution we have right now is to pray that the application is not shut down (or doesn't crash) while processing that kind of big statements. We could increase binlog_row_event_max_size on the replica server but this won't guarantee RowsDTO won't be split and OTOH having a huge array in RowsDTO::getValues() might lead to memory issues.
As a quick fix, we now temporarily store filename / position on QueryDTO that are fired before and after the RowsDTO events, so that if not all RowsDTO can be processed, they can be replayed with the former position (00001 in our example). This leads to duplicate processing, but we prefer this instead of missing events.
Thank you,
Ben
The text was updated successfully, but these errors were encountered:
Let's imagine there a big transaction in database (e.g. large INSERT affected 10kk rows).
First you will receive TABLE_MAP event with source table structure. This event is required to map binary data from row event stored without column names and types to the structure with column names and real types.
Then you will receive multiple ROW_EVENT with inserted rows (one or more, depends on binlog_row_event_max_size, If I remember correctly)
There is no possibility to ask master for TABLE_MAP in the middle of event.
So the only option - save position of latest processed event and save the latest TABLE_MAP position.
Then, if script failed, start from TABLE_MAP event and skip all ROW_EVENTS untill latest processed.
Hello there,
We have a running application which:
MySQLReplication\Event\DTO\RowsDTO
events (inserts, updates, deletes)RowsDTO::getValues()
to process themforeach ($rows->getValues())
, stores the binlog position of the event:$event->getEventInfo()->getBinLogCurrent()->getBinFileName() / $event->getEventInfo()->getBinLogCurrent()->getBinLogPosition()
.When the app starts, it starts from the previously stored filename / position.
In most of the times, this works well. However, we found a typical case when it doesn't.
Consider you have an INSERT containing several thousands rows (i.e.
INSERT INTO [table] SELECT * FROM [some_other_table]
). As opposed to what we expected, this doesn't fire a singleRowsDTO
event with a hugegetValues()
array, but severalRowsDTO
events with just a few dozens items ingetValues()
in each.We didn't understood why, until we discovered the
binlog_row_event_max_size
option on the replica server, but that's not the point.To reproduce:
RowsDTO::getValues()
eventExpected Result:
RowsDTO
events.Actual Result:
RowsDTO
events of that INSERT query are simply dismissed and apps resumes to the next query in the binlog.The only solution we have right now is to pray that the application is not shut down (or doesn't crash) while processing that kind of big statements. We could increase
binlog_row_event_max_size
on the replica server but this won't guaranteeRowsDTO
won't be split and OTOH having a huge array inRowsDTO::getValues()
might lead to memory issues.As a quick fix, we now temporarily store filename / position on
QueryDTO
that are fired before and after theRowsDTO
events, so that if not allRowsDTO
can be processed, they can be replayed with the former position (00001 in our example). This leads to duplicate processing, but we prefer this instead of missing events.Thank you,
Ben
The text was updated successfully, but these errors were encountered: