-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow explicit control of AUTO_ID_CACHE
in DM incremental sync of new tables
#12039
Comments
ptal @alastori @OliverS929 Can you explain more details about the problem without adjust the AUTO_ID_CACHE? For example, when other SQL workloads need to INSERT some rows, it will be conflict with DM's replication? |
Sure, we are using DM to incrementally migrate data from an upstream Aurora MySQL instance and the application client relies on monotonically increasing allocated IDs which we can achieve by specifying auto ID cache size of 1 for MySQl compatibility. So this establishes a requirement for us that all upstream auto increment tables must be created with cache size 1. We normally backfill existing data through a bulk process (ex. Lightning) and explicitly control table creation so we can create existing tables with the correct cache size. However, once we finish bulk backfill and enter into incremental replication with DM, we may replicate newly created tables. We won't have other applications inserting rows during this period, but we do want readers of replicated data to read MySQL-like increasing IDs. During incremental phase, DM creates new tables with the default cache size. This is prohibitive also because we cannot adjust the auto ID cache size after table creation. Instead, we want DM to create the new tables with cache size of 1 pfor MySQL compatibility (or more generically, with user configurable cache size). |
@michaelmdeng thanks for the context. I understand the main issue is related to the potential huge gaps in auto_increment fields when @lance6716, @OliverS929 : I agree this is indeed an important problem that affects data consistency and application behavior during migrations using DM incremental replication. Let me add some additional context from my perspective:
@michaelmdeng note that small gaps in auto_increment fields are still possible if TiDB crashes or failover and in those edge cases it is not guaranteed that the target will have the same IDs as the source. |
yep, we understand this possibility and are prepared for it
Given the requirements/need here, is the proposed approach in the PR appropriate? any other design considerations for how incremental DM interacts w/ other modes? |
Sorry for the late reply. I agree with @alastori that we could indeed add an option to set parameters like AUTO_ID_CACHE when creating tables in DDLs, and it’s important to ensure MySQL compatibility in this case. In addition to the points raised above and the existing PR from @michaelmdeng, I have a few suggestions on how we could approach this issue: Beyond the AUTO_ID_CACHE option in this issue, there are several other TableOption settings, such as Besides, I don’t think this adaptation should be limited to incremental mode. As @michaelmdeng mentioned in his reply:
DM's full mode relies on a similar process, using Lightning to execute CREATE TABLE statements during the process. Users may need customized table options in scenarios beyond just incremental mode. While I understand the current PR is focused on the syncer’s code path, it’s worth considering whether this issue might also apply to other modes. With those being said, I think this issue is quite valuable, and we should definitely prioritize finding a solution. |
Also cc @River2000i |
Is your feature request related to a problem?
We cannot update the
AUTO_ID_CACHE
setting of a table after its creation.When we replicate new tables via incremental DM, the
CREATE TABLE
statements are applied with default auto-id-cache size, which is problematic if we want to control it directly, or if we want to use MySQL compatibility mode w/AUTO_ID_CACHE=1
.Describe the feature you'd like
We'd like to have control to explicitly set it in create table DDLs that are incrementally replicated in DM. When DM syncs a new table w/ an AUTO_INCREMENT column, the syncer can modify the DDL to set an explicit auto-id-cache size based on the syncer configuration.
We propose a new
auto-id-cache-size
Syncer config. When set, DM will update the create statement w/ the corresponding auto-id-cache size. When unset (0 or default), the statement will be left as is.Describe alternatives you've considered
No response
Teachability, Documentation, Adoption, Migration Strategy
No response
The text was updated successfully, but these errors were encountered: