Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync-diff-inspector should not use REPLACE in its fix-sql if ignore-columns is used #842

Open
kennytm opened this issue Jan 31, 2025 · 1 comment

Comments

@kennytm
Copy link
Contributor

kennytm commented Jan 31, 2025

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. What did you do?

  1. Prepare data
drop schema if exists up;
drop schema if exists down;
create schema up;
create schema down;

drop user if exists up;
drop user if exists down;
create user up;
create user down;
grant select on up.* to up;
grant all on down.* to down;

create table up.test (id bigint primary key, d double, a int);
create table down.test like up.test;
insert into up.test values (1, 5, 4);
insert into down.test values (1, 2, 3);
  1. Set diff config
export-fix-sql = true

[data-sources.up]
host = "127.0.0.1"
port = 4000
user = "up"
password = ""
route-rules = ["rename"]

[data-sources.down]
host = "127.0.0.1"
port = 4000
user = "down"
password = ""

[routes]
[routes.rename]
schema-pattern = "up"
table-pattern = "test"
target-schema = "down"
target-table = "test"

[task]
output-dir = "./output"
source-instances = ["up"]
target-instance = "down"
target-check-tables = ["down.test"]
target-configs = ["ignore-columns"]

[table-configs]
[table-configs.ignore-columns]
target-tables = ["down.test"]
ignore-columns = ["d"]
  1. Run sync-diff. It should say "The data of `down`.`test` is not equal".
  2. Execute the fix-sql
mysql -u down -h 127.0.0.1 -P 4000 < ./output/fix-on-down/*.sql
  1. Check the fixed result
select * from down.test;

2. What did you expect to see?

The value of d is either unchanged (2) or becomes the upstream value (5).

3. What did you see instead?

The value of d got replaced as the default value (NULL):

mysql> select * from down.test;
+----+------+------+
| id | d    | a    |
+----+------+------+
|  1 | NULL |    4 |
+----+------+------+
1 row in set (0.00 sec)
@kennytm
Copy link
Contributor Author

kennytm commented Jan 31, 2025

Note that the fix-sql generated by sync-diff uses the REPLACE statement, so the current behavior is arguably "expected" (as REPLACE = DELETE + INSERT)

-- table: down.test
-- range in sequence: Full
/*
  DIFF COLUMNS ╏ `A`  
╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╋╍╍╍╍╍╍
  source data  ╏ 4    
╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╋╍╍╍╍╍╍
  target data  ╏ 3    
╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╋╍╍╍╍╍╍
*/
REPLACE INTO `down`.`test`(`id`,`a`) VALUES (1,4);

However, when we use ignore-columns in sync-diff-inspector, it is because diff cannot actually handle those data types (json, bit, blob, etc). So the behavior is not useful for users.

Instead of REPLACE, the generated fix-sql should either use INSERT INTO ON DUPLICATE to keep the original ignored column:

insert into test (id, a) values (1, 4) on duplicate key update id = values(id), a = values(a);

or just generate the full row to replace everything entirely (may be impossible when sync-diff just don't know how to serialize the column):

replace into test (id, d, a) values (1, 5, 4);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant