Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2025-02-13: Fix column names and rollback expression change in TrimMaskedStrings for Azure SQL masking pipeline #41

Open
sumeetdas-dlpx opened this issue Feb 13, 2025 · 0 comments

Comments

@sumeetdas-dlpx
Copy link
Contributor

sumeetdas-dlpx commented Feb 13, 2025

Expected Behavior

The Azure SQL masking pipeline should use the new column names (introduced by V2025.01.15.0 migration script) in mappings wherever applicable. Also, the expression in TrimMaskedStrings should remain the same as before, i.e. iif(greater($DF_TRIM_LENGTHS[toInteger($# - 1)], 0), substring($$, 1, $DF_TRIM_LENGTHS[toInteger($# - 1)]), toString($$))

Actual Behavior

metadata was not changed to source_metadata and algorithm_metadata in some ADF data flow activities. This may either cause publishing newer template versions to fail or cause some error while running the masking pipeline.

The current expression in TrimMaskedStrings substring($$, 1, $DF_TRIM_LENGTHS[toInteger($# - 1)]) will cause the JSON, XML and UUID masked values to be replaced by an empty string as their identified_column_max_length is -1 and substring function simply returns an empty string for a negative length (though this behaviour is not documented on the ADF website). Masking JSON values would especially cause the masking pipeline to fail with the following error:

Job failed due to reason: at Sink 'Sink': JSON text is not properly formatted. Unexpected character '.' is found at position 0.

Steps To Reproduce the Problem
The column names issue was discovered while developing Azure MI template. I couldn't publish the template without changing metadata to the new column names.

Regarding the expression in TrimMaskedStrings activity, try masking JSON, XML or UUID columns with valid values and see the masked values turn in to empty strings in the sink table.

Screenshots
Masking pipeline error when masking JSON values with the current trim expression:

Image

Version
0.0.17 and later.

Additional Context
These issues are fixed in Azure MI templates' PR. Check the following comments: one, two, three, four, five, six, seven.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant