Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue #3730] Fix parsing issueType bug #3731

Merged
merged 3 commits into from
Jan 31, 2025
Merged

Conversation

widal001
Copy link
Collaborator

Summary

Fixes the type in the IssueType pydantic model that was causing issueType to be parsed incorrectly.

Fixes #3730

Time to review: 5 mins

Changes proposed

What was added, updated, or removed in this PR.

  • Fixes typo in IssueType field alias (it was incorrectly aliased as "type" instead of "issueType")
  • Replaces the string literal field aliases with module level constants to make it easier to update and monitor field aliases.

Context for reviewers

Testing instructions, background context, more in-depth details of the implementation, and anything else you'd like to call out or ask reviewers. Explain how the changes were verified.

  1. Checkout the PR locally
  2. Run make gh-data-export to export the data from GitHub
  3. Run the following command to check the percentage of issues that have null value for issue type:
    jq '[.[] | select(.issue_type == null)] | length' data/delivery-data.json
  4. The output should be around 691
  5. Spot check values in data/delivery-data.json where "issue_type" is null each issue that you check, should be missing a "Type" in GitHub (see screenshot below)
Screenshot 2025-01-30 at 5 02 46 PM

Additional information

Screenshots, GIF demos, code examples or output to help show the changes working as expected.

A major goal for Sprint 2.3 will be to set up some automated data quality checks in both Metabase and the ETL pipeline to prevent or at least catch similar regressions earlier.

Copy link
Collaborator

@coilysiren coilysiren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽!

Copy link
Collaborator

@DavidDudas-Intuitial DavidDudas-Intuitial left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing the root cause of recent data anomalies!

@jcrichlake
Copy link
Collaborator

Should we put a unit test in place for these values? To make sure that the values are parsed correctly and not dropped due to an incorrect label?

@widal001 widal001 merged commit 2994df0 into main Jan 31, 2025
1 check passed
@widal001 widal001 deleted the widal001/hot-fix-issue-type branch January 31, 2025 15:58
DavidDudas-Intuitial pushed a commit that referenced this pull request Feb 7, 2025
Fixes #3730 
- Fixes typo in `IssueType` field alias (it was incorrectly aliased as
`"type"` instead of `"issueType"`) that caused a parsing issue.
- Replaces the string literal field aliases with module level constants
to make it easier to update and monitor field aliases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug parsing issueType from GitHub export
4 participants