Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output_mode for BigQuery is unexpected format #292

Open
a92340a opened this issue Dec 26, 2024 · 2 comments
Open

output_mode for BigQuery is unexpected format #292

a92340a opened this issue Dec 26, 2024 · 2 comments

Comments

@a92340a
Copy link

a92340a commented Dec 26, 2024

Describe the bug
I'm trying to parse PostgreSQL DDL to BigQuery schema. But the output content is unexpected format.

To Reproduce
I followed the instruction and example on README.md, and created a snippet as following. However, I got the output of columns which is incompatible with the BigQuery schema. Please kindly let me know if there is any misunderstanding.

    with open(file_path, "r") as file:
        ddl = file.read()
        result = DDLParser(ddl).run(output_mode="bigquery")
        print(result)

The output is:

[{'table_name': 'product_tag', 'primary_key': [], 'columns': [{'name': 'id', 'type': 'integer', 'size': None, 'references': None, 'unique': False, 'nullable': False, 'default': None, 'check': None}, {'name': 'tag_id', 'type': 'bigint', 'size': None, 'references': None, 'unique': False, 'nullable': True, 'default': None, 'check': None}, {'name': 'sn', 'type': 'character varying', 'size': None, 'references': None, 'unique': False, 'nullable': True, 'default': None, 'check': None}, {'name': 'created_at', 'type': 'timestamp', 'size': None, 'references': None, 'unique': False, 'nullable': True, 'default': None, 'check': None, 'with_time_zone': False}, {'name': 'updated_at', 'type': 'timestamp', 'size': None, 'references': None, 'unique': False, 'nullable': True, 'default': None, 'check': None, 'with_time_zone': False}], 'alter': {}, 'checks': [], 'index': [], 'partitioned_by': [], 'tablespace': None, 'dataset': 'public'}]

Expected behavior
I suppose the output of schema will be the following pattern:
[{"name": "id", "type": "INTEGER", "mode": "REQUIRED"},{"name": "tag_id", "type": "INTEGER", "mode": "NULLABLE"},{"name": "sn", "type": "STRING", "mode": "NULLABLE"},{"name": "created_at", "type": "DATETIME", "mode": "NULLABLE"},{"name": "updated_at", "type": "DATETIME", "mode": "NULLABLE"}]

Screenshots
The output of columns is incompatible with the BigQuery schema:
Screenshot 2024-12-26 at 10 50 23 AM

Desktop (please complete the following information):

  • OS: MacOS
  • Python Version: 3.11
@xnuinside
Copy link
Owner

@a92340a hello, thanks for opening the issue. What does mean 'unexpected'? this output format exactly that provided by library . It is exactly that described in the README

Image and in all test cases

@a92340a
Copy link
Author

a92340a commented Jan 25, 2025

@xnuinside Thanks for your reply.
"Unexpected format" means the outputs didn't aligned with the BigQuery JSON schema format, which I described in the 'Expected behavior'. For the more general BigQuery schema case, please refer to the official document: https://cloud.google.com/bigquery/docs/schemas#creating_a_JSON_schema_file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants