feat: In each record to filter and transform, publish a local service field holding the original object the record is extracted from #214

rpopov · 2025-01-13T10:59:07Z

Implement airbytehq/airbyte#50395 discussed in airbytehq/airbyte#49971

Implemented a new DpathEnhancingExtractor that navigates through a path of field names to reach the records to extract (exactly as the DpathExtractor does) and:

in each record, it binds the $root service field to the original response object
in each object/mapping of the JSON structure (thus in each record in particular), it binds the $parent service field to an object/mapping of the parent object's field, including its $parent field.

Published the DpathEnhancingExtractor to the Builder UI.

These service fields are accessible in the filter and transform steps, after which they are removed from the records.
This allows the filter and transformation steps to navigate to fields and structures that are left out of the selected records.
Examples:

{{ record["$root"]["attribute name"] }} accesses the attributes of the root object of the response, while the selected records are somewhere nested in the response
{{ record["$parent"]["attribute name"] accesses the attributes of the object that owns the selected record(s)
{{ record["$parent"]["$parent"]["attribute name"] accesses the attributes of the object that owns the owner of the selected record(s)
{{ record["$parent"]....["$parent"]["attribute name"] accesses the attributes of the object that owns the owner of the selected record(s) at any level.

More service fields could be added in the future.
The service fields are available in the record's filtering and transform steps.

This design avoids:

building cyclic structures in the JSON response
transforming the service fields in the Flatten transformation.

Explicitly clean the service field(s) after the transform step makes the service fields:

local for the filter and transform steps
not visible to the next mapping and store steps (as they should be)
not visible in the tests beyond the test_record_selector (as they should be)
This allows the record transformation logic to define its "local variables" to reuse
some interim calculations.

The contract of body parsing seems irregular in representing the cases of bad JSON, no JSON and empty JSON.
It cannot be unified as that irregularity is already used.

Updated the development environment setup documentation

to organize and present the setup steps explicitly
to avoid misunderstandings and wasted efforts.

Updated CONTRIBUTING.md to:

collect and organize the knowledge on running the test locally;
state the actual testing steps;
clarify and make explicit the procedures and steps.

The unit, integration, and acceptance tests in this exact version fail due to pipeline configuration issues, not related to the change themselves.

Summary by CodeRabbit

New Features
- Introduced an enhanced record extraction mode that improves handling of nested data and includes contextual reference fields.
Bug Fixes / Improvements
- Enhanced error reporting during JSON parsing and introduced default behaviors to prevent unexpected interruptions.
- Streamlined pagination handling to manage empty responses more reliably.
Documentation
- Updated contributing guidelines, release notes, and installation instructions for improved clarity.
Tests
- Expanded test coverage to validate the new extraction, transformation, and state management behaviors.
Chores
- Upgraded formatting tool dependencies.

At record extraction step, in each record add the service field $root holding a reference to: * the root response object, when parsing JSON format * the original record, when parsing JSONL format that each record to process is extracted from. More service fields could be added in future. The service fields are available in the record's filtering and transform steps. Avoid: * reusing the maps/dictionaries produced, thus avoid building cyclic structures * transforming the service fields in the Flatten transformation. Explicitly cleanup the service field(s) after the transform step, thus making them: * local for the filter and transform steps * not visible to the next mapping and store steps (as they should be) * not visible in the tests beyond the test_record_selector (as they should be) This allows the record transformation logic to define its "local variables" to reuse some interim calculations. The contract of body parsing seems irregular in representing the cases of bad JSON, no JSON and empty JSON. Cannot be unified as that that irregularity is already used. Update the development environment setup documentation * to organize and present the setup steps explicitly * to avoid misunderstandings and wasted efforts. Update CONTRIBUTING.md to * collect and organize the knowledge on running the test locally. * state the actual testing steps. * clarify and make explicit the procedures and steps. The unit, integration, and acceptance tests in this exactly version succeed under Fedora 41, while one of them fails under Oracle Linux 8.7. not related to the contents of this PR. The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source. See airbytehq#197

coderabbitai · 2025-01-13T11:02:16Z

📝 Walkthrough

Walkthrough

This pull request introduces a variety of updates across the Airbyte CDK. The JSON decoding logic has been improved with enhanced error logging and simplified JSON body parsing. Several extractor modules have been extended: DpathExtractor now features new methods for updating the response body and individual records, and service key management functions were added. A new extractor, DpathEnhancingExtractor, has been introduced along with its corresponding schema, factory method, parser mappings, and tests. In addition, pagination strategies, documentation, and dependency versions have been refined and updated.

Changes

File(s)	Change Summary
`airbyte_cdk/.../decoders/json_decoder.py` `unit_tests/.../decoders/test_json_decoder.py`	Updated JSON decoding to improve error handling and simplify JSON body parsing; test cases modified to update parameter names and expected output.
`airbyte_cdk/.../extractors/dpath_extractor.py` `airbyte_cdk/.../extractors/record_extractor.py` `airbyte_cdk/.../extractors/record_selector.py` `unit_tests/.../extractors/test_dpath_extractor.py` `unit_tests/.../extractors/test_record_extractor.py` `unit_tests/.../extractors/test_record_selector.py`	Revised extraction logic by replacing certain empty checks, introducing new methods (`update_body`, `update_record`), and adding service key management functions (e.g., `SERVICE_KEY_PREFIX`, `add_service_key`, `exclude_service_keys`, `is_service_key`, `remove_service_keys`); RecordSelector updated to filter out service keys before normalization.
`airbyte_cdk/.../transformations/flatten_fields.py` `unit_tests/.../transformations/test_flatten_fields.py`	Integrated the `is_service_key` function into the flattening transformation to preserve service keys; added new tests to verify behavior with list and set values.
`airbyte_cdk/.../requesters/paginators/strategies/cursor_pagination_strategy.py` `airbyte_cdk/.../requesters/paginators/strategies/offset_increment.py` `airbyte_cdk/.../requesters/paginators/strategies/page_increment.py` `unit_tests/.../retrievers/test_simple_retriever.py`	Refined pagination logic: ensuring consistent assignment of `page_size` and providing a default value for next page tokens; updated tests now assert that retriever state reflects completed full refresh sync.
`airbyte_cdk/__init__.py` `airbyte_cdk/.../declarative_component_schema.yaml` `airbyte_cdk/.../extractors/__init__.py` `airbyte_cdk/.../extractors/dpath_enhancing_extractor.py` `airbyte_cdk/.../models/declarative_component_schema.py` `airbyte_cdk/.../parsers/manifest_component_transformer.py` `airbyte_cdk/.../parsers/model_to_component_factory.py` `unit_tests/.../extractors/test_dpath_enhancing_extractor.py`	Introduced a new extractor type, DpathEnhancingExtractor, with methods to update the response body (including recursive parent setting) and modify extracted records by adding service fields; corresponding schema, parser mappings, factory method, and tests were added, and the public API was updated to export the new component.
`docs/CONTRIBUTING.md` `docs/RELEASES.md` `package.json`	Updated documentation for clarity and organization (including setup instructions and formatting adjustments) and upgraded the Prettier dependency version in package.json.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Extractor as DpathEnhancingExtractor
  participant Selector as RecordSelector

  Client->>Extractor: Receive HTTP response
  Extractor->>Extractor: update_body() (set parent context recursively)
  Extractor->>Extractor: update_record() (add root service field)
  Extractor-->>Client: Yield enhanced record
  Client->>Selector: Submit record for filtering and normalization
  Selector->>Selector: remove_service_keys() from record
  Selector-->>Client: Return normalized record

Suggested labels

enhancement

Suggested reviewers

maxi297 – Would you be available to review these changes, wdyt?
bnchrch – Could you take a look at the modifications when you have a moment, wdyt?

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (17)

airbyte_cdk/sources/declarative/extractors/record_extractor.py (3)
10-14: Small typo in the documentation

There's a typo in "transormation" -> "transformation". Otherwise, the documentation clearly explains the convention and lifecycle of service fields.
-- The service fields are kept only during the record's filtering and transormation.
++ The service fields are kept only during the record's filtering and transformation.
21-27: Consider removing type ignore and adding return type annotation

The # type: ignore[no-untyped-def] seems unnecessary here. What do you think about adding a return type annotation of None since the function modifies the dict in place? wdyt?
-def remove_service_keys(mapping: dict[str, Any]):  # type: ignore[no-untyped-def]
+def remove_service_keys(mapping: dict[str, Any]) -> None:
34-35: Consider removing type ignore and improving error message

Similar to above, the type ignore seems unnecessary. Also, would it be helpful to include the actual mapping in the assertion message to aid debugging? wdyt?
-def assert_service_keys_exist(mapping: Mapping[str, Any]):  # type: ignore[no-untyped-def]
-    assert mapping != exclude_service_keys(mapping), "The mapping should contain service keys"
+def assert_service_keys_exist(mapping: Mapping[str, Any]) -> None:
+    assert mapping != exclude_service_keys(mapping), f"The mapping should contain service keys, got: {mapping}"
airbyte_cdk/sources/declarative/transformations/flatten_fields.py (1)
36-44: Consider adjusting indentation for better readability

The logic looks good! The service keys are correctly preserved. Would you consider adjusting the indentation of the else clause to match the if condition for better readability? wdyt?
                    if not is_service_key(current_key):
                        new_key = (
                            f"{parent_key}.{current_key}"
                            if (current_key in transformed_record or force_with_parent_name)
                            else current_key
                        )
                        stack.append((value, new_key))
-                    else:  # transfer the service fields without change
-                        transformed_record[current_key] = value
+                    else:
+                        # transfer the service fields without change
+                        transformed_record[current_key] = value
unit_tests/sources/declarative/transformations/test_flatten_fields.py (1)
50-67: Consider adding test cases for nested service keys

The test coverage for service keys looks good! Would you consider adding a test case for nested objects containing service keys to ensure they're handled correctly at any depth? For example:
(
    {
        SERVICE_KEY_PREFIX + "root": {
            "nested": {SERVICE_KEY_PREFIX + "child": "value"}
        },
        "regular": "field"
    },
    {
        SERVICE_KEY_PREFIX + "root": {
            "nested": {SERVICE_KEY_PREFIX + "child": "value"}
        },
        "regular": "field"
    }
)
unit_tests/sources/declarative/extractors/test_record_extractor.py (2)
62-67: Consider catching specific assertion error

The test looks good, but would it be safer to catch the specific AssertionError instead of using a bare except? This would prevent masking other unexpected exceptions. wdyt?
    try:
        assert_service_keys_exist(original)
        success = False
-    except:  # OK, expected
+    except AssertionError:  # OK, expected
        success = True
71-76: Consider adding edge cases for service key detection

The basic test cases look good! Would you consider adding some edge cases to ensure robust handling of special characters and edge cases? For example:
def test_service_key_edge_cases():
    assert is_service_key(SERVICE_KEY_PREFIX)  # just the prefix
    assert is_service_key(SERVICE_KEY_PREFIX + "$nested")  # double prefix
    assert not is_service_key("prefix" + SERVICE_KEY_PREFIX)  # prefix in middle
airbyte_cdk/sources/declarative/decoders/json_decoder.py (1)
38-41: Consider making stack trace logging configurable?

The debug logging with stack_info=True provides great debugging context but might be verbose in production. What do you think about making it configurable through a parameter, wdyt?
-            logger.debug("Response to parse: %s", response.text, exc_info=True, stack_info=True)
+            logger.debug(
+                "Response to parse: %s",
+                response.text,
+                exc_info=True,
+                stack_info=self.parameters.get("debug_stack_trace", False)
+            )
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2)
23-29: Consider using dict constructor for a cleaner copy?

The implementation safely handles both dict and non-dict cases. For the dict copy, what do you think about using the dict constructor for a slightly cleaner approach, wdyt?
-        copy = {k: v for k, v in record.items()}
+        copy = dict(record)
92-109: Consider consolidating the yield statements?

The implementation handles all cases well. What do you think about consolidating the yield statements for better maintainability, wdyt?
-                if isinstance(extracted, list):
-                    for record in extracted:
-                        yield update_record(record, root_response)
-                elif isinstance(extracted, dict):
-                    yield update_record(extracted, root_response)
-                elif extracted:
-                    yield extracted
-                else:
-                    yield from []
+                if not extracted:
+                    return
+                records = [extracted] if isinstance(extracted, dict) else extracted if isinstance(extracted, list) else [extracted]
+                yield from (update_record(record, root_response) for record in records)
airbyte_cdk/sources/declarative/extractors/record_selector.py (1)

112-113: Consider adding a docstring to explain the service fields removal step?

The addition of service fields removal before normalization is a significant change in the data processing pipeline. Would you consider adding a brief docstring to explain why this step is necessary here? wdyt?
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)
196-201: Consider extracting the service key validation into a helper function?

The service key validation logic is repeated across test cases. Would it make sense to extract this into a helper function to improve maintainability? wdyt?
+def validate_service_keys(records):
+    for record in records:
+        if record != {}:
+            assert_service_keys_exist(record)
+    return [exclude_service_keys(record) for record in records]
+
 def test_dpath_extractor(field_path: List, decoder: Decoder, body, expected_records: List):
     # ...
-    for record in actual_records:
-        if record != {}:
-            assert_service_keys_exist(record)
-
-    actual_records = [exclude_service_keys(record) for record in actual_records]
+    actual_records = validate_service_keys(actual_records)
unit_tests/sources/declarative/extractors/test_record_selector.py (1)

169-171: Consider consistent whitespace around service key operations?

The whitespace around service key operations differs between test methods. Would you like to make it consistent? For example, always having one blank line before and after? wdyt?

Also applies to: 244-245
unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (1)
325-325: Nice addition of deduplication logic! 👍

The deduplication using dictionary comprehension with orjson.dumps as keys is elegant. Would you consider extracting this into a helper function since it's used twice in the test? Something like _dedupe_records(records), wdyt?
+def _dedupe_records(records):
+    return list({orjson.dumps(record): record for record in records}.values())

 # Usage:
-cumulative_records_state_deduped = list(
-    {orjson.dumps(record): record for record in cumulative_records_state}.values()
-)
+cumulative_records_state_deduped = _dedupe_records(cumulative_records_state)
docs/CONTRIBUTING.md (2)
97-115: Comprehensive test execution instructions! 🎯

The detailed breakdown of test commands with explanations is very helpful. Consider adding a note about expected test execution times for the different commands to help developers plan their workflow, wdyt?

🧰 Tools

🪛 LanguageTool

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

167-167: Minor: Missing comma in the sentence
-On the **Settings / Domains** page find the subdomain
+On the **Settings / Domains** page, find the subdomain
🧰 Tools

🪛 LanguageTool

[uncategorized] ~167-~167: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)
docs/RELEASES.md (1)

60-61: Clear instructions for testing pre-release versions

The instructions for testing pre-release versions are well-structured. Consider adding a note about the expected time for the publishing pipeline to complete, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e7802a and 4cdc87c.

⛔ Files ignored due to path filters (2)

package-lock.json is excluded by !**/package-lock.json
poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (17)

airbyte_cdk/sources/declarative/decoders/json_decoder.py (1 hunks)
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2 hunks)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
airbyte_cdk/sources/declarative/extractors/record_selector.py (3 hunks)
airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1 hunks)
airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2 hunks)
docs/CONTRIBUTING.md (5 hunks)
docs/RELEASES.md (2 hunks)
package.json (1 hunks)
unit_tests/sources/declarative/decoders/test_json_decoder.py (1 hunks)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (1 hunks)
unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2 hunks)
unit_tests/sources/declarative/test_manifest_declarative_source.py (2 hunks)
unit_tests/sources/declarative/transformations/test_flatten_fields.py (2 hunks)

✅ Files skipped from review due to trivial changes (2)

package.json
unit_tests/sources/declarative/test_manifest_declarative_source.py

🧰 Additional context used

🪛 LanguageTool

docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[uncategorized] ~167-~167: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

[style] ~269-~269: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

docs/RELEASES.md

[grammar] ~68-~68: Did you mean the noun “publishing”?
Context: ...ng Low-Code Python connectors Once the publish pipeline has completed, set the version...

(PREPOSITION_VERB)

🪛 Markdownlint (0.37.0)

docs/CONTRIBUTING.md

140-140: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

239-239: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

86-86: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (7)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-the-guardian-api' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (14)

airbyte_cdk/sources/declarative/decoders/json_decoder.py (1)

47-50: Nice simplification! 👍

The use of yield from makes the code more elegant and Pythonic. Great improvement!

airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1)

73-73: Good defensive programming! 👍

Adding a default empty dict to next() prevents StopIteration exceptions when the decoder yields no items. This aligns well with the JsonDecoder changes.

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2)

12-20: Clean implementation of service field! 👍

Good use of the service key prefix convention for the root field. The implementation aligns well with the PR objectives.

88-90: Clear empty case handling! 👍

The explicit empty dict check and the explanatory comment make the contract compliance clear.

unit_tests/sources/declarative/decoders/test_json_decoder.py (2)

16-27: Test cases align well with the new behavior! 👍

The parameter rename to expected_json and updated test cases clearly reflect the decoder's behavior.

34-48: Great test coverage! 👍

The additional test cases and descriptive IDs improve the test suite's comprehensiveness and readability.

airbyte_cdk/sources/declarative/extractors/record_selector.py (2)

11-14: LGTM! Clean import organization.

The imports are well-organized and the new exclude_service_keys import aligns with the PR's objective of managing service fields.

162-167: LGTM! Clean implementation of service keys removal.

The _remove_service_keys method is simple, focused, and effectively uses the imported exclude_service_keys function. The implementation aligns well with the PR's objective of making service fields local to filtering and transformation.

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (2)

18-21: LGTM! Well-organized imports.

The new imports for service key management functions are properly organized.

40-94: Great test coverage for edge cases!

The test cases thoroughly cover various scenarios including empty strings, objects, arrays, and nested structures. The parameterization is well-organized and the test data is clear.

unit_tests/sources/declarative/extractors/test_record_selector.py (2)

12-19: LGTM! Clean import organization.

The imports are well-organized and properly grouped.

50-79: Great test coverage for nested records!

The test cases thoroughly cover scenarios with nested records and properly validate the behavior with and without filters.

unit_tests/sources/declarative/retrievers/test_simple_retriever.py (1)

247-247: LGTM! Clear state management assertions.

The assertions properly verify that the state is updated to indicate sync completion after reading records.

Also applies to: 362-362

docs/CONTRIBUTING.md (1)

17-23: Great addition of Python version requirements!

The explicit Python version requirements and Fedora-specific installation instructions make it easier for contributors to get started.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

airbyte_cdk/sources/declarative/extractors/record_extractor.py (4)
10-14: Documentation looks great! Would you consider adding an example? 🤔

The convention is well-documented. To make it even more helpful for developers, what do you think about adding a simple example of a service field, like $root? wdyt?

21-27: The implementation looks solid! Quick question about typing 🤔

The # type: ignore[no-untyped-def] comment seems unnecessary since the function has type hints. Could we remove it? wdyt?

30-31: How about using str.startswith() for better readability? 🤔
-    return key.find(SERVICE_KEY_PREFIX) == 0
+    return key.startswith(SERVICE_KEY_PREFIX)
Both work the same way, but startswith() might be more intuitive. What do you think?

34-35: The assertion looks good! A couple of suggestions to consider 🤔
The # type: ignore[no-untyped-def] seems unnecessary since we have type hints
Would you like to make the assertion message more descriptive? Maybe something like:
assert mapping != exclude_service_keys(mapping), f"The mapping should contain at least one service key with prefix '{SERVICE_KEY_PREFIX}'"
What do you think?
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3)
23-29: The implementation looks solid! A few enhancement ideas 🤔

Would you like to add a docstring explaining the function's purpose and return type?

We could narrow the return type using Union:
-def update_record(record: Any, root: Any) -> Any:
+def update_record(record: Any, root: Any) -> Union[dict[str, Any], Any]:
What are your thoughts on these suggestions?

88-91: The empty response handling looks good! One suggestion for clarity 🤔

What do you think about making the contract more explicit in the comment? Maybe:
-            # An empty/invalid JSON parsed, keep the contract
+            # Return empty dict for empty/invalid JSON to maintain the contract that we always yield a mapping
92-109: The extraction logic looks solid! A couple of suggestions to consider 🤔

Would you like to add type hints for root_response and extracted?

The elif extracted: condition on line 106 might be more explicit as elif not isinstance(extracted, (list, dict)) and extracted: to clearly show the intent. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4cdc87c and 7b6ce9b.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2 hunks)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2 hunks)
unit_tests/sources/declarative/transformations/test_flatten_fields.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

unit_tests/sources/declarative/transformations/test_flatten_fields.py
airbyte_cdk/sources/declarative/transformations/flatten_fields.py

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (2)

airbyte_cdk/sources/declarative/extractors/record_extractor.py (1)

17-18: Clean and efficient implementation! 👍

Nice use of dictionary comprehension and type hints.

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (1)

12-20: Clean imports and clear constant definition! 👍

Good practice using the SERVICE_KEY_PREFIX to construct RECORD_ROOT_KEY.

rpopov · 2025-01-20T09:12:25Z

Still #197 is a problem. It blocks the build.

Created a DPath Enhancing Extractor Refactored the record enhancement logic - moved to the extracted class Split the tests of DPathExtractor and DPathEnhancingExtractor Fix the failing tests: FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_custom_components[test_create_custom_component_with_subcomponent_that_uses_parameters] FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_custom_components_do_not_contain_extra_fields FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_parse_custom_component_fields_if_subcomponent FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_page_increment FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_offset_increment FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[simple_unstructured_scenario] FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[no_file_extension_unstructured_scenario] They faile because of comparing string and int values of the page_size (public) attribute. Imposed an invariant: on construction, page_size can be set to a string or int keep only values of one type in page_size for uniform comparison (convert the values of the other type) _page_size holds the internal / working value ... unless manipulated directly. Merged: feat(low-code concurrent): Allow async job low-code streams that are incremental to be run by the concurrent framework (airbytehq#228) fix(low-code): Fix declarative low-code state migration in SubstreamPartitionRouter (airbytehq#267) feat: combine slash command jobs into single job steps (airbytehq#266) feat(low-code): add items and property mappings to dynamic schemas (airbytehq#256) feat: add help response for unrecognized slash commands (airbytehq#264) ci: post direct links to html connector test reports (airbytehq#252) (airbytehq#263) fix(low-code): Fix legacy state migration in SubstreamPartitionRouter (airbytehq#261) fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254) feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236) feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111) fix: don't mypy unit_tests (airbytehq#241) fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225) feat(concurrent cursor): attempt at clamping datetime (airbytehq#234) fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254) feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236) feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111) fix: don't mypy unit_tests (airbytehq#241) fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225) feat(concurrent cursor): attempt at clamping datetime (airbytehq#234) ci: use `ubuntu-24.04` explicitly (resolves CI warnings) (airbytehq#244) Fix(sdm): module ref issue in python components import (airbytehq#243) feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174) chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133) docs: comments on what the `Dockerfile` is for (airbytehq#240) chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237) Fix(sdm): module ref issue in python components import (airbytehq#243) feat(low-code): add DpathFlattenFields (airbytehq#227) feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174) chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133) docs: comments on what the `Dockerfile` is for (airbytehq#240) chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)

Trigger a new build. Hopefully, the integration test infrastructure is fixed.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (11)

airbyte_cdk/sources/declarative/extractors/dpath_enhancing_extractor.py (2)

72-108: Performance caution when reshaping large JSON objects.
Recursively constructing nested dictionaries and adding the parent service key could be computationally expensive for very large or deeply nested JSON objects. Perhaps we could optionally disable this recursive parent assignment for known large data sets, wdyt?

135-142: Collision avoidance with existing 'root' fields.
update_record sets a service key named "root". If the record already contains a key named root, this might overwrite it or create confusion. Maybe prefix it with a special character or handle collisions differently, wdyt?

airbyte_cdk/sources/declarative/extractors/__init__.py (1)

21-21: Confirm __all__ alignment with project naming conventions.
DpathEnhancingExtractor is appended to __all__. This is consistent, though sometimes teams prefer alphabetical order. Would you like to reorder these entries to maintain a standard format, wdyt?

airbyte_cdk/sources/declarative/requesters/paginators/strategies/page_increment.py (1)

34-42: Simplify page_size initialization further?
The updated logic is easier to follow, but the flow around lines 34–42 could be extracted into a helper method for clarity. That might remove the nested ifs and keep it tidy, wdyt?

unit_tests/sources/declarative/extractors/test_dpath_enhancing_extractor.py (1)

43-340: Great test coverage! A few suggestions to consider.

The test suite is comprehensive and well-structured. Really nice job on covering edge cases! 👍

A few minor suggestions to consider:

Would it be helpful to add docstrings to explain the purpose of complex test cases?

What about adding some negative test cases to verify error handling?

Consider grouping related test cases using pytest's @pytest.mark.group decorator for better organization

What are your thoughts on these suggestions?
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
1824-1842: The implementation looks good, but should we add some validation?

The implementation follows the pattern of other extractors, but would it be helpful to add validation for the field path to ensure it's not empty? wdyt?
 def create_dpath_enhancing_extractor(
     self,
     model: DpathEnhancingExtractorModel,
     config: Config,
     decoder: Optional[Decoder] = None,
     **kwargs: Any,
 ) -> DpathEnhancingExtractor:
+    if not model.field_path:
+        raise ValueError("field_path cannot be empty")
     if decoder:
         decoder_to_use = decoder
     else:
         decoder_to_use = JsonDecoder(parameters={})
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1)
29-40: The recursive implementation looks good, but should we add a depth limit?

The exclude_service_keys function handles nested structures well, but would it be worth adding a max depth parameter to prevent potential stack overflow with deeply nested structures? wdyt?
-def exclude_service_keys(struct: Any) -> Any:
+def exclude_service_keys(struct: Any, max_depth: int = 100) -> Any:
     """
     :param struct: any object/JSON structure
+    :param max_depth: maximum recursion depth to prevent stack overflow
     :return: a copy of struct without any service fields at any level of nesting
     """
+    if max_depth <= 0:
+        return struct
     if isinstance(struct, dict):
-        return {k: exclude_service_keys(v) for k, v in struct.items() if not is_service_key(k)}
+        return {k: exclude_service_keys(v, max_depth - 1) for k, v in struct.items() if not is_service_key(k)}
     elif isinstance(struct, list):
-        return [exclude_service_keys(v) for v in struct]
+        return [exclude_service_keys(v, max_depth - 1) for v in struct]
     else:
         return struct
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2)

73-75: Consider adding a docstring to explain the empty body contract?

The empty body handling looks good, but adding a docstring might help future maintainers understand why we need to maintain this contract, wdyt?

98-113: The template methods look great! Consider adding @AbstractMethod?

The update_body and update_record template methods are well-documented. Since they're meant to be overridden by subclasses, would you consider marking them with @AbstractMethod to make the contract more explicit, wdyt?

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)

44-93: Comprehensive nested data test coverage!

The test cases thoroughly cover various nested data structures and extraction patterns. Would you consider adding a test case for very deep nesting to verify any potential recursion limits, wdyt?
airbyte_cdk/sources/declarative/requesters/paginators/strategies/offset_increment.py (1)
55-58: Consider adding type validation for page_size conversion?

The current implementation converts integers to strings without validation. We could add a type check to ensure page_size is either an int or str before conversion, wdyt?
-        self.page_size = str(self.page_size) if isinstance(self.page_size, int) else self.page_size
+        if not isinstance(self.page_size, (int, str, type(None))):
+            raise ValueError(f"page_size must be an integer or string, got {type(self.page_size)}")
+        self.page_size = str(self.page_size) if isinstance(self.page_size, int) else self.page_size

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b6ce9b and 73b8fb8.

📒 Files selected for processing (18)

airbyte_cdk/__init__.py (2 hunks)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2 hunks)
airbyte_cdk/sources/declarative/extractors/__init__.py (2 hunks)
airbyte_cdk/sources/declarative/extractors/dpath_enhancing_extractor.py (1 hunks)
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (1 hunks)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
airbyte_cdk/sources/declarative/extractors/record_selector.py (2 hunks)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2 hunks)
airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (1 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
airbyte_cdk/sources/declarative/requesters/paginators/strategies/offset_increment.py (1 hunks)
airbyte_cdk/sources/declarative/requesters/paginators/strategies/page_increment.py (1 hunks)
airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2 hunks)
docs/CONTRIBUTING.md (5 hunks)
unit_tests/sources/declarative/extractors/test_dpath_enhancing_extractor.py (1 hunks)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (4 hunks)
unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
unit_tests/sources/declarative/extractors/test_record_selector.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

airbyte_cdk/sources/declarative/transformations/flatten_fields.py
airbyte_cdk/sources/declarative/extractors/record_selector.py
docs/CONTRIBUTING.md
unit_tests/sources/declarative/extractors/test_record_extractor.py
unit_tests/sources/declarative/extractors/test_record_selector.py

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (17)

airbyte_cdk/sources/declarative/extractors/dpath_enhancing_extractor.py (2)

66-71: Looks clean and straightforward!
The __post_init__ method delegates seamlessly to the parent class, ensuring necessary initialization logic is still applied.

109-134: Double-check handling of None and empty values.
Currently, _set_parent filters out values if they are dictionaries or lists (lines 124-125). If v can be zero, empty string, or False, are we confident none of those cases should be filtered out? Might we risk losing relevant boolean or numeric fields, wdyt?

airbyte_cdk/sources/declarative/extractors/__init__.py (1)

5-7: Importing the new extractor looks good!
Having DpathEnhancingExtractor in a separate file helps keep the code organization clear.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1674-1674: LGTM! The new extractor type is correctly integrated.

The addition of DpathEnhancingExtractor to the extractor union type in RecordSelector is clean and follows the existing pattern.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

80-80: LGTM! Clean import addition.

The import is correctly placed with other extractor imports.

562-562: LGTM! Mapping is correctly added.

The new extractor is properly mapped to its creation method in the factory's initialization.

airbyte_cdk/sources/declarative/extractors/record_extractor.py (4)

10-15: LGTM! Clear convention documentation.

The convention for service fields is well-documented and the prefix constant is appropriately defined.

17-27: LGTM! Clean utility function implementation.

The add_service_key function is well-documented and follows immutable pattern by returning a new dict.

42-44: LGTM! Simple and focused implementation.

The is_service_key function is clean and follows single responsibility principle.

46-49: LGTM! Generator pattern is well-utilized.

The remove_service_keys function efficiently uses generator pattern to process records one at a time.

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (1)

77-78: LGTM! Smart way to preserve the root response.

Storing the original response before modification and passing it through the update chain enables powerful contextual transformations.

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (2)

39-41: Great test coverage for edge cases!

The empty input test cases effectively verify the contract compliance for empty string, object, and array inputs.

125-128: LGTM! Clear JSONL format test cases.

The JSONL-specific test cases effectively verify the handling of array records and line-based parsing.

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

1458-1488: Well-structured component definition!

The DpathEnhancingExtractor component is well-documented and maintains consistency with other extractors. The service fields ($root, $parent) are clearly explained.

2806-2806: LGTM! Clean integration with RecordSelector.

The addition of DpathEnhancingExtractor to the RecordSelector's extractor property enables seamless integration.

airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (1)

50-51: LGTM! Clean addition of DpathEnhancingExtractor mapping

The new entry follows the established pattern and correctly maps the decoder to JsonDecoder.

airbyte_cdk/__init__.py (1)

95-95: LGTM! Clean addition of DpathEnhancingExtractor to public API

The new extractor is properly exposed in both the imports and all list, following the module's conventions.

Also applies to: 237-237

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

docs/CONTRIBUTING.md (4)
17-17: The installation instructions are well-organized, but could we make the Python version requirement more prominent?

The Python version requirement is crucial for development. What do you think about moving it to the very top of the setup instructions and adding a note about version compatibility? Something like:
-1. Clone the CDK repo. If you will be testing connectors, you should clone the CDK into the same parent directory as `airbytehq/airbyte`, which contains the connector definitions.
-1. Make sure your Python version is 3.11
+## Prerequisites
+
+- Python 3.11 (required)
+- Poetry 2.0 or higher
+
+## Setup Steps
+
+1. Clone the CDK repo. If you will be testing connectors, you should clone the CDK into the same parent directory as `airbytehq/airbyte`, which contains the connector definitions.
wdyt? 🤔

Also applies to: 26-86

111-138: The testing section is comprehensive, but could we improve the organization?

The testing instructions are detailed but could be more structured. What do you think about grouping related commands under subsections with descriptive headers? For example:
### Run Unit Tests

+#### Basic Test Commands
- `poetry run pytest` to run all unit tests.
- `poetry run pytest -k <suite or test name>` to run specific unit tests.
+
+#### Performance Optimized Testing
- `poetry run pytest-fast` to run the subset of PyTest tests, which are not flagged as `slow`. (It should take <5 min for fast tests only.)
+
+#### Advanced Options
- `poetry run pytest -s unit_tests` if you want to pass pytest options.
This organization might make it easier for contributors to find the specific command they need. Thoughts? 🤔

🧰 Tools

🪛 LanguageTool

[style] ~116-~116: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~123-~123: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

346-352: Consider adding a note about container cleanup.

When using Docker for MockServer testing, it might be helpful to add a note about cleaning up the container after testing. What do you think about adding:
docker run -d --rm -v $(pwd)/secrets/mock_server_config:/config -p 8113:8113 --env MOCKSERVER_LOG_LEVEL=TRACE --env MOCKSERVER_SERVER_PORT=8113 --env MOCKSERVER_WATCH_INITIALIZATION_JSON=true --env MOCKSERVER_PERSISTED_EXPECTATIONS_PATH=/config/expectations.json --env MOCKSERVER_INITIALIZATION_JSON_PATH=/config/expectations.json mockserver/mockserver:5.15.0

+# To stop the MockServer container when done:
+docker ps | grep mockserver | awk '{print $1}' | xargs docker stop
wdyt? This would help prevent orphaned containers. 🐳

100-100: Add language specifiers to code blocks.

Some code blocks are missing language specifiers. This affects syntax highlighting and markdown linting.
-```
+```yaml
 connector_image: airbyte/source-shopify:dev    to    connector_image: airbyte/source-shopify:<version>
Also, remove spaces inside code spans:
-`     domain: nw0ipt-vr.myshopify.com`
+`domain: nw0ipt-vr.myshopify.com`
Also applies to: 184-184

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

100-100: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73b8fb8 and c82151b.

📒 Files selected for processing (1)

docs/CONTRIBUTING.md (5 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/CONTRIBUTING.md

[style] ~116-~116: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~123-~123: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~287-~287: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

[locale-violation] ~305-~305: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 markdownlint-cli2 (0.17.2)

docs/CONTRIBUTING.md

100-100: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

155-155: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

184-184: Spaces inside code span elements
null

(MD038, no-space-in-code)

254-254: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (1)

docs/CONTRIBUTING.md (1)

8-8: LGTM! Nice improvement in capitalization.

The capitalization of "GitHub" has been corrected, which aligns with proper noun conventions.

docs/CONTRIBUTING.md

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

docs/CONTRIBUTING.md (6)
16-81: Setup instructions could be more explicit about Python version requirements, wdyt?

The Python version requirement is mentioned but could be more explicit about whether this is a minimum or exact version requirement. Consider clarifying if versions > 3.11 are also supported.
-1. Make sure your Python version is 3.11
+1. Make sure your Python version is 3.11 or higher
93-101: Consider adding a note about version synchronization, what do you think?

The instructions about changing the connector image version could benefit from a note explaining why version synchronization is important and what problems it prevents.
 where the **version** comes from airbyte/airbyte-integrations/connectors/source-shopify/metadata.yaml, as
-the value of the **dockerImageTag** parameter.
+the value of the **dockerImageTag** parameter. This ensures version consistency between the connector image and its metadata, preventing potential compatibility issues.
🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

95-95: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

106-124: The testing section looks great, but what about adding examples? 🤔

The testing instructions are clear, but could be even more helpful with example commands showing typical usage patterns. For instance, showing how to run a specific test suite or how to use the coverage report.
 - `poetry run pytest` to run all unit tests.
+Example:
+```bash
+# Run a specific test suite
+poetry run pytest unit_tests/test_source.py
+
+# Run tests with coverage report
+poetry run pytest --cov=airbyte_cdk
+```
🧰 Tools

🪛 LanguageTool

[style] ~111-~111: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~118-~118: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

179-181: Fix code block formatting for better readability?

The domain and shop examples are using inconsistent formatting. Consider using proper code blocks for better readability.
-    domain: nw0ipt-vr.myshopify.com
-    shop: nw0ipt-vr
+```yaml
+domain: nw0ipt-vr.myshopify.com
+shop: nw0ipt-vr
+```
🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

179-179: Spaces inside code span elements
null

(MD038, no-space-in-code)

339-345: Consider adding a troubleshooting section for MockServer setup?

The MockServer setup instructions are good, but common issues and their solutions could be helpful. For example, what to do if the port is already in use or if Docker can't mount the volume.
 HTTP requests to `localhost:8113/data` should now return the body defined in the expectations file. To test this, the implementer either has to change the code that defines the base URL for the Python source or update the `url_base` from low-code. With the Connector Builder running in docker, you will have to use domain `host.docker.internal` instead of `localhost` as the requests are executed within docker.

+### Troubleshooting MockServer Setup
+
+- If port 8113 is already in use, you can change it to another port in both the docker command and your configuration
+- If you see volume mount errors, ensure the `secrets/mock_server_config` directory exists and has proper permissions
+- For Windows users using WSL2, you might need to use `%cd%` instead of `$(pwd)` in the volume mount
298-298: Minor style suggestion: consider using American English consistently?

For consistency with other documentation, consider using "take a look" instead of "have a look".
-Please have a look at the [Release Management](./RELEASES.md) guide
+Please take a look at the [Release Management](./RELEASES.md) guide
🧰 Tools

🪛 LanguageTool

[locale-violation] ~298-~298: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c82151b and 51dad94.

📒 Files selected for processing (1)

docs/CONTRIBUTING.md (4 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/CONTRIBUTING.md

[style] ~111-~111: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~118-~118: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~280-~280: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

[locale-violation] ~298-~298: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 markdownlint-cli2 (0.17.2)

docs/CONTRIBUTING.md

95-95: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

150-150: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

179-179: Spaces inside code span elements
null

(MD038, no-space-in-code)

249-249: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms (4)

GitHub Check: Pytest (All, Python 3.12, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

661-675: Consider adding more descriptive documentation for the new extractor.

The new DpathEnhancingExtractor class seems to be a key component for implementing the $root service field feature mentioned in the PR objectives. However, its purpose and behavior aren't immediately clear from the documentation. Would you consider adding:

A description of how it enhances the extraction process
An example showing how it adds the $root service field
Documentation about any limitations or considerations when using this extractor

 class DpathEnhancingExtractor(BaseModel):
     type: Literal["DpathEnhancingExtractor"]
     field_path: List[str] = Field(
         ...,
-        description='List of potentially nested fields describing the full path of the field to extract. Use "*" to extract all values from an array. See more info in the [docs](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/record-selector).',
+        description='''List of potentially nested fields describing the full path of the field to extract. Use "*" to extract all values from an array.
+
+        This extractor enhances each extracted record by adding a `$root` service field containing a reference to:
+        - The root response object when parsing JSON format
+        - The original record when parsing JSONL format
+
+        This enables advanced filtering and transformation capabilities by providing access to the complete context.
+        Note: Service fields are local to filtering and transformation steps and are cleaned afterward.
+
+        See more info in the [docs](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/record-selector).
+        ''',
         examples=[
             ["data"],
             ["data", "records"],
             ["data", "{{ parameters.name }}"],
             ["data", "*", "record"],
         ],
         title="Field Path",
     )
     parameters: Optional[Dict[str, Any]] = Field(None, alias="$parameters")

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

1865-1882: Consider reducing code duplication with create_dpath_extractor?

The implementation is correct, but I notice it shares a lot of logic with create_dpath_extractor. Would you consider extracting the common decoder initialization and field path handling into a shared private method to reduce duplication? Something like this perhaps:

+    def _create_dpath_extractor_base(
+        self,
+        decoder: Optional[Decoder],
+        field_path: List[str],
+        config: Config,
+        parameters: Optional[Dict[str, Any]] = None,
+    ) -> Tuple[Decoder, List[Union[InterpolatedString, str]]]:
+        decoder_to_use = decoder if decoder else JsonDecoder(parameters={})
+        model_field_path: List[Union[InterpolatedString, str]] = [x for x in field_path]
+        return decoder_to_use, model_field_path
+
     def create_dpath_extractor(
         self,
         model: DpathExtractorModel,
         config: Config,
         decoder: Optional[Decoder] = None,
         **kwargs: Any,
     ) -> DpathExtractor:
-        if decoder:
-            decoder_to_use = decoder
-        else:
-            decoder_to_use = JsonDecoder(parameters={})
-        model_field_path: List[Union[InterpolatedString, str]] = [x for x in model.field_path]
+        decoder_to_use, model_field_path = self._create_dpath_extractor_base(
+            decoder, model.field_path, config, model.parameters
+        )
         return DpathExtractor(
             decoder=decoder_to_use,
             field_path=model_field_path,
             config=config,
             parameters=model.parameters or {},
         )

     def create_dpath_enhancing_extractor(
         self,
         model: DpathEnhancingExtractorModel,
         config: Config,
         decoder: Optional[Decoder] = None,
         **kwargs: Any,
     ) -> DpathEnhancingExtractor:
-        if decoder:
-            decoder_to_use = decoder
-        else:
-            decoder_to_use = JsonDecoder(parameters={})
-        model_field_path: List[Union[InterpolatedString, str]] = [x for x in model.field_path]
+        decoder_to_use, model_field_path = self._create_dpath_extractor_base(
+            decoder, model.field_path, config, model.parameters
+        )
         return DpathEnhancingExtractor(
             decoder=decoder_to_use,
             field_path=model_field_path,
             config=config,
             parameters=model.parameters or {},
         )

This would make the code more maintainable and reduce the risk of divergence in the future. What do you think?

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

1463-1493: New Component Addition: DpathEnhancingExtractor
The new DpathEnhancingExtractor component is added to support the inclusion of local service fields ($root and $parent) for use during the filter and transform phases. This aligns nicely with the PR objective of retaining the original object in each record. Would you consider adding more detailed examples or clarifying comments on how these service fields could be utilized (and later cleaned up) to further assist connector developers? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51dad94 and d3904f7.

📒 Files selected for processing (3)

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2 hunks)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Pytest (Fast)

🔇 Additional comments (4)

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1685-1685: LGTM!

The update to include DpathEnhancingExtractor in the RecordSelector's extractor field union type is correct and necessary for the new feature.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

80-80: LGTM! Clean import addition.

The DpathEnhancingExtractor import is well-placed within the existing extractors import group.

218-220: LGTM! Clean model mapping integration.

The DpathEnhancingExtractorModel import and constructor mapping are properly integrated into the factory's model mapping system.

Also applies to: 562-562

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

2808-2811: RecordSelector Update: Inclusion of DpathEnhancingExtractor
By adding the reference to DpathEnhancingExtractor in the RecordSelector component’s extractor property, the PR now enables users to select this new extractor. Could you also verify that corresponding tests cover this new behavior—ensuring that the $root field is correctly populated and later removed during transformations? wdyt?

Trigger a new build

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (23)

docs/CONTRIBUTING.md (23)

16-22: Python Version Requirement Added:
Great job specifying that contributors must use Python 3.11. Would it be helpful to include a link to Python 3.11 release notes or a brief note on compatibility issues when using different versions? wdyt?

24-30: Pip Installation Instructions for Fedora:
The new instructions to install pip on Fedora are clear. Would you consider adding a quick note on verifying the installed pip version (for example, by running pip --version) to help less experienced users? wdyt?

32-47: Poetry Installation Details:
The alternative methods for installing Poetry (via pip and via curl) are nicely detailed. Would you consider including a brief comment on when one method might be preferable over the other or adding a link to further documentation? wdyt?

50-63: Using the Python 3.11 Environment:
The section showing how to use a Python 3.11 environment with Poetry is very clear. Would you consider adding a troubleshooting tip (for example, what to do if the correct environment isn’t activated) to further assist contributors? wdyt?

65-65: Cloning the CDK Repository:
The instruction to clone the CDK repo is concise. Do you think adding an example git clone command or a reference link to our Git documentation might further help newcomers? wdyt?

66-70: Installing Unit Test Prerequisites:
The instruction to run poetry install --all-extras is straightforward. Would you consider mentioning briefly why the --all-extras flag is required, so that contributors understand its purpose? wdyt?

73-81: RHEL Module Setup Instructions:
These instructions to execute commands for RHEL (or compatible OS) users are clear. Would you consider adding a short explanation of what the iptable_nat module does and why it’s needed? wdyt?

83-87: 'See Also' References:
The “See also” section with links (including Dagger/Podman integration and issue #197) is useful. Could you please verify that these links are current and consider removing any redundant references if necessary? wdyt?

91-97: Editing Shopify Connector Config:
The guidance on editing the acceptance test configuration for the Shopify connector is clear. Would you consider providing a brief sample diff or additional explanation to help users understand the required changes? wdyt?

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

95-95: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

109-113: Running Unit Tests:
The updated unit test commands are concise and useful. Would you consider varying the sentence structure slightly to avoid repeated phrasing? wdyt?

🧰 Tools

🪛 LanguageTool

[style] ~111-~111: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

114-119: Code Formatting Commands:
In this section, several sentences start similarly. Would you consider rewording a couple of these sentences for improved readability, or adding transitions between them? wdyt?

🧰 Tools

🪛 LanguageTool

[style] ~118-~118: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

126-133: More Tools and Options:
This section is quite helpful. Would it add clarity to include a short explanation of when to use each tool before listing the commands? wdyt?

134-139: Local Testing Against Connector Changes:
The instructions for testing locally with CDK changes are detailed. Would you consider including a brief troubleshooting tip for common issues encountered during this process? wdyt?

150-189: Integration Test Preparation Steps:
This lengthy block is very detailed and informative. Would you consider breaking it into sub-sections (for example, “Project Setup,” “Configuration Changes,” and “File Creation”) to enhance readability for contributors? wdyt?

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

150-150: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

179-179: Spaces inside code span elements
null

(MD038, no-space-in-code)

199-211: Running Connector Tests:
The step-by-step commands for executing connector tests are clear. Would adding a note on how to interpret test outcomes or what to do when a test fails be beneficial? wdyt?

213-223: Shopify Connector Example:
The example provided for the Shopify connector is practical. Would you consider clarifying the placeholder <connector name> (or providing an example) so that the instructions are completely unambiguous? wdyt?

233-243: Extended Acceptance Test Examples:
This block provides various command options (including debug and timeout). Would it be helpful to label these options explicitly (e.g., "With Debug Option", "With Timeout Option") for even easier reference? wdyt?

251-267: airbyte-ci CLI and Build Commands:
The instructions for installing the airbyte-ci CLI and building the connector image using the local CDK are well organized. Would you consider adding a link or further reference to detailed documentation on airbyte-ci for contributors who want to dive deeper? wdyt?

272-276: Working with Poe Tasks:
This section explains Poe Tasks clearly. Would you consider including a short troubleshooting tip or common pitfalls when using Poe tasks to further support contributors? wdyt?

280-285: Auto-Generating the Declarative Schema File:
The instructions are clear, though the phrasing “you may need to regenerate them” could be more expressive. Would you consider rewording it to, for example, “please regenerate the schemas to ensure they align with the latest changes”? wdyt?

🧰 Tools

🪛 LanguageTool

[style] ~280-~280: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

297-299: Release Management Section Wording:
In the Release Management section, the instruction “please have a look at the Release Management guide” might sound friendlier if changed to “please take a look at...”. Would you consider this minor tweak to align with American English usage? wdyt?

🧰 Tools

🪛 LanguageTool

[locale-violation] ~298-~298: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

311-316: GitHub Slash Commands Formatting:
The list of GitHub slash commands is very useful. Would you consider adjusting the formatting of the /autofix entry—perhaps by ensuring a consistent separation between the command and its description—as done with /poetry-lock? wdyt?

347-351: Connector Acceptance Tests Header:
The header for “Running Connector Acceptance Tests for a single connector in Docker” is clear. Would you consider adding a brief note on any prerequisites (if not mentioned elsewhere) for running these tests? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3904f7 and 0e1e5b3.

📒 Files selected for processing (1)

docs/CONTRIBUTING.md (4 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/CONTRIBUTING.md

[style] ~111-~111: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~118-~118: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~280-~280: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

[locale-violation] ~298-~298: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 markdownlint-cli2 (0.17.2)

docs/CONTRIBUTING.md

95-95: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

150-150: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

179-179: Spaces inside code span elements
null

(MD038, no-space-in-code)

249-249: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Pytest (Fast)

🔇 Additional comments (15)

docs/CONTRIBUTING.md (15)

8-8: Clarify Contribution Avenues:
You updated the bullet point to include “GitHub discussions” (with proper capitalization) as a way to contribute. Would you please confirm that this revised wording aligns with our intended communication and style guidelines? wdyt?

88-88: Docker Installation Note:
The brief note on ensuring Docker is installed locally is succinct. Is this meant to serve only as a reminder, or would adding a link to more detailed Docker installation instructions improve clarity? wdyt?

120-125: Code Linting Instructions:
The linting commands are clearly listed and align well with our best practices. Great work here!

140-141: GitHub Pipelines Integration Note:
The note that the GitHub pipelines run integration tests against the Shopify source is concise and useful. Thank you for the clarity!

144-148: Assumptions Section:
The assumptions about checkout locations and project structure are clearly laid out. This adds valuable context—great job!

225-231: Acceptance Tests Locally:
The instructions for running acceptance tests locally are straightforward and well explained. Excellent work!

245-246: Reverting Test Changes Reminder:
The reminder to revert test changes is a nice touch to ensure cleanup after testing. Looks good to me!

247-250: Test CDK Changes Section Header:
The header and subsequent preparation steps for testing CDK changes against connectors are clear. Thank you for the detailed guidance!

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

249-249: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

269-271: Local CDK Build Note:
The explanation that the local CDK is injected at build time is concise and clear. Great job!

277-277: Duplicate Ruff Configuration Paragraph:
It appears that the paragraph about the Ruff configuration is duplicated. Would you consider removing the redundant copy to improve document clarity? wdyt?

294-296: docs-generate Task Mapping:
The explanation of how the docs-generate task is mapped and what it produces is clear and useful. Nice work here!

302-305: FAQ on Maintainers:
The FAQ entry explaining who the “maintainers” are is clear and informative. No changes needed here—great job!

323-327: MockServer Introduction:
The introduction about using MockServer in place of direct API access is clear and sets the context well. Thanks for including this helpful information!

339-343: Docker Command for MockServer:
The Docker command provided for running MockServer is detailed and should be very useful for our testing. Everything looks good here!

345-346: HTTP Requests Explanation:
The follow-up explanation that HTTP requests to localhost:8113/data will return the defined body is clear. No issues here—great work!

At record extraction step, in each record add the service field $root holding a reference to: * the root response object, when parsing JSON format * the original record, when parsing JSONL format that each record to process is extracted from. More service fields could be added in future. The service fields are available in the record's filtering and transform steps. Avoid: * reusing the maps/dictionaries produced, thus avoid building cyclic structures * transforming the service fields in the Flatten transformation. Explicitly cleanup the service field(s) after the transform step, thus making them: * local for the filter and transform steps * not visible to the next mapping and store steps (as they should be) * not visible in the tests beyond the test_record_selector (as they should be) This allows the record transformation logic to define its "local variables" to reuse some interim calculations. The contract of body parsing seems irregular in representing the cases of bad JSON, no JSON and empty JSON. Cannot be unified as that that irregularity is already used. Update the development environment setup documentation * to organize and present the setup steps explicitly * to avoid misunderstandings and wasted efforts. Update CONTRIBUTING.md to * collect and organize the knowledge on running the test locally. * state the actual testing steps. * clarify and make explicit the procedures and steps. The unit, integration, and acceptance tests in this exactly version succeed under Fedora 41, while one of them fails under Oracle Linux 8.7. not related to the contents of this PR. The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source. See airbytehq#197 Polish Integrate the DpathEnhancingExtractor in the UI of Airbyte. Created a DPath Enhancing Extractor Refactored the record enhancement logic - moved to the extracted class Split the tests of DPathExtractor and DPathEnhancingExtractor Fix the failing tests: FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_custom_components[test_create_custom_component_with_subcomponent_that_uses_parameters] FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_custom_components_do_not_contain_extra_fields FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_parse_custom_component_fields_if_subcomponent FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_page_increment FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_offset_increment FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[simple_unstructured_scenario] FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[no_file_extension_unstructured_scenario] They faile because of comparing string and int values of the page_size (public) attribute. Imposed an invariant: on construction, page_size can be set to a string or int keep only values of one type in page_size for uniform comparison (convert the values of the other type) _page_size holds the internal / working value ... unless manipulated directly. Merged: feat(low-code concurrent): Allow async job low-code streams that are incremental to be run by the concurrent framework (airbytehq#228) fix(low-code): Fix declarative low-code state migration in SubstreamPartitionRouter (airbytehq#267) feat: combine slash command jobs into single job steps (airbytehq#266) feat(low-code): add items and property mappings to dynamic schemas (airbytehq#256) feat: add help response for unrecognized slash commands (airbytehq#264) ci: post direct links to html connector test reports (airbytehq#252) (airbytehq#263) fix(low-code): Fix legacy state migration in SubstreamPartitionRouter (airbytehq#261) fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254) feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236) feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111) fix: don't mypy unit_tests (airbytehq#241) fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225) feat(concurrent cursor): attempt at clamping datetime (airbytehq#234) fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254) feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236) feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111) fix: don't mypy unit_tests (airbytehq#241) fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225) feat(concurrent cursor): attempt at clamping datetime (airbytehq#234) ci: use `ubuntu-24.04` explicitly (resolves CI warnings) (airbytehq#244) Fix(sdm): module ref issue in python components import (airbytehq#243) feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174) chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133) docs: comments on what the `Dockerfile` is for (airbytehq#240) chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237) Fix(sdm): module ref issue in python components import (airbytehq#243) feat(low-code): add DpathFlattenFields (airbytehq#227) feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174) chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133) docs: comments on what the `Dockerfile` is for (airbytehq#240) chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237) formatted Update record_extractor.py Trigger a new build. Hopefully, the integration test infrastructure is fixed. Update CONTRIBUTING.md Trigger a new build

docs/CONTRIBUTING.md unit_tests/sources/declarative/transformations/test_flatten_fields.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (14)

unit_tests/sources/declarative/decoders/test_composite_decoder.py (2)
33-35: LGTM! Would you consider adding a docstring? 🤔

The enhanced function signature with default values looks great! For better maintainability, maybe we could add a docstring explaining the parameters? wdyt?
 def generate_csv(
     encoding: str = "utf-8", delimiter: str = ",", should_compress: bool = False
 ) -> bytes:
+    """Generate CSV test data with configurable encoding, delimiter and compression.
+    
+    Args:
+        encoding (str, optional): Character encoding. Defaults to "utf-8".
+        delimiter (str, optional): CSV field separator. Defaults to ",".
+        should_compress (bool, optional): Whether to gzip compress output. Defaults to False.
+    
+    Returns:
+        bytes: Generated CSV data, optionally compressed
+    """
180-200: Great test coverage! Maybe we could make the test data DRY? 🤔

The parameterized test looks excellent! One tiny suggestion: since the expected data is identical to what's in generate_csv, would you consider extracting it to a shared constant to avoid duplication? wdyt?
+TEST_CSV_DATA = [
+    {"id": "1", "name": "John", "age": "28"},
+    {"id": "2", "name": "Alice", "age": "34"},
+    {"id": "3", "name": "Bob", "age": "25"},
+]
+
 def generate_csv(
     encoding: str = "utf-8", delimiter: str = ",", should_compress: bool = False
 ) -> bytes:
-    data = [
-        {"id": "1", "name": "John", "age": "28"},
-        {"id": "2", "name": "Alice", "age": "34"},
-        {"id": "3", "name": "Bob", "age": "25"},
-    ]
+    data = TEST_CSV_DATA
     ...

 def test_composite_raw_decoder_csv_parser_values(requests_mock, encoding: str, delimiter: str):
     ...
-    expected_data = [
-        {"id": "1", "name": "John", "age": "28"},
-        {"id": "2", "name": "Alice", "age": "34"},
-        {"id": "3", "name": "Bob", "age": "25"},
-    ]
+    expected_data = TEST_CSV_DATA
docs/CONTRIBUTING.md (12)

109-120: Local Development – Unit Test Commands
The “Run Unit Tests” section now provides several command options (including using pytest’s -k and -s flags), which is very useful for different testing situations. Might adding a one-line explanation for when to use each option assist less experienced contributors? wdyt?

🧰 Tools

🪛 LanguageTool

[style] ~118-~118: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

131-135: Additional Tools and Options Reference
The “More tools and options” section offers quick access to useful commands. Would linking to a more detailed description of each tool be beneficial, or is brevity preferred here? wdyt?

143-148: Assumptions and Pipeline Use
The note that GitHub pipelines run the integration tests along with clear assumptions about directory structure provides valuable context. Have you considered adding a simple diagram to visually represent this structure? wdyt?

172-180: Configuring Secrets for Integration Tests
The detailed instructions—complete with an example for the Shopify connector—for creating the secrets/config.json file are very practical. Would you consider revising the formatting of the in-line example (e.g., extra indentation) for clarity? wdyt?

198-209: Running Connector Test Commands
The “Steps:” section with detailed commands for running connector tests is comprehensive. Do you think it would help to annotate which command verifies which aspect of the connector (spec, check, discover, read)? wdyt?

211-220: Shopify Connector Test Example
The example provided for the Shopify source (including running spec, check, discover, and read commands) is clear and instructive. Would a note on expected outputs help ensure users know when a test passes? wdyt?

229-238: Extended Acceptance Test Options
Providing examples with debug and timeout options is highly useful for troubleshooting. Would it be beneficial to include a brief explanation of when to use the timeout option versus the debug option? wdyt?

242-257: Testing CDK Changes Against Connectors
The detailed instructions using the airbyte-ci CLI for building connector images and running tests are very well defined. Would a pointer to common errors during the build/test process be useful for contributors? wdyt?

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

256-256: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

273-279: Generating API Reference Documentation
The section on generating API docs using pdoc is detailed and helpful. Should we perhaps mention how to preview the generated docs locally if issues arise? wdyt?

280-282: Release Management Guidance
The directive to “have a look” at the Release Management guide is clear. Given regional language preferences, would you consider revising it to “Please take a look at…” for consistency with American English? wdyt?

283-299: FAQ Section Clarity and Consistency
The FAQ is well structured and covers key questions. A couple of suggestions:
• Consider varying the starting words of consecutive sentences to enhance readability.
• The bullet for “/autofix” could use consistent spacing (e.g., adding a space after /autofix).
Would these refinements help improve the flow? wdyt?

🧰 Tools

🪛 LanguageTool

[style] ~287-~287: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

302-334: Advanced Topics and Appendix
The advanced topics—covering using MockServer and running connector acceptance tests in Docker—are comprehensive and provide useful real-world guidance. Would including a brief troubleshooting checklist or common pitfalls in these sections further assist contributors? wdyt?

🧰 Tools

🪛 LanguageTool

[locale-violation] ~305-~305: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

[uncategorized] ~317-~317: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...What GitHub slash commands are available and who can run them? Only Airbyte CDK mai...

(COMMA_COMPOUND_SENTENCE_2)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e1e5b3 and 338d117.

📒 Files selected for processing (7)

.github/workflows/pypi_publish.yml (1 hunks)
airbyte_cdk/sources/declarative/auth/oauth.py (2 hunks)
airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (1 hunks)
docs/CONTRIBUTING.md (4 hunks)
unit_tests/sources/declarative/auth/test_oauth.py (3 hunks)
unit_tests/sources/declarative/decoders/test_composite_decoder.py (2 hunks)
unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2 hunks)

✅ Files skipped from review due to trivial changes (1)

.github/workflows/pypi_publish.yml

🧰 Additional context used

🪛 LanguageTool

docs/CONTRIBUTING.md

[style] ~118-~118: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~125-~125: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[uncategorized] ~184-~184: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

[style] ~287-~287: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)

[locale-violation] ~305-~305: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 markdownlint-cli2 (0.17.2)

docs/CONTRIBUTING.md

102-102: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

157-157: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

186-186: Spaces inside code span elements
null

(MD038, no-space-in-code)

256-256: Emphasis used instead of a heading
null

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (32)

unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2)

257-260: LGTM! The state completion assertion looks good.

The added assertion verifies that the state correctly indicates sync completion after reading all records. This aligns well with the PR objective of enhancing record filtering and transformation.

372-375: LGTM! The state completion assertion looks good.

The added assertion verifies that the state correctly indicates sync completion after reading all records. This aligns well with the PR objective of enhancing record filtering and transformation.

unit_tests/sources/declarative/decoders/test_composite_decoder.py (1)

37-39: Nice improvements to the test data and CSV generation! 👍

The changes look great:

Using strings for all fields is more CSV-friendly

Clean separation of compression logic

Configurable delimiter enhances test coverage

Also applies to: 42-43, 48-53

airbyte_cdk/sources/declarative/auth/oauth.py (2)

234-237: LGTM! Nice defensive programming.

The addition of the access token initialization check before returning the expiry date helps prevent potential issues with uninitialized tokens.

239-241: LGTM! Clean and focused helper method.

The method follows the single responsibility principle and has a clear, descriptive name.

airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (1)

264-266: LGTM! Good edge case handling.

The addition of the empty value check with token expiry validation provides a graceful fallback mechanism.

unit_tests/sources/declarative/auth/test_oauth.py (2)

317-344: LGTM! Great test coverage for the new functionality.

The test case thoroughly verifies the behavior when no access token is present but the expiry date is in the future.

304-304: LGTM! Good test enhancement.

Adding the access token value parameter improves the test coverage.

docs/CONTRIBUTING.md (24)

8-8: Refine Contribution Instructions
The updated bullet now explicitly lists “reporting bugs, posting GitHub discussions, opening issues…” which broadens the contribution options. Was this expansion intended to better guide potential contributors? wdyt?

16-17: Specify Python Version Requirement
Introducing “1. Make sure your Python version is 3.11” establishes a clear baseline. Have you considered whether contributors using other versions might need additional guidance or troubleshooting steps? wdyt?

18-23: Fedora Instructions for Python Installation
Including “Fedora 41:” with a dedicated code block to install Python 3.11 is very helpful for users on that platform. Did you verify these commands on recent Fedora releases? wdyt?

24-31: Pip Installation Guidance on Fedora
The step “2. Install pip” along with Fedora-specific commands provides clear direction. Would you consider adding a brief note on what to do if the pip package name or command changes in future Fedora releases? wdyt?

32-47: Multiple Methods for Installing Poetry
The section now offers two alternatives for installing Poetry (via pip and via curl) as well as a Fedora-specific block. This flexibility is great. Would you like to mention which method is preferred in our official recommendations, or should we let users choose freely? wdyt?

48-49: Poe Tasks Note Clarity
The note explaining that “Poe” tasks can be used for common actions is informative. Is the linked “Poe the Poet” documentation up to date with these instructions? wdyt?

50-63: Guidance on Using the Python 3.11 Environment
The clear instructions and code block for verifying and switching the Poetry environment to Python 3.11 will help prevent environment mismatches. Would you consider adding a brief troubleshooting tip if the command fails? wdyt?

65-70: Repository Cloning and Test Prerequisites
The instructions to clone the CDK repo and install the unit test prerequisites are neatly detailed. Do you think a note on branch or version management could further assist contributors? wdyt?

74-88: RHEL Module Loading Instructions
The steps for RHEL (or compatible OS) to load the IP table NAT module—with both a temporary and persistent solution—are very clear. Should we add a reminder that these commands typically require root privileges? wdyt?

92-93: Additional Reference Links
The “See also:” links to the Dagger-Podman integration and CDK Issue 197 offer useful extra context. Do you think any further annotations are needed, or is the current level of detail sufficient? wdyt?

95-98: Docker Installation Recommendation
Emphasizing that Docker should be installed (instead of Podman) is a strong clarification. Is this recommendation documented elsewhere as well, or should we consolidate related notes here? wdyt?

100-107: Acceptance Test Config Update
The instructions for editing the Shopify acceptance-test configuration—with the before/after code block—are precise. Would it help to remind users to verify that no other fields are unintentionally modified during this update? wdyt?

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

102-102: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

121-124: Code Formatting Instructions
The step-by-step formatting commands using ruff (both checking and auto-fixing) are clear and concise. Do these instructions match our latest formatting guidelines closely? wdyt?

126-129: Linting and Type-checking Guidance
Including commands for both Poe-based linting and mypy type-checking provides thorough coverage of code quality. Are there any additional common pitfalls we should warn about in this section? wdyt?

137-141: Integration Testing Scenarios
The description of various scenarios for running integration tests (e.g., running a connector that uses the new feature) is very practical. Would it be helpful to explicitly list prerequisites for these tests? wdyt?

150-156: Acceptance Test Preparation
The “Preparation steps” for running connector acceptance tests (including entering the correct directory and installing dependencies) are straightforward. Would a brief troubleshooting tip (e.g., common errors) be useful here? wdyt?

158-163: Updating Test Connector Configuration
The instructions to modify the <test connector>/pyproject.toml file and replace the airbyte_cdk line are clear. Do you think it’s necessary to mention potential version mismatches during this update? wdyt?

165-170: Reinstalling airbyte_cdk for Testing
The step-by-step commands for reinstalling airbyte_cdk in the test connector directory are concise. Are there any caching issues or additional cleanup steps we might need to warn about? wdyt?

183-192: Example JSON Configuration for Secrets
The JSON snippet is well-formatted and serves as a good reference for contributors. Do you think adding comments within the JSON (if permitted) would further aid understanding? wdyt?

🧰 Tools

🪛 LanguageTool

[uncategorized] ~184-~184: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.17.2)

186-186: Spaces inside code span elements
null

(MD038, no-space-in-code)

194-196: Reference Links for Testing Resources
Including links to the Acceptance Tests Reference and Connector Acceptance Tests documentation adds valuable context. Would you like to include a brief summary of what each link covers? wdyt?

222-227: Local Acceptance Test Execution
The command block for running acceptance tests locally is straightforward. Do you feel that adding information on environment prerequisites (e.g., necessary Docker settings) might prevent future confusion? wdyt?

240-240: Test Changes Reversion Reminder
A simple reminder to revert test changes is a neat touch to keep the repository clean. Does that cover all scenarios, or should we expand on how to revert if using version control? wdyt?

258-263: Working with Poe Tasks
The explanation regarding the use of Poe tasks (and the option to pre-install Poe via pipx) is clear and friendly. Do you think a short example of a commonly used Poe command might further enhance this section? wdyt?

264-272: Auto-Generating the Declarative Schema File
The instructions for regenerating the schema file from the YAML spec are concise and easy to follow. Would linking to an example of a schema change be helpful for context? wdyt?

unit_tests/sources/declarative/auth/test_oauth.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

1882-1899: Consider reducing code duplication with create_dpath_extractor?

The implementation is almost identical to create_dpath_extractor. What do you think about extracting the common logic into a private helper method to reduce duplication? Something like this, wdyt?

+    def _create_dpath_extractor_base(
+        self,
+        decoder: Optional[Decoder],
+        model_field_path: List[Union[InterpolatedString, str]],
+        config: Config,
+        parameters: Optional[Dict[str, Any]],
+        extractor_class: Type[Union[DpathExtractor, DpathEnhancingExtractor]]
+    ) -> Union[DpathExtractor, DpathEnhancingExtractor]:
+        decoder_to_use = decoder if decoder else JsonDecoder(parameters={})
+        return extractor_class(
+            decoder=decoder_to_use,
+            field_path=model_field_path,
+            config=config,
+            parameters=parameters or {},
+        )
+
     def create_dpath_extractor(
         self,
         model: DpathExtractorModel,
         config: Config,
         decoder: Optional[Decoder] = None,
         **kwargs: Any,
     ) -> DpathExtractor:
-        if decoder:
-            decoder_to_use = decoder
-        else:
-            decoder_to_use = JsonDecoder(parameters={})
         model_field_path: List[Union[InterpolatedString, str]] = [x for x in model.field_path]
-        return DpathExtractor(
-            decoder=decoder_to_use,
-            field_path=model_field_path,
-            config=config,
-            parameters=model.parameters or {},
-        )
+        return self._create_dpath_extractor_base(
+            decoder=decoder,
+            model_field_path=model_field_path,
+            config=config,
+            parameters=model.parameters,
+            extractor_class=DpathExtractor
+        )

     def create_dpath_enhancing_extractor(
         self,
         model: DpathEnhancingExtractorModel,
         config: Config,
         decoder: Optional[Decoder] = None,
         **kwargs: Any,
     ) -> DpathEnhancingExtractor:
-        if decoder:
-            decoder_to_use = decoder
-        else:
-            decoder_to_use = JsonDecoder(parameters={})
         model_field_path: List[Union[InterpolatedString, str]] = [x for x in model.field_path]
-        return DpathEnhancingExtractor(
-            decoder=decoder_to_use,
-            field_path=model_field_path,
-            config=config,
-            parameters=model.parameters or {},
-        )
+        return self._create_dpath_extractor_base(
+            decoder=decoder,
+            model_field_path=model_field_path,
+            config=config,
+            parameters=model.parameters,
+            extractor_class=DpathEnhancingExtractor
+        )

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fabc665 and ceeea8f.

📒 Files selected for processing (1)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

rpopov mentioned this pull request Jan 13, 2025

Connector/Source: Publish the *response* object in the record transformation like the config, stream_slice, stream_interval, etc. are airbytehq/airbyte#50395

Open

coderabbitai bot reviewed Jan 13, 2025

View reviewed changes

coderabbitai bot approved these changes Jan 13, 2025

View reviewed changes

rpopov mentioned this pull request Jan 13, 2025

feat: Publish the response to process in the context of the record's transformation and filtering steps #193

Closed

rpopov added 2 commits January 18, 2025 23:28

Polish

92bb031

Merge branch 'main' into 50395-2

7b6ce9b

coderabbitai bot reviewed Jan 18, 2025

View reviewed changes

rpopov mentioned this pull request Jan 20, 2025

Shopify acceptance tests fail and block the PRs #197

Closed

rpopov added 7 commits January 21, 2025 09:35

Merge branch 'main' into 50395-2

4d95444

formatted

ea26963

Merge branch 'main' into 50395-2

d50945b

Merge branch 'main' into 50395-2

095b7c1

Merge remote-tracking branch 'refs/remotes/origin/50395-2' into 50395-2

2b5e9a1

Update record_extractor.py

73b8fb8

Trigger a new build. Hopefully, the integration test infrastructure is fixed.

coderabbitai bot reviewed Jan 28, 2025

View reviewed changes

Merge branch 'main' into 50395-2

c82151b

coderabbitai bot requested changes Jan 28, 2025

View reviewed changes

docs/CONTRIBUTING.md Outdated Show resolved Hide resolved

rpopov added 2 commits January 29, 2025 09:14

Merge branch 'main' into 50395-2

fffea7a

Update CONTRIBUTING.md

51dad94

coderabbitai bot reviewed Jan 29, 2025

View reviewed changes

coderabbitai bot approved these changes Jan 29, 2025

View reviewed changes

Merge branch 'airbytehq:main' into 50395-2

d3904f7

coderabbitai bot reviewed Feb 5, 2025

View reviewed changes

Update CONTRIBUTING.md

0e1e5b3

Trigger a new build

coderabbitai bot reviewed Feb 6, 2025

View reviewed changes

rpopov added 5 commits February 7, 2025 11:17

Merge branch 'main' into 50395-2

1b823c5

Merge branch 'airbytehq:main' into 50395-2

987e383

Merge conflicts:

6f62de5

docs/CONTRIBUTING.md unit_tests/sources/declarative/transformations/test_flatten_fields.py

Merge remote-tracking branch 'remotes/origin/50395-2' into 50395-2

338d117

coderabbitai bot requested changes Feb 8, 2025

View reviewed changes

unit_tests/sources/declarative/auth/test_oauth.py Outdated Show resolved Hide resolved

Reoved a duplicate method probably caused by merge

fabc665

coderabbitai bot approved these changes Feb 8, 2025

View reviewed changes

Merge branch 'main' into 50395-2

ceeea8f

coderabbitai bot reviewed Feb 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: In each record to filter and transform, publish a local service field holding the original object the record is extracted from #214

feat: In each record to filter and transform, publish a local service field holding the original object the record is extracted from #214

rpopov commented Jan 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 13, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

rpopov commented Jan 20, 2025

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

feat: In each record to filter and transform, publish a local service field holding the original object the record is extracted from #214

Are you sure you want to change the base?

feat: In each record to filter and transform, publish a local service field holding the original object the record is extracted from #214

Conversation

rpopov commented Jan 13, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jan 13, 2025 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

rpopov commented Jan 20, 2025

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

rpopov commented Jan 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 13, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)