Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/issue-136-process-robustness #140

Merged
merged 10 commits into from
Jan 20, 2025
Merged

Conversation

mpo-vliz
Copy link
Contributor

@mpo-vliz mpo-vliz commented Jan 17, 2025

provides fixes for #136 #137 #138

the hope is that these are removing the current block in providing triples inside the emobon workflow...

Summary by CodeRabbit

Release Notes

  • New Features

    • Added break_on_error option to control error handling during processing
    • Enhanced error handling and logging mechanisms across multiple components
  • Improvements

    • Improved type safety by using Path objects for file path management
    • Streamlined file and directory source handling
    • Updated test suite to use more modern testing practices with pytest
  • Bug Fixes

    • Fixed issues with file path handling and source initialization
    • Improved error detection and reporting in processing workflows
  • Refactoring

    • Simplified test structures by removing class-based testing
    • Updated import and path management strategies

Copy link
Contributor

coderabbitai bot commented Jan 17, 2025

📝 Walkthrough

Walkthrough

The pull request introduces comprehensive changes across the Subyt library, focusing on enhancing type safety, error handling, and testing infrastructure. The modifications span multiple files in the sema/subyt/ directory and the tests/subyt/ directory. Key improvements include adding a break_on_error parameter to control error handling, transitioning from os and unittest to pathlib and more functional testing approaches, and refining type annotations for improved code reliability.

Changes

File Change Summary
sema/subyt/api.py - Added _init_mtimes and _init_source methods to Source class
- Updated GeneratorSettings constructor with break_on_error parameter
- Modified Processor class error handling
sema/subyt/sinks.py - Enhanced assert_writable function to accept `str
sema/subyt/sources.py - Improved type annotations for source-related functions
- Updated path handling in various source classes
- Enhanced URL and file validation
sema/subyt/subyt.py - Added break_on_error parameter to Subyt class
- Updated GeneratorSettings initialization
tests/subyt/* - Transitioned from unittest to functional testing approach
- Replaced os module with pathlib
- Simplified test structures and assertions

Suggested labels

enhancement, components.subyt

Suggested reviewers

  • bulricht
  • mpo-vliz

Possibly related PRs

  • added service interface and __main__.py to sembench #127: The changes in the main PR enhance error handling and initialization processes in the Source and GeneratorSettings classes, which may relate to the overall service interface improvements and error handling in the Sembench class introduced in this PR.

Finishing Touches

  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@mpo-vliz mpo-vliz requested a review from cedricdcc January 17, 2025 16:33
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

🔭 Outside diff range comments (1)
sema/subyt/sources.py (1)

Line range hint 304-315: Remove duplicate XML parsing.

The XML is parsed twice unnecessarily:

-                xml_str = xmlfile.read()
-                xdict = xmlasdict.parse(xml_str)
                # unpack root wrappers
                try:
                    xdict = xmlasdict.parse(xml_str)
🧰 Tools
🪛 Ruff (0.8.2)

297-297: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


303-303: Missing return type annotation for special method __enter__

(ANN204)


304-304: Unnecessary open mode parameters

Remove open mode parameters

(UP015)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb1f23b and 5fc5e28.

📒 Files selected for processing (11)
  • sema/subyt/api.py (6 hunks)
  • sema/subyt/sinks.py (4 hunks)
  • sema/subyt/sources.py (11 hunks)
  • sema/subyt/subyt.py (4 hunks)
  • tests/subyt/test_conditional.py (1 hunks)
  • tests/subyt/test_generator.py (4 hunks)
  • tests/subyt/test_j2.py (2 hunks)
  • tests/subyt/test_processor.py (1 hunks)
  • tests/subyt/test_repeated_sink_paths.py (1 hunks)
  • tests/subyt/test_settings.py (1 hunks)
  • tests/subyt/test_sinks.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/subyt/test_processor.py

109-109: Missing return type annotation for public function test_processor

Add return type annotation: None

(ANN201)


113-113: Unnecessary dict() call (rewrite as a literal)

Rewrite as a literal

(C408)

tests/subyt/test_conditional.py

22-22: Missing return type annotation for public function test_conditional

Add return type annotation: None

(ANN201)


77-77: Unnecessary dict() call (rewrite as a literal)

Rewrite as a literal

(C408)


101-101: Trailing comma missing

Add trailing comma

(COM812)


105-105: Logging statement uses f-string

(G004)


115-115: Redundant exception object included in logging.exception call

(TRY401)

sema/subyt/subyt.py

106-106: Trailing comma missing

Add trailing comma

(COM812)

tests/subyt/test_settings.py

10-10: Missing return type annotation for public function test_statics

Add return type annotation: None

(ANN201)


18-18: Missing return type annotation for public function test_parse

Add return type annotation: None

(ANN201)


26-26: Missing return type annotation for public function test_cases_roundtrip

Add return type annotation: None

(ANN201)


31-31: Unnecessary list() call (rewrite as a literal)

Rewrite as a literal

(C408)


32-32: Unnecessary list() call (rewrite as a literal)

Rewrite as a literal

(C408)


35-35: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

tests/subyt/test_repeated_sink_paths.py

13-13: Missing return type annotation for public function test_allow_repeated_sink_paths

Add return type annotation: None

(ANN201)


16-16: Trailing comma missing

Add trailing comma

(COM812)


30-30: Missing return type annotation for public function test_disallow_repeated_sink_paths

Add return type annotation: None

(ANN201)


33-33: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/api.py

58-58: Missing return type annotation for private function _init_mtimes

Add return type annotation: None

(ANN202)


67-67: Missing return type annotation for private function _init_source

Add return type annotation: None

(ANN202)


112-112: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


113-113: Trailing comma missing

Add trailing comma

(COM812)


146-146: Use of assert detected

(S101)


165-165: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


337-337: Logging statement uses f-string

(G004)

tests/subyt/test_generator.py

12-12: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


16-16: Missing return type annotation for public function load_parts

Add return type annotation: None

(ANN201)


16-16: Missing type annotation for function argument parts

(ANN001)


20-20: Missing return type annotation for private function _assert_count

Add return type annotation: None

(ANN202)


69-69: Unnecessary assignment to indicator before return statement

Remove unnecessary assignment

(RET504)


72-72: Missing return type annotation for public function test_templates

Add return type annotation: None

(ANN201)


74-74: Logging statement uses f-string

(G004)


81-81: Unnecessary dict() call (rewrite as a literal)

Rewrite as a literal

(C408)


87-87: Trailing comma missing

Add trailing comma

(COM812)


107-107: Trailing comma missing

Add trailing comma

(COM812)


111-111: Logging statement uses f-string

(G004)

tests/subyt/test_sinks.py

19-19: Missing return type annotation for public function rand_alfanum_str

(ANN201)


20-20: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


24-24: Wrong type passed to first argument of pytest.mark.parametrize; expected tuple

Use a tuple for the first argument

(PT006)


32-32: Missing return type annotation for public function test_factory

Add return type annotation: None

(ANN201)


32-32: Missing type annotation for function argument identifier

(ANN001)


32-32: Missing type annotation for function argument expected_type

(ANN001)


50-50: Trailing comma missing

Add trailing comma

(COM812)


53-53: Missing return type annotation for public function test_sink_outputs

Add return type annotation: None

(ANN201)


53-53: Missing type annotation for function argument items

(ANN001)


66-66: Trailing comma missing

Add trailing comma

(COM812)


88-88: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


95-95: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


104-104: Logging statement uses f-string

(G004)

sema/subyt/sinks.py

12-12: Missing return type annotation for public function assert_writable

Add return type annotation: None

(ANN201)


12-12: Boolean-typed positional argument in function definition

(FBT001)


12-12: Boolean default positional argument in function definition

(FBT002)


15-15: Use of assert detected

(S101)


21-21: Use of assert detected

(S101)


78-78: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


78-78: Boolean-typed positional argument in function definition

(FBT001)


78-78: Boolean default positional argument in function definition

(FBT002)


85-85: Trailing comma missing

Add trailing comma

(COM812)


88-88: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


90-90: Use explicit conversion flag

Replace with conversion flag

(RUF010)


145-145: Logging statement uses f-string

(G004)


145-145: Trailing comma missing

Add trailing comma

(COM812)


152-153: Logging statement uses f-string

(G004)


153-153: Trailing comma missing

Add trailing comma

(COM812)


177-178: Logging statement uses f-string

(G004)


178-178: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/sources.py

19-19: Missing return type annotation for public function assert_readable

Add return type annotation: None

(ANN201)


21-21: Use of assert detected

(S101)


22-22: Use of assert detected

(S101)


48-48: Use of assert detected

(S101)


93-93: Unused blanket noqa directive

Remove unused noqa directive

(RUF100)


94-94: Use of assert detected

(S101)


94-94: Assertion always fails, replace with pytest.fail()

(PT015)


94-94: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


103-103: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)


108-108: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)


112-112: Use of assert detected

(S101)


118-118: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)


127-127: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


130-130: Missing return type annotation for private function _init_sourcefiles

Add return type annotation: None

(ANN202)


132-132: Use of assert detected

(S101)


152-152: Trailing comma missing

Add trailing comma

(COM812)


191-191: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


195-195: Trailing comma missing

Add trailing comma

(COM812)


198-198: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


203-203: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


208-208: Trailing comma missing

Add trailing comma

(COM812)


211-211: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


223-223: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


254-254: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


297-297: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: make-lint-test
🔇 Additional comments (7)
tests/subyt/test_j2.py (1)

2-2: LGTM! Good modernization of path handling.

The transition from os to pathlib.Path is a good improvement as it provides:

  • More robust path handling
  • Better cross-platform compatibility
  • Type safety through the Path object

Also applies to: 11-13

tests/subyt/test_sinks.py (1)

101-119: Consider adding test cases for edge cases in pattern leniency.

The test verifies basic pattern leniency by setting one key to empty string. Consider adding test cases for:

  • Multiple empty keys
  • Special characters in keys
  • Very long keys
🧰 Tools
🪛 Ruff (0.8.2)

104-104: Logging statement uses f-string

(G004)

sema/subyt/subyt.py (1)

73-77: LGTM! Well-documented parameter addition.

The break_on_error parameter is well documented with clear explanation of its behavior.

sema/subyt/sinks.py (1)

169-179: LGTM! Enhanced template variable validation.

The added validation for missing template variables improves error detection and user feedback.

🧰 Tools
🪛 Ruff (0.8.2)

177-178: Logging statement uses f-string

(G004)


178-178: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/api.py (3)

58-65: LGTM! Robust mtime initialization.

The implementation correctly handles file validation and mtime initialization with proper type safety.

🧰 Tools
🪛 Ruff (0.8.2)

58-58: Missing return type annotation for private function _init_mtimes

Add return type annotation: None

(ANN202)


112-120: LGTM! Enhanced error handling control.

The addition of break_on_error parameter improves process robustness by allowing fine-grained control over error handling.

🧰 Tools
🪛 Ruff (0.8.2)

112-112: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


113-113: Trailing comma missing

Add trailing comma

(COM812)


321-339: LGTM! Robust error handling implementation.

The added try-except block with conditional error propagation based on break_on_error setting enhances process robustness.

🧰 Tools
🪛 Ruff (0.8.2)

337-337: Logging statement uses f-string

(G004)

Copy link

github-actions bot commented Jan 17, 2025

Tests coverage table for 006a0f4 commit.

pycoverage

Name Stmts Miss Cover Missing
sema/bench/core.py 144 40 72.22% 28 139-140 170 173-175 177 188-192 196-198 202-203 207-208 212 214-218 223-224 226-230 233-236 241 246
sema/bench/dispatcher.py 7 0 100.0%
sema/bench/handler.py 37 8 78.38% 38 44 50 57 85 90 95 101
sema/bench/task.py 8 0 100.0%
sema/commons/clean/clean.py 105 0 100.0%
sema/commons/cli/clitools.py 23 0 100.0%
sema/commons/fileformats/mimetypes.py 34 0 100.0%
sema/commons/fileformats/rdffiles.py 23 0 100.0%
sema/commons/j2/j2_functions.py 185 6 96.76% 171-172 179 204 239 329
sema/commons/j2/rdf_syntax_builder.py 6 0 100.0%
sema/commons/j2/syntax_builder.py 35 4 88.57% 31 39 41 61
sema/commons/log/loader.py 21 2 90.48% 19 37
sema/commons/service/model.py 91 4 95.6% 31 36 93 174
sema/commons/store/build.py 9 0 100.0%
sema/commons/store/store.py 183 32 82.51% 350 363 372 385-386 389-390 393-395 412 415 417 424 428-429 431 436-438 440 443 448 453 456 459-460 463 467 540 554 558
sema/commons/web/conneg.py 135 15 88.89% 77 162-163 166 170 174 198-199 204-205 249-252 269
sema/commons/web/conneg_cli.py 32 4 87.5% 102 115-116 120
sema/commons/web/download_to_file.py 40 6 85.0% 17-21 29
sema/commons/web/httpsession.py 11 0 100.0%
sema/commons/web/parse_headers.py 19 0 100.0%
sema/discovery/discovery.py 181 41 77.35% 82 85 124-126 154-155 159 168-169 195-197 253 263 270 282-283 289-291 293-294 297 301-305 308-313 318 331-332 340
sema/discovery/linkheaders.py 22 0 100.0%
sema/discovery/lod_html_parser.py 22 0 100.0%
sema/harvest/config_build.py 154 18 88.31% 52 169 220 250-251 336-337 369 374-377 427-428 435 450
sema/harvest/execution_report.py 75 0 100.0%
sema/harvest/executor.py 41 3 92.68% 75-76 82
sema/harvest/helper.py 36 1 97.22% 36
sema/harvest/path_assertion.py 126 22 82.54% 52-53 84-85 108-109 116-118 121-126 129-131 194
sema/harvest/service.py 60 4 93.33% 24 86 101 104
sema/harvest/store.py 56 5 91.07% 84 87 120 144
sema/harvest/url_to_graph.py 6 0 100.0%
sema/query/query.py 161 5 96.89% 293
sema/subyt/api.py 151 4 97.35% 130 181 378 382
sema/subyt/j2/generator.py 14 0 100.0%
sema/subyt/sinks.py 100 4 96.0% 36 44 84 89
sema/subyt/sources.py 220 24 89.09% 26 70-77 93-94 130 214 244-245 273 284 287-288 314-316 331
sema/subyt/subyt.py 42 1 97.62% 111
sema/syncfs/service.py 87 12 86.21% 34 211 233
TOTAL 2702 265 90.19%

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

♻️ Duplicate comments (1)
sema/subyt/sinks.py (1)

85-85: ⚠️ Potential issue

Fix typo in method name stats()

The method stats() does not exist on Path objects; it should be stat(). This typo will cause an AttributeError at runtime.

Apply this diff:

-                    str(self._file_path): self._file_path.stats().st_mtime
+                    str(self._file_path): self._file_path.stat().st_mtime
🧰 Tools
🪛 Ruff (0.8.2)

85-85: Trailing comma missing

Add trailing comma

(COM812)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5fc5e28 and 006a0f4.

📒 Files selected for processing (2)
  • sema/subyt/sinks.py (4 hunks)
  • sema/subyt/sources.py (11 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
sema/subyt/sinks.py

12-12: Missing return type annotation for public function assert_writable

Add return type annotation: None

(ANN201)


12-12: Boolean-typed positional argument in function definition

(FBT001)


12-12: Boolean default positional argument in function definition

(FBT002)


15-15: Use of assert detected

(S101)


21-21: Use of assert detected

(S101)


78-78: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


78-78: Boolean-typed positional argument in function definition

(FBT001)


78-78: Boolean default positional argument in function definition

(FBT002)


85-85: Trailing comma missing

Add trailing comma

(COM812)


88-88: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


90-90: Use explicit conversion flag

Replace with conversion flag

(RUF010)


145-145: Logging statement uses f-string

(G004)


145-145: Trailing comma missing

Add trailing comma

(COM812)


152-153: Logging statement uses f-string

(G004)


153-153: Trailing comma missing

Add trailing comma

(COM812)


177-178: Logging statement uses f-string

(G004)


178-178: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/sources.py

19-19: Missing return type annotation for public function assert_readable

Add return type annotation: None

(ANN201)


21-21: Use of assert detected

(S101)


22-22: Use of assert detected

(S101)


48-48: Use of assert detected

(S101)


93-93: Unused blanket noqa directive

Remove unused noqa directive

(RUF100)


95-95: Trailing comma missing

Add trailing comma

(COM812)


105-105: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)


110-110: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)


114-114: Use of assert detected

(S101)


120-120: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)


129-129: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


132-132: Missing return type annotation for private function _init_sourcefiles

Add return type annotation: None

(ANN202)


134-134: Use of assert detected

(S101)


154-154: Trailing comma missing

Add trailing comma

(COM812)


193-193: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


197-197: Trailing comma missing

Add trailing comma

(COM812)


200-200: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


205-205: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


210-210: Trailing comma missing

Add trailing comma

(COM812)


213-213: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)


225-225: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


256-256: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


299-299: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: make-lint-test

if self._file_path.exists():
self.mtimes = {
str(self._file_path): self._file_path.stats().st_mtime
}

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add return type annotation to __repr__ method

Adding a return type annotation to the __repr__ method enhances code clarity and type checking.

Apply this diff:

-        def __repr__(self):
+        def __repr__(self) -> str:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __repr__(self):
def __repr__(self) -> str:
🧰 Tools
🪛 Ruff (0.8.2)

88-88: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

Comment on lines +15 to +17
assert not out_path.exists(), (
f"File to write '{path_name}' already exists",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert statements with proper exception handling

Using assert statements for error checking is not recommended in production code because they can be disabled with optimization flags. It's better to raise explicit exceptions to handle error conditions properly.

Apply this diff to fix the issue:

-        assert not out_path.exists(), (
-            f"File to write '{path_name}' already exists",
-        )
+        if out_path.exists():
+            raise FileExistsError(f"File to write '{path_name}' already exists")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert not out_path.exists(), (
f"File to write '{path_name}' already exists",
)
if out_path.exists():
raise FileExistsError(f"File to write '{path_name}' already exists")
🧰 Tools
🪛 Ruff (0.8.2)

15-15: Use of assert detected

(S101)

Comment on lines +21 to +23
assert os.access(parent_path, os.W_OK), (
f"Can not write to folder '{parent_path}' for creating new files",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert statements with proper exception handling

Using assert statements for checking write permissions is not reliable. Instead, raise an exception to handle the error appropriately.

Apply this diff:

-        assert os.access(parent_path, os.W_OK), (
-            f"Can not write to folder '{parent_path}' for creating new files",
-        )
+        if not os.access(parent_path, os.W_OK):
+            raise PermissionError(f"Cannot write to folder '{parent_path}' for creating new files")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert os.access(parent_path, os.W_OK), (
f"Can not write to folder '{parent_path}' for creating new files",
)
if not os.access(parent_path, os.W_OK):
raise PermissionError(f"Cannot write to folder '{parent_path}' for creating new files")
🧰 Tools
🪛 Ruff (0.8.2)

21-21: Use of assert detected

(S101)

out_path: Path = Path(file_path)
if out_path.exists() and out_path.is_dir():
log.warning(
f"Skipping creation of {file_path} as it is a directory. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Avoid using f-strings in logging statements

Using f-strings in logging methods can lead to unnecessary string interpolation even if the log level is not enabled. It's better to use lazy logging by passing the format string and arguments separately.

Apply this diff:

-                    f"Skipping creation of {file_path} as it is a directory. "
+                    "Skipping creation of %s as it is a directory. File will be ignored.",
+                    file_path

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.8.2)

145-145: Logging statement uses f-string

(G004)


145-145: Trailing comma missing

Add trailing comma

(COM812)

@@ -283,11 +281,11 @@ def __exit__(self):
pass

def __repr__(self):
return f"JsonFileSource('{os.path.abspath(self._json)}')"
return f"JsonFileSource('{self._json!s}')"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add return type annotation to __repr__ method

Including a return type annotation for __repr__ methods improves type checking and code readability.

Apply this diff:

-            def __repr__(self):
+            def __repr__(self) -> str:

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +134 to +136
assert len(self._sourcefiles) > 0, (
f"{self} should have content files.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert statement with proper exception handling

Using assert statements for essential runtime checks is not reliable in production code. Raise an exception instead.

Apply this diff:

-        assert len(self._sourcefiles) > 0, (
-            f"{self} should have content files.",
-        )
+        if not self._sourcefiles:
+            raise ValueError(f"{self} should have content files.")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert len(self._sourcefiles) > 0, (
f"{self} should have content files.",
)
if not self._sourcefiles:
raise ValueError(f"{self} should have content files.")
🧰 Tools
🪛 Ruff (0.8.2)

134-134: Use of assert detected

(S101)

Comment on lines +21 to +22
assert in_path.is_file(), f"File to read '{path_name}' does not exist"
assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert statements with proper exception handling

Using assert statements for runtime checks is discouraged because they can be disabled with optimization flags. Instead, raise appropriate exceptions to handle error conditions.

Apply this diff:

-    assert in_path.is_file(), f"File to read '{path_name}' does not exist"
-    assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"
+    if not in_path.is_file():
+        raise FileNotFoundError(f"File to read '{path_name}' does not exist")
+    if not os.access(in_path, os.R_OK):
+        raise PermissionError(f"Cannot read '{path_name}'")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert in_path.is_file(), f"File to read '{path_name}' does not exist"
assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"
if not in_path.is_file():
raise FileNotFoundError(f"File to read '{path_name}' does not exist")
if not os.access(in_path, os.R_OK):
raise PermissionError(f"Cannot read '{path_name}'")
🧰 Tools
🪛 Ruff (0.8.2)

21-21: Use of assert detected

(S101)


22-22: Use of assert detected

(S101)

Comment on lines +48 to +50
assert mime in self._register, (
f"no Source class available for mime '{mime}'",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert statement with proper exception handling

Using assert statements for control flow is not recommended. Instead, raise an exception if the condition is not met.

Apply this diff:

-        assert mime in self._register, (
-            f"no Source class available for mime '{mime}'",
-        )
+        if mime not in self._register:
+            raise KeyError(f"No Source class available for MIME type '{mime}'")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert mime in self._register, (
f"no Source class available for mime '{mime}'",
)
if mime not in self._register:
raise KeyError(f"No Source class available for MIME type '{mime}'")
🧰 Tools
🪛 Ruff (0.8.2)

48-48: Use of assert detected

(S101)

Comment on lines +114 to +115
assert mime is not None, (
f"no valid mime derived from identifier '{identifier}'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert statement with proper exception handling

Using assert statements for important checks can be risky. Replace with an exception to ensure the check is not bypassed.

Apply this diff:

-        assert mime is not None, (
-            f"no valid mime derived from identifier '{identifier}'",
-        )
+        if mime is None:
+            raise ValueError(f"No valid MIME type derived from identifier '{identifier}'")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert mime is not None, (
f"no valid mime derived from identifier '{identifier}'",
if mime is None:
raise ValueError(f"No valid MIME type derived from identifier '{identifier}'")
🧰 Tools
🪛 Ruff (0.8.2)

114-114: Use of assert detected

(S101)

@mpo-vliz mpo-vliz merged commit 48ef7cb into main Jan 20, 2025
2 checks passed
@mpo-vliz mpo-vliz deleted the fix/issue-136-process-robustness branch January 20, 2025 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants