Fix/issue-136-process-robustness #140

mpo-vliz · 2025-01-17T16:32:15Z

provides fixes for #136 #137 #138

the hope is that these are removing the current block in providing triples inside the emobon workflow...

Summary by CodeRabbit

Release Notes

New Features
- Added break_on_error option to control error handling during processing
- Enhanced error handling and logging mechanisms across multiple components
Improvements
- Improved type safety by using Path objects for file path management
- Streamlined file and directory source handling
- Updated test suite to use more modern testing practices with pytest
Bug Fixes
- Fixed issues with file path handling and source initialization
- Improved error detection and reporting in processing workflows
Refactoring
- Simplified test structures by removing class-based testing
- Updated import and path management strategies

also refactoring from unittest to pytest

as well as some extra warning about missing variables

…d_sink_paths

and more consistently using pathlib (over os.path)

coderabbitai · 2025-01-17T16:32:28Z

📝 Walkthrough

Walkthrough

The pull request introduces comprehensive changes across the Subyt library, focusing on enhancing type safety, error handling, and testing infrastructure. The modifications span multiple files in the sema/subyt/ directory and the tests/subyt/ directory. Key improvements include adding a break_on_error parameter to control error handling, transitioning from os and unittest to pathlib and more functional testing approaches, and refining type annotations for improved code reliability.

Changes

File	Change Summary
`sema/subyt/api.py`	- Added `_init_mtimes` and `_init_source` methods to `Source` class - Updated `GeneratorSettings` constructor with `break_on_error` parameter - Modified `Processor` class error handling
`sema/subyt/sinks.py`	- Enhanced `assert_writable` function to accept `str
`sema/subyt/sources.py`	- Improved type annotations for source-related functions - Updated path handling in various source classes - Enhanced URL and file validation
`sema/subyt/subyt.py`	- Added `break_on_error` parameter to `Subyt` class - Updated `GeneratorSettings` initialization
`tests/subyt/*`	- Transitioned from `unittest` to functional testing approach - Replaced `os` module with `pathlib` - Simplified test structures and assertions

Suggested labels

enhancement, components.subyt

Suggested reviewers

bulricht
mpo-vliz

Possibly related PRs

added service interface and __main__.py to sembench #127: The changes in the main PR enhance error handling and initialization processes in the Source and GeneratorSettings classes, which may relate to the overall service interface improvements and error handling in the Sembench class introduced in this PR.

Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 22

🔭 Outside diff range comments (1)

sema/subyt/sources.py (1)
Line range hint 304-315: Remove duplicate XML parsing.

The XML is parsed twice unnecessarily:
-                xml_str = xmlfile.read()
-                xdict = xmlasdict.parse(xml_str)
                # unpack root wrappers
                try:
                    xdict = xmlasdict.parse(xml_str)
🧰 Tools

🪛 Ruff (0.8.2)

297-297: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

303-303: Missing return type annotation for special method __enter__

(ANN204)

304-304: Unnecessary open mode parameters

Remove open mode parameters

(UP015)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb1f23b and 5fc5e28.

📒 Files selected for processing (11)

sema/subyt/api.py (6 hunks)
sema/subyt/sinks.py (4 hunks)
sema/subyt/sources.py (11 hunks)
sema/subyt/subyt.py (4 hunks)
tests/subyt/test_conditional.py (1 hunks)
tests/subyt/test_generator.py (4 hunks)
tests/subyt/test_j2.py (2 hunks)
tests/subyt/test_processor.py (1 hunks)
tests/subyt/test_repeated_sink_paths.py (1 hunks)
tests/subyt/test_settings.py (1 hunks)
tests/subyt/test_sinks.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.8.2)

tests/subyt/test_processor.py

109-109: Missing return type annotation for public function test_processor

Add return type annotation: None

(ANN201)

113-113: Unnecessary dict() call (rewrite as a literal)

Rewrite as a literal

(C408)

tests/subyt/test_conditional.py

22-22: Missing return type annotation for public function test_conditional

Add return type annotation: None

(ANN201)

77-77: Unnecessary dict() call (rewrite as a literal)

Rewrite as a literal

(C408)

101-101: Trailing comma missing

Add trailing comma

(COM812)

105-105: Logging statement uses f-string

(G004)

115-115: Redundant exception object included in logging.exception call

(TRY401)

sema/subyt/subyt.py

106-106: Trailing comma missing

Add trailing comma

(COM812)

tests/subyt/test_settings.py

10-10: Missing return type annotation for public function test_statics

Add return type annotation: None

(ANN201)

18-18: Missing return type annotation for public function test_parse

Add return type annotation: None

(ANN201)

26-26: Missing return type annotation for public function test_cases_roundtrip

Add return type annotation: None

(ANN201)

31-31: Unnecessary list() call (rewrite as a literal)

Rewrite as a literal

(C408)

32-32: Unnecessary list() call (rewrite as a literal)

Rewrite as a literal

(C408)

35-35: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

tests/subyt/test_repeated_sink_paths.py

13-13: Missing return type annotation for public function test_allow_repeated_sink_paths

Add return type annotation: None

(ANN201)

16-16: Trailing comma missing

Add trailing comma

(COM812)

30-30: Missing return type annotation for public function test_disallow_repeated_sink_paths

Add return type annotation: None

(ANN201)

33-33: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/api.py

58-58: Missing return type annotation for private function _init_mtimes

Add return type annotation: None

(ANN202)

67-67: Missing return type annotation for private function _init_source

Add return type annotation: None

(ANN202)

112-112: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

113-113: Trailing comma missing

Add trailing comma

(COM812)

146-146: Use of assert detected

(S101)

165-165: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

337-337: Logging statement uses f-string

(G004)

tests/subyt/test_generator.py

12-12: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

16-16: Missing return type annotation for public function load_parts

Add return type annotation: None

(ANN201)

16-16: Missing type annotation for function argument parts

(ANN001)

20-20: Missing return type annotation for private function _assert_count

Add return type annotation: None

(ANN202)

69-69: Unnecessary assignment to indicator before return statement

Remove unnecessary assignment

(RET504)

72-72: Missing return type annotation for public function test_templates

Add return type annotation: None

(ANN201)

74-74: Logging statement uses f-string

(G004)

81-81: Unnecessary dict() call (rewrite as a literal)

Rewrite as a literal

(C408)

87-87: Trailing comma missing

Add trailing comma

(COM812)

107-107: Trailing comma missing

Add trailing comma

(COM812)

111-111: Logging statement uses f-string

(G004)

tests/subyt/test_sinks.py

19-19: Missing return type annotation for public function rand_alfanum_str

(ANN201)

20-20: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

24-24: Wrong type passed to first argument of pytest.mark.parametrize; expected tuple

Use a tuple for the first argument

(PT006)

32-32: Missing return type annotation for public function test_factory

Add return type annotation: None

(ANN201)

32-32: Missing type annotation for function argument identifier

(ANN001)

32-32: Missing type annotation for function argument expected_type

(ANN001)

50-50: Trailing comma missing

Add trailing comma

(COM812)

53-53: Missing return type annotation for public function test_sink_outputs

Add return type annotation: None

(ANN201)

53-53: Missing type annotation for function argument items

(ANN001)

66-66: Trailing comma missing

Add trailing comma

(COM812)

88-88: Unnecessary open mode parameters

Remove open mode parameters

(UP015)

95-95: Unnecessary open mode parameters

Remove open mode parameters

(UP015)

104-104: Logging statement uses f-string

(G004)

sema/subyt/sinks.py

12-12: Missing return type annotation for public function assert_writable

Add return type annotation: None

(ANN201)

12-12: Boolean-typed positional argument in function definition

(FBT001)

12-12: Boolean default positional argument in function definition

(FBT002)

15-15: Use of assert detected

(S101)

21-21: Use of assert detected

(S101)

78-78: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

78-78: Boolean-typed positional argument in function definition

(FBT001)

78-78: Boolean default positional argument in function definition

(FBT002)

85-85: Trailing comma missing

Add trailing comma

(COM812)

88-88: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

90-90: Use explicit conversion flag

Replace with conversion flag

(RUF010)

145-145: Logging statement uses f-string

(G004)

145-145: Trailing comma missing

Add trailing comma

(COM812)

152-153: Logging statement uses f-string

(G004)

153-153: Trailing comma missing

Add trailing comma

(COM812)

177-178: Logging statement uses f-string

(G004)

178-178: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/sources.py

19-19: Missing return type annotation for public function assert_readable

Add return type annotation: None

(ANN201)

21-21: Use of assert detected

(S101)

22-22: Use of assert detected

(S101)

48-48: Use of assert detected

(S101)

93-93: Unused blanket noqa directive

Remove unused noqa directive

(RUF100)

94-94: Use of assert detected

(S101)

94-94: Assertion always fails, replace with pytest.fail()

(PT015)

94-94: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

103-103: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)

108-108: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)

112-112: Use of assert detected

(S101)

118-118: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)

127-127: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

130-130: Missing return type annotation for private function _init_sourcefiles

Add return type annotation: None

(ANN202)

132-132: Use of assert detected

(S101)

152-152: Trailing comma missing

Add trailing comma

(COM812)

191-191: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

195-195: Trailing comma missing

Add trailing comma

(COM812)

198-198: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

203-203: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

208-208: Trailing comma missing

Add trailing comma

(COM812)

211-211: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

223-223: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

254-254: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

297-297: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: make-lint-test

🔇 Additional comments (7)

tests/subyt/test_j2.py (1)

2-2: LGTM! Good modernization of path handling.

The transition from os to pathlib.Path is a good improvement as it provides:

More robust path handling

Better cross-platform compatibility

Type safety through the Path object

Also applies to: 11-13

tests/subyt/test_sinks.py (1)

101-119: Consider adding test cases for edge cases in pattern leniency.

The test verifies basic pattern leniency by setting one key to empty string. Consider adding test cases for:

Multiple empty keys

Special characters in keys

Very long keys

🧰 Tools

🪛 Ruff (0.8.2)

104-104: Logging statement uses f-string

(G004)

sema/subyt/subyt.py (1)

73-77: LGTM! Well-documented parameter addition.

The break_on_error parameter is well documented with clear explanation of its behavior.

sema/subyt/sinks.py (1)

169-179: LGTM! Enhanced template variable validation.

The added validation for missing template variables improves error detection and user feedback.

🧰 Tools

🪛 Ruff (0.8.2)

177-178: Logging statement uses f-string

(G004)

178-178: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/api.py (3)

58-65: LGTM! Robust mtime initialization.

The implementation correctly handles file validation and mtime initialization with proper type safety.

🧰 Tools

🪛 Ruff (0.8.2)

58-58: Missing return type annotation for private function _init_mtimes

Add return type annotation: None

(ANN202)

112-120: LGTM! Enhanced error handling control.

The addition of break_on_error parameter improves process robustness by allowing fine-grained control over error handling.

🧰 Tools

🪛 Ruff (0.8.2)

112-112: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

113-113: Trailing comma missing

Add trailing comma

(COM812)

321-339: LGTM! Robust error handling implementation.

The added try-except block with conditional error propagation based on break_on_error setting enhances process robustness.

🧰 Tools

🪛 Ruff (0.8.2)

337-337: Logging statement uses f-string

(G004)

tests/subyt/test_repeated_sink_paths.py

tests/subyt/test_settings.py

sema/subyt/sinks.py

sema/subyt/sources.py

github-actions · 2025-01-17T16:45:55Z

Tests coverage table for 006a0f4 commit.

Name	Stmts	Miss	Cover	Missing
sema/bench/core.py	144	40	72.22%	28 139-140 170 173-175 177 188-192 196-198 202-203 207-208 212 214-218 223-224 226-230 233-236 241 246
sema/bench/dispatcher.py	7	0	100.0%
sema/bench/handler.py	37	8	78.38%	38 44 50 57 85 90 95 101
sema/bench/task.py	8	0	100.0%
sema/commons/clean/clean.py	105	0	100.0%
sema/commons/cli/clitools.py	23	0	100.0%
sema/commons/fileformats/mimetypes.py	34	0	100.0%
sema/commons/fileformats/rdffiles.py	23	0	100.0%
sema/commons/j2/j2_functions.py	185	6	96.76%	171-172 179 204 239 329
sema/commons/j2/rdf_syntax_builder.py	6	0	100.0%
sema/commons/j2/syntax_builder.py	35	4	88.57%	31 39 41 61
sema/commons/log/loader.py	21	2	90.48%	19 37
sema/commons/service/model.py	91	4	95.6%	31 36 93 174
sema/commons/store/build.py	9	0	100.0%
sema/commons/store/store.py	183	32	82.51%	350 363 372 385-386 389-390 393-395 412 415 417 424 428-429 431 436-438 440 443 448 453 456 459-460 463 467 540 554 558
sema/commons/web/conneg.py	135	15	88.89%	77 162-163 166 170 174 198-199 204-205 249-252 269
sema/commons/web/conneg_cli.py	32	4	87.5%	102 115-116 120
sema/commons/web/download_to_file.py	40	6	85.0%	17-21 29
sema/commons/web/httpsession.py	11	0	100.0%
sema/commons/web/parse_headers.py	19	0	100.0%
sema/discovery/discovery.py	181	41	77.35%	82 85 124-126 154-155 159 168-169 195-197 253 263 270 282-283 289-291 293-294 297 301-305 308-313 318 331-332 340
sema/discovery/linkheaders.py	22	0	100.0%
sema/discovery/lod_html_parser.py	22	0	100.0%
sema/harvest/config_build.py	154	18	88.31%	52 169 220 250-251 336-337 369 374-377 427-428 435 450
sema/harvest/execution_report.py	75	0	100.0%
sema/harvest/executor.py	41	3	92.68%	75-76 82
sema/harvest/helper.py	36	1	97.22%	36
sema/harvest/path_assertion.py	126	22	82.54%	52-53 84-85 108-109 116-118 121-126 129-131 194
sema/harvest/service.py	60	4	93.33%	24 86 101 104
sema/harvest/store.py	56	5	91.07%	84 87 120 144
sema/harvest/url_to_graph.py	6	0	100.0%
sema/query/query.py	161	5	96.89%	293
sema/subyt/api.py	151	4	97.35%	130 181 378 382
sema/subyt/j2/generator.py	14	0	100.0%
sema/subyt/sinks.py	100	4	96.0%	36 44 84 89
sema/subyt/sources.py	220	24	89.09%	26 70-77 93-94 130 214 244-245 273 284 287-288 314-316 331
sema/subyt/subyt.py	42	1	97.62%	111
sema/syncfs/service.py	87	12	86.21%	34 211 233
TOTAL	2702	265	90.19%

coderabbitai

Actionable comments posted: 9

♻️ Duplicate comments (1)

sema/subyt/sinks.py (1)
85-85: ⚠️ Potential issue

Fix typo in method name stats()

The method stats() does not exist on Path objects; it should be stat(). This typo will cause an AttributeError at runtime.

Apply this diff:
-                    str(self._file_path): self._file_path.stats().st_mtime
+                    str(self._file_path): self._file_path.stat().st_mtime
🧰 Tools

🪛 Ruff (0.8.2)

85-85: Trailing comma missing

Add trailing comma

(COM812)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5fc5e28 and 006a0f4.

📒 Files selected for processing (2)

sema/subyt/sinks.py (4 hunks)
sema/subyt/sources.py (11 hunks)

🧰 Additional context used

🪛 Ruff (0.8.2)

sema/subyt/sinks.py

12-12: Missing return type annotation for public function assert_writable

Add return type annotation: None

(ANN201)

12-12: Boolean-typed positional argument in function definition

(FBT001)

12-12: Boolean default positional argument in function definition

(FBT002)

15-15: Use of assert detected

(S101)

21-21: Use of assert detected

(S101)

78-78: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

78-78: Boolean-typed positional argument in function definition

(FBT001)

78-78: Boolean default positional argument in function definition

(FBT002)

85-85: Trailing comma missing

Add trailing comma

(COM812)

88-88: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

90-90: Use explicit conversion flag

Replace with conversion flag

(RUF010)

145-145: Logging statement uses f-string

(G004)

145-145: Trailing comma missing

Add trailing comma

(COM812)

152-153: Logging statement uses f-string

(G004)

153-153: Trailing comma missing

Add trailing comma

(COM812)

177-178: Logging statement uses f-string

(G004)

178-178: Trailing comma missing

Add trailing comma

(COM812)

sema/subyt/sources.py

19-19: Missing return type annotation for public function assert_readable

Add return type annotation: None

(ANN201)

21-21: Use of assert detected

(S101)

22-22: Use of assert detected

(S101)

48-48: Use of assert detected

(S101)

93-93: Unused blanket noqa directive

Remove unused noqa directive

(RUF100)

95-95: Trailing comma missing

Add trailing comma

(COM812)

105-105: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)

110-110: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)

114-114: Use of assert detected

(S101)

120-120: Unnecessary assignment to source before return statement

Remove unnecessary assignment

(RET504)

129-129: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

132-132: Missing return type annotation for private function _init_sourcefiles

Add return type annotation: None

(ANN202)

134-134: Use of assert detected

(S101)

154-154: Trailing comma missing

Add trailing comma

(COM812)

193-193: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

197-197: Trailing comma missing

Add trailing comma

(COM812)

200-200: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

205-205: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

210-210: Trailing comma missing

Add trailing comma

(COM812)

213-213: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

225-225: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

256-256: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

299-299: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: make-lint-test

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sinks.py

+        if self._file_path.exists():
+            self.mtimes = {
+                str(self._file_path): self._file_path.stats().st_mtime
+            }

    def __repr__(self):


🧹 Nitpick (assertive)

Add return type annotation to __repr__ method

Adding a return type annotation to the __repr__ method enhances code clarity and type checking.

Apply this diff:

- def __repr__(self): + def __repr__(self) -> str:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def __repr__(self):

def __repr__(self) -> str:

🧰 Tools

🪛 Ruff (0.8.2)

88-88: Missing return type annotation for special method __repr__

Add return type annotation: str

(ANN204)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sinks.py

+        assert not out_path.exists(), (
+            f"File to write '{path_name}' already exists",
+        )


⚠️ Potential issue

Replace assert statements with proper exception handling

Using assert statements for error checking is not recommended in production code because they can be disabled with optimization flags. It's better to raise explicit exceptions to handle error conditions properly.

Apply this diff to fix the issue:

- assert not out_path.exists(), ( - f"File to write '{path_name}' already exists", - ) + if out_path.exists(): + raise FileExistsError(f"File to write '{path_name}' already exists")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert not out_path.exists(), (

f"File to write '{path_name}' already exists",

)

if out_path.exists():

raise FileExistsError(f"File to write '{path_name}' already exists")

🧰 Tools

🪛 Ruff (0.8.2)

15-15: Use of assert detected

(S101)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sinks.py

+    assert os.access(parent_path, os.W_OK), (
+        f"Can not write to folder '{parent_path}' for creating new files",
+    )


⚠️ Potential issue

Replace assert statements with proper exception handling

Using assert statements for checking write permissions is not reliable. Instead, raise an exception to handle the error appropriately.

Apply this diff:

- assert os.access(parent_path, os.W_OK), ( - f"Can not write to folder '{parent_path}' for creating new files", - ) + if not os.access(parent_path, os.W_OK): + raise PermissionError(f"Cannot write to folder '{parent_path}' for creating new files")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert os.access(parent_path, os.W_OK), (

f"Can not write to folder '{parent_path}' for creating new files",

)

if not os.access(parent_path, os.W_OK):

raise PermissionError(f"Cannot write to folder '{parent_path}' for creating new files")

🧰 Tools

🪛 Ruff (0.8.2)

21-21: Use of assert detected

(S101)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sinks.py

+        out_path: Path = Path(file_path)
+        if out_path.exists() and out_path.is_dir():
+            log.warning(
+                f"Skipping creation of {file_path} as it is a directory. "


🧹 Nitpick (assertive)

Avoid using f-strings in logging statements

Using f-strings in logging methods can lead to unnecessary string interpolation even if the log level is not enabled. It's better to use lazy logging by passing the format string and arguments separately.

Apply this diff:

- f"Skipping creation of {file_path} as it is a directory. " + "Skipping creation of %s as it is a directory. File will be ignored.", + file_path

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.8.2)

145-145: Logging statement uses f-string

(G004)

145-145: Trailing comma missing

Add trailing comma

(COM812)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sources.py

@@ -283,11 +281,11 @@ def __exit__(self):
            pass

        def __repr__(self):
-            return f"JsonFileSource('{os.path.abspath(self._json)}')"
+            return f"JsonFileSource('{self._json!s}')"


🧹 Nitpick (assertive)

Add return type annotation to __repr__ method

Including a return type annotation for __repr__ methods improves type checking and code readability.

Apply this diff:

- def __repr__(self): + def __repr__(self) -> str:

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sources.py

+        assert len(self._sourcefiles) > 0, (
+            f"{self} should have content files.",
+        )


⚠️ Potential issue

Replace assert statement with proper exception handling

Using assert statements for essential runtime checks is not reliable in production code. Raise an exception instead.

Apply this diff:

- assert len(self._sourcefiles) > 0, ( - f"{self} should have content files.", - ) + if not self._sourcefiles: + raise ValueError(f"{self} should have content files.")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert len(self._sourcefiles) > 0, (

f"{self} should have content files.",

)

if not self._sourcefiles:

raise ValueError(f"{self} should have content files.")

🧰 Tools

🪛 Ruff (0.8.2)

134-134: Use of assert detected

(S101)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sources.py

+    assert in_path.is_file(), f"File to read '{path_name}' does not exist"
+    assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"


⚠️ Potential issue

Replace assert statements with proper exception handling

Using assert statements for runtime checks is discouraged because they can be disabled with optimization flags. Instead, raise appropriate exceptions to handle error conditions.

Apply this diff:

- assert in_path.is_file(), f"File to read '{path_name}' does not exist" - assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'" + if not in_path.is_file(): + raise FileNotFoundError(f"File to read '{path_name}' does not exist") + if not os.access(in_path, os.R_OK): + raise PermissionError(f"Cannot read '{path_name}'")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert in_path.is_file(), f"File to read '{path_name}' does not exist"

assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"

if not in_path.is_file():

raise FileNotFoundError(f"File to read '{path_name}' does not exist")

if not os.access(in_path, os.R_OK):

raise PermissionError(f"Cannot read '{path_name}'")

🧰 Tools

🪛 Ruff (0.8.2)

21-21: Use of assert detected

(S101)

22-22: Use of assert detected

(S101)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sources.py

+        assert mime in self._register, (
+            f"no Source class available for mime '{mime}'",
+        )


⚠️ Potential issue

Replace assert statement with proper exception handling

Using assert statements for control flow is not recommended. Instead, raise an exception if the condition is not met.

Apply this diff:

- assert mime in self._register, ( - f"no Source class available for mime '{mime}'", - ) + if mime not in self._register: + raise KeyError(f"No Source class available for MIME type '{mime}'")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert mime in self._register, (

f"no Source class available for mime '{mime}'",

)

if mime not in self._register:

raise KeyError(f"No Source class available for MIME type '{mime}'")

🧰 Tools

🪛 Ruff (0.8.2)

48-48: Use of assert detected

(S101)

coderabbitai · 2025-01-17T16:52:48Z

sema/subyt/sources.py

+        assert mime is not None, (
+            f"no valid mime derived from identifier '{identifier}'",


⚠️ Potential issue

Replace assert statement with proper exception handling

Using assert statements for important checks can be risky. Replace with an exception to ensure the check is not bypassed.

Apply this diff:

- assert mime is not None, ( - f"no valid mime derived from identifier '{identifier}'", - ) + if mime is None: + raise ValueError(f"No valid MIME type derived from identifier '{identifier}'")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert mime is not None, (

f"no valid mime derived from identifier '{identifier}'",

if mime is None:

raise ValueError(f"No valid MIME type derived from identifier '{identifier}'")

🧰 Tools

🪛 Ruff (0.8.2)

114-114: Use of assert detected

(S101)

marc-portier added 9 commits January 16, 2025 12:06

first stab at #136

5d850bd

new tests for #136

1984333

also refactoring from unittest to pytest

lintfixing

fbbaaec

as well as some extra warning about missing variables

trimming line length

22f408c

adding safe process assurance suggested in #137

c456e21

migrate from os.path to pathlib in subyt as suggested in #138

90690ff

handle connection between new break_on_error (#136) and allow_repeate…

8ef9d98

…d_sink_paths

migrated tests to pytest

cf7480f

and more consistently using pathlib (over os.path)

lintfixed

5fc5e28

mpo-vliz requested a review from cedricdcc January 17, 2025 16:33

coderabbitai bot requested changes Jan 17, 2025

View reviewed changes

useful coderabbit suggestions

006a0f4

coderabbitai bot requested changes Jan 17, 2025

View reviewed changes

cedricdcc approved these changes Jan 20, 2025

View reviewed changes

mpo-vliz merged commit 48ef7cb into main Jan 20, 2025
2 checks passed

mpo-vliz deleted the fix/issue-136-process-robustness branch January 20, 2025 11:13

coderabbitai bot mentioned this pull request Feb 12, 2025

add source-filtering on the subyt input set _ #151

Merged

This was referenced Feb 19, 2025

not call open/close for every part, but only for every sink / process #158

Merged

support elaborate source config via dict or formatted string #159

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/issue-136-process-robustness #140

Fix/issue-136-process-robustness #140

mpo-vliz commented Jan 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 17, 2025 •

edited

Loading

Walkthrough

Changes

Suggested labels

Suggested reviewers

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

github-actions bot commented Jan 17, 2025 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

coderabbitai bot Jan 17, 2025

		assert in_path.is_file(), f"File to read '{path_name}' does not exist"
		assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"

-    assert in_path.is_file(), f"File to read '{path_name}' does not exist"
-    assert os.access(in_path, os.R_OK), f"Can not read '{path_name}'"
+    if not in_path.is_file():
+        raise FileNotFoundError(f"File to read '{path_name}' does not exist")
+    if not os.access(in_path, os.R_OK):
+        raise PermissionError(f"Cannot read '{path_name}'")

		assert mime is not None, (
		f"no valid mime derived from identifier '{identifier}'",

Fix/issue-136-process-robustness #140

Fix/issue-136-process-robustness #140

Conversation

mpo-vliz commented Jan 17, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Jan 17, 2025 • edited Loading

Walkthrough

Changes

Suggested labels

Suggested reviewers

Possibly related PRs

Finishing Touches

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 17, 2025 • edited Loading

Tests coverage table for 006a0f4 commit.

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

mpo-vliz commented Jan 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading