-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
popen options for subprocessing #32
Conversation
WalkthroughThe pull request introduces several modifications across various files in the Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (6)
sh/lint.sh (1)
Line range hint
14-18
: Optimize the autoflake execution.The current implementation runs autoflake individually on each file, which is less efficient for large codebases.
Consider this more efficient approach:
-for FILE in ${PY_FILES[@]}; do - autoflake -i "$FILE" -done +echo "$PY_FILES" | tr ' ' '\n' | xargs -I {} autoflake -i {}phlop/reflection.py (1)
19-19
: Consider adding docstring to document error handling behavior.The function's error handling is controlled by both a parameter and an environment variable. Consider adding a docstring to explain:
- The default fail-fast behavior
- The environment variable override
- Examples of when to use each option
Here's a suggested docstring:
def classes_in_file(file_path, subclasses_only=None, fail_on_import_error=True): """Extract classes from a Python file with configurable error handling. Args: file_path: Path to the Python file subclasses_only: Filter to specific parent classes fail_on_import_error: If True (default), raises import errors. If False, logs and continues. Can be overridden by PHLOP_FORCE_RAISE_ON_IMPORT_ERROR environment variable. Returns: List of class objects found in the file """phlop/proc.py (2)
Line range hint
82-86
: Replace placeholder implementation with proper binary existence check.The current implementation raises an error and the commented approach using
which
is not cross-platform. Consider using Python's built-inshutil.which()
.Here's a better implementation:
def binary_exists_on_path(bin): - """ - https://linux.die.net/man/1/which - """ - raise ValueError("do better") - return run(f"which {bin}").exitcode == 0 + """Check if a binary exists in PATH. + + Args: + bin (str): Name of the binary to check + + Returns: + bool: True if the binary exists and is executable + """ + import shutil + return shutil.which(bin) is not None
Line range hint
35-63
: Consider improving type safety and error handling.The
run_mp
function has complex error handling and could benefit from:
- Type hints for better code maintainability
- Simplified executor shutdown logic
- Consistent subprocess usage pattern
Consider these improvements:
-def run_mp(cmds, N_CORES=None, **kwargs): +def run_mp(cmds: list[str], N_CORES: int | None = None, **kwargs) -> list[RunTimer]: """ spawns N_CORES threads (default=len(cmds)) running commands and waiting for results - https://docs.python.org/3/library/concurrent.futures.html + + Args: + cmds: List of commands to execute + N_CORES: Number of parallel threads (defaults to len(cmds)) + **kwargs: Additional arguments passed to run() + + Returns: + List of RunTimer results + + Raises: + ProcessNonZeroExitCode: If any command fails and check=True """ import concurrent.futures + from typing import List if N_CORES is None: N_CORES = len(cmds) with concurrent.futures.ThreadPoolExecutor(max_workers=N_CORES) as executor: jobs = [executor.submit(run, cmd, **kwargs) for cmd in cmds] - results = [] + results: List[RunTimer] = [] for future in concurrent.futures.as_completed(jobs): try: results += [future.result()] if future.exception() is not None: raise future.exception() except Exception as exc: if kwargs.get("check", False): - dic = {"wait": False} - if sys.version_info > (3, 8): - dic["cancel_futures"] = True - executor.shutdown(**dic) + # Cancel all pending futures and shutdown + for job in jobs: + job.cancel() + executor.shutdown(wait=False, cancel_futures=True) raise exc else: print(f"run_mp generated an exception: {exc}")phlop/testing/test_cases.py (2)
145-147
: Consider handling edge cases in command processing.The current implementation works for basic commands but might need to handle these edge cases:
- Commands with spaces in paths
- Commands with escaped characters
- Symlinks that might need their paths preserved
Consider enhancing the implementation:
def binless(test_case): if test_case.cmd.startswith("/usr/"): - bits = test_case.cmd.split(" ") - test_case.cmd = " ".join([bits[0].split("/")[-1]] + bits[1:]) + import shlex + bits = shlex.split(test_case.cmd) + # Preserve symlinks by using os.path.realpath if needed + binary = bits[0].split("/")[-1] + test_case.cmd = " ".join([shlex.quote(binary)] + [shlex.quote(b) for b in bits[1:]]) return test_case
148-148
: Remove commented debug print statement.The commented debug print statement doesn't provide value and should be removed.
- # print("test_case.cmd ", test_case.cmd)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
- phlop/proc.py (1 hunks)
- phlop/procs/runtimer.py (1 hunks)
- phlop/reflection.py (1 hunks)
- phlop/testing/parallel_processor.py (4 hunks)
- phlop/testing/test_cases.py (1 hunks)
- phlop/timing/scope_timer.py (1 hunks)
- pyproject.toml (1 hunks)
- setup.py (1 hunks)
- sh/lint.sh (1 hunks)
✅ Files skipped from review due to trivial changes (3)
- phlop/timing/scope_timer.py
- pyproject.toml
- setup.py
🧰 Additional context used
🪛 Ruff
phlop/procs/runtimer.py
49-49: Use a context manager for opening files
(SIM115)
50-50: Use a context manager for opening files
(SIM115)
63-63: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
🔇 Additional comments (4)
sh/lint.sh (1)
10-10
: Verify the impact of disabling black formatter.Commenting out the black formatter could lead to inconsistent code formatting across the project. While other tools like
isort
andruff
are still active, they serve different purposes and may not fully cover formatting consistency.Consider:
- Adding a comment explaining why black was disabled
- Documenting the formatting standards now that black is disabled
- Ensuring CI checks are updated accordingly
✅ Verification successful
Let me try a different approach to verify the impact of disabling black formatter by checking the Python files directly.
Let me check one more thing to see if there are other formatting tools in use that might be handling the formatting responsibilities.
Disabling black formatter is acceptable due to redundant formatting coverage
Based on the verification results, commenting out black is acceptable because:
- Ruff is configured and active in
ruff.toml
with formatting-related rules (E7, E9)- The project maintains code style through multiple tools:
isort
for import sortingruff
for code style and formattingpylint
for error checkingautoflake
for removing unused importsThe removal of black eliminates potential formatting conflicts while maintaining code quality through other tools. No further action needed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if black formatting is enforced elsewhere # Test 1: Check if black is listed in project dependencies rg -A 5 "black" pyproject.toml setup.py # Test 2: Check for other potential formatting configurations fd -e toml -e cfg -e ini --exec rg -l "black" # Test 3: Check if black is run in CI fd -e yml -e yaml --exec rg -l "black"Length of output: 124
Script:
#!/bin/bash # Check Python files for formatting-related evidence # 1. Look for any remaining black configuration or disabled comments rg -i "# *black" --type py # 2. Check if there are any Python files that might be affected fd -e py --type f | wc -l # 3. Look for other formatting tools configuration in pyproject.toml if [ -f "pyproject.toml" ]; then echo "=== pyproject.toml contents ===" cat pyproject.toml fi # 4. Check for any pre-commit hooks that might handle formatting if [ -f ".pre-commit-config.yaml" ]; then echo "=== .pre-commit-config.yaml contents ===" cat .pre-commit-config.yaml fiLength of output: 729
Script:
#!/bin/bash # Check for other formatting tools and their configurations # 1. Check ruff configuration which might handle formatting rg -l "ruff" . if [ -f "ruff.toml" ]; then echo "=== ruff.toml contents ===" cat ruff.toml fi # 2. Look for formatting-related settings in setup.cfg if [ -f "setup.cfg" ]; then echo "=== setup.cfg contents ===" cat setup.cfg fi # 3. Check the actual lint.sh file to see all active linting tools echo "=== lint.sh contents ===" cat sh/lint.shLength of output: 823
phlop/reflection.py (1)
19-19
: Verify the impact of changing the default error handling behavior.Changing
fail_on_import_error
default fromFalse
toTrue
makes the function fail fast by default. While this is generally good practice, it could potentially break existing code that relies on silent failures.Let's verify the usage to assess the impact:
✅ Verification successful
Let me analyze the results and generate one more verification to get the complete picture:
The initial results show that the existing usage already explicitly sets
fail_on_import_error=True
, but let's get more context about where and how it's being used to make a definitive conclusion.
Default parameter change is safe to proceed
The change to make
fail_on_import_error=True
by default is safe because:
- The only existing usage found in the codebase already explicitly sets
fail_on_import_error=True
- This indicates that the new default actually aligns with the current usage pattern
- No instances of relying on the old default behavior (False) were found
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for direct calls to classes_in_file to identify potential breaking changes rg -l "classes_in_file\(" | grep -v "reflection.py" | xargs -I {} rg -A 2 "classes_in_file\(" {}Length of output: 263
Script:
#!/bin/bash # Let's find the file containing this usage and get more context rg -l "classes_in_file\(" | grep -v "reflection.py" | xargs -I {} rg -B 5 -A 2 "classes_in_file\(" {}Length of output: 503
phlop/testing/parallel_processor.py (2)
17-18
: LGTM!Simple exception class definition is appropriate here.
38-38
: Verify error handling with capture_output=False.Setting
capture_output=False
means the output goes directly to stdout/stderr. This might affect the error handling at line 40 whereself.run.stderr
is checked. Ensure this change doesn't break error detection and reporting.✅ Verification successful
Let me gather more context about how
self.run
is initialized and used to better understand the impact ofcapture_output=False
.
Let me gather one more piece of information to understand the complete context of the CallableTest class.
Let me try one last search with a different pattern to understand the complete implementation.
Based on the comprehensive context gathered, I can now provide the final response:
Setting capture_output=False is safe and intentional
The code is properly handling the stderr access even with
capture_output=False
because:
- The custom
run()
function fromphlop/proc.py
is being used, not Python's subprocess.run directly- The
RunTimer
class inphlop/procs/runtimer.py
handles both capture_output=True and False cases:
- When
capture_output=False
, it redirects output to log files iflog_file_path
is provided- The
stderr
andexitcode
properties are still accessible regardless of the capture_output setting- The error handling at line 46-47 will work correctly as the
RunTimer
class maintains these properties in both modes🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if stderr is still accessible when capture_output is False ast-grep --pattern 'if self.run.stderr and self.run.exitcode != 0: $$$'Length of output: 245
Script:
#!/bin/bash # Check how self.run is initialized and used ast-grep --pattern 'class CallableTest: $$$ def __call__($$$) { $$$ } $$$'Length of output: 88
Script:
#!/bin/bash # Check if subprocess.run is imported and how it's used rg "subprocess\.run|from subprocess import run" -A 3Length of output: 352
Script:
#!/bin/bash # Look for the complete run() call rg "run\(" -A 3 phlop/testing/parallel_processor.pyLength of output: 176
Script:
#!/bin/bash # Get the complete CallableTest class implementation rg -U "class CallableTest.*?def" phlop/testing/parallel_processor.py -A 50Length of output: 76
Script:
#!/bin/bash # Get the complete file content to understand the implementation cat phlop/testing/parallel_processor.pyLength of output: 3920
Script:
#!/bin/bash # Check if there are any other references to stderr or capture_output rg "stderr|capture_output" -B 2 -A 2Length of output: 8525
@@ -26,63 +24,99 @@ def __init__( | |||
working_dir=None, | |||
log_file_path=None, | |||
logging=2, | |||
popen=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing default popen=True
may introduce breaking changes
Setting popen=True
as the default changes the behavior of RunTimer
to use subprocess.Popen
instead of subprocess.run
. This may affect existing code that relies on the previous default behavior of using subprocess.run
.
Consider setting popen=False
as the default to maintain backward compatibility, or document this change clearly if the behavior change is intentional.
phlop/procs/runtimer.py
Outdated
self._popen(*kwargs, **ekwargs) | ||
else: | ||
self._run( | ||
check=check, capture_output=capture_output, *kwargs, **ekwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid star-arg unpacking after keyword arguments
Using *kwargs
after keyword arguments is discouraged and can cause unexpected behavior. Adjusting the argument order or unpacking method can prevent this issue.
By replacing *kwargs
with **kwargs
as suggested, you resolve this concern.
🧰 Tools
🪛 Ruff
63-63: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
phlop/procs/runtimer.py
Outdated
self._popen(*kwargs, **ekwargs) | ||
else: | ||
self._run( | ||
check=check, capture_output=capture_output, *kwargs, **ekwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect unpacking of kwargs
; use **kwargs
instead of *kwargs
Unpacking kwargs
using *kwargs
passes the dictionary keys as positional arguments, which is not intended and can lead to errors. Since kwargs
is a dictionary of keyword arguments, it should be unpacked using **kwargs
.
Replace *kwargs
with **kwargs
to correctly pass the additional keyword arguments:
- self._popen(*kwargs, **ekwargs)
+ self._popen(**kwargs, **ekwargs)
Similarly, adjust the call to _run
:
- self._run(
- check=check, capture_output=capture_output, *kwargs, **ekwargs
- )
+ self._run(
+ check=check, capture_output=capture_output, **kwargs, **ekwargs
+ )
This change also resolves the static analysis warning about star-arg unpacking after a keyword argument.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
self._popen(*kwargs, **ekwargs) | |
else: | |
self._run( | |
check=check, capture_output=capture_output, *kwargs, **ekwargs | |
self._popen(**kwargs, **ekwargs) | |
else: | |
self._run( | |
check=check, capture_output=capture_output, **kwargs, **ekwargs |
🧰 Tools
🪛 Ruff
63-63: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
stdout=open(f"{log_file_path}.stdout", "w"), | ||
stderr=open(f"{log_file_path}.stderr", "w"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use context managers to safely handle file resources
When opening files with open()
, it's best practice to use context managers (with
statements) to ensure files are properly closed after use, even if an exception occurs. This prevents potential resource leaks.
Modify the code to manage file handles and ensure they are closed after the subprocess finishes:
- ekwargs.update(
- dict(
- stdout=open(f"{log_file_path}.stdout", "w"),
- stderr=open(f"{log_file_path}.stderr", "w"),
- ),
- )
+ stdout_file = open(f"{log_file_path}.stdout", "w")
+ stderr_file = open(f"{log_file_path}.stderr", "w")
+ ekwargs.update(
+ dict(
+ stdout=stdout_file,
+ stderr=stderr_file,
+ ),
+ )
Then, after the subprocess completes, ensure the files are closed in _run
and _popen
methods:
# In _run method, after subprocess.run():
+ if not capture_output and log_file_path:
+ stdout_file.close()
+ stderr_file.close()
# In _popen method, after subprocess.Popen():
+ if not capture_output and log_file_path:
+ stdout_file.close()
+ stderr_file.close()
Committable suggestion was skipped due to low confidence.
🧰 Tools
🪛 Ruff
49-49: Use a context manager for opening files
(SIM115)
50-50: Use a context manager for opening files
(SIM115)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- phlop/proc.py (1 hunks)
- phlop/testing/parallel_processor.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- phlop/proc.py
🔇 Additional comments (3)
phlop/testing/parallel_processor.py (3)
15-16
: Consider making the timeout configurableThe hardcoded timeout of 3600 seconds (1 hour) might not be suitable for all environments. Consider making this configurable through environment variables.
The previous review suggestion about making the timeout configurable through
PHLOP_QUEUE_TIMEOUT
is still valid and would improve the flexibility of the code.
100-100
: Fix the Process creation argument syntaxThe previous review comment about the incorrect Process creation syntax is still valid. The extra parentheses around
pqueue
create a single-element tuple, which is incorrect:-Process(target=runner, args=(test, (pqueue))) +Process(target=runner, args=(test, pqueue))
39-39
: Verify stderr handling with capture_output=FalseSetting
capture_output=False
means the command's output goes directly to stdout/stderr. This might affect the error handling logic that checksself.run.stderr
. Please verify that error detection still works as expected.Also applies to: 48-49
✅ Verification successful
Let me analyze the code further to understand the impact of
capture_output=False
.
Based on the code analysis, I can now provide the final response:
Setting capture_output=False is correct and won't affect error handling
The error handling in
CallableTest.__call__
will work correctly withcapture_output=False
because:
- The
run()
function usesRunTimer
which handles both capture and non-capture modes appropriately- When
capture_output=False
, output is redirected to log files ({log_file_path}.stderr
) which are still accessible for error checking- The error check
if self.run.stderr and self.run.exitcode != 0
remains valid asRunTimer
maintains the stderr property in both modes🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if there are any other places in the codebase that rely on captured output # from this class. # Search for references to CallableTest and stderr/stdout usage rg -A 5 "CallableTest.*stderr|stdout"Length of output: 6653
Script:
#!/bin/bash # Check the implementation of CallableTest class and its error handling ast-grep --pattern 'class CallableTest { $$$ __call__($$$) { $$$ } $$$ }'Length of output: 87
Script:
#!/bin/bash # Let's check the Python implementation since ast-grep didn't work for Python files rg -A 20 "class CallableTest"Length of output: 1482
Script:
#!/bin/bash # Let's check the implementation of the run function being used rg -A 15 "def run\(" phlop/Length of output: 775
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (2)
phlop/os.py (2)
Line range hint
41-45
: Maintain consistent error handling across file operations.For consistency with the improvements made to
read_file
, the error handling inwrite_to_file
should also use proper exception chaining.Apply this diff to improve error handling:
with open(file, mode) as f: f.write(contents) except IOError as e: - raise RuntimeError(f"Failed to write to file {file}: {e}") + raise RuntimeError(f"Failed to write to file {file}: {e}") from e🧰 Tools
🪛 Ruff
48-48: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
57-57: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
51-51
: Add docstring to document function behavior.The new
read_file
function would benefit from a docstring explaining its purpose, parameters, return value, and the types of exceptions it may raise.Add documentation like this:
def read_file(file): + """Read contents of a file. + + Args: + file: Path to the file to read. + + Returns: + str: Contents of the file. + + Raises: + RuntimeError: If the file cannot be read. + """ try:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- phlop/os.py (1 hunks)
- phlop/testing/parallel_processor.py (5 hunks)
🧰 Additional context used
🪛 Ruff
phlop/os.py
57-57: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
🔇 Additional comments (3)
phlop/testing/parallel_processor.py (3)
108-108
: Fix Process args tuple syntaxThe extra parentheses issue identified in the previous review still exists.
122-128
: Improve queue error handlingThe issues with broad exception handling and lack of backoff strategy identified in the previous review still exist.
40-40
: Verify output handling with logging modesSetting
capture_output=False
means output will be streamed directly rather than captured. Please verify this change works correctly with allLoggingMode
values, especiallyON_NON_ZERO_EXIT_CODE
.✅ Verification successful
Let me gather more information about how the output handling is implemented.
Based on the code analysis, I can now generate the final response:
Setting
capture_output=False
is correctly handled with all logging modesThe implementation in
RunTimer
class (phlop/procs/runtimer.py
) properly handles the case whencapture_output=False
:
- For
LoggingMode.OFF
(0): Output streams directly to stdout/stderr- For
LoggingMode.ON_NON_ZERO_EXIT_CODE
(1) andLoggingMode.ALWAYS
(2): If a log file path is provided, output is redirected to.stdout
and.stderr
files via file descriptors (see lines withstdout=open(f"{log_file_path}.stdout", "w")
)🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for LoggingMode usage to understand the impact rg -A 5 "LoggingMode"Length of output: 1084
Script:
#!/bin/bash # Search for the CallableTest class implementation and its usage ast-grep --pattern 'class CallableTest { $$$ __call__($$$) { $$$ } $$$ }' # Also search for any output handling related code rg -A 5 "capture_output"Length of output: 7597
def print_log_files(self): | ||
print(self.run.stdout) | ||
print(self.run.stderr) | ||
if self.test_case.log_file_path: | ||
print(read_file(f"{self.test_case.log_file_path}.stdout")) | ||
print(read_file(f"{self.test_case.log_file_path}.stdout")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix duplicate stdout log print
There's a bug where stdout is printed twice instead of printing stderr.
def print_log_files(self):
print(self.run.stdout)
print(self.run.stderr)
if self.test_case.log_file_path:
print(read_file(f"{self.test_case.log_file_path}.stdout"))
- print(read_file(f"{self.test_case.log_file_path}.stdout"))
+ print(read_file(f"{self.test_case.log_file_path}.stderr"))
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def print_log_files(self): | |
print(self.run.stdout) | |
print(self.run.stderr) | |
if self.test_case.log_file_path: | |
print(read_file(f"{self.test_case.log_file_path}.stdout")) | |
print(read_file(f"{self.test_case.log_file_path}.stdout")) | |
def print_log_files(self): | |
print(self.run.stdout) | |
print(self.run.stderr) | |
if self.test_case.log_file_path: | |
print(read_file(f"{self.test_case.log_file_path}.stdout")) | |
print(read_file(f"{self.test_case.log_file_path}.stderr")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- phlop/os.py (1 hunks)
- phlop/testing/parallel_processor.py (5 hunks)
🧰 Additional context used
🪛 Ruff
phlop/os.py
56-56: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
64-64: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
🔇 Additional comments (6)
phlop/testing/parallel_processor.py (6)
15-16
: Consider making the timeout configurableA 1-hour timeout might be excessive for most test scenarios. Consider making this configurable through environment variables.
The suggestion from the past review still applies. Consider implementing:
-timeout = 60 * 60 +timeout = int(os.getenv('PHLOP_QUEUE_TIMEOUT', str(60 * 60))) # 1 hour default
51-56
: Fix duplicate stdout log printThere's a bug where stdout is printed twice instead of printing stderr.
The suggestion from the past review still applies:
- print(read_file(f"{self.test_case.log_file_path}.stdout")) + print(read_file(f"{self.test_case.log_file_path}.stderr"))
76-91
: LGTM! Good addition for debugging.The print_pending function provides useful visibility into running jobs by showing both stdout and stderr of pending tests.
137-144
: Improve queue error handlingThe current implementation has several issues:
- Catching generic
Exception
is too broad- The error handling doesn't distinguish between queue timeouts and other errors
The suggestion from the past review still applies:
- try: - proc = queue.get(timeout=timeout) - except Exception: - logger.info("Queue timeout - polling") - continue + try: + proc = queue.get(timeout=timeout) + except queue.Empty: + logger.info(f"No jobs completed within {timeout} seconds - continuing to poll") + time.sleep(1) # Add backoff to reduce CPU usage + continue + except Exception as e: + logger.error(f"Unexpected error while polling queue: {e}") + raise
152-153
: Fix exit code failure checkUsing
exitcode > 0
might miss negative exit codes which also indicate failure.The suggestion from the past review still applies:
- if proc.run.exitcode > 0: + if proc.run.exitcode != 0:
39-39
: Verify the impact of disabling output captureSetting
capture_output=False
might affect how test output is processed. Ensure this change doesn't break any output parsing or logging functionality.
def read_file(file): | ||
try: | ||
with open(file, "r") as f: | ||
return f.read() | ||
except IOError as e: | ||
raise RuntimeError(f"Failed to read file {file}: {e}") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhance error handling and file operations.
The function needs improvements in error handling and file operations:
- Add exception chaining for better error traceability
- Add file existence check
- Specify encoding for text file operations
Apply this diff to address the issues:
def read_file(file):
try:
+ if not os.path.exists(file):
+ raise FileNotFoundError(f"File not found: {file}")
- with open(file, "r") as f:
+ with open(file, "r", encoding="utf-8") as f:
return f.read()
except IOError as e:
- raise RuntimeError(f"Failed to read file {file}: {e}")
+ raise RuntimeError(f"Failed to read file {file}: {e}") from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def read_file(file): | |
try: | |
with open(file, "r") as f: | |
return f.read() | |
except IOError as e: | |
raise RuntimeError(f"Failed to read file {file}: {e}") | |
def read_file(file): | |
try: | |
if not os.path.exists(file): | |
raise FileNotFoundError(f"File not found: {file}") | |
with open(file, "r", encoding="utf-8") as f: | |
return f.read() | |
except IOError as e: | |
raise RuntimeError(f"Failed to read file {file}: {e}") from e | |
🧰 Tools
🪛 Ruff
56-56: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
def read_last_lines_of(file, n=10): | ||
try: | ||
with open(file, "r") as f: | ||
return f.readlines()[:10] | ||
except IOError as e: | ||
raise RuntimeError(f"Failed to read file {file}: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix critical logic error and improve implementation.
The function has several issues:
- It's reading the first 10 lines instead of the last N lines (using [:10] instead of [-n:])
- Ignoring the
n
parameter and hardcoding 10 - Same issues as
read_file
: missing exception chaining, file existence check, and encoding specification - Inefficient for large files as it reads all lines into memory
Apply this diff for a more efficient implementation using deque:
+from collections import deque
+
def read_last_lines_of(file, n=10):
try:
- with open(file, "r") as f:
- return f.readlines()[:10]
+ if not os.path.exists(file):
+ raise FileNotFoundError(f"File not found: {file}")
+ with open(file, "r", encoding="utf-8") as f:
+ return list(deque(f, n))
except IOError as e:
- raise RuntimeError(f"Failed to read file {file}: {e}")
+ raise RuntimeError(f"Failed to read file {file}: {e}") from e
The deque
solution is memory-efficient as it only keeps the last N lines in memory while reading the file.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def read_last_lines_of(file, n=10): | |
try: | |
with open(file, "r") as f: | |
return f.readlines()[:10] | |
except IOError as e: | |
raise RuntimeError(f"Failed to read file {file}: {e}") | |
from collections import deque | |
def read_last_lines_of(file, n=10): | |
try: | |
if not os.path.exists(file): | |
raise FileNotFoundError(f"File not found: {file}") | |
with open(file, "r", encoding="utf-8") as f: | |
return list(deque(f, n)) | |
except IOError as e: | |
raise RuntimeError(f"Failed to read file {file}: {e}") from e |
🧰 Tools
🪛 Ruff
64-64: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (1)
phlop/testing/parallel_processor.py (1)
15-16
: Consider adding timeout validation.
While the timeout is already suggested to be configurable via environment variables in past reviews, it would be beneficial to add validation to ensure the timeout value is reasonable (e.g., greater than 0 and less than some maximum value).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- phlop/procs/runtimer.py (1 hunks)
- phlop/testing/parallel_processor.py (5 hunks)
🧰 Additional context used
🪛 Ruff
phlop/procs/runtimer.py
49-49: Use a context manager for opening files
(SIM115)
50-50: Use a context manager for opening files
(SIM115)
🔇 Additional comments (5)
phlop/testing/parallel_processor.py (2)
77-92
: LGTM! Well-structured debug output implementation.
The function provides clear visibility into pending jobs with:
- Clear start/end markers
- Separate stdout/stderr sections
- Proper null checks before printing
40-40
: Verify the impact of setting capture_output=False.
Setting capture_output=False
means the output will be written directly to the terminal/log file. This might affect error handling in the waiter
function where we check proc.run.stderr
. Please ensure this change doesn't break error reporting.
✅ Verification successful
Let me gather more context about how the subprocess is being used.
Let me check the complete implementation of the CallableTest class to understand how it's used.
Setting capture_output=False is the correct implementation
The change to set capture_output=False
in CallableTest.__call__
is correct because:
- The RunTimer class properly handles both capture_output modes:
- When
capture_output=False
, it redirects output to log files iflog_file_path
is provided - Error handling still works as the RunTimer sets stdout/stderr from the log files
- When
- The error checking in
CallableTest
usingself.run.stderr
remains valid since RunTimer populates these fields from either captured output or log files
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any references to run.stdout or run.stderr to verify they're not used when capture_output is False
rg -A 3 'run\.(stdout|stderr)'
Length of output: 1174
Script:
#!/bin/bash
# Search for the subprocess.run call and its context
ast-grep --pattern 'self.run = subprocess.run($$$)'
Length of output: 138
Script:
#!/bin/bash
# Search for the class definition and initialization
ast-grep --pattern 'class CallableTest($$$)'
Length of output: 46
Script:
#!/bin/bash
# Look for any error handling patterns
rg -A 5 'CalledProcessError'
Length of output: 732
Script:
#!/bin/bash
# Search for CallableTest implementation
rg -A 20 "class CallableTest"
Length of output: 1482
Script:
#!/bin/bash
# Search for where capture_output is used in the context
rg -B 2 -A 2 "capture_output"
Length of output: 6548
phlop/procs/runtimer.py (3)
27-35
: Addition of 'popen' parameter enhances flexibility
The introduction of the popen
parameter in the __init__
method allows users to choose between subprocess.run
and subprocess.Popen
, increasing the flexibility of the RunTimer
class. The initialization of related attributes improves clarity.
75-96
: Efficient command execution with '_run' method
The _run
method effectively executes commands using subprocess.run
, capturing output, errors, execution time, and exit codes. Proper exception handling ensures robustness, and logging is handled according to the specified level.
97-119
: Robust implementation of '_popen' method
The _popen
method provides an alternative execution path using subprocess.Popen
. It correctly handles output and error streams, captures execution metrics, and ensures resources are cleaned up appropriately.
proc = None | ||
try: | ||
proc = queue.get(timeout=timeout) | ||
|
||
except Exception: | ||
logger.info("Queue timeout - polling") | ||
print_pending(cc, batches, logging) | ||
continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider potential race conditions in error handling.
The current implementation might miss failures if:
- A process fails right after the timeout exception
- Multiple processes fail simultaneously
Consider adding a check for failed processes after the timeout:
try:
proc = queue.get(timeout=timeout)
except Exception:
logger.info("Queue timeout - polling")
print_pending(cc, batches, logging)
+ # Check if any process has failed
+ for batch_index, procs in enumerate(cc.procs):
+ for proc in procs:
+ if not proc.is_alive() and proc.exitcode != 0:
+ raise TestCaseFailure("Process failed during timeout")
continue
Committable suggestion was skipped due to low confidence.
if logging == 2 and capture_output: | ||
self.stdout = decode_bytes(self.run.stdout) | ||
self.stderr = decode_bytes(self.run.stderr) | ||
except subprocess.CalledProcessError as e: | ||
# only triggers on failure if check=True | ||
self.run_time = time.time() - start | ||
self.exitcode = e.returncode | ||
if logging >= 1 and capture_output: | ||
self.stdout = decode_bytes(e.stdout) | ||
self.stderr = decode_bytes(e.stderr) | ||
logging = 2 # force logging as exception occurred | ||
if logging == 2 and capture_output and log_file_path: | ||
write_to_file(f"{log_file_path}.stdout", self.stdout) | ||
write_to_file(f"{log_file_path}.stderr", self.stderr) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor common output handling into a helper method
The output handling logic in both _run
and _popen
methods is similar. To reduce code duplication and enhance maintainability, consider refactoring this common functionality into a separate helper method.
For example, create a method _handle_output
:
def _handle_output(self, stdout, stderr, logging, capture_output, log_file_path):
if logging == 2 and capture_output:
self.stdout = decode_bytes(stdout)
self.stderr = decode_bytes(stderr)
if logging == 2 and capture_output and log_file_path:
write_to_file(f"{log_file_path}.stdout", self.stdout)
write_to_file(f"{log_file_path}.stderr", self.stderr)
Then modify _run
and _popen
to use this helper:
In _run
:
self._handle_output(self.run.stdout, self.run.stderr, logging, capture_output, log_file_path)
In _popen
:
self._handle_output(self.stdout, self.stderr, logging, capture_output, log_file_path)
Also applies to: 112-119
def go(): | ||
if popen: | ||
self._popen(**ekwargs, **kwargs) | ||
else: | ||
self._run( | ||
check=check, capture_output=capture_output, **ekwargs, **kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid potential keyword argument conflicts by merging dictionaries
Passing **ekwargs
and **kwargs
separately to methods can lead to conflicts if there are overlapping keys, potentially causing unexpected behavior. Consider merging the dictionaries before passing them to _popen
and _run
to ensure all keyword arguments are managed correctly.
Apply this change to merge the dictionaries:
def go():
if popen:
+ combined_kwargs = {**ekwargs, **kwargs}
+ self._popen(**combined_kwargs)
- self._popen(**ekwargs, **kwargs)
else:
+ combined_kwargs = {**ekwargs, **kwargs, 'check': check, 'capture_output': capture_output}
+ self._run(**combined_kwargs)
- self._run(
- check=check, capture_output=capture_output, **ekwargs, **kwargs
- )
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def go(): | |
if popen: | |
self._popen(**ekwargs, **kwargs) | |
else: | |
self._run( | |
check=check, capture_output=capture_output, **ekwargs, **kwargs | |
def go(): | |
if popen: | |
combined_kwargs = {**ekwargs, **kwargs} | |
self._popen(**combined_kwargs) | |
else: | |
combined_kwargs = {**ekwargs, **kwargs, 'check': check, 'capture_output': capture_output} | |
self._run(**combined_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
src/phlop/timing/threaded_scope_timer.cpp (1)
41-41
: Implementation looks good but note the breaking change.The use of
std::make_shared
with the new parameter order is correct. The duration calculationnow() - start
is appropriate for measuring the elapsed time.Note that changing the parameter order is a breaking change. Ensure all call sites in the codebase have been updated to match the new order to prevent runtime errors.
inc/phlop/timing/threaded_scope_timer.hpp (1)
221-224
: Optimize parameter passing inBinaryTimerFileNode
constructorConsider passing
_fn_id
by value instead of by const reference. Sincestd::uint16_t
is a small integral type, passing it by value is generally more efficient and consistent with the other parameters_start
and_time
.Apply this diff to adjust the parameter:
- BinaryTimerFileNode(std::uint16_t const& _fn_id, std::uint64_t const _start, + BinaryTimerFileNode(std::uint16_t _fn_id, std::uint64_t const _start,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
.github/workflows/build_nix.yml
(1 hunks)inc/phlop/timing/scope_timer.hpp
(2 hunks)inc/phlop/timing/threaded_scope_timer.hpp
(7 hunks)mkn.yaml
(2 hunks)src/phlop/timing/threaded_scope_timer.cpp
(1 hunks)
🧰 Additional context used
📓 Learnings (1)
inc/phlop/timing/scope_timer.hpp (1)
Learnt from: PhilipDeegan
PR: PhilipDeegan/phlop#29
File: inc/phlop/timing/scope_timer.hpp:93-94
Timestamp: 2024-09-27T21:01:44.018Z
Learning: In this codebase, when passing small built-in types by value, parameters are declared as `const` to prevent modification within the function.
🔇 Additional comments (13)
mkn.yaml (2)
16-20
: LGTM! Well-structured profile configuration.
The new threaded_scope_timer
profile follows the established pattern:
- Inherits from base profile
- Maintains consistent structure with
scope_timer
- Properly specifies source and test files
- Uses shared mode like other profiles
✅ Verification successful
Profile configuration verified - all referenced files exist
The verification confirms that both the source file src/phlop/timing/threaded_scope_timer.cpp
and test file tests/timing/test_threaded_scope_timer.cpp
exist in their specified locations, validating the correctness of the profile configuration.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the existence of source and test files referenced in the profile
# Test: Check if the referenced files exist
echo "Verifying source file..."
fd "threaded_scope_timer.cpp" src/phlop/timing/
echo "Verifying test file..."
fd "test_threaded_scope_timer.cpp" tests/timing/
Length of output: 295
1-1
: Verify C++20 compatibility across the codebase.
The shebang line specifies -std=c++20
. Ensure that all source files, especially the new threaded_scope_timer.cpp
, are compatible with C++20 features.
✅ Verification successful
C++20 compatibility verified - no issues found
The codebase is already using modern C++ features that are well-supported in C++20, including:
- Standard library features:
<chrono>
,<thread>
,<mutex>
- Language features:
constexpr
, structured bindings, auto type deduction - The code doesn't use any C++20-specific features that would cause compatibility issues
The upgrade to C++20 in the build configuration is safe and appropriate for this codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for potential C++20 compatibility issues in source files
# Test: Look for C++20 specific features or potential compatibility issues
rg -g '*.{cpp,hpp,h}' -i '(concept|requires|coroutine|ranges|modules|constinit|consteval|co_await|co_yield|co_return)'
Length of output: 120
Script:
#!/bin/bash
# Let's check the actual source files content to understand the C++ features being used
# First, find all C++ source files
echo "=== C++ Source Files ==="
fd -e cpp -e hpp -e h
# Then, check for modern C++ features that might need C++20
echo -e "\n=== Modern C++ Usage ==="
rg -g '*.{cpp,hpp,h}' '\<(auto|decltype|nullptr|constexpr|noexcept|override|final|delete|default|lambda|thread|mutex|atomic)\>' -l
# Look for template usage which might benefit from C++20
echo -e "\n=== Template Usage ==="
rg -g '*.{cpp,hpp,h}' 'template\s*<' -l
# Check includes for modern C++ headers
echo -e "\n=== Modern C++ Headers ==="
rg -g '*.{cpp,hpp,h}' '#include\s*<(chrono|thread|mutex|atomic|memory|type_traits)>' -l
Length of output: 1198
Script:
#!/bin/bash
# Let's examine the actual content of the timing-related files to check C++ feature usage
echo "=== scope_timer.hpp ==="
cat inc/phlop/timing/scope_timer.hpp
echo -e "\n=== threaded_scope_timer.hpp ==="
cat inc/phlop/timing/threaded_scope_timer.hpp
echo -e "\n=== Modern C++ Feature Usage ==="
# Fixed the regex from previous attempt
rg -g '*.{cpp,hpp,h}' '(auto|decltype|nullptr|constexpr|noexcept|override|final|delete|default|lambda|thread|mutex|atomic)' --no-line-number
Length of output: 26038
.github/workflows/build_nix.yml (1)
30-30
: Verify build configuration changes.
The build command changes look good but require verification of a few aspects:
- The new
threaded_scope_timer
profile - C++20 compiler compatibility on Ubuntu latest
- Impact of warning level 9
Let's verify the configuration:
Consider adding documentation about the warning level configuration and its implications.
✅ Verification successful
Build configuration changes verified with minor documentation needs
The verification results show:
- ✓
threaded_scope_timer
profile exists in mkn.yaml with proper source and test files - ✓ C++20 upgrade appears safe as no advanced C++20 features are currently used
- ⚠ Warning level 9 (-W 9) is not documented
Consider adding documentation about the warning level configuration in the README.md for better maintainability.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the new profile and C++ requirements
# Test 1: Check if threaded_scope_timer profile exists in mkn.yaml
echo "Checking for threaded_scope_timer profile..."
rg "threaded_scope_timer" "mkn.yaml"
# Test 2: Check for C++20 specific features in the codebase
echo "Checking for C++20 features..."
rg "concept|requires|constinit|consteval|co_await|co_yield|co_return|ranges::|views::"
# Test 3: Check if warning level 9 is documented
echo "Checking for warning level documentation..."
rg "warning.*level.*9|W.*9" "README.md" "docs/"
Length of output: 768
src/phlop/timing/threaded_scope_timer.cpp (1)
41-41
: Verify RunTimerReportSnapshot constructor declaration.
The parameter order has changed to pass start
as the third argument and duration as the fourth. Let's verify this matches the header declaration.
✅ Verification successful
Constructor declaration matches the implementation
The constructor declaration in both header files (inc/phlop/timing/threaded_scope_timer.hpp
and inc/phlop/timing/scope_timer.hpp
) matches the implementation:
- First parameter:
RunTimerReport* s
(self) - Second parameter:
RunTimerReport* p
(parent) - Third parameter:
std::uint64_t const st
(start time) - Fourth parameter:
std::uint64_t const t
(duration)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find and verify the RunTimerReportSnapshot constructor declaration
# Test: Search for RunTimerReportSnapshot constructor declaration
rg -A 3 "RunTimerReportSnapshot\(" "inc/phlop/timing/"
Length of output: 711
inc/phlop/timing/scope_timer.hpp (2)
93-94
: LGTM! Constructor signature improvement.
The change to pass uint64_t
parameters by value aligns with the codebase's convention for small built-in types. This can improve performance by avoiding reference indirection while maintaining safety through const
qualification.
170-171
: LGTM! Improved constructor and member variable safety.
Two good improvements here:
- Constructor now takes small built-in types by value with
const
qualification, consistent with the codebase convention. - Member variables are now
const
-qualified, ensuring immutability after construction which enhances thread safety and prevents accidental modifications.
Also applies to: 178-180
inc/phlop/timing/threaded_scope_timer.hpp (7)
144-148
: Enhancement of RunTimerReportSnapshot
constructor
The addition of the start
parameter and member variable start
in the RunTimerReportSnapshot
constructor improves the granularity of timing data by capturing the start time of each snapshot.
157-157
: Introduction of start
member variable
The new start
member variable in RunTimerReportSnapshot
holds the start time, complementing the existing time
member and enabling more precise timing analysis.
196-196
: Switch to std::chrono::steady_clock
for reliable timing
Changing from std::chrono::system_clock
to std::chrono::steady_clock
in the now()
function ensures that timing measurements are not affected by system clock adjustments, providing more consistent and reliable timing data.
229-231
: Effective use of const
qualifiers for member variables
Making fn_id
, start
, and time
member variables const
ensures their immutability after initialization, which is a good practice for maintaining data integrity.
261-261
: Consistency in accessing key_ids
with child nodes
Similar to the previous comment, ensure that the key c->childs[i]->self->k
exists in key_ids
before using it in node.kinder.emplace_back(...)
to avoid potential runtime errors.
You can extend the previous verification script to include child nodes:
#!/bin/bash
# Description: Verify that all child keys are present in `key_ids`.
# Expectation: All child keys should be accounted for in `key_ids`.
# Search for all child key accesses.
rg 'key_ids\[std::string\{c->childs\[\]\->self->k\}\]'
249-250
: Ensure existence of keys in key_ids
before access
When accessing key_ids[std::string{trace->self->k}]
, it is assumed that the key exists in the key_ids
map. Ensure that all keys are populated in key_ids
prior to this access to prevent potential exceptions or undefined behavior.
Run the following script to confirm that all keys used are present in key_ids
:
✅ Verification successful
Let me gather more context about how these keys are populated by examining the code structure and flow.
Let me check one final aspect - how the recurse_traces_for_keys
function is called in relation to the JSON dumping.
Key access in key_ids
is safe - keys are populated before access
The code is safe. The recurse_traces_for_keys
function is called on each trace before the JSON dumping process begins, and it ensures all keys are populated in key_ids
. The function checks for key existence with key_ids.count(s)
and populates missing keys, recursively handling all child traces. This guarantees that when key_ids[std::string{trace->self->k}]
is accessed during JSON generation, the key will exist.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that all keys used in `emplace_back` are present in `key_ids`.
# Expectation: All keys should be found in `key_ids` without any missing entries.
# Search for all instances where keys are added to `key_ids`.
rg 'key_ids\.emplace\('
# Search for all instances where `key_ids` is accessed.
rg 'key_ids\['
# Compare the keys added versus keys accessed.
Length of output: 836
Script:
#!/bin/bash
# Get the context around where keys are added to key_ids
rg -B 5 -A 5 'key_ids\.emplace'
# Get the full context of the function where key_ids is accessed
ast-grep --pattern 'void dump_json($$$) {
$$$
}'
Length of output: 1600
Script:
#!/bin/bash
# Get the full context of the dump_json function and surrounding code
rg -B 20 -A 20 'void dump_json'
# Also check where recurse_traces_for_keys is called
rg 'recurse_traces_for_keys'
Length of output: 587
331-331
: Verify compatibility of updated output format in _write
method
The inclusion of node.start
in the output changes the file format produced by the _write()
method. Ensure that any code or tools that read this output file are updated to handle the new format, preventing potential parsing errors or data misinterpretation.
Run the following script to locate code that reads from the timing file and check if it accommodates the new format:
✅ Verification successful
Based on the search results, I can now provide a final response. The Python file parser in phlop/timing/scope_timer.py
already handles the new format with start times. Here's the evidence:
- The file parser reads the timing data line by line and processes both formats:
def file_parser(times_filepath):
with open(times_filepath, "r") as file:
bits = line.split(" ") # This split handles both old and new formats
-
The test file
tests/timing/test_scope_timer.py
verifies the parsing functionality and doesn't make any assumptions about the specific format of the time data. -
The format change in
_write
method is backward compatible since:- The function ID and time values remain in the same order
- The colon separator in
node.start << ":" << node.time
ensures clear distinction between start and duration values - The file parser splits on whitespace, making it flexible to handle both formats
No changes needed - output format change is compatible with existing parsers
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find code that reads the timing file to ensure it handles the updated format with `start` times.
# Search for instances of file reading operations on the timing file.
rg 'std::ifstream.*phlop_scope_times\.bin' -A 5
# Search for parsing logic that may be affected by the format change.
rg 'std::getline\(' -A 5
Length of output: 76
Script:
#!/bin/bash
# Let's try a broader search to find file operations and timing-related code
# Search for any mentions of the timing file
rg -i "phlop_scope_times" -A 5
# Search for file reading operations more broadly
rg "std::ifstream|fstream.*open|getline.*>>" -A 5
# Look for timing-related classes and methods
ast-grep --pattern 'class $_ {
$$$
read($$$) {
$$$
}
$$$
}'
# Search for any timing-related file operations
rg -i "timing.*\.(txt|bin|log|dat)" -A 5
Length of output: 1280
Script:
#!/bin/bash
# Let's examine the test file and related timing code
# Check the Python test file mentioned
cat tests/timing/test_scope_timer.py
# Look for any code that processes timing data
rg -g "*.{py,hpp,cpp}" "read.*time|parse.*time" -A 5
# Search for any file operations in timing-related files
rg -g "*timer*.{hpp,cpp,py}" "(open|read|parse|>>|<<)" -A 5
# Look specifically for timing data structures
ast-grep --pattern 'struct $_ {
$$$
time
$$$
}'
Length of output: 28481
Summary by CodeRabbit
Release Notes
New Features
RunTimer
class for flexible command execution with improved error handling and logging capabilities.ProcessNonZeroExitCode
class to include a body for better clarity.read_file
function to read file contents with error handling.read_last_lines_of
function to read the last few lines from a specified file.binless
function for improved command string processing in test cases.threaded_scope_timer
to the build configuration for enhanced timing functionality.Bug Fixes
waiter
function to prevent indefinite blocking during queue operations.Documentation
Style
scope_timer.py
for better readability.