Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

popen options for subprocessing #32

Merged
merged 6 commits into from
Nov 4, 2024
Merged

popen options for subprocessing #32

merged 6 commits into from
Nov 4, 2024

Conversation

PhilipDeegan
Copy link
Owner

@PhilipDeegan PhilipDeegan commented Oct 23, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced RunTimer class for flexible command execution with improved error handling and logging capabilities.
    • Updated ProcessNonZeroExitCode class to include a body for better clarity.
    • Introduced a new read_file function to read file contents with error handling.
    • Added read_last_lines_of function to read the last few lines from a specified file.
    • Modified binless function for improved command string processing in test cases.
    • Added a new profile threaded_scope_timer to the build configuration for enhanced timing functionality.
  • Bug Fixes

    • Improved error handling in the waiter function to prevent indefinite blocking during queue operations.
  • Documentation

    • Updated version of the project to 0.0.28 in configuration files.
  • Style

    • Code organization improvements in scope_timer.py for better readability.

Copy link

coderabbitai bot commented Oct 23, 2024

Walkthrough

The pull request introduces several modifications across various files in the phlop project. Key changes include the enhancement of the RunTimer class to support different command execution methods, adjustments to error handling in the waiter function, and updates to the ProcessNonZeroExitCode class. Additionally, the classes_in_file function now raises errors on import failures by default, and the binless function has been refined for better command string processing. The project version has been incremented in both pyproject.toml and setup.py. New functions for file reading have also been added.

Changes

File Change Summary
phlop/proc.py Added body to ProcessNonZeroExitCode class; no changes to function signatures.
phlop/procs/runtimer.py Updated RunTimer class to add popen parameter; replaced run with go; added _run and _popen methods for command execution.
phlop/reflection.py Changed default value of fail_on_import_error in classes_in_file from False to True.
phlop/testing/parallel_processor.py Introduced timeout variable for queue operations; modified CallableTest.__call__ method to set capture_output to False and added print_log_files method.
phlop/testing/test_cases.py Altered binless function for command string processing; no changes to determine_cores_for_test_case.
phlop/timing/scope_timer.py Cosmetic changes: reordered imports and added whitespace; no functional changes.
pyproject.toml Updated project version from 0.0.27 to 0.0.28.
setup.py Updated package version from 0.0.27 to 0.0.28.
phlop/os.py Added new functions read_file(file) and read_last_lines_of(file, n=10) for file reading with error handling.
inc/phlop/timing/scope_timer.hpp Updated constructor signatures for RunTimerReportSnapshot and BinaryTimerFileNode structs; made member variables const.
inc/phlop/timing/threaded_scope_timer.hpp Modified constructors and added new member variables for timing accuracy in RunTimerReportSnapshot and BinaryTimerFileNode.
mkn.yaml Added new profile threaded_scope_timer for build configuration.
src/phlop/timing/threaded_scope_timer.cpp Updated constructor and destructor logic in scope_timer class to include new timing parameters.

Possibly related PRs

  • start time with scope time #29: The changes in this PR involve modifications to the RunTimerReportSnapshot and BinaryTimerFileNode structures to include a start time parameter, which is directly related to the changes in the ProcessNonZeroExitCode class that also deals with timing and execution control.
  • variance across scope timing files (naive) #30: This PR enhances the functionality of the scope_timer.py file, which is relevant as it deals with timing data, similar to the changes made in the main PR regarding the run_mp function's control flow for executing commands.
  • use steady clock for scope timings #31: This PR updates the timing mechanism in the scope_timer struct to use steady_clock, which aligns with the timing and execution improvements made in the main PR's modifications to the ProcessNonZeroExitCode class.

Poem

In the land of code where rabbits play,
New features hop in, brightening the day.
With timers and processes, we leap and bound,
Error handling's better, solutions abound.
Version numbers rise, like carrots in bloom,
In this coding garden, there's always more room! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (6)
sh/lint.sh (1)

Line range hint 14-18: Optimize the autoflake execution.

The current implementation runs autoflake individually on each file, which is less efficient for large codebases.

Consider this more efficient approach:

-for FILE in ${PY_FILES[@]}; do
-  autoflake -i "$FILE"
-done
+echo "$PY_FILES" | tr ' ' '\n' | xargs -I {} autoflake -i {}
phlop/reflection.py (1)

19-19: Consider adding docstring to document error handling behavior.

The function's error handling is controlled by both a parameter and an environment variable. Consider adding a docstring to explain:

  • The default fail-fast behavior
  • The environment variable override
  • Examples of when to use each option

Here's a suggested docstring:

def classes_in_file(file_path, subclasses_only=None, fail_on_import_error=True):
    """Extract classes from a Python file with configurable error handling.
    
    Args:
        file_path: Path to the Python file
        subclasses_only: Filter to specific parent classes
        fail_on_import_error: If True (default), raises import errors. If False, logs and continues.
            Can be overridden by PHLOP_FORCE_RAISE_ON_IMPORT_ERROR environment variable.
    
    Returns:
        List of class objects found in the file
    """
phlop/proc.py (2)

Line range hint 82-86: Replace placeholder implementation with proper binary existence check.

The current implementation raises an error and the commented approach using which is not cross-platform. Consider using Python's built-in shutil.which().

Here's a better implementation:

 def binary_exists_on_path(bin):
-    """
-    https://linux.die.net/man/1/which
-    """
-    raise ValueError("do better")
-    return run(f"which {bin}").exitcode == 0
+    """Check if a binary exists in PATH.
+    
+    Args:
+        bin (str): Name of the binary to check
+    
+    Returns:
+        bool: True if the binary exists and is executable
+    """
+    import shutil
+    return shutil.which(bin) is not None

Line range hint 35-63: Consider improving type safety and error handling.

The run_mp function has complex error handling and could benefit from:

  1. Type hints for better code maintainability
  2. Simplified executor shutdown logic
  3. Consistent subprocess usage pattern

Consider these improvements:

-def run_mp(cmds, N_CORES=None, **kwargs):
+def run_mp(cmds: list[str], N_CORES: int | None = None, **kwargs) -> list[RunTimer]:
     """
     spawns N_CORES threads (default=len(cmds)) running commands and waiting for results
-    https://docs.python.org/3/library/concurrent.futures.html
+    
+    Args:
+        cmds: List of commands to execute
+        N_CORES: Number of parallel threads (defaults to len(cmds))
+        **kwargs: Additional arguments passed to run()
+    
+    Returns:
+        List of RunTimer results
+    
+    Raises:
+        ProcessNonZeroExitCode: If any command fails and check=True
     """
     import concurrent.futures
+    from typing import List
 
     if N_CORES is None:
         N_CORES = len(cmds)
 
     with concurrent.futures.ThreadPoolExecutor(max_workers=N_CORES) as executor:
         jobs = [executor.submit(run, cmd, **kwargs) for cmd in cmds]
-        results = []
+        results: List[RunTimer] = []
         for future in concurrent.futures.as_completed(jobs):
             try:
                 results += [future.result()]
                 if future.exception() is not None:
                     raise future.exception()
             except Exception as exc:
                 if kwargs.get("check", False):
-                    dic = {"wait": False}
-                    if sys.version_info > (3, 8):
-                        dic["cancel_futures"] = True
-                    executor.shutdown(**dic)
+                    # Cancel all pending futures and shutdown
+                    for job in jobs:
+                        job.cancel()
+                    executor.shutdown(wait=False, cancel_futures=True)
                     raise exc
                 else:
                     print(f"run_mp generated an exception: {exc}")
phlop/testing/test_cases.py (2)

145-147: Consider handling edge cases in command processing.

The current implementation works for basic commands but might need to handle these edge cases:

  • Commands with spaces in paths
  • Commands with escaped characters
  • Symlinks that might need their paths preserved

Consider enhancing the implementation:

def binless(test_case):
    if test_case.cmd.startswith("/usr/"):
-        bits = test_case.cmd.split(" ")
-        test_case.cmd = " ".join([bits[0].split("/")[-1]] + bits[1:])
+        import shlex
+        bits = shlex.split(test_case.cmd)
+        # Preserve symlinks by using os.path.realpath if needed
+        binary = bits[0].split("/")[-1]
+        test_case.cmd = " ".join([shlex.quote(binary)] + [shlex.quote(b) for b in bits[1:]])
    return test_case

148-148: Remove commented debug print statement.

The commented debug print statement doesn't provide value and should be removed.

-    # print("test_case.cmd ", test_case.cmd)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between da4d2e2 and c1cd392.

📒 Files selected for processing (9)
  • phlop/proc.py (1 hunks)
  • phlop/procs/runtimer.py (1 hunks)
  • phlop/reflection.py (1 hunks)
  • phlop/testing/parallel_processor.py (4 hunks)
  • phlop/testing/test_cases.py (1 hunks)
  • phlop/timing/scope_timer.py (1 hunks)
  • pyproject.toml (1 hunks)
  • setup.py (1 hunks)
  • sh/lint.sh (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • phlop/timing/scope_timer.py
  • pyproject.toml
  • setup.py
🧰 Additional context used
🪛 Ruff
phlop/procs/runtimer.py

49-49: Use a context manager for opening files

(SIM115)


50-50: Use a context manager for opening files

(SIM115)


63-63: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

🔇 Additional comments (4)
sh/lint.sh (1)

10-10: Verify the impact of disabling black formatter.

Commenting out the black formatter could lead to inconsistent code formatting across the project. While other tools like isort and ruff are still active, they serve different purposes and may not fully cover formatting consistency.

Consider:

  1. Adding a comment explaining why black was disabled
  2. Documenting the formatting standards now that black is disabled
  3. Ensuring CI checks are updated accordingly
✅ Verification successful

Let me try a different approach to verify the impact of disabling black formatter by checking the Python files directly.


Let me check one more thing to see if there are other formatting tools in use that might be handling the formatting responsibilities.


Disabling black formatter is acceptable due to redundant formatting coverage

Based on the verification results, commenting out black is acceptable because:

  1. Ruff is configured and active in ruff.toml with formatting-related rules (E7, E9)
  2. The project maintains code style through multiple tools:
    • isort for import sorting
    • ruff for code style and formatting
    • pylint for error checking
    • autoflake for removing unused imports

The removal of black eliminates potential formatting conflicts while maintaining code quality through other tools. No further action needed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if black formatting is enforced elsewhere
# Test 1: Check if black is listed in project dependencies
rg -A 5 "black" pyproject.toml setup.py
# Test 2: Check for other potential formatting configurations
fd -e toml -e cfg -e ini --exec rg -l "black"
# Test 3: Check if black is run in CI
fd -e yml -e yaml --exec rg -l "black"

Length of output: 124


Script:

#!/bin/bash
# Check Python files for formatting-related evidence
# 1. Look for any remaining black configuration or disabled comments
rg -i "# *black" --type py

# 2. Check if there are any Python files that might be affected
fd -e py --type f | wc -l

# 3. Look for other formatting tools configuration in pyproject.toml
if [ -f "pyproject.toml" ]; then
    echo "=== pyproject.toml contents ==="
    cat pyproject.toml
fi

# 4. Check for any pre-commit hooks that might handle formatting
if [ -f ".pre-commit-config.yaml" ]; then
    echo "=== .pre-commit-config.yaml contents ==="
    cat .pre-commit-config.yaml
fi

Length of output: 729


Script:

#!/bin/bash
# Check for other formatting tools and their configurations
# 1. Check ruff configuration which might handle formatting
rg -l "ruff" .
if [ -f "ruff.toml" ]; then
    echo "=== ruff.toml contents ==="
    cat ruff.toml
fi

# 2. Look for formatting-related settings in setup.cfg
if [ -f "setup.cfg" ]; then
    echo "=== setup.cfg contents ==="
    cat setup.cfg
fi

# 3. Check the actual lint.sh file to see all active linting tools
echo "=== lint.sh contents ==="
cat sh/lint.sh

Length of output: 823

phlop/reflection.py (1)

19-19: Verify the impact of changing the default error handling behavior.

Changing fail_on_import_error default from False to True makes the function fail fast by default. While this is generally good practice, it could potentially break existing code that relies on silent failures.

Let's verify the usage to assess the impact:

✅ Verification successful

Let me analyze the results and generate one more verification to get the complete picture:

The initial results show that the existing usage already explicitly sets fail_on_import_error=True, but let's get more context about where and how it's being used to make a definitive conclusion.


Default parameter change is safe to proceed

The change to make fail_on_import_error=True by default is safe because:

  1. The only existing usage found in the codebase already explicitly sets fail_on_import_error=True
  2. This indicates that the new default actually aligns with the current usage pattern
  3. No instances of relying on the old default behavior (False) were found
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for direct calls to classes_in_file to identify potential breaking changes
rg -l "classes_in_file\(" | grep -v "reflection.py" | xargs -I {} rg -A 2 "classes_in_file\(" {}

Length of output: 263


Script:

#!/bin/bash
# Let's find the file containing this usage and get more context
rg -l "classes_in_file\(" | grep -v "reflection.py" | xargs -I {} rg -B 5 -A 2 "classes_in_file\(" {}

Length of output: 503

phlop/testing/parallel_processor.py (2)

17-18: LGTM!

Simple exception class definition is appropriate here.


38-38: Verify error handling with capture_output=False.

Setting capture_output=False means the output goes directly to stdout/stderr. This might affect the error handling at line 40 where self.run.stderr is checked. Ensure this change doesn't break error detection and reporting.

✅ Verification successful

Let me gather more context about how self.run is initialized and used to better understand the impact of capture_output=False.


Let me gather one more piece of information to understand the complete context of the CallableTest class.


Let me try one last search with a different pattern to understand the complete implementation.


Based on the comprehensive context gathered, I can now provide the final response:

Setting capture_output=False is safe and intentional

The code is properly handling the stderr access even with capture_output=False because:

  1. The custom run() function from phlop/proc.py is being used, not Python's subprocess.run directly
  2. The RunTimer class in phlop/procs/runtimer.py handles both capture_output=True and False cases:
    • When capture_output=False, it redirects output to log files if log_file_path is provided
    • The stderr and exitcode properties are still accessible regardless of the capture_output setting
  3. The error handling at line 46-47 will work correctly as the RunTimer class maintains these properties in both modes
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if stderr is still accessible when capture_output is False
ast-grep --pattern 'if self.run.stderr and self.run.exitcode != 0:
    $$$'

Length of output: 245


Script:

#!/bin/bash
# Check how self.run is initialized and used
ast-grep --pattern 'class CallableTest:
  $$$
  def __call__($$$) {
    $$$
  }
  $$$'

Length of output: 88


Script:

#!/bin/bash
# Check if subprocess.run is imported and how it's used
rg "subprocess\.run|from subprocess import run" -A 3

Length of output: 352


Script:

#!/bin/bash
# Look for the complete run() call
rg "run\(" -A 3 phlop/testing/parallel_processor.py

Length of output: 176


Script:

#!/bin/bash
# Get the complete CallableTest class implementation
rg -U "class CallableTest.*?def" phlop/testing/parallel_processor.py -A 50

Length of output: 76


Script:

#!/bin/bash
# Get the complete file content to understand the implementation
cat phlop/testing/parallel_processor.py

Length of output: 3920


Script:

#!/bin/bash
# Check if there are any other references to stderr or capture_output
rg "stderr|capture_output" -B 2 -A 2

Length of output: 8525

phlop/proc.py Outdated Show resolved Hide resolved
phlop/testing/parallel_processor.py Outdated Show resolved Hide resolved
phlop/testing/parallel_processor.py Outdated Show resolved Hide resolved
phlop/testing/parallel_processor.py Outdated Show resolved Hide resolved
@@ -26,63 +24,99 @@ def __init__(
working_dir=None,
log_file_path=None,
logging=2,
popen=True,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Changing default popen=True may introduce breaking changes

Setting popen=True as the default changes the behavior of RunTimer to use subprocess.Popen instead of subprocess.run. This may affect existing code that relies on the previous default behavior of using subprocess.run.

Consider setting popen=False as the default to maintain backward compatibility, or document this change clearly if the behavior change is intentional.

self._popen(*kwargs, **ekwargs)
else:
self._run(
check=check, capture_output=capture_output, *kwargs, **ekwargs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid star-arg unpacking after keyword arguments

Using *kwargs after keyword arguments is discouraged and can cause unexpected behavior. Adjusting the argument order or unpacking method can prevent this issue.

By replacing *kwargs with **kwargs as suggested, you resolve this concern.

🧰 Tools
🪛 Ruff

63-63: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

Comment on lines 60 to 63
self._popen(*kwargs, **ekwargs)
else:
self._run(
check=check, capture_output=capture_output, *kwargs, **ekwargs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Incorrect unpacking of kwargs; use **kwargs instead of *kwargs

Unpacking kwargs using *kwargs passes the dictionary keys as positional arguments, which is not intended and can lead to errors. Since kwargs is a dictionary of keyword arguments, it should be unpacked using **kwargs.

Replace *kwargs with **kwargs to correctly pass the additional keyword arguments:

-                self._popen(*kwargs, **ekwargs)
+                self._popen(**kwargs, **ekwargs)

Similarly, adjust the call to _run:

-                self._run(
-                    check=check, capture_output=capture_output, *kwargs, **ekwargs
-                )
+                self._run(
+                    check=check, capture_output=capture_output, **kwargs, **ekwargs
+                )

This change also resolves the static analysis warning about star-arg unpacking after a keyword argument.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._popen(*kwargs, **ekwargs)
else:
self._run(
check=check, capture_output=capture_output, *kwargs, **ekwargs
self._popen(**kwargs, **ekwargs)
else:
self._run(
check=check, capture_output=capture_output, **kwargs, **ekwargs
🧰 Tools
🪛 Ruff

63-63: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

Comment on lines 49 to 50
stdout=open(f"{log_file_path}.stdout", "w"),
stderr=open(f"{log_file_path}.stderr", "w"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use context managers to safely handle file resources

When opening files with open(), it's best practice to use context managers (with statements) to ensure files are properly closed after use, even if an exception occurs. This prevents potential resource leaks.

Modify the code to manage file handles and ensure they are closed after the subprocess finishes:

-            ekwargs.update(
-                dict(
-                    stdout=open(f"{log_file_path}.stdout", "w"),
-                    stderr=open(f"{log_file_path}.stderr", "w"),
-                ),
-            )
+            stdout_file = open(f"{log_file_path}.stdout", "w")
+            stderr_file = open(f"{log_file_path}.stderr", "w")
+            ekwargs.update(
+                dict(
+                    stdout=stdout_file,
+                    stderr=stderr_file,
+                ),
+            )

Then, after the subprocess completes, ensure the files are closed in _run and _popen methods:

# In _run method, after subprocess.run():
+            if not capture_output and log_file_path:
+                stdout_file.close()
+                stderr_file.close()

# In _popen method, after subprocess.Popen():
+            if not capture_output and log_file_path:
+                stdout_file.close()
+                stderr_file.close()

Committable suggestion was skipped due to low confidence.

🧰 Tools
🪛 Ruff

49-49: Use a context manager for opening files

(SIM115)


50-50: Use a context manager for opening files

(SIM115)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between c1cd392 and 58de0e4.

📒 Files selected for processing (2)
  • phlop/proc.py (1 hunks)
  • phlop/testing/parallel_processor.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • phlop/proc.py
🔇 Additional comments (3)
phlop/testing/parallel_processor.py (3)

15-16: Consider making the timeout configurable

The hardcoded timeout of 3600 seconds (1 hour) might not be suitable for all environments. Consider making this configurable through environment variables.

The previous review suggestion about making the timeout configurable through PHLOP_QUEUE_TIMEOUT is still valid and would improve the flexibility of the code.


100-100: Fix the Process creation argument syntax

The previous review comment about the incorrect Process creation syntax is still valid. The extra parentheses around pqueue create a single-element tuple, which is incorrect:

-Process(target=runner, args=(test, (pqueue)))
+Process(target=runner, args=(test, pqueue))

39-39: Verify stderr handling with capture_output=False

Setting capture_output=False means the command's output goes directly to stdout/stderr. This might affect the error handling logic that checks self.run.stderr. Please verify that error detection still works as expected.

Also applies to: 48-49

✅ Verification successful

Let me analyze the code further to understand the impact of capture_output=False.


Based on the code analysis, I can now provide the final response:

Setting capture_output=False is correct and won't affect error handling

The error handling in CallableTest.__call__ will work correctly with capture_output=False because:

  1. The run() function uses RunTimer which handles both capture and non-capture modes appropriately
  2. When capture_output=False, output is redirected to log files ({log_file_path}.stderr) which are still accessible for error checking
  3. The error check if self.run.stderr and self.run.exitcode != 0 remains valid as RunTimer maintains the stderr property in both modes
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if there are any other places in the codebase that rely on captured output
# from this class.

# Search for references to CallableTest and stderr/stdout usage
rg -A 5 "CallableTest.*stderr|stdout"

Length of output: 6653


Script:

#!/bin/bash
# Check the implementation of CallableTest class and its error handling
ast-grep --pattern 'class CallableTest {
  $$$
  __call__($$$) {
    $$$
  }
  $$$
}'

Length of output: 87


Script:

#!/bin/bash
# Let's check the Python implementation since ast-grep didn't work for Python files
rg -A 20 "class CallableTest" 

Length of output: 1482


Script:

#!/bin/bash
# Let's check the implementation of the run function being used
rg -A 15 "def run\(" phlop/

Length of output: 775

phlop/testing/parallel_processor.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (2)
phlop/os.py (2)

Line range hint 41-45: Maintain consistent error handling across file operations.

For consistency with the improvements made to read_file, the error handling in write_to_file should also use proper exception chaining.

Apply this diff to improve error handling:

         with open(file, mode) as f:
             f.write(contents)
     except IOError as e:
-        raise RuntimeError(f"Failed to write to file {file}: {e}")
+        raise RuntimeError(f"Failed to write to file {file}: {e}") from e
🧰 Tools
🪛 Ruff

48-48: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


57-57: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


51-51: Add docstring to document function behavior.

The new read_file function would benefit from a docstring explaining its purpose, parameters, return value, and the types of exceptions it may raise.

Add documentation like this:

 def read_file(file):
+    """Read contents of a file.
+
+    Args:
+        file: Path to the file to read.
+
+    Returns:
+        str: Contents of the file.
+
+    Raises:
+        RuntimeError: If the file cannot be read.
+    """
     try:
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 58de0e4 and 2d2cd63.

📒 Files selected for processing (2)
  • phlop/os.py (1 hunks)
  • phlop/testing/parallel_processor.py (5 hunks)
🧰 Additional context used
🪛 Ruff
phlop/os.py

57-57: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (3)
phlop/testing/parallel_processor.py (3)

108-108: Fix Process args tuple syntax

The extra parentheses issue identified in the previous review still exists.


122-128: Improve queue error handling

The issues with broad exception handling and lack of backoff strategy identified in the previous review still exist.


40-40: Verify output handling with logging modes

Setting capture_output=False means output will be streamed directly rather than captured. Please verify this change works correctly with all LoggingMode values, especially ON_NON_ZERO_EXIT_CODE.

✅ Verification successful

Let me gather more information about how the output handling is implemented.


Based on the code analysis, I can now generate the final response:

Setting capture_output=False is correctly handled with all logging modes

The implementation in RunTimer class (phlop/procs/runtimer.py) properly handles the case when capture_output=False:

  • For LoggingMode.OFF (0): Output streams directly to stdout/stderr
  • For LoggingMode.ON_NON_ZERO_EXIT_CODE (1) and LoggingMode.ALWAYS (2): If a log file path is provided, output is redirected to .stdout and .stderr files via file descriptors (see lines with stdout=open(f"{log_file_path}.stdout", "w"))
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for LoggingMode usage to understand the impact
rg -A 5 "LoggingMode" 

Length of output: 1084


Script:

#!/bin/bash
# Search for the CallableTest class implementation and its usage
ast-grep --pattern 'class CallableTest {
  $$$
  __call__($$$) {
    $$$
  }
  $$$
}'

# Also search for any output handling related code
rg -A 5 "capture_output" 

Length of output: 7597

phlop/os.py Outdated Show resolved Hide resolved
Comment on lines +52 to +57
def print_log_files(self):
print(self.run.stdout)
print(self.run.stderr)
if self.test_case.log_file_path:
print(read_file(f"{self.test_case.log_file_path}.stdout"))
print(read_file(f"{self.test_case.log_file_path}.stdout"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix duplicate stdout log print

There's a bug where stdout is printed twice instead of printing stderr.

     def print_log_files(self):
         print(self.run.stdout)
         print(self.run.stderr)
         if self.test_case.log_file_path:
             print(read_file(f"{self.test_case.log_file_path}.stdout"))
-            print(read_file(f"{self.test_case.log_file_path}.stdout"))
+            print(read_file(f"{self.test_case.log_file_path}.stderr"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def print_log_files(self):
print(self.run.stdout)
print(self.run.stderr)
if self.test_case.log_file_path:
print(read_file(f"{self.test_case.log_file_path}.stdout"))
print(read_file(f"{self.test_case.log_file_path}.stdout"))
def print_log_files(self):
print(self.run.stdout)
print(self.run.stderr)
if self.test_case.log_file_path:
print(read_file(f"{self.test_case.log_file_path}.stdout"))
print(read_file(f"{self.test_case.log_file_path}.stderr"))

phlop/testing/parallel_processor.py Outdated Show resolved Hide resolved
phlop/testing/parallel_processor.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2d2cd63 and be63657.

📒 Files selected for processing (2)
  • phlop/os.py (1 hunks)
  • phlop/testing/parallel_processor.py (5 hunks)
🧰 Additional context used
🪛 Ruff
phlop/os.py

56-56: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


64-64: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (6)
phlop/testing/parallel_processor.py (6)

15-16: Consider making the timeout configurable

A 1-hour timeout might be excessive for most test scenarios. Consider making this configurable through environment variables.

The suggestion from the past review still applies. Consider implementing:

-timeout = 60 * 60
+timeout = int(os.getenv('PHLOP_QUEUE_TIMEOUT', str(60 * 60)))  # 1 hour default

51-56: Fix duplicate stdout log print

There's a bug where stdout is printed twice instead of printing stderr.

The suggestion from the past review still applies:

-            print(read_file(f"{self.test_case.log_file_path}.stdout"))
+            print(read_file(f"{self.test_case.log_file_path}.stderr"))

76-91: LGTM! Good addition for debugging.

The print_pending function provides useful visibility into running jobs by showing both stdout and stderr of pending tests.


137-144: Improve queue error handling

The current implementation has several issues:

  1. Catching generic Exception is too broad
  2. The error handling doesn't distinguish between queue timeouts and other errors

The suggestion from the past review still applies:

-            try:
-                proc = queue.get(timeout=timeout)
-            except Exception:
-                logger.info("Queue timeout - polling")
-                continue
+            try:
+                proc = queue.get(timeout=timeout)
+            except queue.Empty:
+                logger.info(f"No jobs completed within {timeout} seconds - continuing to poll")
+                time.sleep(1)  # Add backoff to reduce CPU usage
+                continue
+            except Exception as e:
+                logger.error(f"Unexpected error while polling queue: {e}")
+                raise

152-153: Fix exit code failure check

Using exitcode > 0 might miss negative exit codes which also indicate failure.

The suggestion from the past review still applies:

-                if proc.run.exitcode > 0:
+                if proc.run.exitcode != 0:

39-39: Verify the impact of disabling output capture

Setting capture_output=False might affect how test output is processed. Ensure this change doesn't break any output parsing or logging functionality.

Comment on lines +51 to +57
def read_file(file):
try:
with open(file, "r") as f:
return f.read()
except IOError as e:
raise RuntimeError(f"Failed to read file {file}: {e}")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance error handling and file operations.

The function needs improvements in error handling and file operations:

  1. Add exception chaining for better error traceability
  2. Add file existence check
  3. Specify encoding for text file operations

Apply this diff to address the issues:

 def read_file(file):
     try:
+        if not os.path.exists(file):
+            raise FileNotFoundError(f"File not found: {file}")
-        with open(file, "r") as f:
+        with open(file, "r", encoding="utf-8") as f:
             return f.read()
     except IOError as e:
-        raise RuntimeError(f"Failed to read file {file}: {e}")
+        raise RuntimeError(f"Failed to read file {file}: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def read_file(file):
try:
with open(file, "r") as f:
return f.read()
except IOError as e:
raise RuntimeError(f"Failed to read file {file}: {e}")
def read_file(file):
try:
if not os.path.exists(file):
raise FileNotFoundError(f"File not found: {file}")
with open(file, "r", encoding="utf-8") as f:
return f.read()
except IOError as e:
raise RuntimeError(f"Failed to read file {file}: {e}") from e
🧰 Tools
🪛 Ruff

56-56: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Comment on lines +59 to +64
def read_last_lines_of(file, n=10):
try:
with open(file, "r") as f:
return f.readlines()[:10]
except IOError as e:
raise RuntimeError(f"Failed to read file {file}: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix critical logic error and improve implementation.

The function has several issues:

  1. It's reading the first 10 lines instead of the last N lines (using [:10] instead of [-n:])
  2. Ignoring the n parameter and hardcoding 10
  3. Same issues as read_file: missing exception chaining, file existence check, and encoding specification
  4. Inefficient for large files as it reads all lines into memory

Apply this diff for a more efficient implementation using deque:

+from collections import deque
+
 def read_last_lines_of(file, n=10):
     try:
-        with open(file, "r") as f:
-            return f.readlines()[:10]
+        if not os.path.exists(file):
+            raise FileNotFoundError(f"File not found: {file}")
+        with open(file, "r", encoding="utf-8") as f:
+            return list(deque(f, n))
     except IOError as e:
-        raise RuntimeError(f"Failed to read file {file}: {e}")
+        raise RuntimeError(f"Failed to read file {file}: {e}") from e

The deque solution is memory-efficient as it only keeps the last N lines in memory while reading the file.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def read_last_lines_of(file, n=10):
try:
with open(file, "r") as f:
return f.readlines()[:10]
except IOError as e:
raise RuntimeError(f"Failed to read file {file}: {e}")
from collections import deque
def read_last_lines_of(file, n=10):
try:
if not os.path.exists(file):
raise FileNotFoundError(f"File not found: {file}")
with open(file, "r", encoding="utf-8") as f:
return list(deque(f, n))
except IOError as e:
raise RuntimeError(f"Failed to read file {file}: {e}") from e
🧰 Tools
🪛 Ruff

64-64: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (1)
phlop/testing/parallel_processor.py (1)

15-16: Consider adding timeout validation.

While the timeout is already suggested to be configurable via environment variables in past reviews, it would be beneficial to add validation to ensure the timeout value is reasonable (e.g., greater than 0 and less than some maximum value).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between be63657 and 77ef50d.

📒 Files selected for processing (2)
  • phlop/procs/runtimer.py (1 hunks)
  • phlop/testing/parallel_processor.py (5 hunks)
🧰 Additional context used
🪛 Ruff
phlop/procs/runtimer.py

49-49: Use a context manager for opening files

(SIM115)


50-50: Use a context manager for opening files

(SIM115)

🔇 Additional comments (5)
phlop/testing/parallel_processor.py (2)

77-92: LGTM! Well-structured debug output implementation.

The function provides clear visibility into pending jobs with:

  • Clear start/end markers
  • Separate stdout/stderr sections
  • Proper null checks before printing

40-40: Verify the impact of setting capture_output=False.

Setting capture_output=False means the output will be written directly to the terminal/log file. This might affect error handling in the waiter function where we check proc.run.stderr. Please ensure this change doesn't break error reporting.

✅ Verification successful

Let me gather more context about how the subprocess is being used.


Let me check the complete implementation of the CallableTest class to understand how it's used.


Setting capture_output=False is the correct implementation

The change to set capture_output=False in CallableTest.__call__ is correct because:

  1. The RunTimer class properly handles both capture_output modes:
    • When capture_output=False, it redirects output to log files if log_file_path is provided
    • Error handling still works as the RunTimer sets stdout/stderr from the log files
  2. The error checking in CallableTest using self.run.stderr remains valid since RunTimer populates these fields from either captured output or log files
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any references to run.stdout or run.stderr to verify they're not used when capture_output is False
rg -A 3 'run\.(stdout|stderr)'

Length of output: 1174


Script:

#!/bin/bash
# Search for the subprocess.run call and its context
ast-grep --pattern 'self.run = subprocess.run($$$)'

Length of output: 138


Script:

#!/bin/bash
# Search for the class definition and initialization
ast-grep --pattern 'class CallableTest($$$)'

Length of output: 46


Script:

#!/bin/bash
# Look for any error handling patterns
rg -A 5 'CalledProcessError'

Length of output: 732


Script:

#!/bin/bash
# Search for CallableTest implementation
rg -A 20 "class CallableTest"

Length of output: 1482


Script:

#!/bin/bash
# Search for where capture_output is used in the context
rg -B 2 -A 2 "capture_output"

Length of output: 6548

phlop/procs/runtimer.py (3)

27-35: Addition of 'popen' parameter enhances flexibility

The introduction of the popen parameter in the __init__ method allows users to choose between subprocess.run and subprocess.Popen, increasing the flexibility of the RunTimer class. The initialization of related attributes improves clarity.


75-96: Efficient command execution with '_run' method

The _run method effectively executes commands using subprocess.run, capturing output, errors, execution time, and exit codes. Proper exception handling ensures robustness, and logging is handled according to the specified level.


97-119: Robust implementation of '_popen' method

The _popen method provides an alternative execution path using subprocess.Popen. It correctly handles output and error streams, captures execution metrics, and ensures resources are cleaned up appropriately.

Comment on lines +138 to +146
proc = None
try:
proc = queue.get(timeout=timeout)

except Exception:
logger.info("Queue timeout - polling")
print_pending(cc, batches, logging)
continue

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider potential race conditions in error handling.

The current implementation might miss failures if:

  1. A process fails right after the timeout exception
  2. Multiple processes fail simultaneously

Consider adding a check for failed processes after the timeout:

 try:
     proc = queue.get(timeout=timeout)
 except Exception:
     logger.info("Queue timeout - polling")
     print_pending(cc, batches, logging)
+    # Check if any process has failed
+    for batch_index, procs in enumerate(cc.procs):
+        for proc in procs:
+            if not proc.is_alive() and proc.exitcode != 0:
+                raise TestCaseFailure("Process failed during timeout")
     continue

Committable suggestion was skipped due to low confidence.

Comment on lines +82 to +96
if logging == 2 and capture_output:
self.stdout = decode_bytes(self.run.stdout)
self.stderr = decode_bytes(self.run.stderr)
except subprocess.CalledProcessError as e:
# only triggers on failure if check=True
self.run_time = time.time() - start
self.exitcode = e.returncode
if logging >= 1 and capture_output:
self.stdout = decode_bytes(e.stdout)
self.stderr = decode_bytes(e.stderr)
logging = 2 # force logging as exception occurred
if logging == 2 and capture_output and log_file_path:
write_to_file(f"{log_file_path}.stdout", self.stdout)
write_to_file(f"{log_file_path}.stderr", self.stderr)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor common output handling into a helper method

The output handling logic in both _run and _popen methods is similar. To reduce code duplication and enhance maintainability, consider refactoring this common functionality into a separate helper method.

For example, create a method _handle_output:

def _handle_output(self, stdout, stderr, logging, capture_output, log_file_path):
    if logging == 2 and capture_output:
        self.stdout = decode_bytes(stdout)
        self.stderr = decode_bytes(stderr)
    if logging == 2 and capture_output and log_file_path:
        write_to_file(f"{log_file_path}.stdout", self.stdout)
        write_to_file(f"{log_file_path}.stderr", self.stderr)

Then modify _run and _popen to use this helper:

In _run:

self._handle_output(self.run.stdout, self.run.stderr, logging, capture_output, log_file_path)

In _popen:

self._handle_output(self.stdout, self.stderr, logging, capture_output, log_file_path)

Also applies to: 112-119

Comment on lines +58 to +63
def go():
if popen:
self._popen(**ekwargs, **kwargs)
else:
self._run(
check=check, capture_output=capture_output, **ekwargs, **kwargs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid potential keyword argument conflicts by merging dictionaries

Passing **ekwargs and **kwargs separately to methods can lead to conflicts if there are overlapping keys, potentially causing unexpected behavior. Consider merging the dictionaries before passing them to _popen and _run to ensure all keyword arguments are managed correctly.

Apply this change to merge the dictionaries:

 def go():
     if popen:
+        combined_kwargs = {**ekwargs, **kwargs}
+        self._popen(**combined_kwargs)
-        self._popen(**ekwargs, **kwargs)
     else:
+        combined_kwargs = {**ekwargs, **kwargs, 'check': check, 'capture_output': capture_output}
+        self._run(**combined_kwargs)
-        self._run(
-            check=check, capture_output=capture_output, **ekwargs, **kwargs
-        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def go():
if popen:
self._popen(**ekwargs, **kwargs)
else:
self._run(
check=check, capture_output=capture_output, **ekwargs, **kwargs
def go():
if popen:
combined_kwargs = {**ekwargs, **kwargs}
self._popen(**combined_kwargs)
else:
combined_kwargs = {**ekwargs, **kwargs, 'check': check, 'capture_output': capture_output}
self._run(**combined_kwargs)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
src/phlop/timing/threaded_scope_timer.cpp (1)

41-41: Implementation looks good but note the breaking change.

The use of std::make_shared with the new parameter order is correct. The duration calculation now() - start is appropriate for measuring the elapsed time.

Note that changing the parameter order is a breaking change. Ensure all call sites in the codebase have been updated to match the new order to prevent runtime errors.

inc/phlop/timing/threaded_scope_timer.hpp (1)

221-224: Optimize parameter passing in BinaryTimerFileNode constructor

Consider passing _fn_id by value instead of by const reference. Since std::uint16_t is a small integral type, passing it by value is generally more efficient and consistent with the other parameters _start and _time.

Apply this diff to adjust the parameter:

- BinaryTimerFileNode(std::uint16_t const& _fn_id, std::uint64_t const _start,
+ BinaryTimerFileNode(std::uint16_t _fn_id, std::uint64_t const _start,
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 77ef50d and 3906309.

📒 Files selected for processing (5)
  • .github/workflows/build_nix.yml (1 hunks)
  • inc/phlop/timing/scope_timer.hpp (2 hunks)
  • inc/phlop/timing/threaded_scope_timer.hpp (7 hunks)
  • mkn.yaml (2 hunks)
  • src/phlop/timing/threaded_scope_timer.cpp (1 hunks)
🧰 Additional context used
📓 Learnings (1)
inc/phlop/timing/scope_timer.hpp (1)
Learnt from: PhilipDeegan
PR: PhilipDeegan/phlop#29
File: inc/phlop/timing/scope_timer.hpp:93-94
Timestamp: 2024-09-27T21:01:44.018Z
Learning: In this codebase, when passing small built-in types by value, parameters are declared as `const` to prevent modification within the function.
🔇 Additional comments (13)
mkn.yaml (2)

16-20: LGTM! Well-structured profile configuration.

The new threaded_scope_timer profile follows the established pattern:

  • Inherits from base profile
  • Maintains consistent structure with scope_timer
  • Properly specifies source and test files
  • Uses shared mode like other profiles
✅ Verification successful

Profile configuration verified - all referenced files exist

The verification confirms that both the source file src/phlop/timing/threaded_scope_timer.cpp and test file tests/timing/test_threaded_scope_timer.cpp exist in their specified locations, validating the correctness of the profile configuration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence of source and test files referenced in the profile

# Test: Check if the referenced files exist
echo "Verifying source file..."
fd "threaded_scope_timer.cpp" src/phlop/timing/

echo "Verifying test file..."
fd "test_threaded_scope_timer.cpp" tests/timing/

Length of output: 295


1-1: Verify C++20 compatibility across the codebase.

The shebang line specifies -std=c++20. Ensure that all source files, especially the new threaded_scope_timer.cpp, are compatible with C++20 features.

✅ Verification successful

C++20 compatibility verified - no issues found

The codebase is already using modern C++ features that are well-supported in C++20, including:

  • Standard library features: <chrono>, <thread>, <mutex>
  • Language features: constexpr, structured bindings, auto type deduction
  • The code doesn't use any C++20-specific features that would cause compatibility issues

The upgrade to C++20 in the build configuration is safe and appropriate for this codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential C++20 compatibility issues in source files

# Test: Look for C++20 specific features or potential compatibility issues
rg -g '*.{cpp,hpp,h}' -i '(concept|requires|coroutine|ranges|modules|constinit|consteval|co_await|co_yield|co_return)'

Length of output: 120


Script:

#!/bin/bash
# Let's check the actual source files content to understand the C++ features being used

# First, find all C++ source files
echo "=== C++ Source Files ==="
fd -e cpp -e hpp -e h

# Then, check for modern C++ features that might need C++20
echo -e "\n=== Modern C++ Usage ==="
rg -g '*.{cpp,hpp,h}' '\<(auto|decltype|nullptr|constexpr|noexcept|override|final|delete|default|lambda|thread|mutex|atomic)\>' -l

# Look for template usage which might benefit from C++20
echo -e "\n=== Template Usage ==="
rg -g '*.{cpp,hpp,h}' 'template\s*<' -l

# Check includes for modern C++ headers
echo -e "\n=== Modern C++ Headers ==="
rg -g '*.{cpp,hpp,h}' '#include\s*<(chrono|thread|mutex|atomic|memory|type_traits)>' -l

Length of output: 1198


Script:

#!/bin/bash
# Let's examine the actual content of the timing-related files to check C++ feature usage

echo "=== scope_timer.hpp ==="
cat inc/phlop/timing/scope_timer.hpp

echo -e "\n=== threaded_scope_timer.hpp ==="
cat inc/phlop/timing/threaded_scope_timer.hpp

echo -e "\n=== Modern C++ Feature Usage ==="
# Fixed the regex from previous attempt
rg -g '*.{cpp,hpp,h}' '(auto|decltype|nullptr|constexpr|noexcept|override|final|delete|default|lambda|thread|mutex|atomic)' --no-line-number

Length of output: 26038

.github/workflows/build_nix.yml (1)

30-30: Verify build configuration changes.

The build command changes look good but require verification of a few aspects:

  1. The new threaded_scope_timer profile
  2. C++20 compiler compatibility on Ubuntu latest
  3. Impact of warning level 9

Let's verify the configuration:

Consider adding documentation about the warning level configuration and its implications.

✅ Verification successful

Build configuration changes verified with minor documentation needs

The verification results show:

  1. threaded_scope_timer profile exists in mkn.yaml with proper source and test files
  2. ✓ C++20 upgrade appears safe as no advanced C++20 features are currently used
  3. ⚠ Warning level 9 (-W 9) is not documented

Consider adding documentation about the warning level configuration in the README.md for better maintainability.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the new profile and C++ requirements

# Test 1: Check if threaded_scope_timer profile exists in mkn.yaml
echo "Checking for threaded_scope_timer profile..."
rg "threaded_scope_timer" "mkn.yaml"

# Test 2: Check for C++20 specific features in the codebase
echo "Checking for C++20 features..."
rg "concept|requires|constinit|consteval|co_await|co_yield|co_return|ranges::|views::"

# Test 3: Check if warning level 9 is documented
echo "Checking for warning level documentation..."
rg "warning.*level.*9|W.*9" "README.md" "docs/"

Length of output: 768

src/phlop/timing/threaded_scope_timer.cpp (1)

41-41: Verify RunTimerReportSnapshot constructor declaration.

The parameter order has changed to pass start as the third argument and duration as the fourth. Let's verify this matches the header declaration.

✅ Verification successful

Constructor declaration matches the implementation

The constructor declaration in both header files (inc/phlop/timing/threaded_scope_timer.hpp and inc/phlop/timing/scope_timer.hpp) matches the implementation:

  • First parameter: RunTimerReport* s (self)
  • Second parameter: RunTimerReport* p (parent)
  • Third parameter: std::uint64_t const st (start time)
  • Fourth parameter: std::uint64_t const t (duration)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find and verify the RunTimerReportSnapshot constructor declaration

# Test: Search for RunTimerReportSnapshot constructor declaration
rg -A 3 "RunTimerReportSnapshot\(" "inc/phlop/timing/"

Length of output: 711

inc/phlop/timing/scope_timer.hpp (2)

93-94: LGTM! Constructor signature improvement.

The change to pass uint64_t parameters by value aligns with the codebase's convention for small built-in types. This can improve performance by avoiding reference indirection while maintaining safety through const qualification.


170-171: LGTM! Improved constructor and member variable safety.

Two good improvements here:

  1. Constructor now takes small built-in types by value with const qualification, consistent with the codebase convention.
  2. Member variables are now const-qualified, ensuring immutability after construction which enhances thread safety and prevents accidental modifications.

Also applies to: 178-180

inc/phlop/timing/threaded_scope_timer.hpp (7)

144-148: Enhancement of RunTimerReportSnapshot constructor

The addition of the start parameter and member variable start in the RunTimerReportSnapshot constructor improves the granularity of timing data by capturing the start time of each snapshot.


157-157: Introduction of start member variable

The new start member variable in RunTimerReportSnapshot holds the start time, complementing the existing time member and enabling more precise timing analysis.


196-196: Switch to std::chrono::steady_clock for reliable timing

Changing from std::chrono::system_clock to std::chrono::steady_clock in the now() function ensures that timing measurements are not affected by system clock adjustments, providing more consistent and reliable timing data.


229-231: Effective use of const qualifiers for member variables

Making fn_id, start, and time member variables const ensures their immutability after initialization, which is a good practice for maintaining data integrity.


261-261: Consistency in accessing key_ids with child nodes

Similar to the previous comment, ensure that the key c->childs[i]->self->k exists in key_ids before using it in node.kinder.emplace_back(...) to avoid potential runtime errors.

You can extend the previous verification script to include child nodes:

#!/bin/bash
# Description: Verify that all child keys are present in `key_ids`.

# Expectation: All child keys should be accounted for in `key_ids`.

# Search for all child key accesses.
rg 'key_ids\[std::string\{c->childs\[\]\->self->k\}\]'

249-250: Ensure existence of keys in key_ids before access

When accessing key_ids[std::string{trace->self->k}], it is assumed that the key exists in the key_ids map. Ensure that all keys are populated in key_ids prior to this access to prevent potential exceptions or undefined behavior.

Run the following script to confirm that all keys used are present in key_ids:

✅ Verification successful

Let me gather more context about how these keys are populated by examining the code structure and flow.


Let me check one final aspect - how the recurse_traces_for_keys function is called in relation to the JSON dumping.


Key access in key_ids is safe - keys are populated before access

The code is safe. The recurse_traces_for_keys function is called on each trace before the JSON dumping process begins, and it ensures all keys are populated in key_ids. The function checks for key existence with key_ids.count(s) and populates missing keys, recursively handling all child traces. This guarantees that when key_ids[std::string{trace->self->k}] is accessed during JSON generation, the key will exist.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all keys used in `emplace_back` are present in `key_ids`.

# Expectation: All keys should be found in `key_ids` without any missing entries.

# Search for all instances where keys are added to `key_ids`.
rg 'key_ids\.emplace\('

# Search for all instances where `key_ids` is accessed.
rg 'key_ids\['

# Compare the keys added versus keys accessed.

Length of output: 836


Script:

#!/bin/bash
# Get the context around where keys are added to key_ids
rg -B 5 -A 5 'key_ids\.emplace'

# Get the full context of the function where key_ids is accessed
ast-grep --pattern 'void dump_json($$$) {
  $$$
}'

Length of output: 1600


Script:

#!/bin/bash
# Get the full context of the dump_json function and surrounding code
rg -B 20 -A 20 'void dump_json'

# Also check where recurse_traces_for_keys is called
rg 'recurse_traces_for_keys'

Length of output: 587


331-331: Verify compatibility of updated output format in _write method

The inclusion of node.start in the output changes the file format produced by the _write() method. Ensure that any code or tools that read this output file are updated to handle the new format, preventing potential parsing errors or data misinterpretation.

Run the following script to locate code that reads from the timing file and check if it accommodates the new format:

✅ Verification successful

Based on the search results, I can now provide a final response. The Python file parser in phlop/timing/scope_timer.py already handles the new format with start times. Here's the evidence:

  1. The file parser reads the timing data line by line and processes both formats:
def file_parser(times_filepath):
    with open(times_filepath, "r") as file:
        bits = line.split(" ")  # This split handles both old and new formats
  1. The test file tests/timing/test_scope_timer.py verifies the parsing functionality and doesn't make any assumptions about the specific format of the time data.

  2. The format change in _write method is backward compatible since:

    • The function ID and time values remain in the same order
    • The colon separator in node.start << ":" << node.time ensures clear distinction between start and duration values
    • The file parser splits on whitespace, making it flexible to handle both formats

No changes needed - output format change is compatible with existing parsers

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find code that reads the timing file to ensure it handles the updated format with `start` times.

# Search for instances of file reading operations on the timing file.
rg 'std::ifstream.*phlop_scope_times\.bin' -A 5

# Search for parsing logic that may be affected by the format change.
rg 'std::getline\(' -A 5

Length of output: 76


Script:

#!/bin/bash
# Let's try a broader search to find file operations and timing-related code

# Search for any mentions of the timing file
rg -i "phlop_scope_times" -A 5

# Search for file reading operations more broadly
rg "std::ifstream|fstream.*open|getline.*>>" -A 5

# Look for timing-related classes and methods
ast-grep --pattern 'class $_ {
  $$$
  read($$$) {
    $$$
  }
  $$$
}'

# Search for any timing-related file operations
rg -i "timing.*\.(txt|bin|log|dat)" -A 5

Length of output: 1280


Script:

#!/bin/bash
# Let's examine the test file and related timing code

# Check the Python test file mentioned
cat tests/timing/test_scope_timer.py

# Look for any code that processes timing data
rg -g "*.{py,hpp,cpp}" "read.*time|parse.*time" -A 5

# Search for any file operations in timing-related files
rg -g "*timer*.{hpp,cpp,py}" "(open|read|parse|>>|<<)" -A 5

# Look specifically for timing data structures
ast-grep --pattern 'struct $_ {
  $$$
  time
  $$$
}'

Length of output: 28481

@PhilipDeegan PhilipDeegan merged commit 8810bad into master Nov 4, 2024
2 checks passed
@PhilipDeegan PhilipDeegan deleted the next branch November 4, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant