Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow os.PathLike paths for Batch.write_output() #14770

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jmarshall
Copy link
Contributor

@jmarshall jmarshall commented Dec 17, 2024

Extends PR #14544 which did this for read_input()/read_input_group().
Refactor type coercion previously added to _new_input_resource_file()
into a new hailtop.utils.path_str() utility routine, and also invoke
it from Batch.write_output().

Security Assessment

Delete all except the correct answer:

  • This change has low security impact

Impact Description

Client side change only. Standard function call on a standard library class. Analogous to existing calls for similar functionality.

(Reviewers: please confirm the security impact before approving)

Extends PR hail-is#14544 which did this for read_input()/read_input_group().
Refactor type coercion previously added to _new_input_resource_file()
into a new hailtop.utils.path_str() utility routine, and also invoke
it from Batch.write_output().
@jmarshall
Copy link
Contributor Author

Change Description

For your consideration… As previously with PR #14544 for read_input() and read_input_group(), we often have a pathlib.Path or cloudpathlib.CloudPath that we've built up by parts, which is then the path to be used as the output location:

b.write_output(j.output_file, str(output_file_path))

Periodically we accidentally omit the str(…), which leads to a semi-obscure error message and an extra editing round-trip:

  File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/parse.py", line 133, in _coerce_args
    return _decode_args(args) + (_encode_result,)
           ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/parse.py", line 117, in _decode_args
    return tuple(x.decode(encoding, errors) if x else '' for x in args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/parse.py", line 117, in <genexpr>
    return tuple(x.decode(encoding, errors) if x else '' for x in args)
                 ^^^^^^^^
AttributeError: 'GSPath' object has no attribute 'decode'

There is a point of view that write_output() could also accept os.PathLike objects directly, and have Hail convert them to str itself as necessary — hence this PR.

Security Assessment

Delete all except the correct answer:

  • This change has no security impact

Impact Description

For none/low impact: a quick one/two sentence justification of the rating.

  • Example: "Docs only", "Low-level refactoring of non-security code", etc.
    For medium/high impact: provide a description of the impact and the mitigations in place.
  • Example: "New UI text field added in analogy to existing elements, with input strings escaped and validated against code injection"

(Reviewers: please confirm the security impact before approving)

Copy link
Member

@ehigham ehigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice little change, thanks!

Copy link
Collaborator

@patrick-schultz patrick-schultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jmarshall!

@cjllanwarne
Copy link
Collaborator

@jmarshall Hope you don't mind that I've moved your security impact text into the issue description and re-rated from "no" to "low". And added a justification for that low rating. Hope that looks good? I'll also have a look and see why the tests are unhappy

@cjllanwarne
Copy link
Collaborator

cjllanwarne commented Jan 6, 2025

@jmarshall the tests are failing at a static analysis step:

ruff check hail
hail/python/hailtop/utils/__init__.py:49:5: F401 [*] `.utils.path_str` imported but unused
Found 1 error.
[*] 1 fixable with the `--fix` option.
make: *** [Makefile:42: check-hail-fast] Error 1

@jmarshall
Copy link
Contributor Author

jmarshall commented Jan 6, 2025

Static analysis issue fixed; it detected a minor (real) mistake.

Because you use GitHub's “Use the PR's first comment as the eventual commit message instead of the PR branch's actual commit message(s)” setting, I tend to be quite careful with that first message — hence why I put my less formal Change Descriptions etc in a separate second comment on the PR. However the Security Impact thing can deserve to be in the permanent record too, so that's fine to have moved it. And usually the pure boilerplate bits get removed during merging. 😄

@cjllanwarne
Copy link
Collaborator

With apologies for the slow turnaround... it looks like you might have one more minor static analysis finding -

hail/python/hailtop/utils/utils.py:1043:4: R1705: Unnecessary "elif" after "return", remove the leading "el" from "elif" (no-else-return)

@jmarshall
Copy link
Contributor Author

In my opinion, that is a misguided diagnostic that discourages good style. See for example hail/python/hail/utils/misc.py's get_obj_metadata() which uses similar code for a similar purpose, or other extant examples in the hail code base.

But if you really want me to make the code worse to appease it, I will.

@cjllanwarne
Copy link
Collaborator

@jmarshall I'm inclined to agree with you, especially considering the extant examples. I want the linter to not error out, but if you were to make that happen by disabling this rule in pylint.rc then I'd be perfectly happy with that as a solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants