-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Remove control flow nodes for module entry definitions from the dataflow graph. #15030
Python: Remove control flow nodes for module entry definitions from the dataflow graph. #15030
Conversation
for module entry definitions from the dataflow graph.
mostly removing of nodes from the graph. One result lost: ``` check("submodule.submodule_attr", submodule.submodule_attr, "submodule_attr", globals()) #$ MISSING:prints=submodule_attr ```
When I initially read the PR description, that sounded like a non-trivial tradeoff. What is your conclusion on whether this is a good tradeoff or not? (I don't see you taking a stance on this explicitly). From reading the code over more closely myself, it seems like a huge edge case scenario, where we import a single attribute from a relative module and then access a different attribute afterwards. from .submodule import irrelevant_attr
use(submodule.submodule_attr) We could try to gauge the impact of this from using MRVA or DCA with some meta-query, if we actually wanted to learn more. But I expect we can reach a conclusion without it 🤞 I also realized that we should have updated the comment for TNode in the original PR (but didn't): codeql/python/ql/lib/semmle/python/dataflow/new/internal/DataFlowPublic.qll Lines 15 to 25 in 263c0aa
Since we're fixing up minor things in this PR anyway, do you care to fix that comment as well? 🙏 |
Sorry, I had recently written my opinion on slack and forgot to repeat it here. I believe that the SSA removal should be strictly cleanup, essentially "no semantic changes" just rerouting the graph past unneeded nodes. So I am for no performance degradation and no gained precision (except what we got from sorting out previous disconnects). That we now know of a way to gain precision is a nice bonus, but we might be able to get that cheaper. I think we should investigate the impact and trade-off later.
Yes, now that I am no longer in hot-fix mode, I think that is a great idea :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the SSA removal should be strictly cleanup, essentially "no semantic changes" just rerouting the graph past unneeded nodes. So I am for no performance degradation and no gained precision (except what we got from sorting out previous disconnects).
Thanks for that clear sentiment 👍 with that in mind, here is an approval (even though I would still like to see the doc improvement)
This should have been part of #14777.
These nodes are extra copared to which SSA nodes existed before.
We do lose one result that we gained by having these nodes:
which is the one we might expect to lose. We do get to keep the other one (about lambdas in flow summaries),