-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Adopt shared type tracking library #14848
Conversation
@@ -8,7 +8,7 @@ | |||
* `getACall` predicate on `SummarizedCallable`. | |||
*/ | |||
module RecursionGuard { | |||
private import semmle.python.dataflow.new.internal.TypeTrackerSpecific as TT | |||
private import semmle.python.dataflow.new.internal.TypeTrackingImpl::TypeTrackingInput as TT |
Check warning
Code scanning / CodeQL
Names only differing by case Warning test
@@ -8,7 +8,7 @@ | |||
* `getACall` predicate on `SummarizedCallable`. | |||
*/ | |||
module RecursionGuard { | |||
private import semmle.python.dataflow.new.internal.TypeTrackerSpecific as TT | |||
private import semmle.python.dataflow.new.internal.TypeTrackingImpl::TypeTrackingInput as TT |
Check warning
Code scanning / CodeQL
Names only differing by case Warning test
5e804c2
to
83c7959
Compare
a9899ec
to
fd3ea14
Compare
fd3ea14
to
6fc9e61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why we need a non-standard local-step relation, as we are filtering out steps to module variable nodes. Perhaps we should be doing that in a different place. We are also using a richer local-step relation than the one for type tracking to compute local source nodes. I wonder if that is actually a mistake...
One question about deprecation, and then a few more to satisfy my own curiosity, but this looks generally fine.
cached | ||
predicate flowsTo(Node localSource, Node dst) { | ||
predicate standardFlowsTo(Node localSource, Node dst) { | ||
not nonStandardFlowsTo(_, _) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic (no standardFlowsTo
if nonStandardFlowsTo
is non-empty) would be more clearly visible in the flowsTo
predicate. That might be less performant, though, if standardFlowsTo
is then computed for no reason. Is that why it is here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, that's why it's there.
or | ||
this instanceof ParameterNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I am surprised we did not have this before!
I am curious, did you observe test-failures from this?
Given the first case
this instanceof ExprNode and
not simpleLocalFlowStepForTypetracking(_, this)
this will only add synthetic parameter nodes (I also verified this with a quick query), but there can still be lots of those...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I am surprised we did not have this before!
Me too.
I am curious, did you observe test-failures from this?
The shared type tracking library (sensibly) assumes parameters to be local source nodes:
predicate callStep(Node nodeFrom, LocalSourceNode nodeTo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah!
@@ -151,7 +153,7 @@ class LocalSourceNode extends Node { | |||
* See `TypeBackTracker` for more details about how to use this. | |||
*/ | |||
pragma[inline] | |||
LocalSourceNode backtrack(TypeBackTracker t2, TypeBackTracker t) { t2 = t.step(result, this) } | |||
LocalSourceNode backtrack(TypeBackTracker t2, TypeBackTracker t) { t = t2.step(result, this) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes me wonder if the QLDoc for track
and backtrack
ought to mention t
and t2
. But I guess now it is at least aligned with other language implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the QL doc changed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, and even if it had mentioned t
and t2
it would not have to change. It just makes it clear that there are two possible conventions here and we have two implementations that made different choices. So we should probably make explicit what role t
and t2
are supposed to play. It is tangential to this PR, though.
or | ||
capturedJumpStep(nodeFrom, nodeTo) | ||
} | ||
predicate simpleLocalFlowStep = TypeTrackingImpl::TypeTrackingInput::simpleLocalSmallStep/2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these not also be deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll deprecate those as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR adopts the shared type tracking library in Python, just like we recently did for Ruby. Unlike the Ruby PR, I have made no attempt at enabling consistency checks (and fixing inconsistencies), and I had to add a hook to the shared library for using a custom
flowsTo
predicate, which ideally shouldn't be needed. I'll let it be up to the @github/codeql-python team to investigate both aspects follow-up.