Dataflow: Simplify revFlowThrough #18355

smowton · 2024-12-20T19:21:09Z

Observations:

revFlowThrough can be much larger than the other reverse-flow predicates, presumably when there are many different innerReturnAps.
It is only ever used in conjunction with flowThroughIntoCall, which can therefore be pushed in, and several of its parameters can thereby be dropped in exchange for exposing arg.
revFlowThroughArg can then be trivially inlined.

Result: on repository go-gitea/gitea with PR #17701 producing a wider selection of access paths than are seen on main, revFlowThrough drops in size from ~120m tuples to ~4m, and the runtime of the reverse-flow computation for dataflow stage 4 goes from dominating the forward-flow cost to relatively insignificant. Overall runtime falls from 3 minutes to 2 with substantial ram available, and presumably falls much more under GHA-style memory pressure.

Observations: * revFlowThrough can be much larger than the other reverse-flow predicates, presumably when there are many different innerReturnAps. * It is only ever used in conjunction with flowThroughIntoCall, which can therefore be pushed in, and several of its parameters can thereby be dropped in exchange for exposing `arg`. * `revFlowThroughArg` can then be trivially inlined. Result: on repository `go-gitea/gitea` with PR github#17701 producing a wider selection of access paths than are seen on `main`, `revFlowThrough` drops in size from ~120m tuples to ~4m, and the runtime of the reverse-flow computation for dataflow stage 4 goes from dominating the forward-flow cost to relatively insignificant. Overall runtime falls from 3 minutes to 2 with substantial ram available, and presumably falls much more under GHA-style memory pressure.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)

shared/dataflow/codeql/dataflow/internal/DataFlowImpl.qll: Language not supported

Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more

hvitved

Nice! I have started DCA for all languages, let's wait for the result of that before merging.

aschackmull · 2025-01-02T14:37:33Z

shared/dataflow/codeql/dataflow/internal/DataFlowImpl.qll

+            flowThroughIntoCall(call, arg, p, ap, innerReturnAp) and
+            revFlowParamToReturn(p, state, pos, innerReturnAp, ap) and
+            revFlowIsReturned(call, returnCtx, returnAp, pos, innerReturnAp)


This is disrupting a non-linear join, so it's very likely beneficial to push flowThroughIntoCall further into one of the recursive conjuncts in some way - as-is we're likely having some inefficiency with at least one of the two delta+prev combinations. Also, it would be nice to understand a bit deeper which columns are contributing to the blowup in which way.
We can safely push the projection flowThroughIntoCall(_, _, p, ap, innerReturnAp) into revFlowParamToReturn, which may be beneficial, as that's a pure filter on a pre-non-linear-join conjunct, but we cannot push in the other columns as that would amount to a join with the call-graph a bit too soon (revFlowIsReturned is exactly meant to constrain that part as much as possible).
OTOH, it may very well be good to push flowThroughIntoCall in its entirety into revFlowIsReturned as that already contains the call graph edge. If a project of flowThroughIntoCall to a pure filter in that case turns out yield a beneficial tuple reduction, then the join of revFlowOut and returnFlowsThrough (which occurs in a few places) ought to be revised as flowThroughIntoCall already contains a projected version of returnFlowsThrough.

Copilot bot review requested due to automatic review settings December 20, 2024 19:21

Copilot AI reviewed Dec 20, 2024

View reviewed changes

smowton requested review from aschackmull and hvitved December 20, 2024 19:21

github-actions bot added the DataFlow Library label Dec 20, 2024

smowton mentioned this pull request Dec 20, 2024

Go: template/text.Template execution methods: support reading arbitrary content #17701

Open

hvitved approved these changes Dec 21, 2024

View reviewed changes

hvitved added the no-change-note-required This PR does not need a change note label Dec 21, 2024

aschackmull reviewed Jan 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataflow: Simplify revFlowThrough #18355

Dataflow: Simplify revFlowThrough #18355

smowton commented Dec 20, 2024

hvitved left a comment

aschackmull Jan 2, 2025

Dataflow: Simplify revFlowThrough #18355

Are you sure you want to change the base?

Dataflow: Simplify revFlowThrough #18355

Conversation

smowton commented Dec 20, 2024

Choose a reason for hiding this comment

hvitved left a comment

Choose a reason for hiding this comment

aschackmull Jan 2, 2025

Choose a reason for hiding this comment