Provide consistent concept for DFG/EOG for Overlay nodes #1994

oxisto · 2025-01-28T09:38:17Z

We need a consistent concept for

DFG and EOG edges for the operations and concepts
Is it allowed to put DFG edges between concepts/operations?
Put the DFG edges on the "traditional world" based on the concepts
Guidance where to place overlay nodes -> Is it allowed to have multiple "underlay" nodes (currently NOT)
Does that effect the queries?
- For example, it can be "dangerous", to make DFG shortcuts, because then we do not "see" the traditional DFG path anymore and cannot check if something happens on this path
Does it make sense to introduce a new ConceptGranularity to have a "concept-world" DFG edge

The text was updated successfully, but these errors were encountered:

maximiliankaul · 2025-02-04T16:44:14Z

Summary of today's discussion:

DFG and EOG edges

We will mark new DFG/EOG edges.
- This allows us to ignore these new edges in existing passes, thus not breaking existing behavior.
We will connect Operation nodes to the CPG nodes to allow for nicer queries (all DFG paths from FileRead must... instead of all nodes, where an OverlayNode of type FileRead exists, must ...

graph TD
  FileRead["Operation[FileRead]"]
  CallExpr["CallExpression[read]"]

  FileRead -->|DFG| CallExpr

New edges will only be introduced wherever they provide a benefit.
- This allows us to connect nodes where the original CPG would not find a connection (e.g. HTTP endpoint and client call).
- This does not introduce shorter paths as no new edge is introduced if a DFG/... path is already present in the "CPG world"
We'll have to be careful as this can result in new loops.

Deciding if all paths contain a required `Concept` node

This is not easy, as there will always be the path using the concept nodes and the one ignoring it:

graph TD
    prev["before foo()"]
    foo["foo()"]
    after["after foo()"]
    concept["concept"]
    prev -->|DFG| foo -->|DFG| after
    concept -->|DFG| foo
    foo -->|DFG| concept

Currently, the best solution is to rephrase such a query to check for a node with an overlayNode of Type X on the path. However, @konradweiss has another suggestion to add to this discussion.

Where to place nodes?

This is purely a design decision. Suggestion:

One node per "concept instance" (instead of one global node per concept)
The concept node is connected to the CPG node creating the object (e.g. API call / malloc / ...).
Operation nodes are connected to the respective CallExpressions.

Concept Reference

We have concepts with fields representing other concepts. E.g. a DiskEncryption concept requires a Secret as a key.

We'll introduce a ConceptReference or something similar.
Having a reference keeps us from dealing with code where we cannot decide which concrete instance of a concept is used (e.g. a Secret set in a branch being used outside the branch). However, this can still be resolved manually to the best of the CPG's ability).

Multiple "underlay" nodes

Think of the following code snippet:

key = SomeLib.getKey()
foo(key)
bar(key)

Here, we will not create edges from the call arguments to the Secret concept. We will provide an (extension?) function allowing the end user to easily check if a given Reference (i.e. key) node can be a key by checking prevDFG. This is purely a convenience function and should result in easier to read queries.
There is also the option to connect all occurrences of key with the concept, however we do not see a benefit of this approach. This is purely a design decision and should yield equivalent results.

Do we have to require a `Operation` class

The question on the necessity of having an Operation for creating a Concept was raised and has to be evaluated.

oxisto assigned maximiliankaul and konradweiss Jan 28, 2025

oxisto added the DFG label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide consistent concept for DFG/EOG for Overlay nodes #1994

Provide consistent concept for DFG/EOG for Overlay nodes #1994

oxisto commented Jan 28, 2025 •

edited

Loading

maximiliankaul commented Feb 4, 2025

Provide consistent concept for DFG/EOG for Overlay nodes #1994

Provide consistent concept for DFG/EOG for Overlay nodes #1994

Comments

oxisto commented Jan 28, 2025 • edited Loading

maximiliankaul commented Feb 4, 2025

DFG and EOG edges

Deciding if all paths contain a required Concept node

Where to place nodes?

Concept Reference

Multiple "underlay" nodes

Do we have to require a Operation class

oxisto commented Jan 28, 2025 •

edited

Loading

Deciding if all paths contain a required `Concept` node

Do we have to require a `Operation` class