[red-knot] Separate 'infer_expression_type' query #15942

sharkdp · 2025-02-04T15:46:56Z

Summary

As a follow-up to the discussion here, this changeset adds a new salsa query

fn infer_expression_type<'db>(db: &'db dyn Db, expression: Expression<'db>) -> Type<'db>

which is similar to infer_expression_types (plural), but returns the type of the expression directly instead of returning a TypeInference object. We can use this in a few places. Notably, it can't be used here:

ruff/crates/red_knot_python_semantic/src/types/narrow.rs

Lines 500 to 504 in 9d83e76

    
           let scope = self.scope(); 
        
           let inference = infer_expression_types(self.db, cls); 
        
           let ty = inference 
        
               .expression_type(cls.node_ref(self.db).scoped_expression_id(self.db, scope)) 
        
               .to_instance(self.db);

because that uses self.scope(), not cls.scope().

Test Plan

—

sharkdp · 2025-02-04T15:52:19Z

crates/red_knot_python_semantic/src/types/unpacker.rs

@@ -42,8 +42,7 @@ impl<'db> Unpacker<'db> {
            "Unpacking target must be a list or tuple expression"
        );

-        let mut value_ty = infer_expression_types(self.db(), value.expression())
-            .expression_type(value.scoped_expression_id(self.db(), self.scope));


I just noticed that this is not semantically equivalent (uses self.scope, not value.expression().scope()). Will check if that's okay.

It seems to me that this should be the same? But someone familiar with this code should probably double-check.

I added a

assert_eq!(value.expression().scope(self.db()), self.scope);

assertion locally and all tests passed.

@dhruvmanila probably has the most context on this module?

I don't think it's necessary to use infer_expression_type here because value always belongs to the same file (Unpacker never does any cross-module type inference).

In general, an expression node can only be part of one scope, and that scope is the only possible correct scope to pass to scoped_expression_id for that expression. So I believe that in any case where we are doing this pattern, either you could use infer_expression_types instead with no change in behavior, or the existing code was buggy. I think the only reason existing code wouldn't use the scope of the expression itself was just because that code happened to already have that same scope stored in a more convenient place. I think this is also true for the narrowing code you linked in the PR summary.

(Note this comment is just about semantic correctness, and independent of the whole question of whether we should introduce another layer of Salsa query in all of these places.)

I think the current implementation (as is on main) could be incorrect when this will be used to unpack in comprehensions. The reason being that the value expression of the first generator comes from the outer scope and not the comprehension scope where the unpacking would belong to once implemented (#15369).

@dhruvmanila I could be remembering wrong, but I would think we should give that "value expression of the first generator" an expression ID in the scope where it should be evaluated, maintaining the invariant that every expression belongs to exactly one scope, which I think would mean the current implementation of this helper is correct?

Sounds like this case deserves some investigation.

I could be remembering wrong, but I would think we should give that "value expression of the first generator" an expression ID in the scope where it should be evaluated, maintaining the invariant that every expression belongs to exactly one scope, which I think would mean the current implementation of this helper is correct?

Oh, I think you're correct. This means the previous implementation would've been probably incorrect for comprehensions because the unpacking would happen in the comprehension scope while the expression is in the outer scope but the previous implementation would try to get the expression from the comprehension scope where it doesn't exist.

We should be giving the expression of the first generator in the scope where it should be evaluated:

ruff/crates/red_knot_python_semantic/src/semantic_index/builder.rs

Lines 627 to 631 in 17245b2

// The `iter` of the first generator is evaluated in the outer scope, while all subsequent

// nodes are evaluated in the inner scope.

self.add_standalone_expression(&generator.iter);

self.visit_expr(&generator.iter);

self.push_scope(scope);

AlexWaygood

Lovely!

MichaReiser

Whether we want to use infer_expression_type or not is slightly more subtle. The way you changed it in this PR isn't incorrect but it means that we pay for an extra salsa queries even in cases where the isolation isn't necessary.

We should only use infer_expression_type if it isn't guaranteed that expression belongs to the same file as the enclosing query. This, for example, isn't the case in the TypeInferenceBuilder or Unpacker where all ast nodes are guaranteed to be from the same file. We do want the isolation in Type::own_member because there's no guarantee from which file (if any) the method is called.

MichaReiser · 2025-02-04T17:36:32Z

crates/red_knot_python_semantic/src/types/infer.rs

+// Similar to `infer_expression_types` (with the same restrictions). Directly returns the
+// type of the overall expression. This is a salsa query because it accesses `node_ref`,
+// which is sensitive to changes in the AST. Making it a query allows downstream queries
+// to short-circuit if the result type has not changed.


We may want to add a note that it isn't necessary to use this query in TypeInferenceBuilder or anywhere else where it's known that the expression belongs to the current file.

MichaReiser · 2025-02-04T17:37:45Z

crates/red_knot_python_semantic/src/types/unpacker.rs

@@ -42,8 +42,7 @@ impl<'db> Unpacker<'db> {
            "Unpacking target must be a list or tuple expression"
        );

-        let mut value_ty = infer_expression_types(self.db(), value.expression())
-            .expression_type(value.scoped_expression_id(self.db(), self.scope));


I don't think it's necessary to use infer_expression_type here because value always belongs to the same file (Unpacker never does any cross-module type inference).

MichaReiser · 2025-02-04T17:39:05Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

-                let scope = test_expr.scope(db);
-                let ty = inference
-                    .expression_type(test_expr.node_ref(db).scoped_expression_id(db, scope));
+                let ty = infer_expression_type(db, test_expr);


I'm not 100% if we should use infer_expression_type here. We should only use it if analyze_single can be reached from another module than where test_expr is defined.

Hmm, the new API is much more convenient and ergonomic. If we expect it to be slower in many cases, it feels like we're adding a footgun (making something that we should only do in very specific cases easy to do in all cases)

We can have two different methods for it but it's important that we use the right method in the right context. I do think infer_expression_type as a query will be useful in other situations too, but it's not a "fits all" solution.

It seems to me that this is sufficiently useful on ergonomic grounds, that it might even be worth having two variants of it, a Salsa-cached one and a non-Salsa-cached one. (The Salsa-cached one could just call the non-Salsa-cached one.)

I think it would also be very valuable if we could implement (via some combination of types, visibility, and code organization) a clearer distinction between "code that can be called across files" (mostly I think this code should live in types.rs and on Type today? Because Type is how type information travels between files) and "code that exclusively operates on only one file", and ensure that the former code can never see an AST directly. This might be a non-trivial refactor, but I think it would be worth it. Maybe worth creating an issue for, at least?

I created #15949 to follow up on the latter.

MichaReiser · 2025-02-20T11:40:28Z

I tried to incorporate this change into #16268

[red-knot] Separate 'infer_expression_type' query

18bb0a3

sharkdp added the red-knot Multi-file analysis & type inference label Feb 4, 2025

sharkdp requested review from carljm, MichaReiser and AlexWaygood as code owners February 4, 2025 15:46

sharkdp commented Feb 4, 2025

View reviewed changes

AlexWaygood approved these changes Feb 4, 2025

View reviewed changes

MichaReiser reviewed Feb 4, 2025

View reviewed changes

sharkdp closed this Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Separate 'infer_expression_type' query #15942

[red-knot] Separate 'infer_expression_type' query #15942

sharkdp commented Feb 4, 2025 •

edited

Loading

sharkdp Feb 4, 2025 •

edited

Loading

sharkdp Feb 4, 2025

AlexWaygood Feb 4, 2025

MichaReiser Feb 4, 2025

carljm Feb 4, 2025 •

edited

Loading

dhruvmanila Feb 5, 2025

carljm Feb 5, 2025

dhruvmanila Feb 6, 2025

AlexWaygood left a comment

MichaReiser left a comment

MichaReiser Feb 4, 2025

MichaReiser Feb 4, 2025

MichaReiser Feb 4, 2025

AlexWaygood Feb 4, 2025 •

edited

Loading

MichaReiser Feb 4, 2025

carljm Feb 4, 2025 •

edited

Loading

carljm Feb 4, 2025

MichaReiser commented Feb 20, 2025

	let scope = self.scope();
	let inference = infer_expression_types(self.db, cls);
	let ty = inference
	.expression_type(cls.node_ref(self.db).scoped_expression_id(self.db, scope))
	.to_instance(self.db);

	// The `iter` of the first generator is evaluated in the outer scope, while all subsequent
	// nodes are evaluated in the inner scope.
	self.add_standalone_expression(&generator.iter);
	self.visit_expr(&generator.iter);
	self.push_scope(scope);

[red-knot] Separate 'infer_expression_type' query #15942

[red-knot] Separate 'infer_expression_type' query #15942

Conversation

sharkdp commented Feb 4, 2025 • edited Loading

Summary

Test Plan

sharkdp Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carljm Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood left a comment

Choose a reason for hiding this comment

MichaReiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carljm Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser commented Feb 20, 2025

sharkdp commented Feb 4, 2025 •

edited

Loading

sharkdp Feb 4, 2025 •

edited

Loading

carljm Feb 4, 2025 •

edited

Loading

AlexWaygood Feb 4, 2025 •

edited

Loading

carljm Feb 4, 2025 •

edited

Loading