Skip to content

Commit

Permalink
[red-knot] Use arena-allocated association lists for narrowing constr…
Browse files Browse the repository at this point in the history
…aints (#16306)

This PR adds an implementation of [association
lists](https://en.wikipedia.org/wiki/Association_list), and uses them to
replace the previous `BitSet`/`SmallVec` representation for narrowing
constraints.

An association list is a linked list of key/value pairs. We additionally
guarantee that the elements of an association list are sorted (by their
keys), and that they do not contain any entries with duplicate keys.

Association lists have fallen out of favor in recent decades, since you
often need operations that are inefficient on them. In particular,
looking up a random element by index is O(n), just like a linked list;
and looking up an element by key is also O(n), since you must do a
linear scan of the list to find the matching element. Luckily we don't
need either of those operations for narrowing constraints!

The typical implementation also suffers from poor cache locality and
high memory allocation overhead, since individual list cells are
typically allocated separately from the heap. We solve that last problem
by storing the cells of an association list in an `IndexVec` arena.

---------

Co-authored-by: Carl Meyer <[email protected]>
  • Loading branch information
dcreager and carljm authored Feb 25, 2025
1 parent 5c007db commit fa76f6c
Show file tree
Hide file tree
Showing 13 changed files with 1,073 additions and 322 deletions.
1 change: 0 additions & 1 deletion crates/red_knot_python_semantic/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ pub(crate) mod symbol;
pub mod types;
mod unpack;
mod util;
mod visibility_constraints;

type FxOrderSet<V> = ordermap::set::OrderSet<V, BuildHasherDefault<FxHasher>>;

Expand Down
2 changes: 2 additions & 0 deletions crates/red_knot_python_semantic/src/semantic_index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,10 @@ mod builder;
pub(crate) mod constraint;
pub mod definition;
pub mod expression;
mod narrowing_constraints;
pub mod symbol;
mod use_def;
mod visibility_constraints;

pub(crate) use self::use_def::{
BindingWithConstraints, BindingWithConstraintsIterator, DeclarationWithConstraint,
Expand Down
20 changes: 10 additions & 10 deletions crates/red_knot_python_semantic/src/semantic_index/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,14 @@ use crate::module_name::ModuleName;
use crate::semantic_index::ast_ids::node_key::ExpressionNodeKey;
use crate::semantic_index::ast_ids::AstIdsBuilder;
use crate::semantic_index::attribute_assignment::{AttributeAssignment, AttributeAssignments};
use crate::semantic_index::constraint::{PatternConstraintKind, ScopedConstraintId};
use crate::semantic_index::constraint::{
Constraint, ConstraintNode, PatternConstraint, PatternConstraintKind, ScopedConstraintId,
};
use crate::semantic_index::definition::{
AssignmentDefinitionNodeRef, ComprehensionDefinitionNodeRef, Definition, DefinitionNodeKey,
DefinitionNodeRef, ForStmtDefinitionNodeRef, ImportFromDefinitionNodeRef,
AssignmentDefinitionNodeRef, ComprehensionDefinitionNodeRef, Definition, DefinitionCategory,
DefinitionNodeKey, DefinitionNodeRef, ExceptHandlerDefinitionNodeRef, ForStmtDefinitionNodeRef,
ImportDefinitionNodeRef, ImportFromDefinitionNodeRef, MatchPatternDefinitionNodeRef,
WithItemDefinitionNodeRef,
};
use crate::semantic_index::expression::{Expression, ExpressionKind};
use crate::semantic_index::symbol::{
Expand All @@ -28,17 +32,13 @@ use crate::semantic_index::symbol::{
use crate::semantic_index::use_def::{
EagerBindingsKey, FlowSnapshot, ScopedEagerBindingsId, UseDefMapBuilder,
};
use crate::semantic_index::visibility_constraints::{
ScopedVisibilityConstraintId, VisibilityConstraintsBuilder,
};
use crate::semantic_index::SemanticIndex;
use crate::unpack::{Unpack, UnpackValue};
use crate::visibility_constraints::{ScopedVisibilityConstraintId, VisibilityConstraintsBuilder};
use crate::Db;

use super::constraint::{Constraint, ConstraintNode, PatternConstraint};
use super::definition::{
DefinitionCategory, ExceptHandlerDefinitionNodeRef, ImportDefinitionNodeRef,
MatchPatternDefinitionNodeRef, WithItemDefinitionNodeRef,
};

mod except_handlers;

/// Are we in a state where a `break` statement is allowed?
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
//! # Narrowing constraints
//!
//! When building a semantic index for a file, we associate each binding with _narrowing
//! constraints_. The narrowing constraint is used to constrain the type of the binding's symbol.
//! Note that a binding can be associated with a different narrowing constraint at different points
//! in a file. See the [`use_def`][crate::semantic_index::use_def] module for more details.
//!
//! This module defines how narrowing constraints are stored internally.
//!
//! A _narrowing constraint_ consists of a list of _clauses_, each of which corresponds with an
//! expression in the source file (represented by a [`Constraint`]). We need to support the
//! following operations on narrowing constraints:
//!
//! - Adding a new clause to an existing constraint
//! - Merging two constraints together, which produces the _intersection_ of their clauses
//! - Iterating through the clauses in a constraint
//!
//! In particular, note that we do not need random access to the clauses in a constraint. That
//! means that we can use a simple [_sorted association list_][ruff_index::list] as our data
//! structure. That lets us use a single 32-bit integer to store each narrowing constraint, no
//! matter how many clauses it contains. It also makes merging two narrowing constraints fast,
//! since alists support fast intersection.
//!
//! Because we visit the contents of each scope in source-file order, and assign scoped IDs in
//! source-file order, that means that we will tend to visit narrowing constraints in order by
//! their IDs. This is exactly how to get the best performance from our alist implementation.
//!
//! [`Constraint`]: crate::semantic_index::constraint::Constraint
use ruff_index::list::{ListBuilder, ListSetReverseIterator, ListStorage};
use ruff_index::newtype_index;

use crate::semantic_index::constraint::ScopedConstraintId;

/// A narrowing constraint associated with a live binding.
///
/// A constraint is a list of clauses, each of which is a [`Constraint`] that constrains the type
/// of the binding's symbol.
///
/// An instance of this type represents a _non-empty_ narrowing constraint. You will often wrap
/// this in `Option` and use `None` to represent an empty narrowing constraint.
///
/// [`Constraint`]: crate::semantic_index::constraint::Constraint
#[newtype_index]
pub(crate) struct ScopedNarrowingConstraintId;

/// One of the clauses in a narrowing constraint, which is a [`Constraint`] that constrains the
/// type of the binding's symbol.
///
/// Note that those [`Constraint`]s are stored in [their own per-scope
/// arena][crate::semantic_index::constraint::Constraints], so internally we use a
/// [`ScopedConstraintId`] to refer to the underlying constraint.
///
/// [`Constraint`]: crate::semantic_index::constraint::Constraint
#[derive(Clone, Copy, Debug, Eq, Ord, PartialEq, PartialOrd)]
pub(crate) struct ScopedNarrowingConstraintClause(ScopedConstraintId);

impl ScopedNarrowingConstraintClause {
/// Returns (the ID of) the `Constraint` for this clause
pub(crate) fn constraint(self) -> ScopedConstraintId {
self.0
}
}

impl From<ScopedConstraintId> for ScopedNarrowingConstraintClause {
fn from(constraint: ScopedConstraintId) -> ScopedNarrowingConstraintClause {
ScopedNarrowingConstraintClause(constraint)
}
}

/// A collection of narrowing constraints for a given scope.
#[derive(Debug, Eq, PartialEq)]
pub(crate) struct NarrowingConstraints {
lists: ListStorage<ScopedNarrowingConstraintId, ScopedNarrowingConstraintClause>,
}

// Building constraints
// --------------------

/// A builder for creating narrowing constraints.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) struct NarrowingConstraintsBuilder {
lists: ListBuilder<ScopedNarrowingConstraintId, ScopedNarrowingConstraintClause>,
}

impl NarrowingConstraintsBuilder {
pub(crate) fn build(self) -> NarrowingConstraints {
NarrowingConstraints {
lists: self.lists.build(),
}
}

/// Adds a clause to an existing narrowing constraint.
pub(crate) fn add(
&mut self,
constraint: Option<ScopedNarrowingConstraintId>,
clause: ScopedNarrowingConstraintClause,
) -> Option<ScopedNarrowingConstraintId> {
self.lists.insert(constraint, clause)
}

/// Returns the intersection of two narrowing constraints. The result contains the clauses that
/// appear in both inputs.
pub(crate) fn intersect(
&mut self,
a: Option<ScopedNarrowingConstraintId>,
b: Option<ScopedNarrowingConstraintId>,
) -> Option<ScopedNarrowingConstraintId> {
self.lists.intersect(a, b)
}
}

// Iteration
// ---------

pub(crate) type NarrowingConstraintsIterator<'a> = std::iter::Copied<
ListSetReverseIterator<'a, ScopedNarrowingConstraintId, ScopedNarrowingConstraintClause>,
>;

impl NarrowingConstraints {
/// Iterates over the clauses in a narrowing constraint.
pub(crate) fn iter_clauses(
&self,
set: Option<ScopedNarrowingConstraintId>,
) -> NarrowingConstraintsIterator<'_> {
self.lists.iter_set_reverse(set).copied()
}
}

// Test support
// ------------

#[cfg(test)]
mod tests {
use super::*;

impl ScopedNarrowingConstraintClause {
pub(crate) fn as_u32(self) -> u32 {
self.0.as_u32()
}
}

impl NarrowingConstraintsBuilder {
pub(crate) fn iter_constraints(
&self,
set: Option<ScopedNarrowingConstraintId>,
) -> NarrowingConstraintsIterator<'_> {
self.lists.iter_set_reverse(set).copied()
}
}
}
44 changes: 32 additions & 12 deletions crates/red_knot_python_semantic/src/semantic_index/use_def.rs
Original file line number Diff line number Diff line change
Expand Up @@ -260,20 +260,22 @@ use ruff_index::{newtype_index, IndexVec};
use rustc_hash::FxHashMap;

use self::symbol_state::{
ConstraintIndexIterator, LiveBindingsIterator, LiveDeclaration, LiveDeclarationsIterator,
ScopedDefinitionId, SymbolBindings, SymbolDeclarations, SymbolState,
LiveBindingsIterator, LiveDeclaration, LiveDeclarationsIterator, ScopedDefinitionId,
SymbolBindings, SymbolDeclarations, SymbolState,
};
use crate::semantic_index::ast_ids::ScopedUseId;
use crate::semantic_index::constraint::{
Constraint, Constraints, ConstraintsBuilder, ScopedConstraintId,
};
use crate::semantic_index::definition::Definition;
use crate::semantic_index::narrowing_constraints::{
NarrowingConstraints, NarrowingConstraintsBuilder, NarrowingConstraintsIterator,
};
use crate::semantic_index::symbol::{FileScopeId, ScopedSymbolId};
use crate::visibility_constraints::{
use crate::semantic_index::visibility_constraints::{
ScopedVisibilityConstraintId, VisibilityConstraints, VisibilityConstraintsBuilder,
};

mod bitset;
mod symbol_state;

/// Applicable definitions and constraints for every use of a name.
Expand All @@ -286,6 +288,9 @@ pub(crate) struct UseDefMap<'db> {
/// Array of [`Constraint`] in this scope.
constraints: Constraints<'db>,

/// Array of narrowing constraints in this scope.
narrowing_constraints: NarrowingConstraints,

/// Array of visibility constraints in this scope.
visibility_constraints: VisibilityConstraints,

Expand Down Expand Up @@ -370,6 +375,7 @@ impl<'db> UseDefMap<'db> {
BindingWithConstraintsIterator {
all_definitions: &self.all_definitions,
constraints: &self.constraints,
narrowing_constraints: &self.narrowing_constraints,
visibility_constraints: &self.visibility_constraints,
inner: bindings.iter(),
}
Expand Down Expand Up @@ -416,6 +422,7 @@ type EagerBindings = IndexVec<ScopedEagerBindingsId, SymbolBindings>;
pub(crate) struct BindingWithConstraintsIterator<'map, 'db> {
all_definitions: &'map IndexVec<ScopedDefinitionId, Option<Definition<'db>>>,
pub(crate) constraints: &'map Constraints<'db>,
pub(crate) narrowing_constraints: &'map NarrowingConstraints,
pub(crate) visibility_constraints: &'map VisibilityConstraints,
inner: LiveBindingsIterator<'map>,
}
Expand All @@ -425,14 +432,16 @@ impl<'map, 'db> Iterator for BindingWithConstraintsIterator<'map, 'db> {

fn next(&mut self) -> Option<Self::Item> {
let constraints = self.constraints;
let narrowing_constraints = self.narrowing_constraints;

self.inner
.next()
.map(|live_binding| BindingWithConstraints {
binding: self.all_definitions[live_binding.binding],
narrowing_constraints: ConstraintsIterator {
narrowing_constraint: ConstraintsIterator {
constraints,
constraint_ids: live_binding.narrowing_constraints.iter(),
constraint_ids: narrowing_constraints
.iter_clauses(live_binding.narrowing_constraint),
},
visibility_constraint: live_binding.visibility_constraint,
})
Expand All @@ -443,13 +452,13 @@ impl std::iter::FusedIterator for BindingWithConstraintsIterator<'_, '_> {}

pub(crate) struct BindingWithConstraints<'map, 'db> {
pub(crate) binding: Option<Definition<'db>>,
pub(crate) narrowing_constraints: ConstraintsIterator<'map, 'db>,
pub(crate) narrowing_constraint: ConstraintsIterator<'map, 'db>,
pub(crate) visibility_constraint: ScopedVisibilityConstraintId,
}

pub(crate) struct ConstraintsIterator<'map, 'db> {
constraints: &'map Constraints<'db>,
constraint_ids: ConstraintIndexIterator<'map>,
constraint_ids: NarrowingConstraintsIterator<'map>,
}

impl<'db> Iterator for ConstraintsIterator<'_, 'db> {
Expand All @@ -458,7 +467,7 @@ impl<'db> Iterator for ConstraintsIterator<'_, 'db> {
fn next(&mut self) -> Option<Self::Item> {
self.constraint_ids
.next()
.map(|constraint_id| self.constraints[ScopedConstraintId::from_u32(constraint_id)])
.map(|narrowing_constraint| self.constraints[narrowing_constraint.constraint()])
}
}

Expand Down Expand Up @@ -509,7 +518,10 @@ pub(super) struct UseDefMapBuilder<'db> {
all_definitions: IndexVec<ScopedDefinitionId, Option<Definition<'db>>>,

/// Builder of constraints.
constraints: ConstraintsBuilder<'db>,
pub(super) constraints: ConstraintsBuilder<'db>,

/// Builder of narrowing constraints.
pub(super) narrowing_constraints: NarrowingConstraintsBuilder,

/// Builder of visibility constraints.
pub(super) visibility_constraints: VisibilityConstraintsBuilder,
Expand Down Expand Up @@ -542,6 +554,7 @@ impl Default for UseDefMapBuilder<'_> {
Self {
all_definitions: IndexVec::from_iter([None]),
constraints: ConstraintsBuilder::default(),
narrowing_constraints: NarrowingConstraintsBuilder::default(),
visibility_constraints: VisibilityConstraintsBuilder::default(),
scope_start_visibility: ScopedVisibilityConstraintId::ALWAYS_TRUE,
bindings_by_use: IndexVec::new(),
Expand Down Expand Up @@ -578,8 +591,9 @@ impl<'db> UseDefMapBuilder<'db> {
}

pub(super) fn record_constraint_id(&mut self, constraint: ScopedConstraintId) {
let narrowing_constraint = constraint.into();
for state in &mut self.symbol_states {
state.record_constraint(constraint);
state.record_constraint(&mut self.narrowing_constraints, narrowing_constraint);
}
}

Expand Down Expand Up @@ -737,10 +751,15 @@ impl<'db> UseDefMapBuilder<'db> {
let mut snapshot_definitions_iter = snapshot.symbol_states.into_iter();
for current in &mut self.symbol_states {
if let Some(snapshot) = snapshot_definitions_iter.next() {
current.merge(snapshot, &mut self.visibility_constraints);
current.merge(
snapshot,
&mut self.narrowing_constraints,
&mut self.visibility_constraints,
);
} else {
current.merge(
SymbolState::undefined(snapshot.scope_start_visibility),
&mut self.narrowing_constraints,
&mut self.visibility_constraints,
);
// Symbol not present in snapshot, so it's unbound/undeclared from that path.
Expand All @@ -763,6 +782,7 @@ impl<'db> UseDefMapBuilder<'db> {
UseDefMap {
all_definitions: self.all_definitions,
constraints: self.constraints.build(),
narrowing_constraints: self.narrowing_constraints.build(),
visibility_constraints: self.visibility_constraints.build(),
bindings_by_use: self.bindings_by_use,
public_symbols: self.symbol_states,
Expand Down
Loading

0 comments on commit fa76f6c

Please sign in to comment.