Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto generate ast expression nodes #16285

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 29 additions & 7 deletions crates/ruff_python_ast/ast.toml
Original file line number Diff line number Diff line change
Expand Up @@ -81,13 +81,6 @@ anynode_is_label = "expression"
rustdoc = "/// See also [expr](https://docs.python.org/3/library/ast.html#ast.expr)"

[Expr.nodes]
ExprBoolOp = {}
ExprNamed = {}
ExprBinOp = {}
ExprUnaryOp = {}
ExprLambda = {}
ExprIf = {}
ExprDict = {}
ExprSet = {}
ExprListComp = {}
ExprSetComp = {}
Expand All @@ -113,6 +106,35 @@ ExprList = {}
ExprTuple = {}
ExprSlice = {}
ExprIpyEscapeCommand = {}
ExprBoolOp = { fields = [
{ name = "op", type = "BoolOp" },
{ name = "values", type = "Expr", seq = true }
]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the AST definition easy to understand and maintain?

Given this concern, this syntax does start to look a bit unwieldy.

Is it easy to parse in our code generator?

But on the other hand, continuing to use TOML still makes it easier to iterate on the code generator itself.

So given all of this, I'd still lean towards using this TOML syntax. We can always revisit the syntax later if we feel there's something that e.g. ungrammar would give us that the TOML doesn't.

ExprNamed = { fields = [
{ name = "target", type = "Expr" },
{ name = "value", type = "Expr" }
]}
ExprBinOp = { fields = [
{ name = "left", type = "Expr" },
{ name = "op", type = "Operator" },
{ name = "right", type = "Expr" }
]}
ExprUnaryOp = { fields = [
{ name = "op", type = "UnaryOp" },
{ name = "operand", type = "Expr" }
]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a style nit, the following are equivalent TOML (i.e., it shouldn't require changes to the generator) that I think might be a bit easier to read:

[Expr.nodes.ExprUnaryOp]
fields = [
  { name = "op", type = "UnaryOp" },
  { name = "operand", type = "Expr" }
]

or

[[Expr.nodes.ExprUnaryOp.fields]]
name = "op"
type = "UnaryOp"

[[Expr.nodes.ExprUnaryOp.fields]]
name = "operand"
type = "Expr"

I have a slight preference for the last one, since it eliminates the nested maps and lists from the TOML. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

ExprLambda = { fields = [
{ name = "parameters", type = "Parameters", optional = true },
{ name = "body", type = "Expr" }
]}
ExprIf = { fields = [
{ name = "test", type = "Expr" },
{ name = "body", type = "Expr" },
{ name = "orelse", type = "Expr" }
]}
ExprDict = { fields = [
{ name = "items", type = "DictItem", seq = true },
]}

[ExceptHandler]
rustdoc = "/// See also [excepthandler](https://docs.python.org/3/library/ast.html#ast.excepthandler)"
Expand Down
63 changes: 63 additions & 0 deletions crates/ruff_python_ast/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,30 @@ class Node:
name: str
variant: str
ty: str
fields: list[Field] | None

def __init__(self, group: Group, node_name: str, node: dict[str, Any]) -> None:
self.name = node_name
self.variant = node.get("variant", node_name.removeprefix(group.name))
self.ty = f"crate::{node_name}"
self.fields = None
fields = node.get("fields")
if fields is not None:
self.fields = [Field(f) for f in fields]


@dataclass
class Field:
name: str
ty: str
seq: bool
optional: bool

def __init__(self, field: dict[str, Any]) -> None:
self.name = field["name"]
self.ty = field["type"]
self.seq = field.get("seq", False)
self.optional = field.get("optional", False)


# ------------------------------------------------------------------------------
Expand Down Expand Up @@ -547,6 +566,49 @@ def write_nodekind(out: list[str], ast: Ast) -> None:
""")


# ------------------------------------------------------------------------------


def write_node(out: list[str], ast: Ast) -> None:
write_node_list = [
"ExprBoolOp",
"ExprNamed",
"ExprBinOp",
"ExprUnaryOp",
"ExprLambda",
"ExprIf",
"ExprDict",
]
group_names = [group.name for group in ast.groups]
node_names = [node.name for node in ast.all_nodes]
for group in ast.groups:
for node in group.nodes:
if node.name not in write_node_list:
continue
if node.fields is not None:
out.append("#[derive(Clone, Debug, PartialEq)]")
name = node.name
out.append(f"pub struct {name} {{")
out.append("pub range: ruff_text_size::TextRange,")
for field in node.fields:
field_str = f"pub {field.name}: "
inner = f"crate::{field.ty}"
if field.ty in node_names or (
field.ty in group_names and (field.seq is False)
):
inner = f"Box<{inner}>"

if field.seq:
field_str += f"Vec<{inner}>,"
elif field.optional:
field_str += f"Option<{inner}>,"
else:
field_str += f"{inner},"
out.append(field_str)
out.append("}")
out.append("")


# ------------------------------------------------------------------------------
# Format and write output

Expand All @@ -558,6 +620,7 @@ def generate(ast: Ast) -> list[str]:
write_ref_enum(out, ast)
write_anynoderef(out, ast)
write_nodekind(out, ast)
write_node(out, ast)
return out


Expand Down
Loading
Loading