Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGeneration] further unification and fixes across raw and non-raw sides #2841

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

AppAppWorks
Copy link
Contributor

@AppAppWorks AppAppWorks commented Sep 6, 2024

This PR aims to further unify and fix code generation across raw and non-raw sides.

The ambiguous SyntaxBuildableType has been replaced by protocols which have more specific roles. Raw representations of SyntaxNodeKind, Node and Child have been introduced to unify usages of the newly introduced protocols at call sites when generating raw nodes.

As a result of the unification, the typing of node choices on the raw side has also been fixed. (which was always rendered as RawSyntax)

@AppAppWorks AppAppWorks force-pushed the further-unify-and-fix-code-generation branch from e9e9f1e to 89adc24 Compare September 6, 2024 09:30
@ahoppen
Copy link
Member

ahoppen commented Sep 9, 2024

Thanks for the PR @AppAppWorks. From a brief look at it and your PR description, it sounds like the PR is doing multiple unrelated things and with 1000 LOC change, it is pretty big. Would it be possible to split the changes into isolated commits or, even better, into isolated PRs? That would greatly simplify the review.

@AppAppWorks AppAppWorks force-pushed the further-unify-and-fix-code-generation branch from 89adc24 to 2d8a3cd Compare September 10, 2024 00:22
…-raw

- introduced `TypeConvertible`, `ParameterConvertible` and `SyntaxNodeConvertible`
- introduced raw representations of `SyntaxNodeKind`, `Node` and `Child` for raw
- removed `SyntaxBuildableType`
- fixed the typing of raw node choices
@AppAppWorks AppAppWorks force-pushed the further-unify-and-fix-code-generation branch from 2d8a3cd to f90e090 Compare September 10, 2024 00:39
Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few architectual questions around the protocol hierarchy. Let me re-state it here so we’re on a common ground:

  • TypeConvertible
    • NodeChoiceConvertible
      • Child
      • Child.Raw
      • Node
      • Node.Raw
    • SyntaxNodeConvertible
      • Child
      • Child.Raw
      • Node
      • Node.Raw
      • Child.InitializableBuilder.Wrapper
    • SyntaxNodeKindProtocol
      • SyntaxNodeKind
      • SyntaxNodeKind.Raw
  1. Why do we need Child.InitializableBuilder.Wrapper at all? Can’t we use Node here? As far as I can tell we won’t generate any syntax nodes from it (or do we), but the comment on SyntaxNodeConvertible says that types conforming to it should provide the definition of a syntax node.
  2. If Child.InitializableBuilder.Wrapper no longer exists, I think there’s no meaningful distinction between NodeChoiceConvertible and SyntaxNodeConvertible (they have the same set of conforming types), so I think those protocols could be merged.
  3. I find it odd that Node and Child conform to TypeConvertible. I think it mixes up the concept of the type (which is SyntaxNodeKind / SyntaxNodeKind.Raw) and the definition of the node itself. I think I’d prefer to not have that conformance and continue doing eg. node.kind.syntaxType instead of kind.syntaxType. If we remove that conformance TypeConvertible and SyntaxNodeKindProtocol could be merged as well.

So, essentially, I’m proposing the following protocol hierarchy.

  • NodeChoiceConvertible merged with SyntaxNodeConvertible
    • Child
    • Child.Raw
    • Node
    • Node.Raw
  • TypeConvertible merged with SyntaxNodeKindProtocol
    • SyntaxNodeKind
    • SyntaxNodeKind.Raw

What do you think? Did I miss something?

@@ -336,8 +336,82 @@ public enum SyntaxNodeKind: String, CaseIterable, IdentifierConvertible, TypeCon
}
}

public var base: Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to declare the base on SyntaxNodeKind instead of on the Node declaration? I found it easier to reason about the base kinds on the Node declarations than in this massive switch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it a bit difficult to understand why Node.base isn't another Node but a SyntaxNodeKind. For me, since SyntaxNodeKind represents the type, it becomes natural to describe the type hierarchy within SyntaxNodeKind, and SyntaxNodeKind.base pairs up quite nice with SyntaxNodeKind.isBase.

Maybe I could break down the multiple case statements to improve readability?

@@ -327,7 +327,7 @@ public enum SyntaxNodeKind: String, CaseIterable, IdentifierConvertible, TypeCon
}
}

public var isBase: Bool {
public var isBaseType: Bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you renamed this? I deliberately avoided using the term Type here because it’s already overloaded: The kind could be .type to represent a type node and then there’s the TypeSyntax we use to represent types during code generation. That’s why this enum is called SyntaxNodeKind not SyntaxNodeType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because this is a requirement of TypeConvertible. If we get rid of TypeConvertible, this change won't be needed.

/// The non-raw representation of this kind.
var nonRaw: NonRaw { get }

/// The name of this kind in proper case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does proper case mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper case is another name of Pascal case? Would you prefer Pascal case here?

public var parameterClause: RawSyntax? {
layoutView.children[5]
public var parameterClause: ParameterClause? {
layoutView.children[5].flatMap(ParameterClause.init)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t map be easier than flatMap? I think both of them behave the same on optionals and map has the easier semantics (to me at least).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean you'd prefer this form?

layoutView.children[5].map { ParameterClause($0)! }

public var item: RawSyntax {
layoutView.children[1]!
public var item: Item {
layoutView.children[1].flatMap(Item.init)!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing flatMap(Item.init)! wouldn’t it be easier if this was

Suggested change
layoutView.children[1].flatMap(Item.init)!
Item(layoutView.children[1]!)

@@ -185,16 +195,20 @@ func rawSyntaxNodesFile(nodesStartingWith: [Character]) -> SourceFileSyntax {
}

for (index, child) in node.children.enumerated() {
let childNode = "layoutView.children[\(index)]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be an ExprSyntax which means that you don’t need to use the raw: interpolations style below.

Comment on lines +18 to +19
@dynamicMemberLookup
struct Choice {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Choice just wraps any NodeChoiceConvertible and provides dynamic lookup into it, what’s the point of having it at all? Shouldn’t we be able to use any NodeChoiceConvertible instead?

@AppAppWorks AppAppWorks marked this pull request as draft September 16, 2024 23:22
@AppAppWorks
Copy link
Contributor Author

AppAppWorks commented Sep 17, 2024

  1. Why do we need Child.InitializableBuilder.Wrapper at all? Can’t we use Node here? As far as I can tell we won’t generate any syntax nodes from it (or do we), but the comment on SyntaxNodeConvertible says that types conforming to it should provide the definition of a syntax node.

Yeah, Child.InitializableBuilder.Wrapper can be eliminated by turning Child.InitializableBuilder into this,

public enum InitializableBuilder {
  case otherKind(resultBuilderType: TypeSyntax, actualType: TypeSyntax, parameterLabel: TokenSyntax)
  case sameKind
}
  1. If Child.InitializableBuilder.Wrapper no longer exists, I think there’s no meaningful distinction between NodeChoiceConvertible and SyntaxNodeConvertible (they have the same set of conforming types), so I think those protocols could be merged.
  2. I find it odd that Node and Child conform to TypeConvertible. I think it mixes up the concept of the type (which is SyntaxNodeKind / SyntaxNodeKind.Raw) and the definition of the node itself. I think I’d prefer to not have that conformance and continue doing eg. node.kind.syntaxType instead of kind.syntaxType. If we remove that conformance TypeConvertible and SyntaxNodeKindProtocol could be merged as well.

The reason I made Node and Child conform to TypeConvertible is that we don't actually generate TypeSyntaxs from Child.syntaxNodeKind,

class Child {
  var syntaxNodeKind: SyntaxNodeKind {
    switch kind {
    case .node(kind: let kind):
      return kind
    case .nodeChoices:
      return .syntax
    case .collection(kind: let kind, _, _, _):
      return kind
    case .token:
      return .token
    }
  }

  var syntaxType: TypeSyntax {
    switch self.kind {
    case .node(let kind), .collection(let kind, _, _, _):
      return kind.syntaxType
    case .nodeChoices: // special treatment for node choices
      return self.syntaxChoicesType
    case .token:
      return "TokenSyntax"
    }
  }
}

It'd feel fairly inconsistent to call node.kind.xxxType while calling child.xxxType on the other hand, and it might be tempting to call child.syntaxNodeKind.xxxType which would very likely result in an incorrect TypeSyntax generated.

After some thought, I'm considering a new hierarchy,

protocol IdentifierConvertible {}
protocol NodeTypeConvertible {}
protocol NodeMetadataProvider {
  var documentation: SwiftSyntax.Trivia
  var experimentalFeature: ExperimentalFeature?
  var apiAttributes: AttributeListSyntax
}
protocol NodeChoiceConvertible: IdentifierConvertible, NodeMetadataProvider, NodeTypeConvertible {}
protocol SyntaxNodeKindProtocol: IdentifierConvertible, NodeTypeConvertible {}
protocol ParameterConvertible {}

enum SyntaxNodeKind: SyntaxNodeKindProtocol {
  struct Raw: SyntaxNodeKindProtocol {}
}

class Node: NodeMetadataProvider {
  public var kind: SyntaxNodeKind
}

class Child: NodeChoiceConvertible, ParameterConvertible { 
  struct Raw: NodeChoiceConvertible, ParameterConvertible {
    private var syntaxNodeKind: SyntaxNodeKind.Raw
  }

  private var syntaxNodeKind: SyntaxNodeKind
}

struct ChildNodeChoices {
  struct NodeWrapper: NodeChoiceConvertible {
    var node: Node
  }

  var choices: [any NodeChoiceConvertible]
}

Node won't need a raw counterpart as Node will no longer conform with NodeTypeConvertible. Child and Child.Raw will keep conforming with NodeTypeConvertible through NodeChoiceConvertible but hide syntaxNodeKind. ChildNodeChoices.NodeWrapper will be introduced to decorate Node into any NodeChoiceConvertible.

@ahoppen
Copy link
Member

ahoppen commented Sep 17, 2024

Yeah, Child.InitializableBuilder.Wrapper can be eliminated by turning Child.InitializableBuilder into this,

That would be a good change.

The reason I made Node and Child conform to TypeConvertible is that we don't actually generate TypeSyntaxs from Child.syntaxNodeKind,

It'd feel fairly inconsistent to call node.kind.xxxType while calling child.xxxType on the other hand, and it might be tempting to call child.syntaxNodeKind.xxxType which would very likely result in an incorrect TypeSyntax generated.

I missed that. That makes total sense and I agree with you. What do you think of the following changes to the hierarchy? I think it would represent everything and would only be a single level deep which generally helps to reason about things.

  • Merge NodeChoiceConvertible and SyntaxNodeConvertible because Child.InitializableBuilder.Wrapper no longer exists.
  • Merge TypeConvertible to SyntaxNodeConvertible and implement the TypeConvertible requirements in Node, Node.Raw, Child and Child.Raw instead of using a protocol extension. That way it’s clear that all these types need to provide an implementation for syntaxType etc. instead of having a default implementation based on the protocol extension on SyntaxNodeConvertible that doesn’t apply half of the time (Child and Child.Raw need to override it). I think this semi-used default implementation actually threw me off.
  • In the implementation of SyntaxNodeKind.syntaxType, call kind.node.syntaxType.

So, the hierarchy would be:

  • TypeConvertible + NodeChoiceConvertible + SyntaxNodeConvertible
    • Child
    • Child.Raw
    • Node
    • Node.Raw
  • SyntaxNodeKindProtocol
    • SyntaxNodeKind
    • SyntaxNodeKind.Raw

What do you think? Do you think this would simplify things and work?

Two other unrelated thoughts I had while writing this:

  • Instead of having all the nested Raw types, I would just make them top level types, ie. RawChild instead of Child.Raw. This matches the naming scheme in SwiftSyntax where we have Syntax / RawSyntax, Identifier / RawIdentifier and probably another one that I can’t think of right now.
  • It might make sense to have SyntaxNodeKindProtocol.node return a non-optional. Looks like we’re force-unwrapping it in most cases anyway and we could say that you just must not call node on SyntaxKind.token, SyntaxKind.syntax and SyntaxKind.syntaxCollection. At least I think those are the only three kinds that don’t have an associated Node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants