Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SourceTexts for Everyone! #11404

Merged
merged 19 commits into from
Jan 21, 2025
Merged

Conversation

DustinCampbell
Copy link
Member

@DustinCampbell DustinCampbell commented Jan 18, 2025

Fixes #8076

This is a change I've been planning for quite a while, and I'm hoping it will have a positive impact on performance. Essentially, this does away with that CodeWriter.GeneratedCode, RazorCSharpDocument.GeneratedCode, and RazorHtmlDocument.GeneratedCode returns a giant string. Instead, each provides a way to get at a canonical SourceText. The SourceText returned by CodeWriter.GetText() is produced efficiently by passing a custom TextReader into SourceText.From(...), and the new RazorCSharpDocument.Text and RazorHtmlDocument.Text properties just return the SourceText that CodeWriter produced.

In generally, I'm hoping this will reduce LOH allocations for large Razor files. However, while moving code that had previously accessed GeneratedCode to now use the SourceText, I found a few places to make additional performance wins:

  1. Any code that was comparing generated code can now do so efficiently using SourceText.ContentEquals(...)
  2. A number of places in tooling code were grabbing the GeneratedCode property and passing it to CSharpSyntaxTree.ParseText(...). These have all been updated and many of them were updated to just call tooling's RazorCodeDocument.GetOrParseCSharpSyntaxTree(...), which caches the C# SytntaxTree.

I focused on making the commit history high quality, so I recommend reviewing commit-by-commit if that makes sense for you.

CI Build: https://dev.azure.com/dnceng/internal/_build/results?buildId=2623654&view=results
Test insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/604233

This change introduces a new RazorCSharpDocument.Text that returns a SourceText. The existing GeneratedCode property has been temporarily modified to return Text.ToString() to allow existing callers to work. This will be removed in later commits.
This change simply merges RazorHtmlDocument and DefaultRazorHtmlDocument into a single sealed RazorHtmlDocument class.
This change introduces a RazorHtmlDocument.Text property that is similar to RazorCSharpDocument.Text. Like the RazorCSharpDocument, the GeneratedCode property delegates to RazorHtmlDocument.Text and will be removed once callers are updated in further commits.
This change adds a Text property to IRazorGeneratedDocument that returns SourceText. The GeneratedCode property will be removed in later commits.
Updates RazorSourceGenerator to use the RazorCSharpDocument.Text property rather than GeneratedCode.
This avoids some inefficiencies and removes some hidden allocations:

- Passing a SourceText to CSharpSyntaxTree.ParseText(...) means that it won't create a new SourceText under-the hood
  to wrap the generated code.
- SourceText.ContentEquals(...) can be used to compare the generated code instead of string.Equals(...).
- IncrementalGeneratorInitializationContext.AddSource(...) no longer needs creates new SourceText under-the-hood
  to wrap the generated code.

To ensure the SourceText has value equality when it flows through the incremental source generator, it's wrapped in a
record struct that compares with SourceText.ContentEquals(...)
The RazorSourceGeneratorTests rely on an EventListener which can make them a challenge to debug.
During debug-time, it's for the EventListener to capture spurious events, resulting in
unexpected failures. This change adds a sanity check to ensure that the EventListener only
captures events from the expected source. In addition, the events are no longer captures in a
ConcurrentQueue, since the "queue-like" behaviors were never utilitized.
This change updates the GetCSharpSourceText() and GetHtmlSourceText() helpers to return GetCSharpDocument().Text and GetHtmlDocument().Text directly. They no longer cache objects on RazorCodeDocument.Items.
Instead of creating a new SourceText, GeneratedDocumentTextLoader.LoadTextAndVersionAsync(...) can just use RazorCSharpDocument.Text.
IRazorCodeDocument.GeneratedCode in the legacy editor shim has been changed to use RazorCSharpDocument.Text.ToString(). It caches the result to avoid constructing the string again on subsequent calls.
This change adjusts lots of code in tooling to avoid accessing RazorCSharpDocument.GeneratedCode just to call CSharpSyntaxTree.ParseText(...). RazorCodeDocument has an extension method helper for this purpose that caches the result.
This change removes the GeneratedCode property from IRazorGeneratedDocument, RazorCShaprDocument, and RazorHtmlDocument.
This keeps set up and clean up code close by.
Update compiler tests to call CodeWriter.GenerateCode().ToString() instead.
These were incorrectly defined to return ReadOnlySpan<T> and ReadOnlyMemory<T> but _should_ return Span<T> and Memory<T> to match the equivalent BCL methods.
This is a much more significant change that adds a custom TextReader to read the contents of CodeWriter's data structures. When passed to SourceText.From, it is used to either produce a StringText (by calling TextReader.ReadToEnd()) or, if the length is too large, a LargeText that wraps multiple arrays of chars rather than a giant string. The TextReader provides overrides of the methods that will ultimately be called by SourceText.From.
@DustinCampbell DustinCampbell requested review from a team as code owners January 18, 2025 19:13
Copy link
Contributor

@davidwengier davidwengier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love to see it.

Does this fix #8076?

@DustinCampbell
Copy link
Member Author

Does this fix #8076?

Yes -- thanks!

@DustinCampbell
Copy link
Member Author

Note: This will need reviews from @dotnet/razor-compiler. cc @chsienki, @333fred, @jjonescz, @jaredpar

Copy link
Contributor

@chsienki chsienki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler changes LGTM

@DustinCampbell DustinCampbell merged commit 0e13d49 into dotnet:main Jan 21, 2025
12 checks passed
@DustinCampbell DustinCampbell deleted the sourcetext-ftw branch January 21, 2025 21:22
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use SourceText instead of string for generated code
4 participants