Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951

VSadov · 2025-01-22T16:12:25Z

This is the actual implementation of what was proposed in dotnet/runtime#110420 and prototyped in #2941

Basically, this changes await marker to be just a call via a special Await helper.

When user writes inside a runtime async method

   int x = await ReturnsTaskOfInt();

C# compiler emits an equivalent of

   int x = Await( ReturnsTaskOfInt() );

The T Await<T>(Task<T> arg) method is a special intrinsic method that performs asynchronous awaiting of the Task<int>.
NOTE: There is no sync-over-async here, Await can optionally suspend/resume the current stack of calls and when the Task<int> is complete, unwraps it and returns int.

Also, the JIT is familiar with the pattern and can further optimize it into call-with-continuation invocation of the runtime-async entry point for ReturnsTaskOfInt().
As a result, if ReturnsTaskOfInt is another runtime-async method, we skip intermediate promise types (Task/ValueTask) entirely, which is the main reason for the performance edge of runtime async over the classic async.

VSadov · 2025-01-22T16:24:37Z

src/tests/async/returns.cs

-            AssertEqual("B", strings.B);
-            AssertEqual("C", strings.C);
-            AssertEqual("D", strings.D);
+            // TODO: need to fix this


@jakobbotsch the change stresses calling via thunks and possibly introduced some scenarios that tests did not cover before. Remarkably, nearly everything works fine!! However, here I saw an assert and turned off one scenario.
Not sure if this is something wrong with IL or something on the JIT side.
(the other disabled case is with thunks for async methods in structs).

I've hit that before when encoding method/type spec tokens incorrectly. Can you verify that the tokens being encoded when we construct the IL for the variants look fine?

jakobbotsch · 2025-01-22T16:40:12Z

The JIT optimization to optimize Await(RuntimeAsyncMethod), which is probably the harder part of the proposal, is not included here.

It would be nice to start on this work to see how it would look before we make the switch. Note that most of the work will be VM work -- teaching getCallInfo implementations to deal with the fact that it now may need to describe a call to the async variant of a call described by a token.

VSadov · 2025-01-23T03:29:16Z

The JIT optimization to optimize Await(RuntimeAsyncMethod), which is probably the harder part of the proposal, is not included here.

It would be nice to start on this work to see how it would look before we make the switch. Note that most of the work will be VM work -- teaching getCallInfo implementations to deal with the fact that it now may need to describe a call to the async variant of a call described by a token.

The optimization would need to detect the following pattern

arg0; .. ; argN; CallToThunkToAsync; CallToAwaitIntrinsic

and turn it into

arg0; .. ; argN; CallToAsync

For that there should be a way to:

detect that a call info is for a thunk to an async method
get a call info for the actual async method (with other inputs being the same)

Is this correct?
Would the following API be sufficient?

for #1, a flag in CORINFO_CALL_INFO::methodFlags indicating that the call info happens to be for a thunk.

CORINFO_FLG_THUNK_TO_ASYNC // the method is a non-async thunk to an async method

for #2 a flag that can be passed in CORINFO_CALLINFO_FLAGS to CEEInfo::getCallInfo, to ask for an actual async method call info.

CORINFO_CALLINFO_UNWRAP_THUNK // assume that the input pResolvedToken is for a thunk (assert if it is not), get the info for the actual async method.

jakobbotsch · 2025-01-23T10:55:03Z

The optimization would need to detect the following pattern

arg0; .. ; argN; CallToThunkToAsync; CallToAwaitIntrinsic

and turn it into

arg0; .. ; argN; CallToAsync

There are a few ways to do this, but maybe the most straightforward will be to do it as a direct IL pattern match at the point where we call getCallInfo:

runtimelab/src/coreclr/jit/importer.cpp

Lines 8952 to 8957 in b077e29

    
           eeGetCallInfo(&resolvedToken, 
        
                         (prefixFlags & PREFIX_CONSTRAINED) ? &constrainedResolvedToken : nullptr, 
        
                         // this is how impImportCall invokes getCallInfo 
        
                         combine(combine(CORINFO_CALLINFO_ALLOWINSTPARAM, CORINFO_CALLINFO_SECURITYCHECKS), 
        
                                 (opcode == CEE_CALLVIRT) ? CORINFO_CALLINFO_CALLVIRT : CORINFO_CALLINFO_NONE), 
        
                         &callInfo);

This would be changed to first look ahead for another call IL instruction and check whether it was a call to RuntimeHelpers.Await. One way to do that is by resolving the next call instruction's token and using isIntrinsic + getMethodNameFromMetadata to check.
You should not need to try to recognize anything about the arguments, I think.

There are some other details to work out, like properly setting up for opportunistic tailcalls when the Await call is in tail position, but that can come later.

For that there should be a way to:

detect that a call info is for a thunk to an async method

get a call info for the actual async method (with other inputs being the same)

Is this correct? Would the following API be sufficient?

for #1, a flag in CORINFO_CALL_INFO::methodFlags indicating that the call info happens to be for a thunk.

CORINFO_FLG_THUNK_TO_ASYNC // the method is a non-async thunk to an async method

for #2 a flag that can be passed in CORINFO_CALLINFO_FLAGS to CEEInfo::getCallInfo, to ask for an actual async method call info.

CORINFO_CALLINFO_UNWRAP_THUNK // assume that the input pResolvedToken is for a thunk (assert if it is not), get the info for the actual async method.

I would skip #1 for now. We can switch any task returning call to its async2 thunk. It is probably more efficient to avoid doing so if we know that we are switching to a thunk, but it is not possible for us to know that statically if the target is dynamically resolved.

#2 is the same as what I was thinking. Without #1 you cannot do the assert, but also it would not be possible to assert this regardless except for statically resolvable cases. Given that I would probably call the flag something like CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT, since we use the "async variant" term in other places.

VSadov · 2025-01-23T20:43:55Z

You should not need to try to recognize anything about the arguments, I think.

Yes. I included the arguments in the example to show that they do not need to change.

I was thinking of looking back at previous instruction once we see an Await intrinsic, and if previous instruction was a call that we can optimize, replace it with a call to async method.
It may be that looking ahead will fit better into how importer does things.

The rest makes sense. Thanks!

VSadov · 2025-01-24T01:10:26Z

Implemented the JIT optimization as discussed above.

…benchmarking purposes)

VSadov · 2025-01-24T03:04:37Z

The impact of the optimization is quite noticeable (as expected):

E:\>set DOTNET_JitOptimizeAwait=0

E:\>E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\async\fibonacci-without-yields\fibonacci-without-yields.cmd
BEGIN EXECUTION
 "E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false" -p "System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true"  fibonacci-without-yields.dll
1172 ms result=3026313472
Expected: 100
Actual: 100
END EXECUTION - PASSED
PASSED

E:\>set DOTNET_JitOptimizeAwait=1

E:\>E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\async\fibonacci-without-yields\fibonacci-without-yields.cmd
BEGIN EXECUTION
 "E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false" -p "System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true"  fibonacci-without-yields.dll
178 ms result=3026313472
Expected: 100
Actual: 100
END EXECUTION - PASSED
PASSED

178 ms is definitely an improvement over 1173 ms.

jakobbotsch · 2025-01-24T09:53:19Z

src/coreclr/vm/jitinterface.cpp

+    if (flags & CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT)
+    {
+        _ASSERTE(!pMD->IsAsync2Method());
+        pMD = pMD->GetAsyncOtherVariant();
+        pResolvedToken->hMethod = (CORINFO_METHOD_HANDLE)pMD;
+    }
+


Nice, that's much simpler than I was expecting.

pResolvedToken is in-only, so we should make a copy of it here and change that one instead. If necessary you can update it from the callInfo on the JIT side, but I'm somewhat worried we end up with a token whose fields are internally inconsistent.

Can you make sure we have tests for some of the hard cases? GVMs, interface calls, virtual class calls and constrained calls come to mind. I was expecting shared generics to require more work as well since other fields of the token are used below for those (see ComputeRuntimeLookupForSharedGenericToken). Can you double check why it works out? Is the method spec/type spec ok to reuse as-is from the token?

Another way to ensure the resolved token consistency could be to pass the new flag not to the eeGetCallInfo, but to the impResolveToken.

I've moved the MethodDesc shimming to the level of impResolveToken. That seems nicer as it allows eeGetCallInfo to stay unchanged.

jakobbotsch · 2025-01-28T16:13:12Z

src/coreclr/jit/jitconfigvalues.h

@@ -586,6 +586,8 @@ OPT_CONFIG_INTEGER(JitDoIfConversion, "JitDoIfConversion", 1)
 OPT_CONFIG_INTEGER(JitDoOptimizeMaskConversions, "JitDoOptimizeMaskConversions", 1) // Perform optimization of mask
                                                                                    // conversions

+RELEASE_CONFIG_INTEGER(JitOptimizeAwait, "JitOptimizeAwait", 1) // Perform optimization of Await intrinsics


I wouldn't add a release knob for this.

jakobbotsch · 2025-01-28T16:14:48Z

src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.cs

+            }
+
+            awaiter.GetResult();
+            return;


Suggested change

return;

jakobbotsch · 2025-01-28T16:14:53Z

src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.cs

+            }
+
+            awaiter.GetResult();
+            return;


Suggested change

return;

VSadov added 9 commits January 15, 2025 18:27

T RuntimeHelpers.Await<T>(Task<T>)

42ec5c2

state machine version of Await and friends

76ca9da

bump roslyn ref

ed94235

more Await helpers

6e1cd59

remove no longer needed test

5c5566b

undo no longer needed metasig

02de185

comment

a1699c6

formatting

a67f3ea

comment

21dc02b

VSadov requested a review from jakobbotsch January 22, 2025 16:12

VSadov commented Jan 22, 2025

View reviewed changes

VSadov mentioned this pull request Jan 22, 2025

Prototyping T RuntimeHelpers.Await<T>(Task<T>) #2941

Closed

implements JIT optimization for Await intrinsics

4723c6a

VSadov added 2 commits January 23, 2025 18:08

make the JitOptimizeAwait switch RELEASE_CONFIG_INTEGER (for testing/…

066ad44

…benchmarking purposes)

isIntrinsic

0a07d7b

jakobbotsch reviewed Jan 24, 2025

View reviewed changes

VSadov added 3 commits January 24, 2025 09:16

CORINFO_TOKENKIND_Await

03ddebc

revert CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT

ae17a3b

undo unnecessary diff

520da20

VSadov mentioned this pull request Jan 24, 2025

Propose new async API dotnet/runtime#110420

Merged

jakobbotsch reviewed Jan 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951

Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951

VSadov commented Jan 22, 2025 •

edited

Loading

VSadov Jan 22, 2025 •

edited

Loading

jakobbotsch Jan 28, 2025

jakobbotsch commented Jan 22, 2025

VSadov commented Jan 23, 2025 •

edited

Loading

jakobbotsch commented Jan 23, 2025 •

edited

Loading

VSadov commented Jan 23, 2025 •

edited

Loading

VSadov commented Jan 24, 2025

VSadov commented Jan 24, 2025 •

edited

Loading

jakobbotsch Jan 24, 2025 •

edited

Loading

VSadov Jan 24, 2025

VSadov Jan 24, 2025

jakobbotsch Jan 28, 2025

jakobbotsch Jan 28, 2025

jakobbotsch Jan 28, 2025

Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951

Are you sure you want to change the base?

Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951

Conversation

VSadov commented Jan 22, 2025 • edited Loading

VSadov Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

jakobbotsch Jan 28, 2025

Choose a reason for hiding this comment

jakobbotsch commented Jan 22, 2025

VSadov commented Jan 23, 2025 • edited Loading

jakobbotsch commented Jan 23, 2025 • edited Loading

VSadov commented Jan 23, 2025 • edited Loading

VSadov commented Jan 24, 2025

VSadov commented Jan 24, 2025 • edited Loading

jakobbotsch Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

VSadov Jan 24, 2025

Choose a reason for hiding this comment

VSadov Jan 24, 2025

Choose a reason for hiding this comment

jakobbotsch Jan 28, 2025

Choose a reason for hiding this comment

jakobbotsch Jan 28, 2025

Choose a reason for hiding this comment

jakobbotsch Jan 28, 2025

Choose a reason for hiding this comment

VSadov commented Jan 22, 2025 •

edited

Loading

VSadov Jan 22, 2025 •

edited

Loading

VSadov commented Jan 23, 2025 •

edited

Loading

jakobbotsch commented Jan 23, 2025 •

edited

Loading

VSadov commented Jan 23, 2025 •

edited

Loading

VSadov commented Jan 24, 2025 •

edited

Loading

jakobbotsch Jan 24, 2025 •

edited

Loading