Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951

Open
wants to merge 15 commits into
base: feature/async2-experiment
Choose a base branch
from

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Jan 22, 2025

This is the actual implementation of what was proposed in dotnet/runtime#110420 and prototyped in #2941

Basically, this changes await marker to be just a call via a special Await helper.

When user writes inside a runtime async method

   int x = await ReturnsTaskOfInt();

C# compiler emits an equivalent of

   int x = Await( ReturnsTaskOfInt() );

The T Await<T>(Task<T> arg) method is a special intrinsic method that performs asynchronous awaiting of the Task<int>.
NOTE: There is no sync-over-async here, Await can optionally suspend/resume the current stack of calls and when the Task<int> is complete, unwraps it and returns int.

Also, the JIT is familiar with the pattern and can further optimize it into call-with-continuation invocation of the runtime-async entry point for ReturnsTaskOfInt().
As a result, if ReturnsTaskOfInt is another runtime-async method, we skip intermediate promise types (Task/ValueTask) entirely, which is the main reason for the performance edge of runtime async over the classic async.

@VSadov VSadov requested a review from jakobbotsch January 22, 2025 16:12
AssertEqual("B", strings.B);
AssertEqual("C", strings.C);
AssertEqual("D", strings.D);
// TODO: need to fix this
Copy link
Member Author

@VSadov VSadov Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch the change stresses calling via thunks and possibly introduced some scenarios that tests did not cover before. Remarkably, nearly everything works fine!! However, here I saw an assert and turned off one scenario.
Not sure if this is something wrong with IL or something on the JIT side.
(the other disabled case is with thunks for async methods in structs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've hit that before when encoding method/type spec tokens incorrectly. Can you verify that the tokens being encoded when we construct the IL for the variants look fine?

@jakobbotsch
Copy link
Member

The JIT optimization to optimize Await(RuntimeAsyncMethod), which is probably the harder part of the proposal, is not included here.

It would be nice to start on this work to see how it would look before we make the switch. Note that most of the work will be VM work -- teaching getCallInfo implementations to deal with the fact that it now may need to describe a call to the async variant of a call described by a token.

@VSadov
Copy link
Member Author

VSadov commented Jan 23, 2025

The JIT optimization to optimize Await(RuntimeAsyncMethod), which is probably the harder part of the proposal, is not included here.

It would be nice to start on this work to see how it would look before we make the switch. Note that most of the work will be VM work -- teaching getCallInfo implementations to deal with the fact that it now may need to describe a call to the async variant of a call described by a token.

The optimization would need to detect the following pattern

arg0; .. ; argN; CallToThunkToAsync; CallToAwaitIntrinsic

and turn it into

arg0; .. ; argN; CallToAsync

For that there should be a way to:

  1. detect that a call info is for a thunk to an async method
  2. get a call info for the actual async method (with other inputs being the same)

Is this correct?
Would the following API be sufficient?


for #1, a flag in CORINFO_CALL_INFO::methodFlags indicating that the call info happens to be for a thunk.

CORINFO_FLG_THUNK_TO_ASYNC // the method is a non-async thunk to an async method

for #2 a flag that can be passed in CORINFO_CALLINFO_FLAGS to CEEInfo::getCallInfo, to ask for an actual async method call info.

CORINFO_CALLINFO_UNWRAP_THUNK // assume that the input pResolvedToken is for a thunk (assert if it is not), get the info for the actual async method.

@jakobbotsch
Copy link
Member

jakobbotsch commented Jan 23, 2025

The optimization would need to detect the following pattern

arg0; .. ; argN; CallToThunkToAsync; CallToAwaitIntrinsic

and turn it into

arg0; .. ; argN; CallToAsync

There are a few ways to do this, but maybe the most straightforward will be to do it as a direct IL pattern match at the point where we call getCallInfo:

eeGetCallInfo(&resolvedToken,
(prefixFlags & PREFIX_CONSTRAINED) ? &constrainedResolvedToken : nullptr,
// this is how impImportCall invokes getCallInfo
combine(combine(CORINFO_CALLINFO_ALLOWINSTPARAM, CORINFO_CALLINFO_SECURITYCHECKS),
(opcode == CEE_CALLVIRT) ? CORINFO_CALLINFO_CALLVIRT : CORINFO_CALLINFO_NONE),
&callInfo);

This would be changed to first look ahead for another call IL instruction and check whether it was a call to RuntimeHelpers.Await. One way to do that is by resolving the next call instruction's token and using isIntrinsic + getMethodNameFromMetadata to check.
You should not need to try to recognize anything about the arguments, I think.

There are some other details to work out, like properly setting up for opportunistic tailcalls when the Await call is in tail position, but that can come later.

For that there should be a way to:

  1. detect that a call info is for a thunk to an async method
  2. get a call info for the actual async method (with other inputs being the same)

Is this correct? Would the following API be sufficient?

for #1, a flag in CORINFO_CALL_INFO::methodFlags indicating that the call info happens to be for a thunk.

CORINFO_FLG_THUNK_TO_ASYNC // the method is a non-async thunk to an async method

for #2 a flag that can be passed in CORINFO_CALLINFO_FLAGS to CEEInfo::getCallInfo, to ask for an actual async method call info.

CORINFO_CALLINFO_UNWRAP_THUNK // assume that the input pResolvedToken is for a thunk (assert if it is not), get the info for the actual async method.

I would skip #1 for now. We can switch any task returning call to its async2 thunk. It is probably more efficient to avoid doing so if we know that we are switching to a thunk, but it is not possible for us to know that statically if the target is dynamically resolved.

#2 is the same as what I was thinking. Without #1 you cannot do the assert, but also it would not be possible to assert this regardless except for statically resolvable cases. Given that I would probably call the flag something like CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT, since we use the "async variant" term in other places.

@VSadov
Copy link
Member Author

VSadov commented Jan 23, 2025

You should not need to try to recognize anything about the arguments, I think.

Yes. I included the arguments in the example to show that they do not need to change.

I was thinking of looking back at previous instruction once we see an Await intrinsic, and if previous instruction was a call that we can optimize, replace it with a call to async method.
It may be that looking ahead will fit better into how importer does things.

The rest makes sense. Thanks!

@VSadov
Copy link
Member Author

VSadov commented Jan 24, 2025

Implemented the JIT optimization as discussed above.

@VSadov
Copy link
Member Author

VSadov commented Jan 24, 2025

The impact of the optimization is quite noticeable (as expected):

E:\>set DOTNET_JitOptimizeAwait=0

E:\>E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\async\fibonacci-without-yields\fibonacci-without-yields.cmd
BEGIN EXECUTION
 "E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false" -p "System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true"  fibonacci-without-yields.dll
1172 ms result=3026313472
Expected: 100
Actual: 100
END EXECUTION - PASSED
PASSED

E:\>set DOTNET_JitOptimizeAwait=1

E:\>E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\async\fibonacci-without-yields\fibonacci-without-yields.cmd
BEGIN EXECUTION
 "E:\A2\runtimelab\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false" -p "System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true"  fibonacci-without-yields.dll
178 ms result=3026313472
Expected: 100
Actual: 100
END EXECUTION - PASSED
PASSED

178 ms is definitely an improvement over 1173 ms.

Comment on lines 4922 to 4928
if (flags & CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT)
{
_ASSERTE(!pMD->IsAsync2Method());
pMD = pMD->GetAsyncOtherVariant();
pResolvedToken->hMethod = (CORINFO_METHOD_HANDLE)pMD;
}

Copy link
Member

@jakobbotsch jakobbotsch Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that's much simpler than I was expecting.

pResolvedToken is in-only, so we should make a copy of it here and change that one instead. If necessary you can update it from the callInfo on the JIT side, but I'm somewhat worried we end up with a token whose fields are internally inconsistent.

Can you make sure we have tests for some of the hard cases? GVMs, interface calls, virtual class calls and constrained calls come to mind. I was expecting shared generics to require more work as well since other fields of the token are used below for those (see ComputeRuntimeLookupForSharedGenericToken). Can you double check why it works out? Is the method spec/type spec ok to reuse as-is from the token?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to ensure the resolved token consistency could be to pass the new flag not to the eeGetCallInfo, but to the impResolveToken.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the MethodDesc shimming to the level of impResolveToken. That seems nicer as it allows eeGetCallInfo to stay unchanged.

@@ -586,6 +586,8 @@ OPT_CONFIG_INTEGER(JitDoIfConversion, "JitDoIfConversion", 1)
OPT_CONFIG_INTEGER(JitDoOptimizeMaskConversions, "JitDoOptimizeMaskConversions", 1) // Perform optimization of mask
// conversions

RELEASE_CONFIG_INTEGER(JitOptimizeAwait, "JitOptimizeAwait", 1) // Perform optimization of Await intrinsics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't add a release knob for this.

}

awaiter.GetResult();
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return;

}

awaiter.GetResult();
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants