Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Known bug: Utilizing custom middleware on orchestrators may generate non-determinism exceptions #158

Open
davidmrdavid opened this issue Jun 26, 2023 · 10 comments
Labels
bug Something isn't working P2

Comments

@davidmrdavid
Copy link
Member

Bug Description:

Applying custom middleware to an orchestrator's invocation pipeline may result in non-determinism exceptions. In essence, this is because custom middleware is being interpreted by the DurableTask framework as being part of the orchestrator code, which needs to abide by specific coding constraints to prevent non-determinism errors. When middleware logic does not abide by these constraints, the DurableTask framework will flag the orchestrator as non-deterministic and fail the invocation.

This issue was originally reported here: #153

Diagnosis

If the following conditions are met, then you may be affected by this bug:
(1) Your orchestrators are failing with exceptions prefixed with Non-Deterministic workflow detected:
(2) You application injects custom middleware during function invocations. Example scenario: you're using the Azure AppConfiguration middleware.
(3) Your orchestrators definition is deterministic
(4) Removing the middleware prevents the non-determinism exceptions

Workaround:

While we work to fix this bug, there are two main workarounds you can consider:

(1) Skip your custom middleware logic when it is used to invoke an orchestrator. You may detect that an orchestrator is being invoked by re-using this helper method.

(2) Do not use custom middleware in your function invocations. We realize this is not an ideal solution.

Long term fix

The specific long term fix is still being discussed. For now, we're tracking that work here: Azure/azure-functions-dotnet-worker#1666

@davidmrdavid davidmrdavid added known bug bug Something isn't working and removed Needs: Triage 🔍 known bug labels Jun 26, 2023
@davidmrdavid davidmrdavid changed the title Known bug: Utilizing middleware on orchestrators may generate non-determinism exceptions Known bug: Utilizing custom middleware on orchestrators may generate non-determinism exceptions Jun 26, 2023
@RobARichardson
Copy link

RobARichardson commented Oct 31, 2023

For several weeks, my team & I have been troubleshooting sub-orchestrator function failures due to the following exception:

Error Details:

- FormattedMessage: The orchestrator function completed on a non-orchestrator thread!
- Exception Type: System.InvalidOperationException
- Message: An invalid asynchronous invocation was detected. This can be caused by awaiting non-durable tasks in an orchestrator function's implementation or by middleware that invokes asynchronous code.
- Problem ID: System.InvalidOperationException at Microsoft.Azure.Functions.Worker.Extensions.DurableTask.FunctionsOrchestrationContext.ThrowIfIllegalAccess
- Assembly: Microsoft.Azure.Functions.Worker.Extensions.DurableTask, Version=1.0.3.0, Culture=neutral, PublicKeyToken=014045d636e89289	
- CategoryName: Microsoft.Azure.Functions.Worker.Extensions.DurableTask.DurableTaskFunctionsMiddleware	

Our search for answers led us to this issue. Since we had developed custom function middleware, we tried removing it but it had no impact. Furthermore, we could not reproduce the issue locally - only in Azure. Yesterday, we turned our attention to what could be unique about our environment in Azure. My organization uses DataDog for Application Monitoring and the Azure Function App in question uses the DataDog AAS Extension. After removing the DataDog AAS Extension from the Function App, this exception has disappeared completely.

I'm wondering if the team working on durabletask-dotnet has any insight into what could be going on here and whether these two things could be related.

@danniefraim
Copy link

For several weeks, my team & I have been troubleshooting sub-orchestrator function failures due to the following exception:

Error Details:

- FormattedMessage: The orchestrator function completed on a non-orchestrator thread!
- Exception Type: System.InvalidOperationException
- Message: An invalid asynchronous invocation was detected. This can be caused by awaiting non-durable tasks in an orchestrator function's implementation or by middleware that invokes asynchronous code.
- Problem ID: System.InvalidOperationException at Microsoft.Azure.Functions.Worker.Extensions.DurableTask.FunctionsOrchestrationContext.ThrowIfIllegalAccess
- Assembly: Microsoft.Azure.Functions.Worker.Extensions.DurableTask, Version=1.0.3.0, Culture=neutral, PublicKeyToken=014045d636e89289	
- CategoryName: Microsoft.Azure.Functions.Worker.Extensions.DurableTask.DurableTaskFunctionsMiddleware	

Our search for answers led us to this issue. Since we had developed custom function middleware, we tried removing it but it had no impact. Furthermore, we could not reproduce the issue locally - only in Azure. Yesterday, we turned our attention to what could be unique about our environment in Azure. My organization uses DataDog for Application Monitoring and the Azure Function App in question uses the DataDog AAS Extension. After removing the DataDog AAS Extension from the Function App, this exception has disappeared completely.

I'm wondering if the team working on durabletask-dotnet has any insight into what could be going on here and whether these two things could be related.

This was super interesting! I found this issue while investigating the same exception you're getting, and after just enabling Datadog monitoring for our application. It feels like this might be something that could be handled by Datadog as well - have you reported it to them, @RobARichardson?

@ForteUnited
Copy link

ForteUnited commented Jan 3, 2024

We've run into the same issue using Azure App Configuration and wiring up the App Configuration SDK for dynamic config changes using a sentinel value.

This code causes the issue with the Azure App Configuration nuget/sdk

public static void Main()
{
    var host = new HostBuilder()
        .ConfigureAppConfiguration(builder =>
        {
            // Omitted the code added in the previous step.
            // ... ...
        })
        .ConfigureServices(services =>
        {
            // Make Azure App Configuration services available through dependency injection.
            services.AddAzureAppConfiguration();
        })
        .ConfigureFunctionsWorkerDefaults(app =>
        {
            // Use Azure App Configuration middleware for data refresh.
            app.UseAzureAppConfiguration();
        })
        .Build();

    host.Run();
}

Taken from this MS article -> https://learn.microsoft.com/en-us/azure/azure-app-configuration/enable-dynamic-configuration-azure-functions-csharp?tabs=isolated-process#reload-data-from-app-configuration

@lilyjma lilyjma added the P2 label Jan 4, 2024
@jkdmyrs
Copy link

jkdmyrs commented Jan 25, 2024

I am also having issues with the Azure App Configuration middleware and durable tasks. It appears to impact sub-orchestrations, as mentioned above.

@smackodale
Copy link

We are having the same issue with Azure App Config. This is our first attempt at Durable Functions and it unfortunately it falls over at what some would consider an essential part of configuration management within an enterprise platform.
Is there any update to this, a fix/timeframe for a fix or at the minimum a work around?

The following 2 bugs are also related:

@davidmrdavid
Copy link
Member Author

Hi @smackodale : does the workaround described at the beginning of this thread not work for you? If it doesn't work - could you please open a new issue to describe your particular issue?

@dmytroett
Copy link

For several weeks, my team & I have been troubleshooting sub-orchestrator function failures due to the following exception:
Error Details:

- FormattedMessage: The orchestrator function completed on a non-orchestrator thread!
- Exception Type: System.InvalidOperationException
- Message: An invalid asynchronous invocation was detected. This can be caused by awaiting non-durable tasks in an orchestrator function's implementation or by middleware that invokes asynchronous code.
- Problem ID: System.InvalidOperationException at Microsoft.Azure.Functions.Worker.Extensions.DurableTask.FunctionsOrchestrationContext.ThrowIfIllegalAccess
- Assembly: Microsoft.Azure.Functions.Worker.Extensions.DurableTask, Version=1.0.3.0, Culture=neutral, PublicKeyToken=014045d636e89289	
- CategoryName: Microsoft.Azure.Functions.Worker.Extensions.DurableTask.DurableTaskFunctionsMiddleware	

Our search for answers led us to this issue. Since we had developed custom function middleware, we tried removing it but it had no impact. Furthermore, we could not reproduce the issue locally - only in Azure. Yesterday, we turned our attention to what could be unique about our environment in Azure. My organization uses DataDog for Application Monitoring and the Azure Function App in question uses the DataDog AAS Extension. After removing the DataDog AAS Extension from the Function App, this exception has disappeared completely.
I'm wondering if the team working on durabletask-dotnet has any insight into what could be going on here and whether these two things could be related.

This was super interesting! I found this issue while investigating the same exception you're getting, and after just enabling Datadog monitoring for our application. It feels like this might be something that could be handled by Datadog as well - have you reported it to them, @RobARichardson?

We're also experiencing the same issue due to DataDog extension. Were you able to find a workaround?

@danniefraim
Copy link

We're also experiencing the same issue due to DataDog extension. Were you able to find a workaround?

We never solved it. We stopped using Durable Functions altogether and moved to message queue based processing instead. For this and many other reasons. Durable Functions don't feel quite production ready to me.

@jkdmyrs
Copy link

jkdmyrs commented Aug 21, 2024

We're also experiencing the same issue due to DataDog extension. Were you able to find a workaround?

We never solved it. We stopped using Durable Functions altogether and moved to message queue based processing instead. For this and many other reasons. Durable Functions don't feel quite production ready to me.

This was the route my team went down also. Use a Container App as the "driver" for the long running process, have the container app manually fan-out with queues/queue triggers inside a AzFunction app.

It turns out we didn't really need to "fan in" anyways, so this pattern works much better than trying to fan out/in with Durable.

@Banchio
Copy link

Banchio commented Nov 16, 2024

Facing this issue as well with Azure App Configuration when using dotnet 8 isolated durable function. Orchestration never ends (throws exception when processing Terminate message). @davidmrdavid is it possible to use the workaround you mention at the beginning of the issue in this case? I believe not because I'm not adding a custom middleware but the Azure App Config one but require confirmation on this.
Personally I believe this is very important to be addressed, centralizing configuration with Azure App Config is required in many enterprises now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2
Projects
None yet
Development

No branches or pull requests

9 participants