-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net: Bug: Azure-hosted Open-Source Models (e.g., Mistral Nemo) fail with Kernel Functions in .NET #9933
Comments
@markwallace-microsoft can you take a look? |
@GregorBiswanger I created a sample for this and tested with Mistral Nemo and didn't see any issues. Take a look here: #9954 |
@markwallace-microsoft It is important that you tried Mistral Nemo via Azure AI and not with Ollama or locally... so really via Azure?! |
After further investigation, I’ve identified that the issue specifically occurs when streaming is enabled for the Azure-hosted Open Source Models (e.g., Mistral Nemo) using Semantic Kernel in .NET. Here’s what I tested: What I tested:
Code that does not work with Streaming:#pragma warning disable SKEXP0070
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.AzureAIInference;
using OllamaApiFacade.Extensions;
using SemanticFlow.DemoWebApi;
var azureKeyVaultHelper = new AzureKeyVaultHelper("https://XXX.vault.azure.net");
var endpoint = await azureKeyVaultHelper.GetSecretAsync("AZURE-MINISTRAL-NEMO-ENDPOINT");
var apiKey = await azureKeyVaultHelper.GetSecretAsync("AZURE-MINISTRAL-NEMO-KEY");
// Kernel setup
var builder = Kernel.CreateBuilder()
.AddAzureAIInferenceChatCompletion("Mistral-Nemo-oclsi", apiKey, new Uri(endpoint));
// Use Burp Suite proxy for analysis backend communication
// builder.Services.AddProxyForDebug();
var kernel = builder.Build();
// Import plugin functions
kernel.ImportPluginFromFunctions("HelperFunctions",
[
kernel.CreateFunctionFromMethod(() => new List<string> { "Squirrel Steals Show", "Dog Wins Lottery" },
"GetLatestNewsTitles", "Retrieves latest news titles."),
kernel.CreateFunctionFromMethod(() => DateTime.UtcNow.ToString("R"),
"GetCurrentUtcDateTime", "Retrieves the current date time in UTC."),
kernel.CreateFunctionFromMethod((string cityName, string currentDateTime) =>
{
if (string.IsNullOrEmpty(cityName) || string.IsNullOrEmpty(currentDateTime))
{
throw new ArgumentException("cityName and currentDateTime are required.");
}
return cityName switch
{
"Boston" => "61 and rainy",
"London" => "55 and cloudy",
"Miami" => "80 and sunny",
"Paris" => "60 and rainy",
"Tokyo" => "50 and sunny",
"Sydney" => "75 and sunny",
"Tel Aviv" => "80 and sunny",
_ => "31 and snowing",
};
}, "GetWeatherForCity", "Gets the current weather for the specified city, using the city name and current UTC date/time.")
]);
// Settings with Streaming Enabled
var settings = new AzureAIInferencePromptExecutionSettings { FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() };
var chatHistory = new ChatHistory();
chatHistory.AddUserMessage("What is the weather in Tokyo based on the current date and time?");
// Streaming with IChatCompletionService
var chatCompletionService = kernel.Services.GetRequiredService<IChatCompletionService>();
try
{
// Stream the response
await foreach (var message in chatCompletionService.GetStreamingChatMessageContentsAsync(chatHistory, settings, kernel))
{
if (message.Role.HasValue)
{
Console.Write($"{message.Role.Value}: ");
}
if (!string.IsNullOrEmpty(message.Content))
{
Console.Write(message.Content);
}
}
Console.WriteLine("\nStreaming completed.");
}
catch (Exception ex)
{
// Log any exceptions
Console.WriteLine($"Error: {ex.Message}");
} Cheers, |
I’ve discovered something new that might narrow down the issue further. It appears the problem isn’t just related to streaming but to Chat Completions in general. Using Burp Suite, I analyzed the request JSON being sent. I noticed that the JSON contains two separate
This leads me to believe there’s a bug in how Chat Completions handle these duplicate entries in the request JSON. Here’s the exact JSON being sent (identical even with streaming enabled): POST /chat/completions?api-version=2024-05-01-preview HTTP/1.1
Host: mistral-nemo-oclsi.swedencentral.models.ai.azure.com
Accept: application/json
...
Content-Type: application/json
Content-Length: 936
Connection: keep-alive
{
"messages": [
{
"content": [
{
"text": "What is the weather in Tokyo based on the current date and time?",
"type": "text"
}
],
"role": "user"
}
],
"model": "Mistral-Nemo-oclsi",
"tools": [
{
"type": "function",
"function": {
"name": "HelperFunctions-GetLatestNewsTitles",
"description": "Retrieves latest news titles.",
"parameters": {
"type": "object",
"required": [],
"properties": {}
}
}
},
{
"type": "function",
"function": {
"name": "HelperFunctions-GetCurrentUtcDateTime",
"description": "Retrieves the current date time in UTC.",
"parameters": {
"type": "object",
"required": [],
"properties": {}
}
}
},
{
"type": "function",
"function": {
"name": "HelperFunctions-GetWeatherForCity",
"description": "Gets the current weather for the specified city, using the city name and current UTC date/time.",
"parameters": {
"type": "object",
"required": ["cityName", "currentDateTime"],
"properties": {
"cityName": {
"type": "string"
},
"currentDateTime": {
"type": "string"
}
}
}
}
}
],
"tool_choice": "auto",
"stop": [],
"tools": []
} Key Observation:
While this structure doesn’t seem to break local solutions like LM-Studio, Azure AI appears to mishandle it by focusing on the last entry. I now believe the bug lies in how Chat Completions process these entries in general. Let me know if additional details or logs are needed! |
Thanks for the detailed investigation @GregorBiswanger. I suspect this is an issue in the Azure SDK, will continue to investigate. |
@markwallace-microsoft In combination with IChatCompletionService and PromptExecutionSettings instead of AzureAIInferencePromptExecutionSettings, it works without duplicating tool entries in the request. The 'Function Call' then works perfectly. |
### Motivation and Context - Closes #9933 ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [ ] The code builds clean without any errors or warnings - [ ] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [ ] All unit tests pass, and I have added new tests where possible - [ ] I didn't break anyone 😄
Hi,
And use SemanticKernel package 1.33.0 + Microsoft.SemanticKernel.Connectors.AzureAIInference 1.33.0-beta to test, The Mistral-Nemo model works perfect. But not in llama series models. With llama-3.3-70B, When make a chatcomplecton call with tools, It responded with an incorrect result and not invoke any tools. no matter use With llama-3.2-90B, When make a chatcomplecton call with tools, It throws an exception right away (error message in bwlow)
Is the AzureAIInference connector has issue with llama series models currently? |
Describe the bug
Semantic Kernel Functions in .NET using
Microsoft.SemanticKernel.Connectors.AzureAIInference
do not work with Azure-hosted open-source models like "Mistral Nemo," even though these models support Function Calls. The same setup works fine with GPT models, and in Python, it works with Azure AI without any issues. This seems to be a .NET-specific issue with Semantic Kernel.To Reproduce
Steps to reproduce the behavior:
Microsoft.SemanticKernel.Connectors.AzureAIInference
.PromptExecutionSettings
withFunctionChoiceBehavior.Auto()
.Expected behavior
The Semantic Kernel Function should execute successfully using Function Calls with the Azure-hosted open-source model, similar to how it works with GPT models or with Python implementations.
Screenshots
Not applicable
Platform
Additional context
Please provide guidance or a fix to enable Semantic Kernel Functions to work with Azure-hosted open-source models.
The text was updated successfully, but these errors were encountered: