-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GroupJoin in EF Core 9 Returns Null for Joined Entities #35393
Comments
When I updated the version to 8.0.11, the issue was resolved. There is a bug in version 9.0.0 |
I'm able to reproduce this: [ConditionalFact]
public async Task ReproGJ()
{
using (var ctx = new MyContext())
{
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();
var p1 = new Product();
var p2 = new Product();
var c1 = new Category { CategoryName = "one" };
var c2 = new Category { CategoryName = "two" };
ctx.Products.AddRange(p1, p2);
ctx.Categories.AddRange(c1, c2);
await ctx.SaveChangesAsync();
}
using (var ctx = new MyContext())
{
var query = await ctx.Categories
.GroupJoin(ctx.Products,
c => true,
p => true,
(c, p) => new
{
Category = c,
Products = p
})
.Select(s => new
{
CategoryName = s.Category.CategoryName,
Products = s.Products
})
.ToListAsync();
}
}
public class MyContext : DbContext
{
public DbSet<Category> Categories { get; set; }
public DbSet<Product> Products { get; set; }
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
}
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder
.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=ReproGJ;Trusted_Connection=True;MultipleActiveResultSets=true")
.LogTo(Console.WriteLine, LogLevel.Information)
.EnableSensitiveDataLogging();
}
}
public class MyEntity
{
public int Id { get; set; }
}
public class Category
{
public int Id { get; set; }
public string CategoryName { get; set; }
}
public class Product
{
public int Id { get; set; }
} We produce this sql: SELECT [c].[CategoryName], [c].[Id], [p0].[Id]
FROM [Categories] AS [c]
OUTER APPLY (
SELECT [p].[Id]
FROM [Products] AS [p]
WHERE 0 = 1
) AS [p0]
ORDER BY [c].[Id] while EF 8 produces: SELECT [c].[CategoryName], [c].[Id], [t].[Id]
FROM [Categories] AS [c]
OUTER APPLY (
SELECT [p].[Id]
FROM [Products] AS [p]
) AS [t]
ORDER BY [c].[Id] |
When we process GroupJoin in GroupJoinConvertingExpressionVisitor we convert it to filtered subquery: DbSet<Product>()
.Where(p => !(object.Equals(
objA: (object)True,
objB: null)) && object.Equals(
objA: (object)True,
objB: (object)True)) then, during translation:
is naively translated to
This optimization is correct in 3-value logic:
but it breaks down when we know that a is actually null (i.e. null constant or null value parameter later on), which converts to ISNULL in
/cc @ranma42 |
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: ``` !(true == a) -> false == a !(false == a) -> true == a ``` is not safe to do when a is a constant or parameter null value, because it evaluates to true IS NULL (two value logic). Fix is to remove this optimization from SqlExpressionFactory and instead put it back into OptimizeNotExpression, once we've converted everything we could to IS NULL checks already. Fixes #35393
Exactly, if the Sorry, I introduced this regression in #34142 |
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: ``` !(true == a) -> false == a !(false == a) -> true == a ``` is not safe to do when a is a constant or parameter null value, because it evaluates to true IS NULL (two value logic). Fix is to remove this optimization from SqlExpressionFactory and instead put it back into OptimizeNotExpression, once we've converted everything we could to IS NULL checks already. Fixes #35393
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: ``` !(true == a) -> false == a !(false == a) -> true == a ``` is not safe to do when a is a constant or parameter null value, because it evaluates to true IS NULL (two value logic). Fix is to remove this optimization from SqlExpressionFactory and instead put it back into OptimizeNotExpression, once we've converted everything we could to IS NULL checks already. Fixes #35393
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: ``` !(true == a) -> false == a !(false == a) -> true == a ``` is not safe to do when a is nullable. Fix is to constrain this optimization in SqlExpressionFactory to only work on argument which we know is not null (constant, column, non-nullable parameter) and do the comprehensive one back in OptimizeNotExpression, once we've converted everything we could to IS NULL checks already. Fixes #35393
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: ``` !(true == a) -> false == a !(false == a) -> true == a ``` is not safe to do when a is nullable. Fix is to constrain this optimization in SqlExpressionFactory to only work on argument which we know is not null (constant, column, non-nullable parameter) and do the comprehensive one back in OptimizeNotExpression, once we've converted everything we could to IS NULL checks already. Fixes #35393
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: !(true = a) -> false = a !(false = a) -> true = a In principle, this should be valid in 3-value logic a (true = a) !(true = a) (false = a) 0 0 1 1 1 1 0 0 N N N N but not in c# semantics, both null == true and null == false are evaluated to false, so !(true == a) is not the same as false = a (first one is true, second one is false) Fix is to drop this optimization from SqlExpressionFactory and instead compensate in nullability processor (where we have full nullability information), so that we don't regress performance for cases in which this optimization happened to be valid. Fixes #35393
Problem was that in EF9 we moved some optimizations from sql nullability processor to SqlExpressionFactory (so that we optimize things early). One of the optimizations: !(true = a) -> false = a !(false = a) -> true = a In principle, this should be valid in 3-value logic a (true = a) !(true = a) (false = a) 0 0 1 1 1 1 0 0 N N N N but not in c# semantics, both null == true and null == false are evaluated to false, so !(true == a) is not the same as false = a (first one is true, second one is false) Fix is to drop this optimization from SqlExpressionFactory and instead compensate in nullability processor (where we have full nullability information), so that we don't regress performance for cases in which this optimization happened to be valid. Fixes #35393
…Null for Joined Entities Fixes #35393 Description In EF9 we refactored some optimizations we performed on generated SQL - we are now applying them earlier (as part of SqlExpressionFactory. However, one of those optimizations is only correct for non-nullable arguments, and at the time we apply them we don't have full nullability information. Fix is to only apply this optimization when we know the argument is non-nullable. Customer impact Data corruption. Queries comparing nullable boolean property to a constant (true/false) produced incorrect results (data corruption). Workaround is to rewrite the query to add missing null checks, but this is not intuitive at all. How found Customer report on 9.0.0 Regression Yes, from 8.0. Testing Added tests for the scenarios affected as well as test that brute-forces all combinations of comparisons between nullable arguments in variety of scenarios. Risk Low, we are just removing the optimization that was incorrect. After removal, the queries produced are the same as in EF8. Quirk added, just in case.
patch fix for 9.0.3 - leaving this open until we push the fix to main (likely via #34166 which builds on top of the hotfix change) |
Issue Description:
Hello EF Core team,
I've encountered what seems to be a bug in EF Core 9. When I perform a GroupJoin between two tables using a condition that always evaluates to true (for example,
c => true
andp => true
), the resulting joined collection is always null, even though there are valid entries in the related table. Here is minimal reproducible example:In this scenario, s.Products always comes back as empty (null), However, if I rewrite this using a from clause (or a manual join with
on 1 equals 1
), I get the expected results:Both of these queries should logically produce the same result, but the GroupJoin version returns null for the
Products
while the LINQ from version successfully includes the product collection. If this is not intended behavior, it seems like a bug.Thank you for looking into this! If you need any further information I'm happy to provide more details.
The text was updated successfully, but these errors were encountered: