I have two collections which have property Email
in both collections. I need to get a list of the items in the first list where Email
does not exist in the second list. With SQL I would just use "not in", but I do not know the equivalent in LINQ. How is that done?
So far I have a join, like...
var matches = from item1 in list1
join item2 in list2 on item1.Email equals item2.Email
select new { Email = list1.Email };
But I cannot join since I need the difference and the join would fail. I need some way of using Contains or Exists I believe. I just have not found an example to do that yet.
You want the Except operator.
var answer = list1.Except(list2);
Better explanation here: https://docs.microsoft.com/archive/blogs/charlie/linq-farm-more-on-set-operators
NOTE: This technique works best for primitive types only, since you have to implement an IEqualityComparer to use the Except
method with complex types.
I don't know if this will help you but..
NorthwindDataContext dc = new NorthwindDataContext();
dc.Log = Console.Out;
var query =
from c in dc.Customers
where !(from o in dc.Orders
select o.CustomerID)
.Contains(c.CustomerID)
select c;
foreach (var c in query) Console.WriteLine( c );
from The NOT IN clause in LINQ to SQL by Marco Russo
For people who start with a group of in-memory objects and are querying against a database, I've found this to be the best way to go:
var itemIds = inMemoryList.Select(x => x.Id).ToArray();
var otherObjects = context.ItemList.Where(x => !itemIds.Contains(x.Id));
This produces a nice WHERE ... IN (...)
clause in SQL.
items in the first list where the Email does not exist in the second list.
from item1 in List1
where !(list2.Any(item2 => item2.Email == item1.Email))
select item1;
You can use a combination of Where and Any for finding not in:
var NotInRecord =list1.Where(p => !list2.Any(p2 => p2.Email == p.Email));
.Any()
, especially when it's "NOT ANY" I get confused!
In the case where one is using the ADO.NET Entity Framework, EchoStorm's solution also works perfectly. But it took me a few minutes to wrap my head around it. Assuming you have a database context, dc, and want to find rows in table x not linked in table y, the complete answer answer looks like:
var linked =
from x in dc.X
from y in dc.Y
where x.MyProperty == y.MyProperty
select x;
var notLinked =
dc.X.Except(linked);
In response to Andy's comment, yes, one can have two from's in a LINQ query. Here's a complete working example, using lists. Each class, Foo and Bar, has an Id. Foo has a "foreign key" reference to Bar via Foo.BarId. The program selects all Foo's not linked to a corresponding Bar.
class Program
{
static void Main(string[] args)
{
// Creates some foos
List<Foo> fooList = new List<Foo>();
fooList.Add(new Foo { Id = 1, BarId = 11 });
fooList.Add(new Foo { Id = 2, BarId = 12 });
fooList.Add(new Foo { Id = 3, BarId = 13 });
fooList.Add(new Foo { Id = 4, BarId = 14 });
fooList.Add(new Foo { Id = 5, BarId = -1 });
fooList.Add(new Foo { Id = 6, BarId = -1 });
fooList.Add(new Foo { Id = 7, BarId = -1 });
// Create some bars
List<Bar> barList = new List<Bar>();
barList.Add(new Bar { Id = 11 });
barList.Add(new Bar { Id = 12 });
barList.Add(new Bar { Id = 13 });
barList.Add(new Bar { Id = 14 });
barList.Add(new Bar { Id = 15 });
barList.Add(new Bar { Id = 16 });
barList.Add(new Bar { Id = 17 });
var linked = from foo in fooList
from bar in barList
where foo.BarId == bar.Id
select foo;
var notLinked = fooList.Except(linked);
foreach (Foo item in notLinked)
{
Console.WriteLine(
String.Format(
"Foo.Id: {0} | Bar.Id: {1}",
item.Id, item.BarId));
}
Console.WriteLine("Any key to continue...");
Console.ReadKey();
}
}
class Foo
{
public int Id { get; set; }
public int BarId { get; set; }
}
class Bar
{
public int Id { get; set; }
}
You can take both the collections in two different lists, say list1 and list2.
Then just write
list1.RemoveAll(Item => list2.Contains(Item));
This will work.
One could also use All()
var notInList = list1.Where(p => list2.All(p2 => p2.Email != p.Email));
var secondEmails = (from item in list2
select new { Email = item.Email }
).ToList();
var matches = from item in list1
where !secondEmails.Contains(item.Email)
select new {Email = item.Email};
While Except
is part of the answer, it's not the whole answer. By default, Except
(like several of the LINQ operators) does a reference comparison on reference types. To compare by values in the objects, you'll have to
implement IEquatable
override Equals and GetHashCode in your type, or
pass in an instance of a type implementing IEqualityComparer
Example using List of int for simplicity.
List<int> list1 = new List<int>();
// fill data
List<int> list2 = new List<int>();
// fill data
var results = from i in list1
where !list2.Contains(i)
select i;
foreach (var result in results)
Console.WriteLine(result.ToString());
For anyone who also wants to use a SQL-alike IN
operator in C#, download this package :
Mshwf.NiceLinq
It has In
and NotIn
methods:
var result = list1.In(x => x.Email, list2.Select(z => z.Email));
Even you can use it this way
var result = list1.In(x => x.Email, "a@b.com", "b@c.com", "c@d.com");
Alternatively you can do like this:
var result = list1.Where(p => list2.All(x => x.Id != p.Id));
I did not test this with LINQ to Entities:
NorthwindDataContext dc = new NorthwindDataContext();
dc.Log = Console.Out;
var query =
from c in dc.Customers
where !dc.Orders.Any(o => o.CustomerID == c.CustomerID)
select c;
Alternatively:
NorthwindDataContext dc = new NorthwindDataContext();
dc.Log = Console.Out;
var query =
from c in dc.Customers
where dc.Orders.All(o => o.CustomerID != c.CustomerID)
select c;
foreach (var c in query)
Console.WriteLine( c );
Couldn't you do an outer join, only selecting the items from the first list if the group is empty? Something like:
Dim result = (From a In list1
Group Join b In list2
On a.Value Equals b.Value
Into grp = Group
Where Not grp.Any
Select a)
I'm unsure whether this would work in any sort of efficient way with the Entity framework.
DynamicWebsiteEntities db = new DynamicWebsiteEntities();
var data = (from dt_sub in db.Subjects_Details
//Sub Query - 1
let sub_s_g = (from sg in db.Subjects_In_Group
where sg.GroupId == groupId
select sg.SubjectId)
//Where Cause
where !sub_s_g.Contains(dt_sub.Id) && dt_sub.IsLanguage == false
//Order By Cause
orderby dt_sub.Subject_Name
select dt_sub)
.AsEnumerable();
SelectList multiSelect = new SelectList(data, "Id", "Subject_Name", selectedValue);
//======================================OR===========================================
var data = (from dt_sub in db.Subjects_Details
//Where Cause
where !(from sg in db.Subjects_In_Group
where sg.GroupId == groupId
select sg.SubjectId).Contains(dt_sub.Id) && dt_sub.IsLanguage == false
//Order By Cause
orderby dt_sub.Subject_Name
select dt_sub)
.AsEnumerable();
Success story sharing
list1.Select(item => new { Property1 = item.Property1, Property2 = item.Property2 }).Except(list2.Select( item => new { Property1 = item.Property1, Property2 = item.Property2 }));
this is particular useful when you're determining equality by evaluating only a set of values of the complex type.IEquatityComparor<T,T>
or override object comparison methods in aLinqToSql
scenario; for, the query will be represented as/compiled to/expressed as SQL; thus the values will be checked, not the object reference.except
I was able to speed up a LINQ query from 8-10 seconds to a half a second