I have a list testList
that contains a bunch of strings. I would like to add a new string into the testList
only if it doesn't already exist in the list. Therefore, I need to do a case-insensitive search of the list and make it efficient. I can't use Contains
because that doesn't take into account the casing. I also don't want to use ToUpper/ToLower
for performance reasons. I came across this method, which works:
if(testList.FindAll(x => x.IndexOf(keyword,
StringComparison.OrdinalIgnoreCase) >= 0).Count > 0)
Console.WriteLine("Found in list");
This works, but it also matches partial words. If the list contains "goat", I can't add "oat" because it claims that "oat" is already in the list. Is there a way to efficiently search lists in a case insensitive manner, where words have to match exactly? thanks
I realise this is an old post, but just in case anyone else is looking, you can use Contains
by providing the case insensitive string equality comparer like so:
using System.Linq;
// ...
if (testList.Contains(keyword, StringComparer.OrdinalIgnoreCase))
{
Console.WriteLine("Keyword Exists");
}
This has been available since .net 2.0 according to msdn.
Instead of String.IndexOf, use String.Equals to ensure you don't have partial matches. Also don't use FindAll as that goes through every element, use FindIndex (it stops on the first one it hits).
if(testList.FindIndex(x => x.Equals(keyword,
StringComparison.OrdinalIgnoreCase) ) != -1)
Console.WriteLine("Found in list");
Alternately use some LINQ methods (which also stops on the first one it hits)
if( testList.Any( s => s.Equals(keyword, StringComparison.OrdinalIgnoreCase) ) )
Console.WriteLine("found in list");
List<>.Exists(Predicate<>)
instance method. Also note that if the list contains null
entries, this can blow up. In that case it is more safe to say keyword.Equals(x, StringComparison.OrdinalIgnoreCase)
than x.Equals(keyword, StringComparison.OrdinalIgnoreCase)
(if you can guarantee that the keyword
is never null).
Based on Adam Sills answer above - here's a nice clean extensions method for Contains... :)
///----------------------------------------------------------------------
/// <summary>
/// Determines whether the specified list contains the matching string value
/// </summary>
/// <param name="list">The list.</param>
/// <param name="value">The value to match.</param>
/// <param name="ignoreCase">if set to <c>true</c> the case is ignored.</param>
/// <returns>
/// <c>true</c> if the specified list contais the matching string; otherwise, <c>false</c>.
/// </returns>
///----------------------------------------------------------------------
public static bool Contains(this List<string> list, string value, bool ignoreCase = false)
{
return ignoreCase ?
list.Any(s => s.Equals(value, StringComparison.OrdinalIgnoreCase)) :
list.Contains(value);
}
You can use StringComparer
static variants with the Contains
overload from LINQ for example, like this:
using System.Linq;
var list = new List<string>();
list.Add("cat");
list.Add("dog");
list.Add("moth");
if (list.Contains("MOTH", StringComparer.OrdinalIgnoreCase))
{
Console.WriteLine("found");
}
Based on Lance Larsen answer - here's an extension method with the recommended string.Compare instead of string.Equals
It is highly recommended that you use an overload of String.Compare that takes a StringComparison parameter. Not only do these overloads allow you to define the exact comparison behavior you intended, using them will also make your code more readable for other developers. [Josh Free @ BCL Team Blog]
public static bool Contains(this List<string> source, string toCheck, StringComparison comp)
{
return
source != null &&
!string.IsNullOrEmpty(toCheck) &&
source.Any(x => string.Compare(x, toCheck, comp) == 0);
}
You're checking if the result of IndexOf is larger or equal 0, meaning whether the match starts anywhere in the string. Try checking if it's equal to 0:
if (testList.FindAll(x => x.IndexOf(keyword,
StringComparison.OrdinalIgnoreCase) >= 0).Count > 0)
Console.WriteLine("Found in list");
Now "goat" and "oat" won't match, but "goat" and "goa" will. To avoid this, you can compare the lenghts of the two strings.
To avoid all this complication, you can use a dictionary instead of a list. They key would be the lowercase string, and the value would be the real string. This way, performance isn't hurt because you don't have to use ToLower
for each comparison, but you can still use Contains
.
Below is the example of searching for a keyword in the whole list and remove that item:
public class Book
{
public int BookId { get; set; }
public DateTime CreatedDate { get; set; }
public string Text { get; set; }
public string Autor { get; set; }
public string Source { get; set; }
}
If you want to remove a book that contains some keyword in the Text property, you can create a list of keywords and remove it from list of books:
List<Book> listToSearch = new List<Book>()
{
new Book(){
BookId = 1,
CreatedDate = new DateTime(2014, 5, 27),
Text = " test voprivreda...",
Autor = "abc",
Source = "SSSS"
},
new Book(){
BookId = 2,
CreatedDate = new DateTime(2014, 5, 27),
Text = "here you go...",
Autor = "bcd",
Source = "SSSS"
}
};
var blackList = new List<string>()
{
"test", "b"
};
foreach (var itemtoremove in blackList)
{
listToSearch.RemoveAll(p => p.Source.ToLower().Contains(itemtoremove.ToLower()) || p.Source.ToLower().Contains(itemtoremove.ToLower()));
}
return listToSearch.ToList();
I had a similar problem, I needed the index of the item but it had to be case insensitive, i looked around the web for a few minutes and found nothing, so I just wrote a small method to get it done, here is what I did:
private static int getCaseInvariantIndex(List<string> ItemsList, string searchItem)
{
List<string> lowercaselist = new List<string>();
foreach (string item in ItemsList)
{
lowercaselist.Add(item.ToLower());
}
return lowercaselist.IndexOf(searchItem.ToLower());
}
Add this code to the same file, and call it like this:
int index = getCaseInvariantIndexFromList(ListOfItems, itemToFind);
Hope this helps, good luck!
Success story sharing
StringComparer
class has been around since 2.0, but that overload of Contains was introduced in 3.5. msdn.microsoft.com/en-us/library/bb339118(v=vs.110).aspx