Case insensitive 'Contains(string)'

A

Andrew Truckle

You could use the String.IndexOf Method and pass StringComparison.OrdinalIgnoreCase as the type of search to use:

string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;

Even better is defining a new extension method for string:

public static class StringExtensions
{
    public static bool Contains(this string source, string toCheck, StringComparison comp)
    {
        return source?.IndexOf(toCheck, comp) >= 0;
    }
}

Note, that null propagation ?. is available since C# 6.0 (VS 2015), for older versions use

if (source == null) return false;
return source.IndexOf(toCheck, comp) >= 0;

USAGE:

string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);

Great string extension method! I've edited mine to check the source string is not null to prevent any object reference errors from occurring when performing .IndexOf().

This gives the same answer as paragraph.ToLower(culture).Contains(word.ToLower(culture)) with CultureInfo.InvariantCulture and it doesn't solve any localisation issues. Why over complicate things? stackoverflow.com/a/15464440/284795

@ColonelPanic the ToLower version includes 2 allocations which are unnecessary in a comparison / search operation. Why needlessly allocate in a scenario that doesn't require it?

@Seabiscuit that won't work because string is an IEnumerable<char> hence you can't use it to find substrings

A word of warning: The default for string.IndexOf(string) is to use the current culture, while the default for string.Contains(string) is to use the ordinal comparer. As we know, the former can be changed be picking a longer overload, while the latter cannot be changed. A consequence of this inconsistency is the following code sample:

Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture; string self = "Waldstrasse"; string value = "straße"; Console.WriteLine(self.Contains(value));/* False */ Console.WriteLine(self.IndexOf(value) >= 0);/* True */

i

iliketocode

To test if the string paragraph contains the string word (thanks @QuarterMeister)

culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

Where culture is the instance of CultureInfo describing the language that the text is written in.

This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I and i for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.

Thus the strings tin and TIN are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)

To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture, because it will be wrong in familiar ways.

Why not culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0? That uses the right culture and is case-insensitive, it doesn't allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison.

This solution also needlessly pollutes the heap by allocating memory for what should be a searching function

Comparing with ToLower() will give different results from a case-insensitive IndexOf when two different letters have the same lowercase letter. For example, calling ToLower() on either U+0398 "Greek Capital Letter Theta" or U+03F4 "Greek Capital Letter Theta Symbol" results in U+03B8, "Greek Small Letter Theta", but the capital letters are considered different. Both solutions consider lowercase letters with the same capital letter different, such as U+0073 "Latin Small Letter S" and U+017F "Latin Small Letter Long S", so the IndexOf solution seems more consistent.

@Quartermeister - and BTW, I believe .NET 2 and .NET4 behave differently on this as .NET 4 always uses NORM_LINGUISTIC_CASING while .NET 2 did not (this flags has appeared with Windows Vista).

Why didn't you write "ddddfg".IndexOf("Df", StringComparison.OrdinalIgnoreCase) ?

L

Liam

You can use IndexOf() like this:

string title = "STRING";

if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1)
{
    // The string exists in the original
}

Since 0 (zero) can be an index, you check against -1.

MSDN

The zero-based index position of value if that string is found, or -1 if it is not. If value is String.Empty, the return value is 0.

m

marsze

Alternative solution using Regex:

bool contains = Regex.IsMatch("StRiNG to search", Regex.Escape("string"), RegexOptions.IgnoreCase);

Good Idea, also we have a lot of bitwise combinations in RegexOptions like RegexOptions.IgnoreCase & RegexOptions.IgnorePatternWhitespace & RegexOptions.CultureInvariant; for anyone if helps.

Must say I prefer this method although using IsMatch for neatness.

What's worse, since the search string is interpreted as a regex, a number of punctuation chars will cause incorrect results (or trigger an exception due to an invalid expression). Try searching for "." in "This is a sample string that doesn't contain the search string". Or try searching for "(invalid", for that matter.

@cHao: In that case, Regex.Escape could help. Regex still seems unnecessary when IndexOf / extension Contains is simple (and arguably more clear).

Note that I was not implying that this Regex solution was the best way to go. I was simply adding to the list of answers to the original posted question "Is there a way to make the following return true?".

M

Mathieu Renda

.NET Core 2.0+ (including .NET 5.0+)

.NET Core has had a pair of methods to deal with this since version 2.0 :

String.Contains(Char, StringComparison)

String.Contains(String, StringComparison)

Example:

"Test".Contains("test", System.StringComparison.CurrentCultureIgnoreCase);

It is now officially part of the .NET Standard 2.1, and therefore part of all the implementations of the Base Class Library that implement this version of the standard (or a higher one).

Now also available in .NET Standard 2.1

Available in .NET 5.0 as well.

.NET 5.0 is included in ".NET Core 2.0+"

E

Ed S.

You could always just up or downcase the strings first.

string title = "string":
title.ToUpper().Contains("STRING")  // returns true

Oops, just saw that last bit. A case insensitive compare would *probably* do the same anyway, and if performance is not an issue, I don't see a problem with creating uppercase copies and comparing those. I could have sworn that I once saw a case-insensitive compare once...

Search for "Turkey test" :)

In some French locales, uppercase letters don't have the diacritics, so ToUpper() may not be any better than ToLower(). I'd say use the proper tools if they're available - case-insensitive compare.

Don't use ToUpper or ToLower, and do what Jon Skeet said

Just saw this again after two years and a new downvote... anyway, I agree that there are better ways to compare strings. However, not all programs will be localized (most won't) and many are internal or throwaway apps. Since I can hardly expect credit for advice best left for throwaway apps... I'm moving on :D

Is searching for "Turkey test" the same as searching for "TURKEY TEST"?

f

fubo

One issue with the answer is that it will throw an exception if a string is null. You can add that as a check so it won't:

public static bool Contains(this string source, string toCheck, StringComparison comp)
{
    if (string.IsNullOrEmpty(toCheck) || string.IsNullOrEmpty(source))
        return true;

    return source.IndexOf(toCheck, comp) >= 0;
}

If toCheck is the empty string it needs to return true per the Contains documentation: "true if the value parameter occurs within this string, or if value is the empty string (""); otherwise, false."

Based on amurra's comment above, doesn't the suggested code need to be corrected? And shouldn't this be added to the accepted answer, so that the best response is first?

Now this will return true if source is an empty string or null no matter what toCheck is. That cannot be correct. Also IndexOf already returns true if toCheck is an empty string and source is not null. What is needed here is a check for null. I suggest if (source == null || value == null) return false;

The source cant be null

if (string.IsNullOrEmpty(source)) return string.IsNullOrEmpty(toCheck);

a

abatishchev

StringExtension class is the way forward, I've combined a couple of the posts above to give a complete code example:

public static class StringExtensions
{
    /// <summary>
    /// Allows case insensitive checks
    /// </summary>
    public static bool Contains(this string source, string toCheck, StringComparison comp)
    {
        return source.IndexOf(toCheck, comp) >= 0;
    }
}

why are you allowing ANOTHER layer of abstraction over StringComparison ?

Because this simplifies both reading and writing the code. It's essentially mimicking what later versions of .Net added directly to the class. There's a lot to be said for simple convenience methods that make your life and the life of others easier, even if they do add a little bit of abstraction.

N

Neuron

This is clean and simple.

Regex.IsMatch(file, fileNamestr, RegexOptions.IgnoreCase)

This will match against a pattern, though. In your example, if fileNamestr has any special regex characters (e.g. *, +, ., etc.) then you will be in for quite a surprise. The only way to make this solution work like a proper Contains function is to escape fileNamestr by doing Regex.Escape(fileNamestr).

besides, parsing and matching a regex is much more resource-intensive than a simple case-insensitive comparison

F

Fabian Bigler

OrdinalIgnoreCase, CurrentCultureIgnoreCase or InvariantCultureIgnoreCase?

Since this is missing, here are some recommendations about when to use which one:

Dos

Use StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.

Use StringComparison.OrdinalIgnoreCase comparisons for increased speed.

Use StringComparison.CurrentCulture-based string operations when displaying the output to the user.

Switch current use of string operations based on the invariant culture to use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase when the comparison is linguistically irrelevant (symbolic, for example).

Use ToUpperInvariant rather than ToLowerInvariant when normalizing strings for comparison.

Don'ts

Use overloads for string operations that don't explicitly or implicitly specify the string comparison mechanism.

Use StringComparison.InvariantCulture -based string operations in most cases; one of the few exceptions would be persisting linguistically meaningful but culturally-agnostic data.

Based on these rules you should use:

string title = "STRING";
if (title.IndexOf("string", 0, StringComparison.[YourDecision]) != -1)
{
    // The string exists in the original
}

whereas [YourDecision] depends on the recommendations from above.

link of source: http://msdn.microsoft.com/en-us/library/ms973919.aspx

what if you know you're always gonna get an english string. which one to use?

@BKSpurgeon I'd use OrdinalIgnoreCase, if case does not matter

Why do we prefer ToUpperInvariant over ToLowerInvariant?

nevermind found out why docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/…

L

Lav Vishwakarma

These are the easiest solutions.

By Index of string title = "STRING"; if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1) { // contains } By Changing case string title = "STRING"; bool contains = title.ToLower().Contains("string") By Regex Regex.IsMatch(title, "string", RegexOptions.IgnoreCase);

P

Pradeep Asanka

As simple and works

title.ToLower().Contains("String".ToLower())

and it's slower than most other options

j

johnnyRose

Just like this:

string s="AbcdEf";
if(s.ToLower().Contains("def"))
{
    Console.WriteLine("yes");
}

This is not culture-specific and may fail for some cases. culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) should be used.

Why avoid string.ToLower() when doing case-insensitive string comparisons? Tl;Dr It's costly because a new string is "manufactured".

s

serhio

I know that this is not the C#, but in the framework (VB.NET) there is already such a function

Dim str As String = "UPPERlower"
Dim b As Boolean = InStr(str, "UpperLower")

C# variant:

string myString = "Hello World";
bool contains = Microsoft.VisualBasic.Strings.InStr(myString, "world");

C

Casey

The InStr method from the VisualBasic assembly is the best if you have a concern about internationalization (or you could reimplement it). Looking at in it dotNeetPeek shows that not only does it account for caps and lowercase, but also for kana type and full- vs. half-width characters (mostly relevant for Asian languages, although there are full-width versions of the Roman alphabet too). I'm skipping over some details, but check out the private method InternalInStrText:

private static int InternalInStrText(int lStartPos, string sSrc, string sFind)
{
  int num = sSrc == null ? 0 : sSrc.Length;
  if (lStartPos > num || num == 0)
    return -1;
  if (sFind == null || sFind.Length == 0)
    return lStartPos;
  else
    return Utils.GetCultureInfo().CompareInfo.IndexOf(sSrc, sFind, lStartPos, CompareOptions.IgnoreCase | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth);
}

A

Andrew D. Bond

You can use a string comparison parameter (available from .NET Core 2.1 and above) String.Contains Method.

public bool Contains (string value, StringComparison comparisonType);

Example:

string title = "ASTRINGTOTEST";
title.Contains("string", StringComparison.InvariantCultureIgnoreCase);

yes it is available in .net Standard 2.1 and .Net Core 5.0 docs.microsoft.com/en-us/dotnet/api/… Got fixed as part of - github.com/dotnet/runtime/issues/22198

P

Peter Mortensen

Use this:

string.Compare("string", "STRING", new System.Globalization.CultureInfo("en-US"), System.Globalization.CompareOptions.IgnoreCase);

The questioner is looking for Contains not Compare.

@DuckMaestro, the accepted answer is implementing Contains with IndexOf. So this approach is equally helpful! The C# code example on this page is using string.Compare(). SharePoint team's choice that is!

T

TarmoPikaro

This is quite similar to other example here, but I've decided to simplify enum to bool, primary because other alternatives are normally not needed. Here is my example:

public static class StringExtensions
{
    public static bool Contains(this string source, string toCheck, bool bCaseInsensitive )
    {
        return source.IndexOf(toCheck, bCaseInsensitive ? StringComparison.OrdinalIgnoreCase : StringComparison.Ordinal) >= 0;
    }
}

And usage is something like:

if( "main String substring".Contains("SUBSTRING", true) )
....

C

Christian Findlay

Just to build on the answer here, you can create a string extension method to make this a little more user-friendly:

    public static bool ContainsIgnoreCase(this string paragraph, string word)
    {
        return CultureInfo.CurrentCulture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0;
    }

Assuming your paragraph and word will always be in en-US

To avoid issues with forcing the culture to en-US, use return CultureInfo.CurrentCulture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0; instead.

j

johnnyRose

Using a RegEx is a straight way to do this:

Regex.IsMatch(title, "string", RegexOptions.IgnoreCase);

Your answer is exactly the same as guptat59's but, as was pointed out on his answer, this will match a regular expression, so if the string you're testing contains any special regex characters it will not yield the desired result.

This is a straight up copy of this answer and suffers from the same issues as noted in that answer

Agreed. Study regular expressions

g

gusmally supports Monica

if you want to check if your passed string is in string then there is a simple method for that.

string yourStringForCheck= "abc";
string stringInWhichWeCheck= "Test abc abc";

bool isContained = stringInWhichWeCheck.ToLower().IndexOf(yourStringForCheck.ToLower()) > -1;

This boolean value will return if the string is contained or not

U

Udi Y

Similar to previous answers (using an extension method) but with two simple null checks (C# 6.0 and above):

public static bool ContainsIgnoreCase(this string source, string substring)
{
    return source?.IndexOf(substring ?? "", StringComparison.OrdinalIgnoreCase) >= 0;
}

If source is null, return false (via null-propagation operator ?.)

If substring is null, treat as an empty string and return true (via null-coalescing operator ??)

The StringComparison can of course be sent as a parameter if needed.

B

Ben

The top-rated several answers are all good and correct in their own ways, I write here to add more information, context, and perspective.

For clarity, let us consider that string A contains string B if there is any subsequence of codepoints in A which is equal to B. If we accept this, the problem is reduced to the question of whether two strings are equal.

The question of when strings are equal has been considered in detail for many decades. Much of the present state of knowledge is encapsulated in SQL collations. Unicode normal forms are close to a proper subset of this. But there is more beyond even SQL collations.

For example, in SQL collations, you can be

Strictly binary sensitive - so that different Unicode normalisation forms (e.g. precombined or combining accents) compare differently. For example, é can be represented as either U+00e9 (precombined) or U+0065 U+0301 (e with combining acute accent). Are these the same or different?

Unicode normalised - In this case the above examples would be equal to each other, but not to É or e.

accent insensitive, (for e.g. Spanish, German, Swedish etc. text). In this case U+0065 = U+0065 U+0301 = U+00e9 = é = e

case and accent insensitive, so that (for e.g. Spanish, German, Swedish etc. text). In this case U+00e9 = U+0065 U+0301 = U+00c9 = U+0045 U+0301 = U+0049 = U+0065 = E = e = É = é

Kanatype sensitive or insensitive, i.e. you can consider Japanese Hiragana and Katakana as equivalent or different. The two syllabaries contain the same number of characters, organised and pronounced in the (mostly) the same way, but written differently and used for different purposes. For example katakana are used for loan words or foreign names, but hiragana are used for children's books, pronunciation guides (e.g. rubies), and where there is no kanji for a word (or perhaps where the writer does not know the kanji, or thinks the reader may not know it).

Full-width or half-width sensitive - Japanese encodings include two representations of some characters for historical reasons - they were displayed at different sizes.

Ligatures considered equivalent or not: See https://en.wikipedia.org/wiki/Ligature_(writing) Is æ the same as ae or not? They have different Unicode encodings, as do accented characters, but unlike accented characters they also look different. Which brings us to...

Arabic presentation form equivalence Arabic writing has a culture of beautiful calligraphy, where particular sequences of adjacent letters have specific representations. Many of these have been encoded in the Unicode standard. I don't fully understand the rules, but they seem to me to be analogous to ligatures.

Other scripts and systems: I have no knowledge whatsoever or Kannada, Malayalam, Sinhala, Thai, Gujarati, Tibetan, or almost all of the tens or hundreds of scripts not mentioned. I assume they have similar issues for the programmer, and given the number of issues mentioned so far and for so few scripts, they probably also have additional issues the programmer ought to consider.

That gets us out of the "encoding" weeds.

Now we must enter the "meaning" weeds.

is Beijing equal to 北京? If not, is Bĕijīng equal to 北京? If not, why not? It is the Pinyin romanisation.

Is Peking equal to 北京? If not, why not? It is the Wade-Giles romanisation.

Is Beijing equal to Peking? If not, why not?

Why are you doing this anyway?

For example, if you want to know if it is possible that two strings (A and B) refer to the same geographical location, or same person, you might want to ask:

Could these strings be either Wade-Giles or Pinyin representations of a set of sequences of Chinese characters? If so, is there any overlap between the corresponding sets?

Could one of these strings be a Cyrillic transcription of a Chinese Character?

could one of these strings be a Cyrillic transliteration of the Pinyin romanisation?

Could one of these strings be a Cyrillic transliteration of a Pinyin romanisation of a Sinification of an English name?

Clearly these are difficult questions, which don't have firm answers, and in any case, the answer may be different according to the purpose of the question.

To finish with a concrete example.

If you are delivering a letter or parcel, clearly Beijing, Peking, Bĕijīng and 北京 are all equal. For that purpose, they are all equally good. No doubt the Chinese post-offices recognise many other options, such as Pékin in French, Pequim in Portuguese, Bắc Kinh in Vietnamese, and Бээжин in Mongolian.

Words do not have fixed meanings.

Words are tools we use to navigate the world, to accomplish our tasks, and to communicate with other people.

While it looks like it would be helpful if words like equality, Beijing, or meaning had fixed meanings, the sad fact is they do not.

Yet we seem to muddle along somehow.

TL;DR: If you are dealing with questions relating to reality, in all its nebulosity (cloudiness, uncertainty, lack of clear boundaries), there are basically three possible answers to every question:

Probably

Probably not

Maybe

T

Tamilselvan K

if ("strcmpstring1".IndexOf(Convert.ToString("strcmpstring2"), StringComparison.CurrentCultureIgnoreCase) >= 0){return true;}else{return false;}

F

FelixSFD

You can use string.indexof () function. This will be case insensitive

M

Massimiliano Kraus

The trick here is to look for the string, ignoring case, but to keep it exactly the same (with the same case).

 var s="Factory Reset";
 var txt="reset";
 int first = s.IndexOf(txt, StringComparison.InvariantCultureIgnoreCase) + txt.Length;
 var subString = s.Substring(first - txt.Length, txt.Length);

Output is "Reset"

F

Final Heaven

public static class StringExtension
{
    #region Public Methods

    public static bool ExContains(this string fullText, string value)
    {
        return ExIndexOf(fullText, value) > -1;
    }

    public static bool ExEquals(this string text, string textToCompare)
    {
        return text.Equals(textToCompare, StringComparison.OrdinalIgnoreCase);
    }

    public static bool ExHasAllEquals(this string text, params string[] textArgs)
    {
        for (int index = 0; index < textArgs.Length; index++)
            if (ExEquals(text, textArgs[index]) == false) return false;
        return true;
    }

    public static bool ExHasEquals(this string text, params string[] textArgs)
    {
        for (int index = 0; index < textArgs.Length; index++)
            if (ExEquals(text, textArgs[index])) return true;
        return false;
    }

    public static bool ExHasNoEquals(this string text, params string[] textArgs)
    {
        return ExHasEquals(text, textArgs) == false;
    }

    public static bool ExHasNotAllEquals(this string text, params string[] textArgs)
    {
        for (int index = 0; index < textArgs.Length; index++)
            if (ExEquals(text, textArgs[index])) return false;
        return true;
    }

    /// <summary>
    /// Reports the zero-based index of the first occurrence of the specified string
    /// in the current System.String object using StringComparison.InvariantCultureIgnoreCase.
    /// A parameter specifies the type of search to use for the specified string.
    /// </summary>
    /// <param name="fullText">
    /// The string to search inside.
    /// </param>
    /// <param name="value">
    /// The string to seek.
    /// </param>
    /// <returns>
    /// The index position of the value parameter if that string is found, or -1 if it
    /// is not. If value is System.String.Empty, the return value is 0.
    /// </returns>
    /// <exception cref="ArgumentNullException">
    /// fullText or value is null.
    /// </exception>
    public static int ExIndexOf(this string fullText, string value)
    {
        return fullText.IndexOf(value, StringComparison.OrdinalIgnoreCase);
    }

    public static bool ExNotEquals(this string text, string textToCompare)
    {
        return ExEquals(text, textToCompare) == false;
    }

    #endregion Public Methods
}

V

Valentin Peta

Based on the existing answers and on the documentation of Contains method I would recommend the creation of the following extension which also takes care of the corner cases:

public static class VStringExtensions 
{
    public static bool Contains(this string source, string toCheck, StringComparison comp) 
    {
        if (toCheck == null) 
        {
            throw new ArgumentNullException(nameof(toCheck));
        }

        if (source.Equals(string.Empty)) 
        {
            return false;
        }

        if (toCheck.Equals(string.Empty)) 
        {
            return true;
        }

        return source.IndexOf(toCheck, comp) >= 0;
    }
}

O

O Thạnh Ldt

Simple way for newbie:

title.ToLower().Contains("string");//of course "string" is lowercase.

Downvote for just being incorrect. What if title = StRiNg? StRiNg != string and StRiNg != STRING

I was wrong. Edit answer as follows, too simple simple:
title.ToLower().Contains("string") // of course "string" is lowercase

Case insensitive 'Contains(string)'

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Links

Contact US