Is there a way to make the following return true?
string title = "ASTRINGTOTEST";
title.Contains("string");
There doesn't seem to be an overload that allows me to set the case sensitivity. Currently I UPPERCASE them both, but that's just silly (by which I am referring to the i18n issues that come with up- and down casing).
UPDATE
This question is ancient and since then I have realized I asked for a simple answer for a really vast and difficult topic if you care to investigate it fully.
For most cases, in mono-lingual, English code bases this answer will suffice. I'm suspecting because most people coming here fall in this category this is the most popular answer.
This answer however brings up the inherent problem that we can't compare text case insensitive until we know both texts are the same culture and we know what that culture is. This is maybe a less popular answer, but I think it is more correct and that's why I marked it as such.
You could use the String.IndexOf
Method and pass StringComparison.OrdinalIgnoreCase
as the type of search to use:
string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;
Even better is defining a new extension method for string:
public static class StringExtensions
{
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source?.IndexOf(toCheck, comp) >= 0;
}
}
Note, that null propagation ?.
is available since C# 6.0 (VS 2015), for older versions use
if (source == null) return false;
return source.IndexOf(toCheck, comp) >= 0;
USAGE:
string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);
To test if the string paragraph
contains the string word
(thanks @QuarterMeister)
culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
Where culture
is the instance of CultureInfo
describing the language that the text is written in.
This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I
and i
for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.
Thus the strings tin
and TIN
are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)
To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture
, because it will be wrong in familiar ways.
culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
? That uses the right culture and is case-insensitive, it doesn't allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison.
You can use IndexOf()
like this:
string title = "STRING";
if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1)
{
// The string exists in the original
}
Since 0 (zero) can be an index, you check against -1.
The zero-based index position of value if that string is found, or -1 if it is not. If value is String.Empty, the return value is 0.
Alternative solution using Regex:
bool contains = Regex.IsMatch("StRiNG to search", Regex.Escape("string"), RegexOptions.IgnoreCase);
RegexOptions.IgnoreCase & RegexOptions.IgnorePatternWhitespace & RegexOptions.CultureInvariant;
for anyone if helps.
"."
in "This is a sample string that doesn't contain the search string"
. Or try searching for "(invalid"
, for that matter.
Regex.Escape
could help. Regex still seems unnecessary when IndexOf
/ extension Contains
is simple (and arguably more clear).
.NET Core 2.0+ (including .NET 5.0+)
.NET Core has had a pair of methods to deal with this since version 2.0 :
String.Contains(Char, StringComparison)
String.Contains(String, StringComparison)
Example:
"Test".Contains("test", System.StringComparison.CurrentCultureIgnoreCase);
It is now officially part of the .NET Standard 2.1, and therefore part of all the implementations of the Base Class Library that implement this version of the standard (or a higher one).
You could always just up or downcase the strings first.
string title = "string":
title.ToUpper().Contains("STRING") // returns true
Oops, just saw that last bit. A case insensitive compare would *
probably*
do the same anyway, and if performance is not an issue, I don't see a problem with creating uppercase copies and comparing those. I could have sworn that I once saw a case-insensitive compare once...
One issue with the answer is that it will throw an exception if a string is null. You can add that as a check so it won't:
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
if (string.IsNullOrEmpty(toCheck) || string.IsNullOrEmpty(source))
return true;
return source.IndexOf(toCheck, comp) >= 0;
}
if (string.IsNullOrEmpty(source)) return string.IsNullOrEmpty(toCheck);
StringExtension class is the way forward, I've combined a couple of the posts above to give a complete code example:
public static class StringExtensions
{
/// <summary>
/// Allows case insensitive checks
/// </summary>
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source.IndexOf(toCheck, comp) >= 0;
}
}
StringComparison
?
This is clean and simple.
Regex.IsMatch(file, fileNamestr, RegexOptions.IgnoreCase)
fileNamestr
has any special regex characters (e.g. *
, +
, .
, etc.) then you will be in for quite a surprise. The only way to make this solution work like a proper Contains
function is to escape fileNamestr
by doing Regex.Escape(fileNamestr)
.
OrdinalIgnoreCase, CurrentCultureIgnoreCase or InvariantCultureIgnoreCase?
Since this is missing, here are some recommendations about when to use which one:
Dos
Use StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
Use StringComparison.OrdinalIgnoreCase comparisons for increased speed.
Use StringComparison.CurrentCulture-based string operations when displaying the output to the user.
Switch current use of string operations based on the invariant culture to use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase when the comparison is linguistically irrelevant (symbolic, for example).
Use ToUpperInvariant rather than ToLowerInvariant when normalizing strings for comparison.
Don'ts
Use overloads for string operations that don't explicitly or implicitly specify the string comparison mechanism.
Use StringComparison.InvariantCulture -based string operations in most cases; one of the few exceptions would be persisting linguistically meaningful but culturally-agnostic data.
Based on these rules you should use:
string title = "STRING";
if (title.IndexOf("string", 0, StringComparison.[YourDecision]) != -1)
{
// The string exists in the original
}
whereas [YourDecision] depends on the recommendations from above.
link of source: http://msdn.microsoft.com/en-us/library/ms973919.aspx
These are the easiest solutions.
By Index of string title = "STRING"; if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1) { // contains } By Changing case string title = "STRING"; bool contains = title.ToLower().Contains("string") By Regex Regex.IsMatch(title, "string", RegexOptions.IgnoreCase);
As simple and works
title.ToLower().Contains("String".ToLower())
Just like this:
string s="AbcdEf";
if(s.ToLower().Contains("def"))
{
Console.WriteLine("yes");
}
I know that this is not the C#, but in the framework (VB.NET) there is already such a function
Dim str As String = "UPPERlower"
Dim b As Boolean = InStr(str, "UpperLower")
C# variant:
string myString = "Hello World";
bool contains = Microsoft.VisualBasic.Strings.InStr(myString, "world");
The InStr
method from the VisualBasic assembly is the best if you have a concern about internationalization (or you could reimplement it). Looking at in it dotNeetPeek shows that not only does it account for caps and lowercase, but also for kana type and full- vs. half-width characters (mostly relevant for Asian languages, although there are full-width versions of the Roman alphabet too). I'm skipping over some details, but check out the private method InternalInStrText
:
private static int InternalInStrText(int lStartPos, string sSrc, string sFind)
{
int num = sSrc == null ? 0 : sSrc.Length;
if (lStartPos > num || num == 0)
return -1;
if (sFind == null || sFind.Length == 0)
return lStartPos;
else
return Utils.GetCultureInfo().CompareInfo.IndexOf(sSrc, sFind, lStartPos, CompareOptions.IgnoreCase | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth);
}
You can use a string comparison parameter (available from .NET Core 2.1 and above) String.Contains Method.
public bool Contains (string value, StringComparison comparisonType);
Example:
string title = "ASTRINGTOTEST";
title.Contains("string", StringComparison.InvariantCultureIgnoreCase);
Use this:
string.Compare("string", "STRING", new System.Globalization.CultureInfo("en-US"), System.Globalization.CompareOptions.IgnoreCase);
Contains
not Compare
.
Contains
with IndexOf
. So this approach is equally helpful! The C# code example on this page is using string.Compare(). SharePoint team's choice that is!
This is quite similar to other example here, but I've decided to simplify enum to bool, primary because other alternatives are normally not needed. Here is my example:
public static class StringExtensions
{
public static bool Contains(this string source, string toCheck, bool bCaseInsensitive )
{
return source.IndexOf(toCheck, bCaseInsensitive ? StringComparison.OrdinalIgnoreCase : StringComparison.Ordinal) >= 0;
}
}
And usage is something like:
if( "main String substring".Contains("SUBSTRING", true) )
....
Just to build on the answer here, you can create a string extension method to make this a little more user-friendly:
public static bool ContainsIgnoreCase(this string paragraph, string word)
{
return CultureInfo.CurrentCulture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0;
}
return CultureInfo.CurrentCulture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0;
instead.
Using a RegEx is a straight way to do this:
Regex.IsMatch(title, "string", RegexOptions.IgnoreCase);
if you want to check if your passed string is in string then there is a simple method for that.
string yourStringForCheck= "abc";
string stringInWhichWeCheck= "Test abc abc";
bool isContained = stringInWhichWeCheck.ToLower().IndexOf(yourStringForCheck.ToLower()) > -1;
This boolean value will return if the string is contained or not
Similar to previous answers (using an extension method) but with two simple null checks (C# 6.0 and above):
public static bool ContainsIgnoreCase(this string source, string substring)
{
return source?.IndexOf(substring ?? "", StringComparison.OrdinalIgnoreCase) >= 0;
}
If source is null, return false (via null-propagation operator ?.)
If substring is null, treat as an empty string and return true (via null-coalescing operator ??)
The StringComparison can of course be sent as a parameter if needed.
The top-rated several answers are all good and correct in their own ways, I write here to add more information, context, and perspective.
For clarity, let us consider that string A contains string B if there is any subsequence of codepoints in A which is equal to B. If we accept this, the problem is reduced to the question of whether two strings are equal.
The question of when strings are equal has been considered in detail for many decades. Much of the present state of knowledge is encapsulated in SQL collations. Unicode normal forms are close to a proper subset of this. But there is more beyond even SQL collations.
For example, in SQL collations, you can be
Strictly binary sensitive - so that different Unicode normalisation forms (e.g. precombined or combining accents) compare differently. For example, é can be represented as either U+00e9 (precombined) or U+0065 U+0301 (e with combining acute accent). Are these the same or different?
Unicode normalised - In this case the above examples would be equal to each other, but not to É or e.
accent insensitive, (for e.g. Spanish, German, Swedish etc. text). In this case U+0065 = U+0065 U+0301 = U+00e9 = é = e
case and accent insensitive, so that (for e.g. Spanish, German, Swedish etc. text). In this case U+00e9 = U+0065 U+0301 = U+00c9 = U+0045 U+0301 = U+0049 = U+0065 = E = e = É = é
Kanatype sensitive or insensitive, i.e. you can consider Japanese Hiragana and Katakana as equivalent or different. The two syllabaries contain the same number of characters, organised and pronounced in the (mostly) the same way, but written differently and used for different purposes. For example katakana are used for loan words or foreign names, but hiragana are used for children's books, pronunciation guides (e.g. rubies), and where there is no kanji for a word (or perhaps where the writer does not know the kanji, or thinks the reader may not know it).
Full-width or half-width sensitive - Japanese encodings include two representations of some characters for historical reasons - they were displayed at different sizes.
Ligatures considered equivalent or not: See https://en.wikipedia.org/wiki/Ligature_(writing) Is æ the same as ae or not? They have different Unicode encodings, as do accented characters, but unlike accented characters they also look different. Which brings us to...
Arabic presentation form equivalence Arabic writing has a culture of beautiful calligraphy, where particular sequences of adjacent letters have specific representations. Many of these have been encoded in the Unicode standard. I don't fully understand the rules, but they seem to me to be analogous to ligatures.
Other scripts and systems: I have no knowledge whatsoever or Kannada, Malayalam, Sinhala, Thai, Gujarati, Tibetan, or almost all of the tens or hundreds of scripts not mentioned. I assume they have similar issues for the programmer, and given the number of issues mentioned so far and for so few scripts, they probably also have additional issues the programmer ought to consider.
That gets us out of the "encoding" weeds.
Now we must enter the "meaning" weeds.
is Beijing equal to 北京? If not, is Bĕijīng equal to 北京? If not, why not? It is the Pinyin romanisation.
Is Peking equal to 北京? If not, why not? It is the Wade-Giles romanisation.
Is Beijing equal to Peking? If not, why not?
Why are you doing this anyway?
For example, if you want to know if it is possible that two strings (A and B) refer to the same geographical location, or same person, you might want to ask:
Could these strings be either Wade-Giles or Pinyin representations of a set of sequences of Chinese characters? If so, is there any overlap between the corresponding sets?
Could one of these strings be a Cyrillic transcription of a Chinese Character?
could one of these strings be a Cyrillic transliteration of the Pinyin romanisation?
Could one of these strings be a Cyrillic transliteration of a Pinyin romanisation of a Sinification of an English name?
Clearly these are difficult questions, which don't have firm answers, and in any case, the answer may be different according to the purpose of the question.
To finish with a concrete example.
If you are delivering a letter or parcel, clearly Beijing, Peking, Bĕijīng and 北京 are all equal. For that purpose, they are all equally good. No doubt the Chinese post-offices recognise many other options, such as Pékin in French, Pequim in Portuguese, Bắc Kinh in Vietnamese, and Бээжин in Mongolian.
Words do not have fixed meanings.
Words are tools we use to navigate the world, to accomplish our tasks, and to communicate with other people.
While it looks like it would be helpful if words like equality
, Beijing
, or meaning
had fixed meanings, the sad fact is they do not.
Yet we seem to muddle along somehow.
TL;DR: If you are dealing with questions relating to reality, in all its nebulosity (cloudiness, uncertainty, lack of clear boundaries), there are basically three possible answers to every question:
Probably
Probably not
Maybe
if ("strcmpstring1".IndexOf(Convert.ToString("strcmpstring2"), StringComparison.CurrentCultureIgnoreCase) >= 0){return true;}else{return false;}
You can use string.indexof ()
function. This will be case insensitive
The trick here is to look for the string, ignoring case, but to keep it exactly the same (with the same case).
var s="Factory Reset";
var txt="reset";
int first = s.IndexOf(txt, StringComparison.InvariantCultureIgnoreCase) + txt.Length;
var subString = s.Substring(first - txt.Length, txt.Length);
Output is "Reset"
public static class StringExtension
{
#region Public Methods
public static bool ExContains(this string fullText, string value)
{
return ExIndexOf(fullText, value) > -1;
}
public static bool ExEquals(this string text, string textToCompare)
{
return text.Equals(textToCompare, StringComparison.OrdinalIgnoreCase);
}
public static bool ExHasAllEquals(this string text, params string[] textArgs)
{
for (int index = 0; index < textArgs.Length; index++)
if (ExEquals(text, textArgs[index]) == false) return false;
return true;
}
public static bool ExHasEquals(this string text, params string[] textArgs)
{
for (int index = 0; index < textArgs.Length; index++)
if (ExEquals(text, textArgs[index])) return true;
return false;
}
public static bool ExHasNoEquals(this string text, params string[] textArgs)
{
return ExHasEquals(text, textArgs) == false;
}
public static bool ExHasNotAllEquals(this string text, params string[] textArgs)
{
for (int index = 0; index < textArgs.Length; index++)
if (ExEquals(text, textArgs[index])) return false;
return true;
}
/// <summary>
/// Reports the zero-based index of the first occurrence of the specified string
/// in the current System.String object using StringComparison.InvariantCultureIgnoreCase.
/// A parameter specifies the type of search to use for the specified string.
/// </summary>
/// <param name="fullText">
/// The string to search inside.
/// </param>
/// <param name="value">
/// The string to seek.
/// </param>
/// <returns>
/// The index position of the value parameter if that string is found, or -1 if it
/// is not. If value is System.String.Empty, the return value is 0.
/// </returns>
/// <exception cref="ArgumentNullException">
/// fullText or value is null.
/// </exception>
public static int ExIndexOf(this string fullText, string value)
{
return fullText.IndexOf(value, StringComparison.OrdinalIgnoreCase);
}
public static bool ExNotEquals(this string text, string textToCompare)
{
return ExEquals(text, textToCompare) == false;
}
#endregion Public Methods
}
Based on the existing answers and on the documentation of Contains method I would recommend the creation of the following extension which also takes care of the corner cases:
public static class VStringExtensions
{
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
if (toCheck == null)
{
throw new ArgumentNullException(nameof(toCheck));
}
if (source.Equals(string.Empty))
{
return false;
}
if (toCheck.Equals(string.Empty))
{
return true;
}
return source.IndexOf(toCheck, comp) >= 0;
}
}
Simple way for newbie:
title.ToLower().Contains("string");//of course "string" is lowercase.
Success story sharing
paragraph.ToLower(culture).Contains(word.ToLower(culture))
withCultureInfo.InvariantCulture
and it doesn't solve any localisation issues. Why over complicate things? stackoverflow.com/a/15464440/284795ToLower
version includes 2 allocations which are unnecessary in a comparison / search operation. Why needlessly allocate in a scenario that doesn't require it?string
is anIEnumerable<char>
hence you can't use it to find substringsstring.IndexOf(string)
is to use the current culture, while the default forstring.Contains(string)
is to use the ordinal comparer. As we know, the former can be changed be picking a longer overload, while the latter cannot be changed. A consequence of this inconsistency is the following code sample:Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture; string self = "Waldstrasse"; string value = "straße"; Console.WriteLine(self.Contains(value));/* False */ Console.WriteLine(self.IndexOf(value) >= 0);/* True */