I just read some recommendations on using
std::string s = get_string();
std::string t = another_string();
if( !s.compare(t) )
{
instead of
if( s == t )
{
I'm almost always using the last one because I'm used to it and it feels natural, more readable. I didn't even know that there was a separate comparison function. To be more precise, I thought == would call compare().
What are the differences? In which contexts should one way be favored to the other?
I'm considering only the cases where I need to know if a string is the same value as another string.
if(x.compare(y) == 0)
<- equals sign, it's equal. IMO using !
only serves to make code unreadable.
compare
return -1
if s
is lower than t
and +1
if s
is greater than t
while ==
return true/false
. Nonzero integers are true
and 0
is false
.
This is what the standard has to say about operator==
21.4.8.2 operator== template
Seems like there isn't much of a difference!
std::string::compare() returns an int
:
equal to zero if s and t are equal,
less than zero if s is less than t,
greater than zero if s is greater than t.
If you want your first code snippet to be equivalent to the second one, it should actually read:
if (!s.compare(t)) {
// 's' and 't' are equal.
}
The equality operator only tests for equality (hence its name) and returns a bool
.
To elaborate on the use cases, compare()
can be useful if you're interested in how the two strings relate to one another (less or greater) when they happen to be different. PlasmaHH rightfully mentions trees, and it could also be, say, a string insertion algorithm that aims to keep the container sorted, a dichotomic search algorithm for the aforementioned container, and so on.
EDIT: As Steve Jessop points out in the comments, compare()
is most useful for quick sort and binary search algorithms. Natural sorts and dichotomic searches can be implemented with only std::less.
std::less
, which is also a total order in this case) rather than a three-way comparator. compare()
is for operations modeled on std::qsort
and std::bsearch
, as opposed to those modeled on std:sort
and std::lower_bound
.
Internally, string::operator==()
is using string::compare()
. Please refer to: CPlusPlus - string::operator==()
I wrote a small application to compare the performance, and apparently if you compile and run your code on debug environment the string::compare()
is slightly faster than string::operator==()
. However if you compile and run your code in Release environment, both are pretty much the same.
FYI, I ran 1,000,000 iteration in order to come up with such conclusion.
In order to prove why in debug environment the string::compare is faster, I went to the assembly and here is the code:
DEBUG BUILD
string::operator==()
if (str1 == str2)
00D42A34 lea eax,[str2]
00D42A37 push eax
00D42A38 lea ecx,[str1]
00D42A3B push ecx
00D42A3C call std::operator==<char,std::char_traits<char>,std::allocator<char> > (0D23EECh)
00D42A41 add esp,8
00D42A44 movzx edx,al
00D42A47 test edx,edx
00D42A49 je Algorithm::PerformanceTest::stringComparison_usingEqualOperator1+0C4h (0D42A54h)
string::compare()
if (str1.compare(str2) == 0)
00D424D4 lea eax,[str2]
00D424D7 push eax
00D424D8 lea ecx,[str1]
00D424DB call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::compare (0D23582h)
00D424E0 test eax,eax
00D424E2 jne Algorithm::PerformanceTest::stringComparison_usingCompare1+0BDh (0D424EDh)
You can see that in string::operator==(), it has to perform extra operations (add esp, 8 and movzx edx,al)
RELEASE BUILD
string::operator==()
if (str1 == str2)
008533F0 cmp dword ptr [ebp-14h],10h
008533F4 lea eax,[str2]
008533F7 push dword ptr [ebp-18h]
008533FA cmovae eax,dword ptr [str2]
008533FE push eax
008533FF push dword ptr [ebp-30h]
00853402 push ecx
00853403 lea ecx,[str1]
00853406 call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::compare (0853B80h)
string::compare()
if (str1.compare(str2) == 0)
00853830 cmp dword ptr [ebp-14h],10h
00853834 lea eax,[str2]
00853837 push dword ptr [ebp-18h]
0085383A cmovae eax,dword ptr [str2]
0085383E push eax
0085383F push dword ptr [ebp-30h]
00853842 push ecx
00853843 lea ecx,[str1]
00853846 call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::compare (0853B80h)
Both assembly code are very similar as the compiler perform optimization.
Finally, in my opinion, the performance gain is negligible, hence I would really leave it to the developer to decide on which one is the preferred one as both achieve the same outcome (especially when it is release build).
compare
has overloads for comparing substrings. If you're comparing whole strings you should just use ==
operator (and whether it calls compare
or not is pretty much irrelevant).
compare()
is equivalent to strcmp(). ==
is simple equality checking. compare()
therefore returns an int
, ==
is a boolean.
compare()
will return false
(well, 0
) if the strings are equal.
So don't take exchanging one for the other lightly.
Use whichever makes the code more readable.
If you just want to check string equality, use the == operator. Determining whether two strings are equal is simpler than finding an ordering (which is what compare() gives,) so it might be better performance-wise in your case to use the equality operator.
Longer answer: The API provides a method to check for string equality and a method to check string ordering. You want string equality, so use the equality operator (so that your expectations and those of the library implementors align.) If performance is important then you might like to test both methods and find the fastest.
One thing that is not covered here is that it depends if we compare string to c string, c string to string or string to string.
A major difference is that for comparing two strings size equality is checked before doing the compare and that makes the == operator faster than a compare.
here is the compare as i see it on g++ Debian 7
// operator ==
/**
* @brief Test equivalence of two strings.
* @param __lhs First string.
* @param __rhs Second string.
* @return True if @a __lhs.compare(@a __rhs) == 0. False otherwise.
*/
template<typename _CharT, typename _Traits, typename _Alloc>
inline bool
operator==(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
const basic_string<_CharT, _Traits, _Alloc>& __rhs)
{ return __lhs.compare(__rhs) == 0; }
template<typename _CharT>
inline
typename __gnu_cxx::__enable_if<__is_char<_CharT>::__value, bool>::__type
operator==(const basic_string<_CharT>& __lhs,
const basic_string<_CharT>& __rhs)
{ return (__lhs.size() == __rhs.size()
&& !std::char_traits<_CharT>::compare(__lhs.data(), __rhs.data(),
__lhs.size())); }
/**
* @brief Test equivalence of C string and string.
* @param __lhs C string.
* @param __rhs String.
* @return True if @a __rhs.compare(@a __lhs) == 0. False otherwise.
*/
template<typename _CharT, typename _Traits, typename _Alloc>
inline bool
operator==(const _CharT* __lhs,
const basic_string<_CharT, _Traits, _Alloc>& __rhs)
{ return __rhs.compare(__lhs) == 0; }
/**
* @brief Test equivalence of string and C string.
* @param __lhs String.
* @param __rhs C string.
* @return True if @a __lhs.compare(@a __rhs) == 0. False otherwise.
*/
template<typename _CharT, typename _Traits, typename _Alloc>
inline bool
operator==(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
const _CharT* __rhs)
{ return __lhs.compare(__rhs) == 0; }
Suppose consider two string s and t. Give them some values. When you compare them using (s==t) it returns a boolean value(true or false , 1 or 0). But when you compare using s.compare(t) ,the expression returns a value (i) 0 - if s and t are equal (ii) <0 - either if the value of the first unmatched character in s is less than that of t or the length of s is less than that of t. (iii) >0 - either if the value of the first unmatched character in t is less than that of s or the length of t is less than that of s.
Success story sharing
!s.compare(t)
ands == t
will return the same value, but the compare function provides more information thans == t
, ands == t
is more readable when you don't care how the strings differ but only if they differ.