String comparison with C# in .Net Framework 4

I have recently performed a small analysis of the string comparison methods available in the .Net Framework 4 including performance measurements using dotTrace and analyzing the code with Reflector.

Usually the string behavior changes with each framework version so the result of this analysis might not be valid for previous or future versions of the .Net Framework but this is what I have found.

== operator

a == b

The == operator performs a case-sensitive culture-insensitive string comparison by calling the String.Equals static method.

Notes

  • Whether there might be a possible overhead with the additional function call to String.Equals is not clear but I didn’t notice a huge difference in performance between the equal operator and the String.Equals method during my measurement tests.

String.Equals methods

String.Equals(a,b) static method

String.Equals(a, b)

The static Equals method performs the string comparison as follows:

  • Checks if the two strings are equal
  • Checks if any of the strings are null and, if so, returns false
  • Checks if the strings have the same length and, if so, returns false
  • Calls EqualsHelper, optimized native code.

Notes

  • The String.Equals static method does NOT throw a NullReferenceException if one of the strings are null, instead false will be returned.

String.Equals(value) instance method

a.Equals(b)

The Equals instance method performs a reference comparison first before the string comparison:

  • Checks if the instance string is null and, if so, throws NullReferenceException
  • Checks if the parameter value is null and, if so, returns false
  • Performs a reference comparison using the ReferenceEquals method (regarding the ReferenceEquals method, see later chapter)
  • Checks if the strings have the same length and, if so, returns false
  • Calls the EqualsHelper, optimized native code.

Notes

  • The String.Equals instance method throws a NullReferenceException if the instance string is null.
  • The String.Equals static method does NOT throw a NullReferenceException if the string parameter is null, instead false will be returned.

String.Equals override with StringComparison parameter

a.Equals(b, StringComparison.Ordinal)
String.Equals(a, b, StringComparison.Ordinal)

StringComparison enumeration

The StringComparison enumeration specifies the internal behavior of the search and comparison  methods.

For performance optimization use StringComparison.Ordinal enumeration value, this will use the numeric value of the characters in the string in the ASCII table to search the string character by character until a difference has been found and the comparison is aborted immediately.

Performance order from bad to good:

  • CurrentCulture
  • CurrentCultureIgnoreCase
  • InvariantCulture
  • InvariantCultureIgnoreCase
  • Ordinal
  • OrdinalIgnoreCase
  • Slow: Culture-dependent string comparison (CurrentCulture, CurrentCultureIgnoreCase, InvariantCulture, InvariantCultureIgnoreCase) uses the CultureInfo.CurrentCulture.CompareInfo.Compare method to perform the actual comparison.
  • Faster: OrdinalIgnoreCase checks for the same length and calls CompareOrdinalIgnoreCaseHelper if the strings are ASCII; otherwise calls TextInfo.CompareOrdinalIgnoreCaseHelper to do the actual comparison.
  • Fastest: Ordinal checks if the strings have the same length and calls EqualsHelper, optimized native code.

Notes

  • Default value for the string methods is StringComparison.Ordinal.
  • Microsoft recommends calling an overload of the Equals method that includes a StringComparison parameter whenever possible.

String.Compare

String.Compare(a, b) == 0

The String Compare method uses the CultureInfo.CurrentCulture.CompareInfo.Compare method to perform the actual comparison.

Notes

  • Much slower than equals, should only be used for sorting, not for equality testing.
  • Microsoft recommends calling an overload of the Compare method that includes a StringComparison parameter whenever possible.
  • Microsoft recommends not using String.Compare and String.CompareTo to check for equality, but to use Equals instead.

Reference comparison with ReferenceEquals

ReferenceEquals(a, b)

Reference comparison performs a memory address comparison using the intern pool, a table containing single references to unique literal strings.

In order to perform a reference comparison, the string must be an interned string. Non-interned strings can be interned by doing the following:

String.IsInterned(someString) ?? String.Intern(someString)

String.IsInterned

The String.IsInterned method returns the reference to the value if the string is found in the intern pool; otherwise null;

String.Intern

The String.Intern method returns the reference to the value if the string is found in the intern pool; otherwise a new reference to the string is added to the intern pool and then returned.

Notes

  • Interned strings take less space in memory because these strings exist only once and referenced everywhere.
  • Interning a new string is less performant because the string is searched first in the intern pool and then added, if not found.
  • Once the string is interned, it will stay in the memory indefinitely.
  • String.Equals performs a reference comparison by calling ReferenceEquals before doing the actual string comparison so calling String.Equals directly is recommended.

Related

Chain string comparison

If-else

if (a == "dog") {...} else if (a == "cat") {...}

Switch

switch (a){case "dog":... case "cat":... default:...}

The switch case statement for string comparison behaves internally as follows:

  • In Debug mode:
    • If-else blocks will be generated in MSIL, this is sequential search so the bottom if block will take longer because all if-else blocks will be evaluated first
  • In Release mode:
    • In case of <=6 case blocks, if-else blocks will be generated, this is a sequential search
    • Optimized for more >=7 case blocks, a static dictionary will be created to map the strings to numbers which is then used in an internal switch-case in MSIL

Notes

  • In case of a higher number of case blocks, switch is preferred over if.

Related

Other string comparison solutions

There are also other ways to compare strings or sub-strings, just to name a few:

  • Using LINQ
  • Using the String.IndexOf() and String.IndexOfAny() methods
  • Using the String.Contains() and String.ContainsAny() methods
  • Using the Regex.IsMatch() method

Other

String.ToLower and String.ToUpper methods

When converting a string to lower or upper-case using these methods, the original string remains unchanged and a new temporary string will be created.

Although it’s off-topic, I recommend reading the following blog posts regarding the ToUpper() method and the tr-TR culture:

Conclusion

  • Based on the results of my performance measurement tests, I have not noticed any significant differences between the == operator and the String.Equals static and instance methods.
  • The performance test results show however remarkably higher numbers with the String.Compare method. This method should not be used for verifying equality between two strings.
  • Switch-case blocks are recommended when performing chain string comparison with 5 or more comparisons, otherwise if-else can be also used.

Links