Microsoft & .NETComparing Large Bodies of Text with Hash Codes

Comparing Large Bodies of Text with Hash Codes

Welcome to this week’s installment of .NET Tips & Techniques! Each week, award-winning Architect and Lead Programmer Tom Archer demonstrates how to perform a practical .NET programming task.

While most people think of hash codes in relation to security, hash codes actually are a very fast means of comparing large text values. Using the standard Windows CryptoAPI can be very cumbersome, but the various classes defined in the .NET Cryptography namespace make using hash codes—and other cryptographic functions—easier and more accessible than ever. In this article, I illustrate just how easy it is to compare two text values in a .NET application using hash codes.

Creating a hash code for a body of text is as simple as deciding which hashing algorithm you wish to use (e.g., MD5, SHA1, etc.), instantiating the appropriate .NET service provider object, and then calling that object’s ComputeHash method. (All hash algorithm classes ultimately derive from the HashAlgorithm class and inherit its ComputeHash method, which is usually overridden.) Other than that, there’s just the typical conversion between Byte (or Char) arrays to String objects, and you’re done.

Figure 1 contains a screen capture of the demo application included with this article.



Figure 1: Simple C++ Managed Extensions example illustrating the comparison of two text (string) values using hash codes

The application uses the MD5 hash code algorithm to compare two input strings. The two fields below the two input fields are the actual hash codes. Below you’ll find the code used to generate those hash codes and compare the results.

The code first uses the Encoding::ASCII::GetBytes method to convert from the String values returned from the input controls to Byte arrays. A MD5CryptoServiceProvider object is then instantiate and its ComputeHash method is called for each Byte array, resulting in a second Byte array containing the hash code for the text value. The hash values are converted to String values and displayed on the demo dialog and compared for equality where the results of the comparison are shown in a message box. That’s it—just a few lines of code to compare two text values of virtually any length!

using namespace System::Security::Cryptography;
using namespace System::Text;

...

private: System::Void btnCompare_Click(System::Object *  sender, System::EventArgs *  e)
{
  try
  {
    // Convert the text values into Byte arrays
       Byteba1[]=
    Encoding::ASCII->GetBytes(txt1->Text); Byte  ba2[]=Encoding::ASCII->GetBytes(txt2->Text);
    
    MD5CryptoServiceProvider* md5csp = new MD5CryptoServiceProvider();
    
    // Get the hash values for each text value using ComputeHash
    Byte baHashCode1[] = md5csp->ComputeHash(ba1);
    Byte baHashCode2[] = md5csp->ComputeHash(ba2);
    
    // Convert the two hash code arrays into strings for
    // display and comparison
    ASCIIEncoding* encoding =  new
    ASCIIEncoding();txtHash1->Text = 
    BitConverter::ToString(baHashCode1);txtHash2->Text = BitConverter::ToString(baHashCode2);
    
    // Display the results of the comparisons of the two hash codes
    MessageBox::Show( 
      String::Format(S"The two values are {0}", 
                     (0 == String::Compare(txtHash1->Text, 
                                           txtHash2->Text) 
                       ? S"the same" : S"different")));
  }
  catch(Exception* e)
  {
    MessageBox::Show(e->Message);
  }
}

Download the Code

To download the accompanying source code for this tip, click here.

About the Author

The founder of the Archer Consulting Group (ACG), Tom Archer has been the project lead on three award-winning applications and is a best-selling author of 10 programming books as well as countless magazine and online articles.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories