Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Installation IDs Based on Truncated Hashing

4.93/5 (30 votes)
4 Dec 2007CPOL13 min read 1   2.2K  
Create Semi-Anonymous Installation Fingerprints Using Truncated Hashing and Crypto++

Introduction

This article is a compliment to Product Activation Based on RSA Signatures. It will present the reader with a framework for uniquely identifying software installations while maintaining a certain degree of user anonymity. One would desire a unique installation identification if using a Product Activation System, since the system can help deter piracy, neutralize a key generator, and develop end user demographics.

To represent an installation, WMI will be used to determine installed hardware. To achieve the anonymity, the samples will employ Truncated Hashing using Crypto++.

This article was tested on Windows Vista, Windows Server 2003, Windows XP SP2, and Windows 2000, SP4. Both standard user accounts and Administrator accounts performed as expected.

This article will visit the following topics:

  • Intellectual Property
  • Background Information
    • Setup API
    • Windows Registry
    • RSA Public Key of the Host
    • WMI
  • Truncated Hashes
  • Machine Information
  • Compiling and Integrating Crypto++
  • SHA-512
  • Troubleshooting WMI
  • Operating System Dependent Behaviour
    • Windows 2000
    • Windows XP
    • Windows Server 2003
    • Windows Vista
  • Sample Code
  • Truncation Bits
  • Collisions
  • Device Location in WMI
  • Installation Fingerprints
  • Operating System Upgrade Effects
  • Summary

Intellectual Property

Microsoft owns a patent closely related to this article. While fingerprinting an installation does not appear to be patented, a tolerant fingerprint is. In the United States, the patent is 6,243,468, "Software Anti-piracy System that Adapts to Hardware Upgrades". In Europe, the patent is EP1452940. Microsoft should be contacted for licensing at:

Microsoft Corporation
Legal Department
One Microsoft Way
Redmond, WA 98052

Microsoft also offers email addresses at http://www.microsoft.com/legal/. However, the author was not able to receive a response from listed emails.

Background

While researching methods to inventory a computer system's hardware, two thoughts immediately came to mind: the Windows Registry and Setup API. Each has its own deficiencies which are described below.

In the context of host fingerprinting (without hardware identifcation), there is an additional alternative - the host's Public RSA Key. However, this also has a shortcoming as described below.

Setup API

For an example of using the Setup API, see Enumerate Installed Devices Using Setup API by A. Raiza. Adapting the methods presented by A. Raiza suffers from the fact that key information is not readily available. For example, details of the Processor are not available directly.

In addition, information such as the size of memory installed at a particular motherboard slot is not available.

Image 1

Windows Registry

Image 2

The Windows Registry suffers from the same limitaion as the Setup API - limited information is readily available.

Host's RSA Public Key

The computer's RSA key is used in, for example, setting up the Secure Channel between itself and a Domain Controller. However, it is unclear how often the machine's RSA key pair are refreshed (if at all), and if standard users will be able to access the key material due to security restrictions. The author is aware of S-Channel automatic resets, but is not versed on the particulars of how it is accomplished.

The presumed RSA shortcomings were a disappointment - one can easily create a hash of the RSA Public Key (without the requirement of ASN.1 DER decoding). RSA Keys satisify the requirement of uniqueness and anonymity.

WMI

WMI allows the programmer to not only extract basic "named" information from the Operating System, but also easily determine additional object properties such as Serial Numbers (when provided by the Operating System). In addition, if a software user installs a new operating system after a hard drive format and then reinstalls a previously activated product, the software will not require an additional reactivation.

For an additional treatment of WMI, see Getting Information from WMI in Visual C++ by Aamir Butt and Making WMI Queries In C++ by Martin Friedrich. Finally, Microsoft's examples of WMI can be found at WMI C++ Application Examples.

Truncated Hashes

Simply stated, Truncated Hashing discards bits from the output of the hash function, resulting in a shorter digest. In theory, if one requires a 128 bit digest, one could choose:

  • SHA, retaining all 128 bits
  • SHA-256, discarding 128 bits
  • SHA-512, discarding 384 bits

This presumes an idealized hash - or a hash in which each bit has exactly a 50 percent chance of taking the value 0 or 1. This implies that the truncated 256 bit hash the same cryptographic security as the 128 bit hash. In reality, no hash is idealized.

The requirements of this article - anonymity - will allow use of the abridged digest. Unlike A Deterministic Method of Determining a Document's Modified State which used a Hash over a CRC to avoid collisions, a certain amount of collisions are desired to provide anonymity. It is desired that two different, distinct devices produce the same Truncated Hash, just that it not occur too often. In another light, given a Truncated Hash, one cannot say with certainty, "This is an Abit Motherboard", or "This is a GeForce Video Card".

For an IETF proposed implementation employing truncated hashes, see Host Identity Protocol (HIP) Domain Name System (DNS) Extensions.

Machine Information

The machine information which will be gathered to create an installation fingerprint are properties of the BIOS, Processor, Memory, Disk Drive, and Network Adapter Information.

The reader should investigate expanding the metrics to develop the fingerprint. For example, CD-ROM, DVD, and Sound Card information should be used to expand entropy. The various WMI classes documentation can be found at Win32 Classes.

Downloads

There are four downloads associated with this article. They are presented at the end of the article. The topics of the downloads are:

  • Sample 1 - Basic WMI Program
  • Sample 2 - Retrieving BIOS, Processor, Memory, Disk Drive, and Network Adapter Information
  • Sample 3 - Truncated Hashing
  • Sample 3 Release Build - Release Build provided for convenience

Compiling and Integrating Crypto++

Image 3Crypto++ can be downloaded from Wei Dai's Crypto++ page. For compilation and integration issues, see Compiling and Integrating Crypto++ into the Microsoft Visual C++ Environment. This article is based upon basic assumptions presented in the previously mentioned article.

For those who are interested in other C++ Cryptographic libraries, please see Peter Gutmann's Cryptlib or Victor Shoup's NTL.

SHA-512 Hash Function

This article will use SHA-512 as the hash function. It is a member of the SHA-2 family of hashes. The SHA-2 hashes are mandated for Federal use by FIPS; and recognized by the ISO as an International standard.

For those who would like to use a flat C File and a non-NIST recommendation, the ISO recognizes RIPEMD and WHIRLPOOL (in addition to SHA). Both RIPEMD and WHIRLPOOL are implemented in Crypto++.

Troubleshooting WMI

Microsoft's TechNet has a very good article entitled WMI Isn't Working. Should the reader encounter script failures or WMI Service problems, he or she should consult this article. The TechNet article include a WMI diagnostic utility.

Operating System Dependent Behavior

Where to introduce the special cases of this article presented bit of a problem. Though the differences are few, the supporting screen shots quickly made article sections grow. As such, the errata will be detailed now. In general, each Operating System displayed more information using WMI as the family matured.

In general, no Operating System returned any information of value from the WMI's Win32_MotherBoards class. Useful information which is typically returned in Caption, Description, and Name was represented as "Motherboard".

Image 4

Windows 2000

Windows 2000 suffered quite a few shortcomings. When a standard user attempted to instantiate the IWebAdminstratorLocator class, COM error 0x80041014 was issued. When attempting to connect to the WMI Service using IWebLocator, standard users received COM error 0x80041008 (E_INVALID_PARAMETER ).

Image 5

An Administrator also received 0x80041008 (Not Implemented) when connecting to the WMI Service using IWebLocator interface. This leads the author to believe IWebLocator is broken or being used incorrectly, and the WMI Service requires an adjustment on Permissions to allow standard users to connect using IWebAdminstratorLocator.

Finally, Windows 2000 was very verbose with respect to Network Adapters, returning a total of seven - including RAS Async and multiple WAN Miniport Adapters. The verbosity will be corrected in Sample 3 by refining the WQL statement (by adding a WHERE clause).

Windows XP

Windows XP missed Disk Drive Serial Numbers, and was slightly verbose with Network Adapters.

Windows Server 2003

Server 2003 correctly returned the Network Adapter information, but did not return items such as Hard Disk Serial Number.

Windows Vista

Vista properly returned all information except Network Adapters. Network Adapters results were off slightly - adding Teredo Tunneling Pseudo-Adapter.

Operating System Comparison

For comparison, the output of Sample 2 is reproduced below based on Operating Systems.

Image 6

Image 7

Image 8

Image 9

Windows 2000

Windows XP

Windows Server 2003

Windows Vista

Sample 1

The first sample demonstrates connecting to the WMI Service. It includes calls to CoInitializeEx() and CoInitializeSecurity(); and instantiating IWbemLocator and IWbemServices. MSDN and the previously mentioned Code Project articles document it well, so there is no detail provided here.

Sample 2

Sample two is an exercise in repetition. For each class to be queried, IEnumWbemClassObject's Next() method is invoked until exhausted. At each invocation, the relevant data (for example, Name or MAC Address) is retrieved and printed to the Console. The simplified code is shown below.

CComPtr< IEnumWbemClassObject > pEnumerator = NULL;
pService->ExecQuery( CComBSTR( L"WQL" ),
    CComBSTR( L"Select * from Win32_<Class Of Interest>" ),
    WBEM_FLAG_FORWARD_ONLY | WBEM_FLAG_RETURN_IMMEDIATELY,
    NULL, &pEnumerator );

for( UINT i = 0; NULL != pEnumerator; i++ )
{
    CComPtr< IWbemClassObject> pObject = NULL;
    ULONG uReturn = 0;
    VARIANT vtProperty;

    cout << "Device " << i << endl;
 
    hr = pEnumerator->Next( WBEM_INFINITE, 1, &pObject, &uReturn );
    if( FAILED( hr ) || 0 == uReturn ) { break; }

    hr = pObject->Get( L"Name", 0, &vtProperty, 0, 0);
    if( SUCCEEDED( hr ) )
    {
        cout << _T(" Name: ");
        if( VT_EMPTY != vtProperty.vt && VT_NULL != vtProperty.vt )
        {
            cout << vtProperty.bstrVal << endl;
        }
        else
        {
            cout << "Not Available" << endl;
        }
    }
}

A typical output from Sample 2 is shown below (taken from Windows Server 2003). The error retrieving the username is due to an issue with the Windows Server 2003 host.

Notice that the properties queried differ between devices. This is because the Operating System is not consistent in it's presentation of object data. For example, Name is of interest with respect to BIOS, but not Memory. To determine what is of interest, the reader must write demonstration code. There does not appears to be source of this information.

Image 10

Sample 3

The third sample introduces the WMI objects to Truncated Hashing. The changes between Sample 2 and Sample 3 are as follows:

  • Addition of cbHash[ SHA512::DIGEST_SIZE ]
  • Addition of Hash logic
  • Removed field which were not of interest
  • Expanded WMI Query to include a basic WHERE clause

This sample is producing a Truncated Hash of:

  • BIOS Name
  • Processor Speed
  • Memory Bank Size
  • Disk Drive Model
  • Network Adapter MAC Address

Below is the result of running example three on Windows Vista. Note that Memory is now given in bytes as a string (sample two performed a conversion using _wtoi()).

Image 11

The Truncated hash is created as follow. Notice that even though the entire 512 SHA hash was created, only a small portion was retained by using a bit mask of 0x3F (0011 1111) which discarded all but the low order 6 bits of Byte 0.

szValue = vtProperty.bstrVal;
std::wcout << (wchar_t*)szValue << std::endl;
 
hash.Update( (PBYTE)(BSTR) szValue,
    szValue.Length() * sizeof( WCHAR ) );
hash.Final( cbHash );

cout << _T(" Truncated Hash: 0x");
cout << std::hex << std::setw(2);
cout << setfill( '0' ) << std::uppercase;

cout << ( cbHash[ 0 ] & 0x3F );

cout << std::endl;

The WQL used for filtering adapters follows. Note that AdapterType and "Ethernet 802.3" are fully documented WMI's Win32_NetworkAdapter class documentation.

hr = pService->ExecQuery( CComBSTR( L"WQL" ),
    CComBSTR( L"Select Name, MACAddress from Win32_NetworkAdapter " \
              L"WHERE AdapterType=\"Ethernet 802.3\"" ),
    WBEM_FLAG_FORWARD_ONLY | WBEM_FLAG_RETURN_IMMEDIATELY,
    NULL, &pEnumNetworkAdapter );

Below is the result of running the sample program on Windows 2000 with the narrower WQL query.

Image 12

Finally, Windows Server 2003 is shown below for completeness.

Image 13

Choice of Truncation Bits

The author chose 6 bits based on background research for this article. Six bits creates 26 or 64 distinct values a message can assume. The reader is encouraged to experiment with different sizes. Since the author does not have a collection of strings to test and develop metrics, he cannot offer empirical data as to the validity of 6 bits over other possible choices.

If the reader chooses to create a Truncated Hash of Processor manufacturer strings, 6 bits will probably not be an appropriate choice. It appears WMI returns the result of the CPUID instruction. In the case of Intel, this string is "GenuineIntel". The author is aware of seven x86 compatible manufacturers: AMD, Cyrix, IBM, IDT, Intel, NEC, and Transmeta. In this case, two bits would probably be a better choice due to the anonymity restriction.

A final example of using fewer bits of the hash would be use of the Operating System as a metric. There are four versions this article investigates: Windows 2000, Windows XP, Windows Server 2003, and Windows Vista. In this case, 1 bit would be an appropriate choice for anonymity since 21 = 2.

Collisions

A collision was observed using Truncated Hashes for Processors of Windows Vista (799 MHz) and Windows XP (231 MHz). The overlay is shown below. The reader should also note from above that Windows 2000 (731 MHz) developed a value of 0x1F. The author can assure the reader that the full 512 bit hashes are different - without observing the full hash of either.

Image 14

Another collision appears to exist between Windows Server 2003 and Windows Vista with respect to Disk Drives. However, both have an IBM Deskstar (same Model Number) installed. So the collision is not a true collision per se.

Device Location

At times, it is not readily apparent where a device will lie. For example, a USB wireless network adapter will be returned when querying Network Adapters, but a SCSI Orb Tape Drive will be placed in Disk Drives.

Installation Fingerprint

In the example presented above, a simple concatenation will suffice to create the Identification fingerprint. In the example below (taken from Windows 2000), the string would be 27:1F:11:11:3E:19.

Image 15

Now, suppose a user were to change the two 256 MB sticks of RAM for 512 MB sticks. The signature would change from 27:1F:11:11:3E:19 to 27:1F:22:22:3E:19. Note the doubling of 11 - it is coincidence. It is an output of the algorithm, not a literal scalar doubling. Taking from Windows Server 2003, 128 MB is 0x30 truncated.

The reader is encouraged to develop a weighting system so that small hardware changes do not require the user to reactivate. For a further discussion of weighting, see Product Activation Based on RSA Signatures.

Operating System Upgrade

Below was the Windows 2000 installation, which has been upgraded to Windows XP. Note the only change performed by the author was an Operating System and hard disk upgrade. The BIOS Name and Network Adapter information have changed, even though the phical devices are the same as in the previous Operation System installation.

Image 16

Summary

This article presented the reader with foundations to uniquely identify installations. The key points the reader should keep in mind when implementing their system in addition to simplicity and using the technology in moderation (due to the exchange of personal user information) are:

  • A fingerprint can be generated based on the installed hardware
  • Not all fields returned in a WMI query are useful
  • Different versions of Windows populate a WMI query differently
  • Hardware is not always in an obvious WMI device category
  • Truncated Hashes add anonymity by creating collisions
  • Due to truncation, almost any hash choice is appropriate

Downloads

Acknowledgements

  • Wei Dai for Crypto++ and his invaluable help on the Crypto++ mailing list
  • Dr. A. Brooke Stephens who laid my Cryptographic foundations
  • James Snyder (owner of RFID Solutions) for his sharing his thoughts
  • Dr. Vasiliy Smirnov (www.DiscoveryBiz.net) for his sharing his thoughts

Revisions

  • 12.05.2007 Reclassified Article
  • 12.05.2007 Expanded Windows Vista Information
  • 09.12.2007 Expanded 'Choice of Truncation Bits' Section
  • 06.12.2007 Added Operating System Upgrade Efects
  • 06.12.2007 Added Sample 3 Release Build
  • 06.11.2007 Added Windows XP Fingerprint after Upgrade from Windows 2000
  • 06.10.2007 Expanded Information on Truncation Bits
  • 06.09.2007 Initial Release

Checksums

  • Sample1.zip
    MD5: 9BD62E560E9027224A6EE66C5E149B45
    SHA-1: 17E3ED48125B3DF311BA9719FCEF360276761F0B
  • Sample2.zip
    MD5: 82AD788AC3F413461CB2D8F9218EA05D
    SHA-1: A69C2312FB67BE53F07465C6A6AB7721A04FC7FE
  • Sample3.zip
    MD5: 87FB2486B979DFC58C7B13C229D8ABF3
    SHA-1: 47038167029AD4DABBB3D13C33886052C7BCE811
  • Sample3RelExe.zip
    MD5: 9F7172F914692A6E96206AA701E8E1C2
    SHA-1: 4C7A186959189BBE227159B35FA112FDB40F945E

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)