I have just one idea. It can work only if you the sound is human speech, some words. Also, the set of words should fit on some observable vocabulary. I would say, this is the only hope you have.
If this is acceptable, don't store any sounds, store words. When you receive sound, try to recognize it using available speech recognition engine. Try to recognize the sound as speech and in the case of success compare words. Use the assembly "System.Speech", name space
System.Speech.Recognition
; it comes with .NET Framework re-distributable package, please see:
http://msdn.microsoft.com/en-us/library/system.speech.recognition.aspx[
^].
That's all.
If you think you can compare two sound records without 100% match, you should present you criteria of "closeness" of them. But you don't need to do it. Just forget it. With your present level of understanding of the problems, if you work along, I don't believe you whole life time would be enough to approach this task. Just don't waste your time.
—SA