Zoltán Zörgő asked some very good questions above.
I'm going to assume some answers here...
If you treat the two strings as two sets of characters and you just want to know which characters are not present in both strings then:
string StringSetDifference(string a, string b)
{
var aEnumerable = a as IEnumerable<char>;
var bEnumerable = b as IEnumerable<char>;
var allChars = new HashSet<char>(aEnumerable);
allChars.UnionWith(bEnumerable);
var bothChars = new HashSet<char>(aEnumerable);
bothChars.IntersectWith(bEnumerable);
allChars.ExceptWith(bothChars);
return string.Concat(allChars as IEnumerable<char>);
If you want to keep track of the positions of the differences as well, for example:
The difference between "ABCDEFDG" and "ABCDEF" would be "DG".
The algorithm above would not get this correct because the "D" occurs twice and the
HashSet
doesn't care about duplicates.
This is an area of significant research in computer science.
See the book:
Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology by Dan Gusfield (1997)[
^] as well as the references provided by Original Griff....