16,000,304 members
See more:
Today's challenge is back to strings.

Given a random (or not so random) string, generate unique anagrams.

Bonus points: Hook into a spell checking service of your choice (local machine, backed into the OS, remote webservice) to only return anagrams that are actual words.

The trick: it needs to be a fast solution. Think of ways to reduce the number of spell checks needed if you choose to return only actual words.

Last Week

Richard Deeming gets the mug for last week's solution. Contact Sean and you can be a guinea pig for our new mug design.
Posted
Updated 18-Jun-18 1:27am
Richard Deeming 3-Mar-17 10:32am
I get a mug, or I am a mug? :D

And shouldn't that be "baked into", rather than "backed into"?
Maciej Los 4-Mar-17 13:52pm
As to me - baked.
PIEBALDconsult 3-Mar-17 12:47pm
Hmmm... permutations of chars... coupled to a Spell Check Tree... and I already have an Access "database" containing a Scrabble dictionary...
If I'm allowed to use a database with a function based index it could definitely be done in fractions of a second
PIEBALDconsult 4-Mar-17 0:21am
Someone ought to post some examples.

## Solution 4

Inspired by Solution 2 but considerably faster
C#
```class Program
{
private static IEnumerable<string> GetWords()
{
//using a local version of:
//https://raw.githubusercontent.com/dwyl/english-words/master/words.txt

{
string s = String.Empty;
while ((s = sr.ReadLine()) != null)
{
if (s.Length > 0)
{
yield return s;
}
}
}
}

static Char[] ToCharArray(string Source)
{
char[] CharArray = Source.ToLower().ToCharArray();
Array.Sort(CharArray);
return CharArray;
}

static void Main(string[] args)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();

Dictionary<Char[], List<string>> AllWords = new Dictionary<Char[], List<string>>(new CharArrayEqualityComparer());
foreach (string word in GetWords())
{
List<string> WordList;
Char[] key = ToCharArray(word);
if (!AllWords.TryGetValue(key, out WordList))
{
WordList = new List<string>();
}
}
//ILookup<UInt16[], string> AllWords = GetWords().ToLookup(s => ToNumericArray(s), new CharArrayEqualityComparer());

stopWatch.Stop();

Console.WriteLine(string.Format("Lookup for {0} words loaded in {1}ms)", AllWords.Count, stopWatch.Elapsed.TotalMilliseconds));

string CompareWord;
do
{
Console.Write(string.Format("{0}Search Word: ", Environment.NewLine));

stopWatch.Reset();
stopWatch.Start();

if (CompareWord != null)
{
IEnumerable<string> Anagrams = AllWords[ToCharArray(CompareWord)];
stopWatch.Stop();
Console.WriteLine(string.Format("Time elapsed (ms): {0}", stopWatch.Elapsed.TotalMilliseconds));

foreach (string word in Anagrams)
{
Console.WriteLine(word);
}
}
} while (CompareWord != null);
}
}

internal class CharArrayEqualityComparer : EqualityComparer<Char[]>
{
private static readonly EqualityComparer<Char> Comparer = EqualityComparer<Char>.Default;
public override bool Equals(char[] x, char[] y)
{
if (x == null) return false;
if (y == null) return false;

if (x.Length != y.Length)
{
return false;
}
for (int i = 0; i < x.Length; i++)
{
if(!Comparer.Equals(x[i], y[i]))
{
return false;
}
}
return true;
}

public override int GetHashCode(char[] obj)
{
unchecked
{
if (obj == null) return 0;
int hash = 17;
foreach (char Item in obj)
{
hash = (((hash << 5) - hash) ^ Comparer.GetHashCode(Item));
}
return hash;
}
}
}
```

And the results:
```Lookup for 315379 words loaded in 515,3337ms)

Search Word: teaser
Time elapsed (ms): 0,0082
aretes
asteer
easter
eaters
reseat
saeter
seater
staree
teaser

Time elapsed (ms): 0,0082
alters
artels
estral
laster
lastre
rastle
ratels
relast
resalt
salter
slater
staler
stelar
talers```

I suspect most of the measured time is writing out the words to the console.
Changed the code to measure the same way as Graeme is doing

I managed to trim it a bit further by skipping the strings and using int arrays with a homemade EqualityComparer

<edit>Since using a lookup simplifies the problem to, who's having the fastest hash algorithm.
I thought I'd have a look at how to take it a step further by creating combinations of words as anagrams. They could be created fairly easy by using a method similar to:
C#
```public static IEnumerable<IEnumerable<T>> CreateCombinations<T>(this IEnumerable<T> elements, int Ordinal)
{
return Ordinal == 0 ? new[] { new T[0] } :
elements.SelectMany((e, i) =>
elements.Skip(i + 1).CreateCombinations(Ordinal - 1).Select(c => (new[] { e }).Concat(c)));
}
```
The problem is just the time aspect of creating the lookup table. If it takes 1 second to create a lookup for N words it will take N-1 seconds to create a lookup with a combination of just two words, with three words it will take (N-1)*(N-2) seconds and so on. So I decided I'll give it a rest until I come up with a faster way of creating the lookup table</edit>

<edit2>Just to prove there's no difference in lookup performance I created a dropin replacement for the ILookup using a dictionary. As a result the load performance got quite better since it doesn't use linq any more.
I've also trimmed the equalityComparer so it doesn't contain any multiplications anymore. </edit2>

<edit2>Realized that I can use an EqualityComparer<char> inside the EqualityComparer I'm using for the dictionary.
Now I don't need to copy the array</edit2>

v12
Graeme_Grant 5-Mar-17 7:32am
No point including what does not matter... I could trim the timing further but it is already extremely quick. ;)
Graeme_Grant 6-Mar-17 9:23am
I found some time to look at your code and found an error in your timing calculation. The following code only measures the time to the first element and not all elements:
```stopWatch.Start();
IEnumerable<string> Anagrams = AllWords[ToNumericArray(CompareWord)];
stopWatch.Stop();```
You iterate through the remainder after you stop the stopwatch...

To fairly calculate the timing, you need to do the following:
```stopWatch.Start();
var Anagrams = AllWords[ToNumericArray(CompareWord)].ToList();
stopWatch.Stop();```
Now all elements are returned within the timing, not just the first.

There is also a timing calculation error with loading the words from disk:
```List<string> Words = GetWords().ToList(); // words loaded here!

Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();

ILookup<UInt16[], string> AllWords = Words.ToLookup(s => ToNumericArray(s), new UInt16ArrayEqualityComparer());
stopWatch.Stop();
```
Should be:
```Stopwatch stopWatch = new Stopwatch();

stopWatch.Start();
List<string> Words = GetWords().ToList(); // words loaded here!
ILookup<UInt16[], string> AllWords = Words.ToLookup(s => ToNumericArray(s), new UInt16ArrayEqualityComparer());
stopWatch.Stop();```
You may want to recalculate those timings...
Just because something implements IEnumerable doesn't mean it's a lazy loading iterator method.
ILookup is implemented by the Lookup class (https://referencesource.microsoft.com/#System.Core/System/Linq/Parallel/Utils/Lookup.cs,41)
And when you do a lookup against the Lookup :-), it returns an Igrouping. Which is implemented by a Grouping.(https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,7bb231e0604c79e3)
Which is implementing, wait for it, an IList. Internally it's implemented using an array, but that's the case in a List as well.
Anyway, the code is actually just passing a reference after a lookup against a dictionary, just like you do in your code.

Which brings us to the other one which is a plain oops!
I was experimenting with some stuff and forgot to change it back to the previous state. (Which you can find if you compare the versions of the solution)

Graeme_Grant 6-Mar-17 20:15pm
I understand where you are coming from but I did do some research and watched the code work before posting. However, when you do make the change, you can see that there is a difference. Your timings are still not correctly calculated.
Of course there would be a difference if you copy over the elements from an IList to a new List. The fact that you add time if you do unnecessary work doesn't mean anything.
So, explain properly how the code is measuring any more wrong than yours.
We're both passing a reference, from a Dictionary!

## Solution 3

Given a database of words, and a function to sort the characters in a string...

`SELECT [word] FROM [dictionary] WHERE sort([word])=sort([inputstring])`

:laugh:

Edit:

Using SQL Server, I have a list of 216634 words, with their letters pre-sorted. I have not yet added an index. The following completes in less than a second.

SQL
```DECLARE @i VARCHAR(16) = 'teaser'
DECLARE @s VARCHAR(16) = dbo.SortLetters ( @i )
SELECT [Word] FROM [WordList] WHERE [Letters] = @s AND [Word] != @i ORDER BY [Word]```

```ARETES
EASTER
EATERS
REATES
RESEAT
SAETER
SEATER
STEARE```

Edit:

After adding words.txt and words3.txt from the site that others have mentioned
(and ignoring the "words" that are not strictly alphabetic) for a total of 463260 words...

```ARETES
ASTEER
EASTER
EASTRE
EATERS
ERASTE
REATES
RESEAT
RESETA
SAETER
SEATER
STAREE
STEARE
TERESA```

Edit:

I remembered that OpenVMS has a word list, so I extracted that from one of my systems and added it to my master list (now a total of five lists).

It contains 42979 entries, 2973 of which are not on any of the other lists (mostly names, such as Boromir), including these:

```CARDIOVASCULATORY
DJANGOLOGY
DOPPLEGANGER```

And these:

```ARBRITRARY
ATTACTIVE
COWRTIERS
DOCTERS
EXPECIALLY```

(I didn't review them all.)
Sadly, it contains no new anagrams of "teaser".

Edit:

After removing duplicate words from my master list, I have 466224 words.
I see that someone posted anagrams of 'alerts', so here's what my database has:

```Word   HowMany Which
ALTERS 5       31
ARTELS 4       15
ESTRAL 4       15
LASTER 4       15
LASTRE 2       12
RASTLE 2       12
RATELS 4       15
RELAST 2       12
RESALT 2       12
SALTER 5       31
SLATER 5       31
STALER 4       15
STELAR 4       15
TALERS 4       15
TARSEL 1       2```

The HowMany column indicates how many of the input lists contain the word.
The Which column is a bit-mapped value indicating which input lists contain it.

v5
Graeme_Grant 4-Mar-17 11:10am
Are you poking fun at my stats?
PIEBALDconsult 5-Mar-17 0:30am
Do I need to?
PIEBALDconsult 5-Mar-17 19:16pm
Your statistics are so ugly, they have a non-standard deviation.

Your statistics are so dumb that the sigma is negative.

Graeme_Grant 6-Mar-17 1:21am
hehe...
Graeme_Grant 6-Mar-17 9:26am
Wouldn't there be disk latency to slow you down versus in memory?

## Solution 1

### Python

Takes a word from the command-line and returns all anagrams that are words.
The way it works:
1. It uses Python's `itertools` to generate all permutations. This is likely the fastest way to do so.
2. The fastest way to check if words are real, should be to check their existence in a set loaded into memory. Then you don't need to read the file system and you don't need to send a HTTP request, which can take some time. I included a full word list in the source, from /usr/share/dict/words on my Ubuntu virtual machine. However, that file is ~900kB, which is pretty big. For this reason I did not include the 'raw' words in the source, but I zlib-compressed them and base64-encoded this compression, which is 'only' 333kB (still big, but more acceptable - and speed was a concern, memory and source length were not :) ). That string is put into the source code, and on the startup of the program, it decompresses it and created a Python `set`, that potential anagrams are checked against.

This is the code:
Full anagrams code for CodeProject coding challenge · GitHub[^]
Code without the huge Base64 string:
Python
```import sys, base64, zlib, itertools

def main():
input_word = sys.argv[1]
perms = set([''.join(x) for x in itertools.permutations(input_word)])
for w in perms:
if is_word(w):
print(w)

word_compressed_base64 = "some huge thing that I'm not going to copy here; see Gist for full code"

word_set = None

def prepare_words_set():
global word_set
decoded = base64.standard_b64decode(word_compressed_base64)
decompressed = zlib.decompress(decoded).decode('utf-8')
word_set = set(decompressed.split(';'))

def is_word(w):
return w in word_set

prepare_words_set()
main()```

v2
Graeme_Grant 3-Mar-17 11:24am
Solution of this size will slow down page loading as other solutions are posted. So you may want to crop the encoded data for viewing (to a handful of lines) and add a download link or use compile perl online[^] to host the full solution.
Thomas Daniels 3-Mar-17 11:48am
yeah, you're right. I put the full code in a GitHub Gist.

## Solution 2

### C#

Uses a local wordlist (from GitHub - dwyl/english-words: A text file containing 479k English words for fast search auto-completion / autosuggestion.[^]

Words are loaded into a dictionary, indexed by length to help w/ searching.

C#
```using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace anagramGenerate
{
internal class Program
{
private static void Main(string[] args)
{
Dictionary<string, int> words = new Dictionary<string, int>();

//using a local version of:
//https://raw.githubusercontent.com/dwyl/english-words/master/words.txt

{
string s = String.Empty;
while ((s = sr.ReadLine()) != null)
{
if (s.Length > 0)
{
}
}
}

Console.WriteLine("Word you'd like to see anagrams for (" + words.Count + " words loaded): ");

string newWord;

do
{

if (newWord != null)
{
newWord = newWord.ToLower();

int wordLen = newWord.Length;
string w1sort = String.Concat(newWord.OrderBy(w => w));
Console.WriteLine("Anagrams for " + newWord + ":");
foreach (var pair in words)
{
if (pair.Value == wordLen && pair.Key != newWord)
{
if (w1sort == String.Concat(pair.Key.OrderBy(w => w)))
{
Console.WriteLine(pair.Key);
}
}
}
}
} while (newWord != null);
}
}
}```

Screenshot showing some timing. (I'm curious if others think this is good or bad):

### Update 2

Since it seems like everyone is updating their solution (keep in mind this is my first time entering) Below is a new, faster solution:

This uses an int hash code array to do lookups. What do you think?

C#
```using System;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;

namespace anagramGenerate
{
internal class Program2
{
private static int ssorthash(string str)
{
char[] foo = str.ToArray();
Array.Sort(foo);
return new string(foo).GetHashCode();
}

private static void Main(string[] args)
{
//using a local version of:
//https://raw.githubusercontent.com/dwyl/english-words/master/words.txt
var s1 = Stopwatch.StartNew();

int[] hashes = new int[wordlist.Count()];

for (int i = 0; i <= wordlist.GetUpperBound(0); i++)
{
hashes[i] = ssorthash(wordlist[i]);
}

s1.Stop();
Console.WriteLine("Words loaded in: " + s1.Elapsed.TotalMilliseconds + " ms");
Console.WriteLine("Word you'd like to see anagrams for (" + wordlist.Count() + " words loaded): ");

string newWord;

do
{

if (newWord != null)
{
s1.Reset();
s1.Start();
newWord = newWord.ToLower();
StringBuilder output = new StringBuilder();
int w1hash = ssorthash(newWord);
Console.WriteLine("Anagrams for " + newWord + ":");

for (int x = 0; x <= hashes.GetUpperBound(0); x++)
{
if (hashes[x] == w1hash && wordlist[x] != newWord)
{
output.AppendLine(wordlist[x]);
}
}

s1.Stop();
Console.Write(output.ToString());
Console.WriteLine("Anagrams found in: " + s1.Elapsed.TotalMilliseconds + " ms");
}
} while (newWord != null);
}
}
}```

Some timing:
```Words loaded in: 198.8951 ms
Word you'd like to see anagrams for (354985 words loaded):
lions
Anagrams for lions:
insol
isoln
linos
loins
noils
Anagrams found in: 9.4648 ms
bears
Anagrams for bears:
bares
barse
baser
besra
braes
saber
sabre
serab
Anagrams found in: 7.1672 ms```

v3

## Solution 5

Like some others, I have used a lookup table but with a twist. A `Dictionary` with a specialized grouping key.

The key is a `<Tuple<int, long, long>` made up of length of words, and a pair of base 26 `Long`s. As there are 26 characters to english words, I've split the lowercase alphabet into halfs - a to l (0 to 12) to the lower & m to z (13 to 26) to the upper.

The `Find` method run in milliseconds, almost instantly, with a list of over 350,000 words. Word length does not affect performance.

C#
```using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using ZeroFormatter;

namespace BuildLookupFile
{
static class Program
{
static void Main(string[] args)
{
Console.WriteLine("Anagram Lookup File Builder");
Console.WriteLine("===========================");

var helper = new AnagramsHelper();
var sw = new Stopwatch();

sw.Start();
var elapsed = sw.Elapsed;

Console.WriteLine("* Loaded & processed in {0:N6} seconds",
elapsed.TotalMilliseconds / 1000);
Console.WriteLine("\r\nStatistics");
Console.WriteLine("----------");

Console.WriteLine("* {0:N0} words", helper.count);

Console.WriteLine("\r\nSaving to file...");

sw.Restart();
helper.SaveTable("words.bin");
elapsed = sw.Elapsed;

Console.WriteLine("* Write to disk took in {0:N6} seconds",
elapsed.TotalMilliseconds / 1000);
}
}
public class AnagramsHelper
{
public int count { get; private set; }

private Dictionary<string, List<string>> LookupTable
= new Dictionary<string, List<string>>();

{
var wordKeys = new Dictionary<string, int>();
string word, key;
List<string> wordList = null;

{
while ((word = sr.ReadLine()) != null)
{
if (word.Length > 0 && !word.Contains("'") && !word.Contains("\""))
{
key = word.GetKey();

if (!LookupTable.ContainsKey(key))
{
wordList = new List<string>();
}
else
{
wordList = LookupTable[key];
}

count++;
}
}
}
}

internal void SaveTable(string filename)
{
using (var fs = File.OpenWrite(filename))
LookupTable.Serialize(fs);
}
}

public static class HelperExtensions
{
public static string GetKey(this string word)
{
var chars = word.ToCharArray();
Array.Sort(chars);
return new string(chars);
}

public static void Serialize(this Dictionary<string, List<string>> data, Stream stream)
{
var writer = new BinaryWriter(stream);
writer.Write(ZeroFormatterSerializer.Serialize(data));
writer.Flush();
}
}
}```

VB.NET
```Imports System.IO
Imports System.Runtime.CompilerServices
Imports ZeroFormatter
Module Module1

Sub Main()

Console.WriteLine("Anagram Lookup File Builder")
Console.WriteLine("===========================")

Dim helper = New AnagramsHelper()
Dim sw = New Stopwatch()

sw.Start()
Dim elapsed = sw.Elapsed

Console.WriteLine("* Loaded & processed in {0:N6} seconds", elapsed.TotalMilliseconds / 1000)
Console.WriteLine(vbCr & vbLf & "Statistics")
Console.WriteLine("----------")

Console.WriteLine("* {0:N0} words", helper.count)

Console.WriteLine(vbCr & vbLf & "Saving to file...")

sw.Restart()
helper.SaveTable("words.bin")
elapsed = sw.Elapsed

Console.WriteLine("* Write to disk took in {0:N6} seconds", elapsed.TotalMilliseconds / 1000)
End Sub

End Module

Public Class AnagramsHelper

Public Property count() As Integer
Private LookupTable As New Dictionary(Of String, List(Of String))()

Dim word As String
Dim key As String
Dim wordList As List(Of String)

Using sr As StreamReader = File.OpenText(filename)
Do
If word?.Length > 0 AndAlso
Not word.Contains("'") AndAlso
Not word.Contains("""") Then
key = word.GetKey()
If Not LookupTable.ContainsKey(key) Then
wordList = New List(Of String)()
Else
wordList = LookupTable(key)
End If
count += 1
End If
Loop While word IsNot Nothing
End Using

End Sub

Friend Sub SaveTable(filename As String)
Using fs = File.OpenWrite(filename)
LookupTable.Serialize(fs)
End Using
End Sub

End Class

Public Module HelperExtensions

<Extension>
Public Function GetKey(word As String) As String
Dim chars = word.ToCharArray()
Array.Sort(chars)
Return New String(chars)
End Function

<Extension>
Public Sub Serialize(data As Dictionary(Of String, List(Of String)), stream As Stream)
Dim writer = New BinaryWriter(stream)
writer.Write(ZeroFormatterSerializer.Serialize(data))
writer.Flush()
End Sub

End Module```

And here is th output:
```Ultra-Fast Anagram Generator
============================

Word list supplied by: https://github.com/dwyl/english-words

Statistics
----------
* 351,096 words
* Loaded & processed in 7.4770 seconds
* Largest group of Anagrams: 15
* Longest word in Lookup Table:
- dichlorodiphenyltrichloroethane

Search Word: teaser

FOUND 9 anagrams for "teaser"

aretes
asteer
easter
eaters
reseat
saeter
seater
staree
teaser

Time: 0.000072 seconds

alters
artels
estral
laster
lastre
rastle
ratels
relast
resalt
salter
slater
staler
stelar
talers

Time: 0.000072 seconds```

UPDATE: I've found an improvement in performance through simplification. The upside is that load time is improved by 300%, and response time improved by over 200%.

Here are the revised results:
```Ultra-Fast Anagram Generator
============================

Word list supplied by: https://github.com/dwyl/english-words

Statistics
----------
* 351,096 words
* Loaded & processed in 2.5293 seconds
* Largest group of Anagrams: 15
* Longest word in Lookup Table:
- dichlorodiphenyltrichloroethane

Search Word: teaser

FOUND 9 anagrams for "teaser"

aretes
asteer
easter
eaters
reseat
saeter
seater
staree
teaser

Time: 0.000039 seconds

alters
artels
estral
laster
lastre
rastle
ratels
relast
resalt
salter
slater
staler
stelar
talers

Time: 0.000034 seconds```

UPDATE #2.1: Now that the lookup is lightning fast, the next performance bottleneck to resolve is the loading of the lookup table. PIEBOLDconsult is using a SQL DB, so I decided to pre-build the Lookup Table.

The biggest bottleneck encountered in saving and loading the binary version of the lookup table was the .Net `BinaryFormatter` whick took 61+ seconds (best time) to deserialize the binary stream. So, with a little searching I found ZeroFormatter[^]. Load time is now reduced from 2.5293 seconds to a lightning fast at 0.534043 seconds - over 500% quicker and over 1400% quicker than the orignal!

Lastly, also squeezed beter performance out of the lookup. Now reduced from 0.034ms to 0.013ms or 261% improvement, and over 550% faster than the original!

Here is the code for the building of the binary file:

C#
```using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using ZeroFormatter;

namespace BuildLookupFile
{
static class Program
{
static void Main(string[] args)
{
Console.WriteLine("Anagram Lookup File Builder");
Console.WriteLine("===========================");

var helper = new AnagramsHelper();
var sw = new Stopwatch();

sw.Start();
var elapsed = sw.Elapsed;

Console.WriteLine("* Loaded & processed in {0:N6} seconds",
elapsed.TotalMilliseconds / 1000);
Console.WriteLine("\r\nStatistics");
Console.WriteLine("----------");

Console.WriteLine("* {0:N0} words", helper.count);

Console.WriteLine("\r\nSaving to file...");

sw.Restart();
helper.SaveTable("words.bin");
elapsed = sw.Elapsed;

Console.WriteLine("* Write to disk took in {0:N6} seconds",
elapsed.TotalMilliseconds / 1000);
}
}
public class AnagramsHelper
{
public int count { get; private set; }

private Dictionary<string, List<string>> LookupTable
= new Dictionary<string, List<string>>();

{
var wordKeys = new Dictionary<string, int>();
string word, key;
List<string> wordList = null;

{
while ((word = sr.ReadLine()) != null)
{
if (word.Length > 0 && !word.Contains("'") && !word.Contains("\""))
{
key = word.GetKey();

if (!LookupTable.ContainsKey(key))
{
wordList = new List<string>();
}
else
{
wordList = LookupTable[key];
}

count++;
}
}
}
}

internal void SaveTable(string filename)
{
using (var fs = File.OpenWrite(filename))
LookupTable.Serialize(fs);
}
}

public static class HelperExtensions
{
public static string GetKey(this string word)
{
var chars = word.ToCharArray();
Array.Sort(chars);
return new string(chars);
}

public static void Serialize(this Dictionary<string, List<string>> data, Stream stream)
{
var writer = new BinaryWriter(stream);
writer.Write(ZeroFormatterSerializer.Serialize(data));
writer.Flush();
}
}
}```

VB.NET
```Imports System.IO
Imports System.Runtime.CompilerServices
Imports ZeroFormatter
Module Module1

Sub Main()

Console.WriteLine("Anagram Lookup File Builder")
Console.WriteLine("===========================")

Dim helper = New AnagramsHelper()
Dim sw = New Stopwatch()

sw.Start()
Dim elapsed = sw.Elapsed

Console.WriteLine("* Loaded & processed in {0:N6} seconds", elapsed.TotalMilliseconds / 1000)
Console.WriteLine(vbCr & vbLf & "Statistics")
Console.WriteLine("----------")

Console.WriteLine("* {0:N0} words", helper.count)

Console.WriteLine(vbCr & vbLf & "Saving to file...")

sw.Restart()
helper.SaveTable("words.bin")
elapsed = sw.Elapsed

Console.WriteLine("* Write to disk took in {0:N6} seconds", elapsed.TotalMilliseconds / 1000)
End Sub

End Module

Public Class AnagramsHelper

Public Property count() As Integer
Private LookupTable As New Dictionary(Of String, List(Of String))()

Dim word As String
Dim key As String
Dim wordList As List(Of String)

Using sr As StreamReader = File.OpenText(filename)
Do
If word?.Length > 0 AndAlso
Not word.Contains("'") AndAlso
Not word.Contains("""") Then
key = word.GetKey()
If Not LookupTable.ContainsKey(key) Then
wordList = New List(Of String)()
Else
wordList = LookupTable(key)
End If
count += 1
End If
Loop While word IsNot Nothing
End Using

End Sub

Friend Sub SaveTable(filename As String)
Using fs = File.OpenWrite(filename)
LookupTable.Serialize(fs)
End Using
End Sub

End Class

Public Module HelperExtensions

<Extension>
Public Function GetKey(word As String) As String
Dim chars = word.ToCharArray()
Array.Sort(chars)
Return New String(chars)
End Function

<Extension>
Public Sub Serialize(data As Dictionary(Of String, List(Of String)), stream As Stream)
Dim writer = New BinaryWriter(stream)
writer.Write(ZeroFormatterSerializer.Serialize(data))
writer.Flush()
End Sub

End Module```

Here is the output:
```Anagram Lookup File Builder
===========================

* Loaded & processed in 1.243783 seconds

Statistics
----------
* 351,096 words

Saving to file...
* Serialization & writing to disk took 0.300447 seconds```
Here are the revised results:
```Ultra-Fast Anagram Generator
============================

Word list supplied by: https://github.com/dwyl/english-words

Statistics
----------

* Loaded & processed in 0.534043 seconds

* 351,096 words
* 311,497 anagram groups
* 283,469 unique anagrams
* Largest group of Anagrams: 15
* Longest word in Lookup Table:
- dichlorodiphenyltrichloroethane

Search Word: teaser

FOUND 9 anagrams for "teaser"

aretes
asteer
easter
eaters
reseat
saeter
seater
staree
teaser

Time: 0.000014 seconds

alters
artels
estral
laster
lastre
rastle
ratels
relast
resalt
salter
slater
staler
stelar
talers

Time: 0.000013 seconds```

UPDATE #3: After a conversation with jörgen Andersson on timings, I realied that there was another lookup performance gain overlooked. I was returning the whole collection from the dictionary after I did the key lookup. I did not need to do that, only referencing the lookup list of results as IEnumerable was required as the dictionary's KVP already contained the results. This gave over 900% improvement gain on the last version and over 3600% improvement over the original - now only takes 0.0011 ms!

UPDATE #4: Refactored code for performance. Anagram seach now only takes 0.0007 ms to return vaild words from the dictionary!

C#
```using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using ZeroFormatter;

namespace Anagram
{
static class Program
{
static Stopwatch sw = new Stopwatch();
static TimeSpan elapsed;

static Dictionary<string, List<string>> LookupTable
= new Dictionary<string, List<string>>();

{
{
LookupTable = ZeroFormatterSerializer
.Deserialize<Dictionary<string, List<string>>>
}
}

static string GetKey(string w)
{
var chars = w.ToCharArray();
Array.Sort(chars);
return new string(chars);//.GetHashCode();
}

static IEnumerable<string> Find(string lookupWord)
{
var chars = lookupWord.ToCharArray();
Array.Sort(chars);
var key = new string(chars);
if (LookupTable.ContainsKey(key))
foreach (var item in LookupTable[key])
yield return item;
else
yield break;
}

static void Main()
{
Console.WriteLine("Ultra-Fast Anagram Generator");
Console.WriteLine("============================");

Console.WriteLine("\r\nWord list supplied by: https://github.com/dwyl/english-words");

sw.Start();
elapsed = sw.Elapsed;

ReportStatisics(elapsed);
HandleUserQueries();
}

private static void ReportStatisics(TimeSpan elapsed)
{
Console.WriteLine("\r\nStatistics");
Console.WriteLine("----------");

var sec = elapsed.TotalMilliseconds / 1000;
var ms = elapsed.TotalMilliseconds;
var tw = LookupTable.Sum(x => x.Value.Count);
var tg = LookupTable.Count;
var ua = LookupTable.Where(x => x.Value.Count == 1).Count();
var lg = LookupTable.OrderByDescending(x => x.Value.Count)
.Select(x => x.Value).FirstOrDefault();
var lw = LookupTable.OrderByDescending(x => x.Value[0].Length)
.Select(x => x.Value).FirstOrDefault();

Console.WriteLine(\$"\r\n* Loaded in {sec:N8} seconds or {ms} ms\r\n");
Console.WriteLine(\$"* {tw:N0} words");
Console.WriteLine(\$"* {tg:N0} anagram groups");
Console.WriteLine(\$"* {ua:N0} unique anagrams");
Console.WriteLine(\$"* Largest group of Anagrams: {lg.Count}");
Console.WriteLine(\$"* Longest word{(lw.Count == 1 ? "" : "s")} in Lookup Table:");
foreach (var word in lw) Console.WriteLine(\$"  - {word}");
}

private static void HandleUserQueries()
{
while (true)
{
Console.Write("\r\nSearch Word: ");

if (lookupWord.Length == 0) break;

sw.Restart();
var results = Find(lookupWord);
elapsed = sw.Elapsed;

if (results.Any())
{
int count = 0;
Console.WriteLine("\r\nFOUND:\r\n");
foreach (var item in results)
Console.WriteLine(\$"  {++count,2}: {item}");
Console.WriteLine(\$"\r\nTotal: {count:N0} anagram{(count == 1 ? "" : "s")} for \"{lookupWord}\"");
}
else
{
Console.WriteLine(\$"\r\n{lookupWord} has no anagrams.");
}
var sec = elapsed.TotalMilliseconds / 1000;
var ms = elapsed.TotalMilliseconds;
Console.WriteLine(\$"Time: {sec:N8} seconds or {ms} ms");
}
}
}
}```

VB.NET
```Imports System.IO
Imports ZeroFormatter
Module Module1

Dim sw As New Stopwatch()
Dim elapsed As TimeSpan

Dim LookupTable As New Dictionary(Of String, List(Of String))()

LookupTable = ZeroFormatterSerializer _
.Deserialize(Of Dictionary(Of String, List(Of String))) _
End Using
End Sub

Private Function GetKey(w As String) As String
Dim chars = w.ToCharArray()
Array.Sort(chars)
Return New String(chars)
End Function

Private Iterator Function Find(lookupWord As String) As IEnumerable(Of String)
Dim chars = lookupWord.ToCharArray()
Array.Sort(chars)
Dim key = New String(chars)
If LookupTable.ContainsKey(key) Then
For Each item In LookupTable(key)
Yield item
Next
End If
End Function

Sub Main()

Console.WriteLine("Ultra-Fast Anagram Generator")
Console.WriteLine("============================")

Console.WriteLine("{0}Word list supplied by: https://github.com/dwyl/english-words", vbCrLf)

sw.Start()
elapsed = sw.Elapsed

ReportStatisics(elapsed)
HandleUserQueries()

End Sub

Private Sub ReportStatisics(elapsed As TimeSpan)

Console.WriteLine("{0}Statistics", vbCrLf)
Console.WriteLine("----------")

Dim sec = elapsed.TotalMilliseconds / 1000
Dim ms = elapsed.TotalMilliseconds
Dim tw = LookupTable.Sum(Function(x) x.Value.Count)
Dim tg = LookupTable.Count
Dim ua = LookupTable.Where(Function(x) x.Value.Count = 1).Count()
Dim lg = LookupTable.OrderByDescending(Function(x) x.Value.Count) _
.Select(Function(x) x.Value).FirstOrDefault()
Dim lw = LookupTable.OrderByDescending(Function(x) x.Value(0).Length) _
.[Select](Function(x) x.Value).FirstOrDefault()

Console.WriteLine("{0}* Loaded in {1:N8} seconds or {2} ms", vbCrLf, sec, ms)
Console.WriteLine("* {0:N0} words", tw)
Console.WriteLine("* {0:N0} anagram groups", tg)
Console.WriteLine("* {0:N0} unique anagrams", ua)
Console.WriteLine("* Largest group of Anagrams: {0}", lg.Count)
Console.WriteLine("* Longest word{0} in Lookup Table:", If(lw.Count = 1, "", "s"))

For Each word In lw
Console.WriteLine("  - {0}", word)
Next

End Sub

Private Sub HandleUserQueries()

While True
Console.Write("{0}Search Word: ", vbCrLf)

If lookupWord.Length = 0 Then
Exit While
End If

sw.Restart()
Dim results = Find(lookupWord)
elapsed = sw.Elapsed

If results.Any() Then
Dim count As Integer = 0
Console.WriteLine("{0}FOUND:{0}", vbCrLf)
For Each item In results
count += 1
Console.WriteLine("  {0,2}: {1}", count, item)
Next
Console.WriteLine("{0}Total: {1:N0} anagram{2} for ""{3}""",
vbCrLf, count, If(count = 1, "", "s"), lookupWord)
Else
Console.WriteLine("{0}{1} has no anagrams.", vbCrLf, lookupWord)
End If

Dim sec = elapsed.TotalMilliseconds / 1000
Dim ms = elapsed.TotalMilliseconds
Console.WriteLine("Time: {0:N8} seconds or {1} ms", sec, ms)
End While

End Sub

End Module```

Here are the revised results:
```Ultra-Fast Anagram Generator
============================

Word list supplied by: https://github.com/dwyl/english-words

Statistics
----------

* Loaded & processed in 0.53279680 seconds or 532.7968 ms

* 351,096 words
* 311,497 anagram groups
* 283,469 unique anagrams
* Largest group of Anagrams: 15
* Longest word in Lookup Table:
- dichlorodiphenyltrichloroethane

Search Word: teaser

FOUND:

1: aretes
2: asteer
3: easter
4: eaters
5: reseat
6: saeter
7: seater
8: staree
9: teaser

Total: 9 anagrams for "teaser"
Time: 0.00000070 seconds or 0.0007 ms

FOUND:

2: alters
3: artels
4: estral
5: laster
6: lastre
7: rastle
8: ratels
9: relast
10: resalt
11: salter
12: slater
13: staler
14: stelar
15: talers

Time: 0.00000090 seconds or 0.0009 ms```

v16
Is really arrest an anagram for teaser?
Arrest has two R, while teaser has two E.
Graeme_Grant 3-Mar-17 18:29pm
huh ... need to look at that... minor hiccup...
Graeme_Grant 3-Mar-17 19:38pm
Updated.
You can probably speed up the solution quite a bit by assuming 32 letters in the alphabet and use a bitshift.
Graeme_Grant 5-Mar-17 7:39am
Thanks for the suggestion. I did think about that, but I require the count of the same letter. I am thinking about another way that most likely would and will try when I have time...

## Solution 6

First of all, I use Heap's algorithm[^] to generate all possible permutations of the random letters entered, and I put those permutations into a HashSet.
Then, rather than checking each permutation to see if it is a word, I figured the dictionary needs to be loaded anyway, so why not check if the word being loaded is suitable? As such, I do not load the dictionary into memory at all - I simply echo the words in the dictionary that are in the list of permutations. I used the same words.txt from GitHub - dwyl/english-words: A text file containing 479k English words for fast search auto-completion / autosuggestion.[^] used in Solution 2 for my dictionary.
C#
```class Program {
static HashSet<string> _permutations = new HashSet<string>();
static void GeneratePermutations(int n, char[] letters) {
if (n == 1)
else {
for (int i = 0; i < n - 1; ++i) {
GeneratePermutations(n - 1, letters);
int swapIndex = (n & 1) == 0 ? i : 0;
char c = letters[swapIndex];
letters[swapIndex] = letters[n - 1];
letters[n - 1] = c;
}
GeneratePermutations(n - 1, letters);
}
}

static void ListPermutations() {
Console.WriteLine("\nAll Permutations:");
foreach (string permutation in _permutations)
Console.WriteLine(permutation);
}

static void FindWords() {
Console.WriteLine("\nWords:");
using (StreamReader fp = File.OpenText(Path.Combine(Path.GetDirectoryName(Assembly.GetEntryAssembly().Location), "words.txt"))) {
string word;
while ((word = fp.ReadLine()) != null) {
if (_permutations.Contains(word)) {
Console.WriteLine(word);
}
}
}
}

static void Main(string[] args) {
string letters = null;
if ((args?.Length ?? 0) > 0)
letters = args[0];
char action;
do {
while (string.IsNullOrEmpty(letters)) {
Console.Write("Enter random letters: ");
if (letters.Any(c => !char.IsLetter(c)))
letters = null;
}
_permutations.Clear();
GeneratePermutations(letters.Length, letters.ToCharArray());
do {
Console.WriteLine("1. List all permutations");
Console.WriteLine("2. Find Words");
Console.WriteLine("3. Restart");
Console.WriteLine("0. Exit");
if (action == '1')
ListPermutations();
else if (action == '2')
FindWords();
} while (action != '3' && action != '0');
letters = null;
} while (action != '0');
}
}```

PIEBALDconsult 4-Mar-17 0:53am
With a hashset you lose track of how many of each letter you have, don't you?
Midi_Mick 4-Mar-17 1:16am
No = the hashset contains all possible permutations, but only once each - it removes duplicates caused by the doubling up of letters. For example, if 5 different characters are entered, there are 120 permutations. However, if 2 of those letters are the same, of the 120 actual permutations, only 60 are distinct - each sequence of letters would be generated twice (once for the letter is the 1st position, and once for the letter in the 2nd position). The hashset removes those duplicates.
PIEBALDconsult 4-Mar-17 1:30am
Ah, right, my brain is tired.

## Solution 7

### C++ / Python

Not going for the bonus points here. Instead I did something else: I made a polyglot. This program is both valid C++ and Python. It takes a word from the command-line and gives all unique permutations.

Python syntax-highlighting:
Python
```#define A 0 // \
from itertools import permutations
#define B 0 // \
from sys import argv

#define C 0 // \
for w in set(permutations(argv[1])):
#define D 0 // \
joined = ''.join(w)
#define E 0 // \
print(joined)

#include <string>
#include <iostream>
#include <algorithm>

#define pass int main(int argc, char* argv[])  { std::string input(argv[1]); std::string input_copy(argv[1]); std::cout << input << std::endl; while (std::next_permutation(input.begin(), input.end())) { std::cout << input << std::endl; } while (std::prev_permutation(input_copy.begin(), input_copy.end())) { std::cout << input_copy << std::endl; } }

pass```

C++ syntax-highlighting:
C++
```#define A 0 // \
from itertools import permutations
#define B 0 // \
from sys import argv

#define C 0 // \
for w in set(permutations(argv[1])):
#define D 0 // \
joined = ''.join(w)
#define E 0 // \
print(joined)

#include <string>
#include <iostream>
#include <algorithm>

#define pass int main(int argc, char* argv[])  { std::string input(argv[1]); std::string input_copy(argv[1]); std::cout << input << std::endl; while (std::next_permutation(input.begin(), input.end())) { std::cout << input << std::endl; } while (std::prev_permutation(input_copy.begin(), input_copy.end())) { std::cout << input_copy << std::endl; } }

pass```

The way it works: `#` indicates a comment in Python. This means that we can use `#include` and `#define` just fine in our C++. My original intention was to use #define to make the exact Python code work in C++, but unfortunately the syntax was too different. So I wanted to find a way to comment out all Python code. If you do a `//` comment in C++ and end the line with `\`, the next line is a comment too. But because `//` comments don't work in Python, we have to do a useless `#define` so Python ignores the line with the comment.

Then, the C++ main-method should be ignored as well in Python. I defined `pass` as the whole method. `pass` is a statement in Python that does nothing (it gets used in empty statements for example), which is perfect for this polyglot. C++ pre-processes the 'pass' as the main method, Python ignores it. We have our polyglot!

Graeme_Grant 4-Mar-17 4:58am
There were a lot of complaints about people (specifically me, but others also) submitting more than one solution... Kills creativity...
Thomas Daniels 4-Mar-17 5:00am
Oh, really? I missed those. Was there an official rule concluded about multiple solutions?
Graeme_Grant 4-Mar-17 5:37am
Over the last few weeks challenges... No new rule that I am aware of...
Thomas Daniels 4-Mar-17 5:43am
Hmm, okay. Personally I don't really agree with the complaints so if there's an official discussion somewhere to suggest a consensus / rule change, I'd be happy to participate, but as long as there isn't, I'll keep my second solution.
Graeme_Grant 4-Mar-17 5:57am
I'm with you ...

## Solution 9

PHP
```string=life
string_list=(\$(egrep \$((eval echo \$(for x in \$(seq \${#string}); do printf "{"\$(echo \$string | sed -e 's/\(.\)/\1,/g' | sed -e 's/,\$//')"}"; done)) | perl -e 'chomp(\$line=<>); \$line="^\$line\\$"; \$line =~ s/ /\\$|\^/g;print "(\$line)\n";') /usr/share/dict/words))

sorted_string=\$(echo \$string | grep -o . | sort | tr -d "\n")
for x in \${string_list[@]}; do
if echo \${x} | grep -o . | sort | tr -d "\n" | grep \${sorted_string} > /dev/null 2>&1; then
echo \${x}
fi
done```

ouput:
```feil
fiel
file
lief
life```

v2

## Solution 10

Here is an alternative solution using SQLite as an in-memory DB and runs just as lightning fast as my other solution...
C#
```using System;
using System.Collections.Generic;
using System.Data.SQLite;
using System.IO;

static class SqLiteLookup
{
public static IEnumerable<string> Find(string lookupWord)
{
var chars = lookupWord.ToCharArray();
Array.Sort(chars);

using (var cmd =
new SQLiteCommand(\$"SELECT word FROM {tableName} WHERE '" +
new string(chars) + "' = wordkey", dbConn))
}

static SQLiteConnection dbConn;
static SQLiteTransaction dbTrans;
static string wordFile, dbFile = "words.db",
conn, tableName = "WordLookup",
createCmd = \$"CREATE TABLE {tableName} (wordkey TEXT, word TEXT)",
insertCmd = \$"INSERT INTO {tableName} (wordkey, word) values (@wordkey, @word)";

public static void InitDB(string file)
{
wordFile = file;
CheckAndOpenDb(dbFile);
CheckAndCreateTable();
}

private static void CheckAndOpenDb(string file)
{
conn = \$"Data Source={file};Version=3;DateTimeKind=Utc;" +
"FullUri=file::memory:?cache=shared;";

if (!File.Exists(file))
SQLiteConnection.CreateFile(file);

dbConn = new SQLiteConnection(conn);
dbConn.Open();
}

private static void CheckAndCreateTable()
{
if (!TableExists(tableName))
{
ExecuteCommand(createCmd);
}
}

{
string word, key;

BeginTrans();
{
while ((word = sr.ReadLine()) != null)
{
if (word.Length > 0 &&
!word.Contains("'") &&
!word.Contains("\""))
{
key = word.GetKey();
ExecuteCommand(insertCmd,
new Dictionary<string, object>
{
{ "@wordkey", key },
{ "@word", word }
});
}
}
}
CommitTrans();
}

static bool ExecuteCommand
(string query, Dictionary<string, object> fields = null)
{
bool success = false;
using (SQLiteCommand cmd = Command(query))
{
if (fields != null && fields.Count != 0)
foreach (var key in fields.Keys)

cmd.ExecuteNonQuery();
success = true;
}
return success;
}

static SQLiteCommand Command(string sql)
=> new SQLiteCommand(sql, dbConn);

static bool TableExists(string name)
=> Command(\$"SELECT name FROM sqlite_master WHERE type='table' AND name='{name}'")
.HasRows;

static void BeginTrans()
{
dbTrans = dbConn.BeginTransaction();
}

static void CommitTrans()
{
dbTrans.Commit();
}
}```
Using the same key generator:
C#
```using System;

static class KeyGen
{
public static string GetKey(this string word)
{
var chars = word.ToCharArray();
Array.Sort(chars);
return new string(chars);
}
}```
The code to run it:
C#
```using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace Anagrams
{
static class Program
{
static string filename = "words.txt";
static void Main()
{
var st = new Stopwatch();
var elapsed = default(TimeSpan);
IEnumerable<string> anagrams = null;

Console.WriteLine("Starting SQLite...");
st.Start();
SqLiteLookup.InitDB(filename);
elapsed = st.Elapsed;
st.Stop();
Console.WriteLine(\$"Time: {elapsed.TotalMilliseconds} ms\r\n");

while (true)
{
st.Reset();
Console.Write("\r\nSearch Word: ");

int count = 0;

if (lookupWord.Length == 0) break;

st.Restart();
anagrams = SqLiteLookup.Find(lookupWord);
elapsed = st.Elapsed;

foreach (var anagram in anagrams)
Console.WriteLine(\$"{++count,2}: {anagram}");

Console.WriteLine(\$"\r\nTime: {elapsed.TotalMilliseconds} ms");
Console.WriteLine(\$"{count:N0} anagrams for {lookupWord}");
}
}
}
}```
And word data from the same resource: GitHub - dwyl/english-words[^]

Outputs:
```Starting SQLite...
Time: 193.3068 ms

Search Word: teaser
1: aretes
2: asteer
3: easter
4: eaters
5: reseat
6: saeter
7: seater
8: staree
9: teaser

Time: 0.0003 ms
9 anagrams for teaser```