Thank you for clarification of the question. In this case, you really need to implement one or another kind of a
search tree:
http://en.wikipedia.org/wiki/Search_tree[
^],
http://www.drdobbs.com/database/ternary-search-trees/184410528[
^].
For space efficiency and simplicity, a
ternary search tree or
binary search tree could be recommended:
http://en.wikipedia.org/wiki/Ternary_search_tree[
^],
http://en.wikipedia.org/wiki/Binary_search_tree[
^].
As the text can be relatively big, and you need to place all data in memory, space efficiency is important criterion, even though the speed of the algorithm is not the very maximum. The time complexity of search is of course O(log n) (see
http://en.wikipedia.org/wiki/Big_O_notation[
^]). The arrangement of nodes and search is done according string comparison operation.
As the tree structures and tree-based search are very fundamental in computer science, you will find many implementations:
http://bit.ly/XExXbv[
^],
http://bit.ly/XExQNc[
^].
—SA