The late Google WonderWheel feature, and other interactive tools like the VirtualThesaurus[
^], are implementations of graphic user interfaces which express "semantic relations."
Many of these interfaces use variation in font size, and color, to simulate in two dimensions a relationship in three dimensions.
A simple way of conceptualizing "semantic relations" is to imagine that the meaning of a particular word is at the "center of a vast web" of possible associated words given a certain context. That implies that for every word there is a set of associations, and within the set of all associations, each other word has some probability of association, or "strength" of association.
You may find this CodeProject article useful: [
^].
So, if I think of "butterfly," it's highly likely that the words "moth," "caterpillar," "insect" may be very strongly associated, while words, or phrases, like "migration," "pupa," "imago," and "Vladimir Nabokov" may have varying degrees of strength of association.
If you want to implement a feature like this, you are going to need ... at minimum ... a database of keys (terms), which has, for every key, its degree of association with all other keys.
Imagine you had 256 words (keys, terms) in your "semantic relations" data structure: each key would have a reference to each other key so:
1. you have 256 key objects.
2. each key contains 255 references to the other keys with a "degree of association" factor
3. assuming the key objects were strings, the code for each reference is one byte, and the association factor was stored in two bytes
Then you'd have a total size of:
256 keys with a variable length string and a one-byte id code
255 entries per key consisting of: one byte id code, and two-byte association factor
The total size would be:
(Sum of lengths of all keywords 1~256) + (256 * 255) + (3 * (256 * 255))
(Sum of lengths of all keywords 1~256) + 65280 + 195840
If we assume an average keyword length of seven, then you'd have
1792 + 65280 + 195840 = 262912 bytes
So, I think you can see that creating a database for a semantic relation UI is no easy task; while under 300kb may seem like a small size, these days, somebody has to do the work of entering/creating the association values: someone(s) with deep linguistic skills.
And, if you think about adding "depth" to your search, that is: if I click on "butterfly," and then "imago," you want to consider the degree of semantic relatedness of both terms, then you get an
exponential increase in the number of data items you need for every additional level of relationship you deal with.
But, there's nothing stopping you from creating your own facility for yourself, or end-users, to create their own semantic relations database by direct action ... is there ?
In fact, if I were to teach again, I'd consider that as a very good assignment for students in a second-level course on programming.
good luck, bill