Add a Chatbot to a C# Application using SIML (Synthetic Intelligence Markup Language) - Part 2

DaveMathews

Rate me:

4.95/5 (12 votes)

26 Jan 2015CPOL17 min read

39.3K

2.5K

Continuation of the article "Integration of a Chatbot in a C# application using SIML (Synthetic Intelligence Markup Language)"

Introduction

Previously, I introduced SIML and some of its basic functionalities. In this article, we will be digging a bit deeper.

Please do go through the "Previous Article- Part 1" before proceeding with this one.

Reduction via Recursion

The main idea behind recursion is to help the developer come up with the least number of Patterns that match as many number of user inputs possible.

Consider these three sentences:

What is the meaning of the word beatboxing?
What does beatboxing mean?
Define beatboxing

Though in SIML, I could just write all the three patterns inside a single Model, but I could come into a situation where two or more developers might be working on the same SIML knowledge base and the only permission I have is to add a new Model in a separate file and leave the primary knowledge base untouched.

The trick here would be to break the user input into its simplest form and then retry the pattern search. Considering the above sentences, the simplest sentence would be Define beatboxing.

So an example SIML Code would be:

XML

<Model>
  <Pattern>DEFINE *</Pattern>
  <Response>Let me search the dictionary for the word <Match />...</Response>
</Model>

<Model>
  <Pattern>
    <Item>WHAT IS THE MEANING OF THE WORD *</Item>
    <Item>WHAT DOES * MEAN</Item>
  </Pattern>
  <Response>
    <Goto>Define <Match /></Goto>
  </Response>
</Model>

The first SIML Model addresses the simplest possible pattern for inquiries relating to word definitions. The second Model however has no response in it and all it does is that it redirects the Patterns What is the meaning of the word * and What does * mean to Define *. The element we used in the second Model is the <Goto> SIML element. This is the redirection element, it has no attributes and the only function it has is to redirect the pattern search to a new value.

So now, if you go ahead and type What is the meaning of the word beatboxing in the Console, the first Model will be activated and the response will be Let me search the dictionary for the word beatboxing.

But what if you had two or more wildcards in your Pattern?. How would you redirect the pattern search?

You would do that by specifying an Index within the <Match/> element. Say instead of What is the meaning of the word * you used the pattern * meaning of the word *. In such a case, our redirection should specify the index of the wildcard to be used.

Example SIML Code:

XML

<Model>
  <Pattern>* meaning of the word *</Pattern>
  <Response>
    <Goto>Define <Match Index="2"/></Goto>
  </Response>
</Model>

In the above example within the <Match/> element, I've added an attribute called Index and the value of this attribute has been set to "2". So now if the value of the <Match /> element is evaluated, it will return the value captured by the second wildcard in our pattern. Similarly, you could use 3, 4, 5 and so on to specify the index of the respective wildcards within your pattern.

Another derivative of the <Goto> element is the <GotoMatch /> element which is a shortcut tag for...

XML

<Goto><Match/></Goto>

But it doesn't end there. Just like the <Match /> element, the <GotoMatch /> element has an Index attribute. And that's the reason why I referred to it as a derivative of the <Goto> element. As a result, if you specify an Index like <GotoMatch Index="2"/>, the shortcut tag will be interpreted as..

XML

<Goto><Match Index="2"/></Goto>

A simple use-case for the <GotoMatch /> element:

XML

<Model>
  <Pattern>CAN YOU *</Pattern>
  <Response>Yes I can. <GotoMatch /></Response>
</Model>

Assuming that you are working with the SIML Project attached to this article, if you type in Console Can you play music? the output will be Yes I can. Playing music..

Always try to specify the simplest possible pattern when considering redirections.
Avoid meta-redirections. Do not redirect a pattern search to a new pattern that may again try to redirect the new pattern to an even newer pattern.
Stick with multiple patterns within the same Model if possible and only use redirections if and only if you are very certain that the new pattern you redirect, the pattern search, to will have a Response for that pattern you specified.
If 2 or more redirections point to each other. The chat request is automatically terminated or timed out by the interpreter.

Request vs Input

Everytime a user tries to interact with the Bot, he or she creates a ChatRequest. A Chat Request is the user message in its entirety that is sent to the Bot. So if you type How are you? What are you upto? . A Chat request would hold the entire message in its raw format.

SIML interpreter doesn't deal with ChatRequests; it only deals with Inputs. An Input is a single sentence that is processed by the SIML interpreter. So the above chat request How are you? what are you upto? will be broken down into two inputs, How are you? and what are you upto?. These two inputs will then be processed separately and the combined result will be sent back to the user.

So if at some point the user tries to test the Bot to check if it recalls the previous Request, you can make use of the <Request> or the <Input> element.

Example SIML code to retrieve the entire user message:

XML

<Model>
  <Pattern>What did I say</Pattern>
  <Response>You said "<Request />"</Response>
</Model>

The above SIML code will return How are you? What are you upto? upon the user input What did I say?

Example SIML code to retrieve the last user sentence:

XML

<Model>
  <Pattern>What did I just say</Pattern>
  <Response>You said "<Input />"</Response>
</Model>

The above SIML code will return What are you upto? upon the user input What did I just say?. And this is because the <Input> element returns the last user sentence the Bot processed and not the entire message (Chat Request) that was sent to the Bot.

It's important to note that both the <Request> and <Input> element offers an Index attribute. The Index attribute enables you to backward reference the previous chat request or the previous user input using an index number.

Say in the above <Input> example instead of What are you upto? you chose to return How are you? (the first sentence instead of the second)

XML

<Model>
  <Pattern>What was the first thing I said</Pattern>
  <Response>You wrote "<Input Index="2"/>"</Response>
</Model>

The Index attribute within the Input element in the above SIML code now references the first sentence in the user input. Backward referencing the user input requires that a developer keeps in mind that the user inputs are fetched in their descending order.

What are you upto? - Second or last sentence (Index 1 - Default index value if no Index attribute is specified)
How are you? - First sentence (Index 2)

Result vs Output

Just like a chat Request represents the entire user message a Result in SIML represents the entire previous utterance of the Bot. And similarly just like an <Input> element denotes the last sentence in the previous user input, an <Output> element denotes the last sentence in the previous utterance of the Bot.

Example SIML code to extract the last utterance of the Bot.

XML

<Model>
  <Pattern>WHAT DID YOU SAY</Pattern>
  <Response>I said "<Result />"</Response>
</Model>

Example SIML code to extract the last sentence from the previous utterance of the Bot:

XML

<Model>
  <Pattern>WHAT DID YOU JUST SAY</Pattern>
  <Response>I said "<Output />"</Response>
</Model>

Now if the user says How are you? What are you upto? and if the Bot replies I am fine. Just chatting with users. We can either retrieve the entire previous utterance of the bot or just the last sentence using the <Result> or <Output> element respectively.

Not surprisingly (just like <Request> and <Input>), both <Result> and <Output> element both have an Index attribute that can be used to backward reference the previous utterance of the Bot.

Variables and Operators

In SIML (as previously discussed) developers can create, manipulate and work with User or Bot related properties/variables.

The following elements are used to control the flow of your SIML response.

<If>, <ElseIf> and <Else>
<Switch> (along with <Case> and <Default>
<While>

All of the above elements can control the flow of a response using a number of SIML operators:

Value
GreaterThan
GreaterThanOr
LessThan
LessThanOr
Exists
Defined
Not
Contains

We will now discuss the above elements in detail.

<If>, <ElseIf> and <Else>

Let me put an example in front of you for the <If>, <ElseIf> and <Else> elements.

The following SIML code has already been discussed in the previous article:

XML

<Model>
  <Pattern>I AM @NUMBER YEARS OLD</Pattern>
  <Response>
    <Match /> years is good enough. <Think><User Set="age"><Match />
    </User></Think></Response>
</Model>

Now just after the above SIML code, we'll add a new Model:

XML

<Model>
  <Pattern>AM I OLD ENOUGH</Pattern>
  <Response>
    <If User="age" Value="18">Yes you are!</If>
    <ElseIf User="age" GreaterThan="18">You are more than 18 years old. 
                       So why not?</ElseIf>
    <Else>I am not sure enough.</Else>
  </Response>
</Model>

The above code is much like the C# if, elseif and else statement. The output depends on the age of the user. The Value attribute (in the <If> element ) is an equality comparison and checks if the age of the user is 18. Within the <ElseIf> element I've used the GreaterThan operator the value of which will be returned if the age of the user is greater than 18 years. The value of the <Else> element will be returned if and only if the conditions specified within the preceding If and ElseIf elements are not met.

I am 18 years old -> Am I old enough? -> Yes you are!
I am 35 years old -> Am I old enough? -> You are more than 18 years old. So why not?
I am 5 years old -> Am I old enough? -> I am not sure enough.

<Switch> (along with <Case> and <Default>)

The above SIML code can be re-written using a <Switch> element. This switch element behaves much like the Switch keyword in C# (and many known programming languages):

XML

<Model>
  <Pattern>AM I OLD ENOUGH</Pattern>
  <Response>
    <Switch User="age">
      <Case Value="18">Yes you are!</Case>
      <Case GreaterThan="18">You are more than 18 years old. So why not.</Case>
      <Default>I am not sure enough.</Default>
    </Switch>
  </Response>
</Model>

The usage-syntax of the Switch element is pretty simple. Within the Switch element, you specify the variable and its owner. In the code above, the owner of the variable is the User and variable under consideration is the age of the user.

Every <Case> element along with an operator is used to create a unique condition. If any of the condition is satisfied, then the inner value of the element is accepted as part of the response.

However, if none of the conditions in the ancestor case elements are satisfied, the value of the <Default> element is accepted. It isn't mandatory to have a <Default> element within a Switch element but every Switch element should atleast have 1 Case element.

<While>

I have personally never used the SIML While element, IMHO you should use JavaScript if you aren't sure as to when your loop's going to end.

Example SIML code:

XML

<Model>
  <Pattern>REPEAT * * TIMES</Pattern>
  <Response>
    <Think>
      <Var Set="num">
        <Match Index="2" />
      </Var>
    </Think>
    <While Var="num" GreaterThan="0">
      <Match />
      <Bot Get="space" />
      <Think>
        <Var Set="num">
          <Math Get="decrement">
            <Var Get="num" />
          </Math>
        </Var>
      </Think>
    </While>
  </Response>
</Model>

The above SIML code gives us the functionality of repeating a sentence or word as many times the user specifies. So if the says repeat bla 4 times the response will be bla bla bla bla.

The new element Var in the above example is a temporary owner for variables whose scope is the Model in which they are used. Using the <Var> element, I've created a new variable num to store the number of times the user wants the Bot to repeat a word or sentence.

A <While> loop should include a variable and a comparison operator. The code loops until the condition specified by an operator is True. In the above example, the While loop runs until the value of the variable num is greater than 0.

Remembering and Learning

A developer should be aware of the distinctions between the SIML <Remember> and <Learn> element before using them in a Model.

Remembering something involves storage of SIML Models within a separate knowledge base that is assigned to one particular user. This distinction can be utilized to save facts/knowledge that work only for the users who helped do so.

Remembering

Suppose I tried to teach my bot that a Tomato is a fruit while some other user tried to teach the Bot that a Tomato is a vegetable.

XML

<Model>
  <Pattern>A TOMATO IS A *</Pattern>
  <Response>
    <Remember>
      <Model>
        <Pattern>What is a tomato</Pattern>
        <Response>A tomato is a <Process><Match /></Process></Response>
      </Model>
    </Remember>
     Alright I'll remember that a tomato is a <Match />.
  </Response>
</Model>

What I've done in the above SIML code is that I have created a pattern A tomato is a * and within my <Response> element I've used a <Remember> element. Inside the <Remember> element I have added a Model with an atomic pattern (What is a tomato) and a response that will output a tomato is a followed by the exact value the user specifies.

The above Model gets saved only for the user who activates the Model. So if taught my Bot that a tomato is a fruit. Whenever I ask my Bot what a tomato is the answer would be fruit. However, if some other user taught the bot that a tomato is a vegetable. The Bot will still remember that for Dave a tomato is a fruit while for the other user a tomato is a vegetable.

Note the <Process> element in the above code. This element forces the interpreter to evaluate the inner or children elements before storing the Model into the User's graph node.

Using the <Process> element changes the inner value of the Response element to...

XML

<Response>A tomato is a fruit</Response>

Learning

Learning on the other hand involves storage of SIML Models within the main knowledge graph that is globally used by the Bot. It's much like the <Remember> element but the underlying Model is stored into the main knowledge graph of the Bot.

XML

<Model>
  <Pattern>THE SUN IS A *</Pattern>
  <Response>
    <Learn>
      <Model>
        <Pattern>What is the sun</Pattern>
        <Response>The sun is a <Process><Match /></Process></Response>
      </Model>
    </Learn>
     Alright I have now learnt that a Sun is a <Match />.
  </Response>
</Model>

Models saved using the <Learn> element are globally accessible, i.e., their responses are common to all users (unlike the <Remember> element)

So if any user teaches the Bot that the sun is a star. No matter who asks the question what is the sun?. The answer will always be the sun is a star. Of course, if some other user teaches the bot that the sun is big giant fireball. The response will be then be changed for all users.

Elements inside Attributes

What if you had to use the value of an element inside the attribute of another?

SIML offers a feasible work around using the <Define> element. The Define element assigns a Key to a value of an element. The value of this Key can later be used inside attributes and responses.

Take, for instance, the following example:

XML

<Model>
  <Pattern>CHANGE TEXT TO *</Pattern>
  <Response>
    <Define Key="{0}"><Match /></Define>
    "<Text Get="{0}"><Match /></Text>"
  </Response>
</Model>

In the above SIML code, the wildcard * is stored as a key {0} and within the Get attribute of the <Text> element, I have used the key. Therefore during a chat session the key {0} will be replaced by the value of the <Match/> element.

The Define element as a whole doesn't return any text value eliminating the need of enclosing it within a <Think> element.
Every Define element should have a Key attribute. The value of which should be a unique identifier.

Once a Key has been defined it can be used anywhere (as values for attributes or arbitrary texts). As a result of which I could replace the <Match /> element with the Key {0}, within the <Text> element, in the above code and output would still be the same.

Stop Repeating

A major problem with most of the Bot architectures is that they repeat their output for the same pattern. SIML provides a simple work around for repeated user inputs. There are three possibilities when it comes to repetition.

The user doesn't repeat his input - The message is unique.
The user repeats the entire message - Every sentence in the user input has been processed previously.
The user repeats one or more sentences - There is atleast one repeated and sporadic sentence in the chat request.

A successful approach to any of the above scenario is to implement a Repetition Management system using <Repeat> and <Echo> element.

Managing repetitions for every SIML Model would have been extremely insufficient and redundant. And for such reasons every SIML Concept can have a Repeat attribute set to True or False. This tells the SIML interpreter to treat the children Models accordingly.

For us to test this functionality, you'll have to first create a new Concept file in Chatbot Studio.

For this tutorial, name the new Concept file "Don't Repeat".

For my example SIML Code, I have created a new Concept with the name Don't Repeat and set the value of the Repeat attribute to False.

XML

<Concept Name="Don't Repeat" Type="Public" Repeat="False">
  <Model>
    <Pattern>WHERE IS THE MILLION DOLLAR HIGHWAY</Pattern>
    <Response>It's in Colorado.</Response>
  </Model>
</Concept>

After saving the new Concept file, click on Settings->Repetition. This (Repetition.sml) file by default has some repetition management code in it. Before we dig into repetition management, we should be familiarised with the following user variables and their values

Partial-Repeat - This user variable is set to True if there is atleast one repeated input in the user message.
Repeat-Count - The value of this variable is the number of repeated inputs the Bot processed.

Repetition Management involves the following steps:

Create a <Repeat> element within the document root element (<Siml>)
Add a new <Response> element within the above Repeat element.
Address both Partial and Complete repetitions by checking the value of the Partial-Repeat variable.
Address every input that was repeated by the user by checking the value of the Repeat-Count variable.

XML

<Repeat>
  <Response>
    <If User="partial-repeat">
      <Switch User="repeat-count">
        <Case Value="1">And I have already mentioned that <Echo Index="1" /></Case>
        <Case Value="2">And I think you already know that <Echo Index="1" /> 
                        and <Echo Index="2" /></Case>
        <Default>And I do not like to repeat myself.</Default>
      </Switch>
    </If>
    <Else>
      <Switch User="repeat-count">
        <Case Value="1">I have already mentioned that <Echo Index="1" /></Case>
        <Case Value="2">I think you already know that <Echo Index="1" /> 
                        and <Echo Index="2" /></Case>
        <Default>I do not like to repeat myself.</Default>
      </Switch>
    </Else>
  </Response>
</Repeat>

In the above SIML code, given the aforementioned steps, the setup has a Response element within which I've used an <If> element to check if the value of the user variable partial-repeat is true or not. If it's a partial-repeat, I can be certain that before my response there's going to be some other response for a sporadic input. And for that reason, I have started my responses with the word And. Following the If and Else elements are the Switch elements, the children of which are activated based on the number of sentences repeated by the user.

The <Echo> element in the above example is used to retrieve the response from the Model whose pattern matched a repeated user input. Every repeated output, for a repeated input, is stored within the Echo Stack with an Index value. Therefore the first repeated output is stored at Index 1, the second at Index 2 and so on.

Best Practices

Well begun is half done. The following practices should help you implement a better Chatbot.

Start With the Basic Patterns

Always try to use atomic patterns first and then migrate towards looser patterns. So instead of "* Your name" first create a pattern "What is your name".

Pattern migration is a step by step procedure and the recommended migration is as follows:

Atomic patterns (What is your name)
Patterns with Sets (My eyes are [color])
Patterns with Regex (I was born in @Date)
Patterns with wildcards (My name is *)
Keywords ({Your name})

Don't Forget What You've Filtered

In the previous article, we learnt how to filter out words or normalize them to their simplest form. So if you've created a filter that converts the word Whats to What is. The following pattern will NEVER be matched.

XML

<Pattern>WHATS YOUR NAME</Pattern>

Always go through your normalization file to check what you've filtered.

Avoid Keywords Based Patterns

Patterns like {your name} are very ambiguous. It can match What is your name?, I don't know your name, I don't like your name and so on.

But their ambiguity doesn't render them useless. I recommend using them in scenarios where you can offer the user an alternative pattern.

Say, for instance, the user says, I don't know your name. If it matches a keywords based pattern offer an output like Did you mean "what is your name?" and if the user replies by saying yes then (using the Goto element) redirect the pattern search to What is your name?

Sets Instead of Regular Expressions

The sole purpose of Sets is to match words. They've been heavily optimized for such tasks. Using regular expressions instead of Sets would be extremely cumbersome.

So instead of:

XML

<Regex Name="Color" Pattern="\b(red|green|blue|...)\b"/>

stick with:

XML

<Set Name="Color">
  <Item>Red</Item>
  <Item>Green</Item>
  <Item>Blue</Item>
......
</Set>

Yes, I do understand the latter looks more verbose but over time you'll get the perks.

Be Careful With Redirections

The <Goto> element redirects the pattern search to a new value. One has to make an instinctive decision as to when to use Gotos and when to use multiple patterns inside Models. More importantly making sure that 2 <Goto> element do NOT point to each other.

I encourage developers to use multiple patterns inside Models instead of the redundant Gotos:

XML

<Model>
  <Pattern>
    <Item>HELLO THERE</Item>
    <Item>HI</Item>
    <Item>HOLA</Item>
    <Item>HI</Item>
  </Pattern>
  <Response>Hi there!</Response>
</Model>

Believe in Reusability

In SIML, you can reuse Random responses, Phrases, JavaScripts, Sets, Regular Expressions, Maps, Models and even Emotions.

Say you created a new JavaScript for finding the factorial value of a number.

XML

<Model>
  <Pattern>WHAT IS THE FACTORIAL OF *</Pattern>
  <Response>
    <Js>function fact(x) {
          if(x==0) { return 1;}
         return x * fact(x-1);
        }
        fact(<Match />);
     </Js>
  </Response>
</Model>

You could make your JavaScript reusable by making the function globally accessible. To do so, you could store the JavaScript function in the Script file.

XML

<Script Type="JavaScript">
   function fact(x) {
      if(x==0) { return 1;}
     return x * fact(x-1);
   }
</Script>

And later reuse your function...

XML

<Model>
  <Pattern>WHAT IS THE FACTORIAL OF *</Pattern>
  <Response>
    <Js>fact(<Match />);</Js>
  </Response>
</Model>

The same philosophy can be used with Random responses, Models (using <Goto>) and Emotions.

Code readability

Readability is an important aspect of coding your Chatbot. Readability not only makes your code more appealing and easily comprehensible but also reduces the labor required to maintain your code.

The following points should give you a kick start.

Avoid redundancy by reusing your Code.
When nesting your Models do not go beyond 2 or 3 levels.
Divide large Concept files into two or more separate files for maintainability.
Add comments on top of complex Models (Press Alt+C in Chatbot Studio) to help other developers understand your code.
Use simple but concrete identifiers for Sets, Maps, Regular Expressions and JavaScript functions.

Points of Interest

In this article, I tried to deal with SIML elements in detail and set more examples to demonstrate their usage. Part 1 and Part 2 should give any developer enough backgrounds on SIML. In the future tutorials, I plan to work more with the library and explore its features.

History

Monday, 26^th of January, 2015 - Initial release (Part 2)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

DaveMathews

Software Developer (Senior)

United States

This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.