Design Concepts
14 Oct 2016- Web / Mobile apps – minimise state
- Bot Apps – lots of state
General Requirements
- State
- Resiliency
- Security
- Maintainability as the app grows
Conversation Flow
Requirements
- Perform tasks that require multiple exchanges, holding onto context from earlier exchanges
- Context switch - cancel a task, switch to a different topic, handle chitchat during a task flow
- Direct the user back to the point or end early
- Handle incomplete responses
- Use existing knowledge to shortcut the process where possible
- Help the user maintain locus of control
High-level approaches
- If-then-else (decision tree) approach - quickly becomes difficult to scale to complex cases, especially handling tangential concerns
- Workflow style approach, with possible sub-flows
- Finite state machine
- A grammar or scripting approach
- Black-box approaches, e.g. neural net
- Ensemble approach
Comparison
State Machine
- Easier to model jumping steps (in a workflow) by being in a given state. For example, no need to ask the user if they have their mobile phone (to recharge a SIM) if they are interacting from a mobile phone (as detected by the front-end).
Cooee
Cooee’s first implementation has adopted a state-machine approach.
- It can encapsulate the current state of the conversation (and therefore succinctly define next steps), while also allowing jumps (context-switching) through catching unhandled events.
- An event-driven approach handles non-linear flows more easily than a workflow or decision-tree approach
Case Study - Microsoft Bot Framework
The Microsoft Bot Framework (MBF) has the concept of a Dialog. A common type of dialog is the Waterfall Dialog. It implements a sequence of steps. Steps are completed in turn, and all steps must be completed to finish the task (dialog flow).
A step may consist of a sub-dialog. MBF supports commands to go back or to cancel a dialog.
To move to the next step, a valid user response must be given to the current question. A library of built-in Prompts is used to validate responses, e.g. a prompt for numeric input, date-time input, or to confirm a previous response.
MBF implements something they call “Guided Dialog”, meaning that the bot is generally driving (or guiding) the conversation with the user.
Data captured along the way can be stored in multiple scopes:
- dialog (local)
- private conversation/session (private for the current user)
- conversation (visible to all user sessions), or
- user (global across all conversations)
Case Study - Abot
Abot uses a finite state machine approach.
Case Study - ChatScript
ChatScript is similar to the If-then-else approach in that it is a collection of rules which match patterns against the input and executes output code when the pattern matches. ChatScript using a domain-specific language (DSL) to make the process of constructing rules easier to write and read.
Here is a simple example:
?: (you * love * me) I love you
a: (you are just saying that) No, I mean it
b: (no) You think I love someone else?
The ?:
is the rule type. s:
rules only react to statements. ?:
rules only react to questions. u:
rules react to
the union of both. ChatScript doesn’t have to match all words of the input. Wildcards and regular expressions can be used.
ChatScript rejoinders occur visually immediately following the corresponding rule and get tested only on user input immediately after that rule fired. You can have multiple levels of rejoinders. (Similar to the Microsoft waterfall dialog.)
ChatScript rules are organized by topic. (Sort of like a sub-flow.) In addition to the responders (s:
u:
?:
) and
rejoinders (a:
b:
etc.), topics have a type of rule called a gambit (t:
). It’s something the chatbot can say when it
has control. Even if the system cannot find a direct response to your input, if your input suggests that we are talking about,
say baseball, then the chatbot can offer you a relevant gambit. Or the bot can initiate a topic and say a gambit.
ChatScript supports concepts. It comes with around 1400 predefined concepts, you can define your own, and includes WordNet’s ontology.
Case Study - SuperScript and Introducing SuperScript
Takes a similar a path to ChatScript. An example script is:
+ *
- What is your favorite color?
+ *1
% what is your favorite color
- <cap> is my favorite color, too!
- The user says something, and this is matched by the
*
generic wildcard. The bot then replies with What is your favorite color?. - When the next input from the user comes into the system, we first check the history and see if the previous reply has more dialogue. This is noted by the
% What is your favorite color?
line. - We then pull the next trigger, in this case it is
*1*
meaning one word. The bot then replies with [user input] is my favorite color, too!.
Replies can execute a custom function where they have access to the entire message object, the user object and other resources like the fact databases. You can also pass parameters to the functions from captured input parts:
+ * weather in <name>?
- ^getWeather(<cap1>)
We can pass extra metadata back to the client, using a specific function, ^addMessageProp(key,value)
. The reply object can have extra properties added on from any reply.
+ can you (text/sms) me the name of the place
- {^hasNumber(false)} Can I have your phone number, please?
- {^hasNumber(true)} ^addMessageProp(medium,txt) The Canteen
In the above example, if the user asks Can you text me the name of the place?, the system will check if the system has the user’s phone number (using the ^hasNumber filter). If the user does not have a phone number, the reply will be Can I have your phone number, please?
Building for scalability
Typical web app design
Reactive Asynchronous Design
Ideal for messaging apps.
The Actor Model
Non-blocking and resilient.
Automatic failover.
Collaborating Actors
Interfaces
Messaging Provider.
Conversation.
Integrating with API Services
Call API in response to matching an Intent.
Providing enhanced UI responses.
Implementing a Conversation Engine
Actors as Finite State Machines.
Watson Intents.
IBM Watson is a registered trademark of International Business Machines Corporation.
Watson Conversation Service.
Providing Watson Context.
Watson needs help. It doesn’t include the ability to parse time, measurements, etc. These entities must be identified and parsed before calling Watson. They can be provided to Watson as context in the request.
I’m using the excellent Duckling library for parsing time, and the Humanize library to present time in natural language.