It can be easy when you eat, sleep and breathe conversational artificial intelligence (or indeed any subject matter) to forget that not everyone else does too! But whilst we practitioners use technical terms and some jargon to discuss our craft, in any conversation we aim to be straightforward and understandable. With this in mind, our team has produced a glossary of some of the conversational AI terminology, jargon and acronyms we use every day that are second nature to conversational AI practitioners.
Here’s our CAI glossary, which we’ll continually update as (inevitably) new conversational AI terminology surfaces. If you’re a word–nerd and think we could improve our definitions, we welcome your feedback to make it better – or if there’s another term you’d like us to add please let us know.
Conversational AI Term |
Definition |
Abandonment | Abandonment is when a user leaves a conversation with a chat or voice bot before completing their intended goal or transaction. |
Agentic AI | AI systems that can autonomously analyse data, set goals, and take actions on their own without constant human direction. These systems demonstrate a higher level of independence and decision-making capability. |
Application Programming Interface (API) | A set of rules and protocols that allow different software applications to communicate. Communicating with another application (e.g. getting delivery details from a customer database or sending an SMS) are actions that help conversational systems get things done, not just talk about it! |
Artificial Intelligence (AI) | The simulation of human intelligence in machines, enabling them to learn, reason, communicate and perform tasks. |
AI Models | AI models are the specific mathematical and computational frameworks that enable AI systems to process information and generate responses. These can include various types of neural networks and other machine learning architectures. Want a simpler definition? An AI model is a bit like a software program, but unlike a program which follows strict rules to handle tasks, an AI model learns patterns from data and can handle situations it wasn’t explicitly programmed for. |
Automatic Speech Recognition (ASR) | The process of converting spoken language into text. This is usually the first step in systems people can speak to, such as voicebots, smart speakers and voice assistants. |
Bidirectional Encoder Representations from Transformers (BERT) | A deep learning model designed for natural language processing (NLP) tasks that considers the context of words in both directions. |
Chatbot | A software application designed to hold a conversation with a human users, usually over the Internet. Chatbots can be rule-based or powered by AI. Rule-based chatbots tend to give the user limited options to choose from, whereas AI powered chatbots are more likely to try to emulate natural language, with varying degrees of success! |
Classifier | A Machine Learning (ML) algorithm or component that categorises user inputs into predefined categories or intents. They’re commonly used in conversational systems to enable the software to understand what the user is trying to accomplish. |
Containment | Deflection and Containment can sometimes be confused as the same thing. Containment can be considered as the bot assisting the user directly. For example, answering the customer’s query. In more complex use cases, the bot can perform certain actions itself, often in lieu of a human who would otherwise have to be involved. For example, a process that allows the customer to report a fault and log that fault together with their account and contact details might mean a dialogue is considered to be contained. |
Context | Context can have multiple meanings. It may refer to the time, place and needs of the customer when they use the bot to communicate with the company. More often, it refers to the bot ‘knowing’ information about the customer without needed to re-ask it. If a customer is logged in to the website or app, an intelligent bot should have access to that persons’ account details, recent orders, open complaints etc. Good conversation design can harness that information to make for a more coherent customer experience, without them needing to repeat information that is already known by the company. |
Conversation | A series of exchanges between a user and an AI system that involves understanding context, maintaining coherence, and working toward resolving the user’s needs. |
Customer experience (CX) | Customer Experience encompasses the entire business relationship from the viewpoint of the customer. Bots are frequently used in customer service settings, and should therefore help alleviate frustration and cognitive load on behalf of the customer, rather than add to it. Constant consideration of the entire Customer Experience is essential in creating effective bot solutions, and the reason why conversation design remains so important. |
Deep Learning (DL) | A type of Machine Learning (ML) that uses neural networks with multiple layers to model complex patterns in data. |
Deflection | Deflection and Containment can sometimes be confused as the same thing. However, deflection can be considered as directing a customer to use other resources available to them, rather than have a human involved in helping them. For example, directing the customer to an online form or an FAQ. Essentially, deflection prevents a human having to get involved, often in trivial or less important queries or tasks. |
Dialogue Management (DM) | The component of a conversational AI system responsible for maintaining the context of the conversation and managing the flow of dialogue based on user inputs. Or in other words, the processes the chatbot goes through behind the scenes, to remember what the user has said, process new information from the user, and work out what to do or say next. |
Disambiguation | The process of clarifying user intent when multiple interpretations of their input are possible, usually by asking follow-up questions. |
Entity | Specific pieces of information in user input that can help clarify or add detail to the intent. For example, entities can include dates, locations, or product names. |
F1 Score (F1) | A measure of a model’s accuracy that balances precision and recall (two other common ML evaluation metrics). |
Fallback | A reply given by a chat or voice bot when it cannot understand what the user has just said. Usually the bot will ask the user to clarify what they mean or try to reword their query to try to match it to an intent. |
First Contact Resolution/ FCR | In Customer Service, FCR is an important indicator of how effective and efficient a process is at dealing with customer queries or issues. Ideally, a customer should not have to contact a company repeatedly about the same thing. When that happens, it’s a waste of time on both sides and degrades the customer’s patience and loyalty. FCR seeks to resolve the customer’s query or issue the first time they inform the company. |
Frequently Asked Questions (FAQ) | A structured knowledge base used to provide predefined answers to common customer enquiries. |
GenAI | Generative Artificial Intelligence. In conversational AI, GenAI bots are bots that do not have pre-written responses. Instead, they typically rely on an LLM to interpret the customer’s query and create a response on-the-fly. Whilst this provides a more natural and conversational experience, there are risks that the bot will provide inaccurate or poorly translated answers. Using a RAG approach can help to mitigate such ‘hallucinations’ but they cannot be entirely eliminated. |
Generative Pre-trained Transformer (GPT) | A specific type of LLM developed by OpenAI that generates coherent and contextually relevant text. |
Hallucination | Hallucination refers to instances where an LLM generates false, misleading or nonsensical information that may or may not seem plausible. Hallucinations occur because LLMs predict text based on patterns and not by understanding facts. |
Handover (handoff, escalation) | A handover is when the bot transfers, or escalates, the conversation to an agent. As agentic AI progresses, we may see an increase in handovers to other agentic bots. However, a handover is usually to a human who will receive the transcript of the conversation thus far and can help with the more complex questions and issues. As a handover incurs a cost to the business, the handover rate is a key metric when assessing the effectiveness of a bot. |
Human-Computer Interaction (HCI) | The study of designing user-friendly interactions between humans and machines. |
Intent | The goal or purpose behind a user’s input in a conversation. Identifying the intent helps the system understand what the user wants to achieve. Intent might be identified using an ML classifier or an LLM, but in any case it’s the important step of understanding what the user wants or needs in order for the chatbot to form a response. |
Interactive Voice Response (IVR) | A telephony system that interacts with callers using voice prompts and keypad inputs. |
Large language model (LLM) | A Large Language Model is a model which has been trained on an enormous amount of content. An LLM is intended to produce the ‘completion’ text to a ‘prompt’. The prompt may be a question or an entire conversation and the LLM determines the next best word to use, based on what it has seen in its training. LLMs can appear to know everything and converse just like a human, but they can also be unpredictable and factually incorrect. However, an element of randomness is always present, and some controls can be dialled up or down. They are not information databases, and you cannot ‘search’ an LLM for answers per-se, although interfaces to LLMs with deep research models are now available. LLMs are an incredible utility and can be applied in a variety of ways in conversational AI. |
Machine Learning (ML) | A subset of AI that uses algorithms to analyse data, learn from it, and make predictions or decisions without being explicitly programmed for specific tasks. |
Message | A single unit of communication between the user and the chat or voice bot, which can be in the form of text, speech, or other formats (such a background systems transferring context data). |
Named Entity Recognition (NER) | An NLP technique used to identify proper names, locations, dates, and other specific entities in text. |
Natural Language Generation (NLG) | NLG is a subset of Natural Language Processing (NLP) that focuses on transforming structured or unstructured data into human-readable text or spoken language. |
Natural Language Processing (NLP) | The branch of AI that focuses on enabling machines to understand, interpret, and generate human language. |
Natural Language Understanding (NLU) | A subfield of NLP that enables machines to comprehend meaning, intent, and context from text or speech. |
Regression Testing | Regression testing involves re-running previously developed tests to ensure that existing functionalities still work as expected after new code changes or additions. This practice helps verify that recent updates haven’t negatively impacted the existing software. High quality regression testing is important whether bots are LLM-powered, using more traditional NLP, or taking a hybrid approach. For more info read our blog Bulletproof chatbots: why regression testing is non-negotiable |
Conversational AI Term |
Definition |
Retrieval Augmented Generation (RAG) | In a Gen AI solution, retrieval augmented generation (RAG) is an approach that retrieves information relevant to the customers query, typically from an external data source. That information is then referenced in order to produce a relevant answer to a customers’ query. The additional information may be a large single document, or smaller chunks of various documents. The user’s query is passed to the LLM along with the additional information. This is also known as ‘grounding’ the LLM so that it has relevant reference material to call upon when creating a response, rather than relying on its original training data. It is a common approach to use vectorisation to retrieve information that is semantically similar to the user’s original query. |
Safety Net | Built-in mechanisms and responses that help handle unexpected inputs or situations where the AI system might not understand what the user asked. |
Sentiment Analysis | The process of determining the emotional tone behind a series of words, used to understand the user’s feelings and tailor responses accordingly. |
Semantic | Semantic relates to the meaning of language; in conversational AI, this refers to the system’s ability to understand the actual meaning and context of user inputs rather than just matching keywords. |
Small Language Model (SLM) | Similar to an LLM, but much smaller and therefore can be cheaper and faster, a Small Language Model is a Machine Learning model that can respond to and generate natural language. SLMs need fewer resources to train them and are then used to perform more specific tasks. |
Speech-to-Text (STT) | A subset of ASR, but the terms are often used interchangeably. STT is the key function of transcribing spoken words into written form, often used in voice assistants. |
Text-to-Speech (TTS) | A technology that synthesizes human-like speech from written text. TTS is commonly used in voice assistants to produce the spoken answers from the voicebot. |
Training | Training is the process of teaching AI models to understand and respond appropriately to user inputs using large datasets and machine learning (ML) techniques. |
Training data | The datasets used to train AI models. In conversational AI, this includes examples of user inputs and the corresponding outputs or responses. |
Turn Count | The number of back-and-forth exchanges between the user and the AI system in a single conversation, often used as a metric for conversation efficiency. One user utterance, paired with the associated bot response, is counted as one turn. |
User Experience (UX) | The overall experience a user has while interacting with a product or service. In conversational AI, this includes how intuitive and satisfying the conversation feels. In addition to the conversational aspects, UX also includes the chat interface or any other mechanism the user interacts with to get to or as part of a chat. |
Utterance | An utterance refers to a single unit of speech or text in a chat or voicebot conversation, usually a single message or a statement from either the user or the bot. It can be as short as one word or as long as several sentences. |
Virtual Assistant (VA) | An AI-powered software agent that performs tasks or provides information via text or voice interaction. Sometimes used synonymously with chatbot, conversational agent, voice assistant and similar terms. |
Word Error Rate (WER) | A metric used to evaluate the accuracy of speech recognition by comparing transcriptions to the original speech. A speech recognition tool or system operating with a low WER indicates that the user’s speech is being well recognised – the first step to ensuring the bot provides good responses. |