A snapshot of the ways virtual assistant performance is measured, conversational AI industry trends, and valuable insight into real-world practices.
In 2025, effective bot performance assessment is as important as ever, and is increasingly challenging. With the adoption of AI-powered conversational agents across various industries, understanding how to measure and optimise performance is crucial for businesses seeking to enhance customer experiences and operational efficiency.
With this in mind, we decided to investigate how it’s done, what the gaps are, and identify best practices. To do this we conducted in-depth interviews, worked with Innovate UK, and ran a survey. We found that virtual assistant developers and designers can be found in a range of departments that include marketing, customer service and more.
This has interesting implications for analytics as assessment of bot performance can be influenced by the emphasis of the department. Some teams create complex strategies to get to the heart of what they want to know, whiles others work in the dark.
So what did we find?
Current Trends
- Chatbots are increasingly leveraging AI technologies. Natural Language Processing (NLP) has been part of chatbot implementation for many years, but we are now seeing the rise of companies incorporating Large Language Models (LLMs) too in order to improve the ability of their bots to understand and respond to user queries. The most common use is fallbacks (when a hybrid chatbot encounters a query it can’t handle, generative AI is us used to create a response)
- Performance data for bots is typically checked daily, or at least weekly. But there can be a difficulty in accessing useful, actionable information.
- The longer a bot has been established, the more confidence there is in the understanding of its performance. It’s worth noting that strategies to monitor known issues don’t always surface new ones.
Performance Metrics and Assessment Practices
The top 3 indicators in the evaluation of bot performance:
- Handovers
- Customer Satisfaction
- Fallbacks
These are closely followed by user drop off, containment and goal completion. We also saw metrics such as cost per conversion (one for marketing teams!) and industry specific measures such as clinical safety for healthcare bots.
When it comes to indicators not currently measured, there is a big desire to understand nuanced outcomes: translation issues, genuine customer feelings and preferences, and the quality of bot utterances.
Analysis is often done by a data analyst or data scientist in the CAI or customer care team. This could well be due to the complex business intelligence tools typically used. Manual transcript review is very much the norm, despite the time it takes. Indeed manual effort is a regular challenge faced when it comes to understanding conversational performance, along with difficulty in understanding user behaviour and a lack of specialist tools.
Best Practices in Bot Performance Assessment
Based on the survey results and industry trends, several best practices emerge:
- Comprehensive training and knowledge base: Maintain an accurate, relevant, and up-to-date knowledge base to ensure effective handling of queries
- Regular testing and optimisation: Continuously test and optimise bot responses, leveraging ai analytics to identify gaps and enhance capabilities
- Effective fallback mechanisms: Implement robust fallback options, including escalation to human agents when necessary
- User feedback integration: Collect and analyse user feedback to drive iterative improvements in bot functionality and user experience
- Security and compliance: Ensure bots comply with relevant standards such as GDPR and HIPAA, using encryption and authentication protocols to secure interactions
- Scalability and integration: Design bots to scale easily and integrate with existing systems to enhance functionality and support business growth
What’s Next for Bot Performance Assessment in 2025?
Our findings suggests that the rise in LLM powered bots will lead to more complexities in analysing conversational data. To counter this, advances in AI will also power better analytical tools to solve this problem.
As the field continues to evolve, successful organisations will be those that stay informed about emerging best practices, invest in robust assessment tools, and maintain a focus on continuous improvement to ensure their bots meet both operational goals and user expectations.
To find out more about what we discovered when it comes to virtual assistant performance
and how best to measure it, download the full report:
Watch our on-demand webinar for a deep dive into the findings.