Case Study
AI Customer Support Chatbot for SaaS Platform
An LLM-powered support chatbot that deflected 61% of tier-1 support tickets, reduced first-response time from 4.2 hours to under 90 seconds, and cut support cost per resolution by 44%.
- Client
- Vaultify (B2B SaaS)
- Industry
- Technology / SaaS
- Duration
- 9 weeks
- Delivered
- March 2026
Key Results
Ticket deflection rate: 61%
First-response time: 4.2 hours → <90 seconds
Support cost per resolution -44%
CSAT score maintained at 4.6/5
Support team capacity freed for complex issues
The Challenge
Vaultify is a B2B SaaS platform providing document storage and compliance management to professional services firms. With 2,800 active business accounts and a 60-person support team, they were experiencing a common scaling problem: support ticket volume was growing at 18% quarter-on-quarter while headcount grew at 6%.
The analysis of their Zendesk ticket data revealed a familiar pattern: 58% of tickets were tier-1 queries — questions already answered in their help centre — that required a support agent to locate and send a link or copy-paste a standard response. Each ticket averaged 12 minutes of agent time including context-switching overhead.
The compounding problems:
- Average first-response time of 4.2 hours during peak periods — creating friction for business customers who needed answers quickly
- Support agents spent the majority of their day on repetitive tier-1 work, leading to high turnover in the support function
- A growing help centre (340 articles) that customers struggled to navigate — most opened a ticket rather than searching
- A planned product expansion would add two new modules and significantly increase support volume
The brief: deploy a customer-facing AI chatbot that could handle tier-1 queries accurately, escalate to human agents when needed, and integrate with their existing Zendesk workflow — without degrading the 4.7/5 CSAT score they had worked hard to maintain.
Our Approach
Week 1–2: Knowledge base audit and conversation design
We audited all 340 help centre articles and 6 months of closed Zendesk tickets. We identified 94 intent categories that covered 81% of ticket volume and mapped each to the relevant documentation.
This analysis shaped the conversation design: rather than building a rigid decision tree, we designed an intent classification layer that routed queries to the appropriate RAG retrieval context — ensuring the chatbot answered with Vaultify's actual documentation rather than hallucinated responses.
We also defined explicit escalation triggers:
- Account-specific queries (billing, contract changes, data exports)
- Multi-step configuration issues beyond tier-1 complexity
- Any message containing frustration indicators (detected via sentiment classification)
- Explicit requests to speak to a human
Week 3–6: RAG pipeline and chatbot development
We built a Retrieval-Augmented Generation pipeline using:
- OpenAI
text-embedding-3-largeto embed all 340 help centre articles and update embeddings automatically when articles were published or revised - Pinecone as the vector store, with namespace segmentation by product module
- Claude claude-sonnet-4-6 as the generation model — chosen for its instruction-following reliability and lower hallucination rate on grounded retrieval tasks
- LangChain for orchestration, with a custom prompt template that enforced source citation and prohibited answers unsupported by retrieved context
For escalation, we built a Zendesk ticket creation flow that pre-populated the ticket with the conversation history, the user's account ID (pulled from their authenticated session), and the chatbot's assessment of intent — giving agents immediate context without requiring customers to repeat themselves.
Week 7–8: Testing and CSAT calibration
We ran 1,200 test conversations against historical tickets with known correct answers, measuring both accuracy and response appropriateness. Initial accuracy on tier-1 queries was 91.3%; after prompt refinement and retrieval tuning it reached 94.7%.
CSAT calibration was the most sensitive part of the project. We tuned escalation thresholds conservatively — preferring to escalate borderline cases rather than over-deflect — and tested with a panel of 40 real Vaultify customers before go-live.
Week 9: Phased rollout
We launched to 20% of inbound chat traffic for the first week, monitoring deflection rate, CSAT, and escalation accuracy before opening to full traffic.
The Results
Operational outcomes at 90 days:
- Ticket deflection: 61% of tier-1 inquiries resolved by the chatbot without agent involvement — above the 55% target
- First-response time: Reduced from 4.2 hours to under 90 seconds for all chatbot-handled queries
- Agent workload: Support agents shifted from 58% tier-1 work to 19% tier-1 work — spending more time on complex, relationship-building issues
- CSAT: Maintained at 4.6/5 — a marginal decrease of 0.1 point from the 4.7/5 baseline, within acceptable tolerance
Business outcomes:
- Support cost per resolution reduced by 44% (blended across chatbot-handled and agent-handled tickets)
- Vaultify onboarded the two new product modules without adding support headcount
- Agent attrition in the support team dropped by approximately 30% in the subsequent quarter — attributed partly to more interesting work and reduced repetitive volume
Unexpected benefit:
The chatbot's conversation logs became a product intelligence asset. Vaultify's product team began reviewing weekly digests of unanswered questions — queries the chatbot escalated because no documentation existed — as a direct input to their documentation roadmap and product FAQ improvements.
Technical Stack
- Embeddings: OpenAI
text-embedding-3-large - Vector store: Pinecone (3 namespaces by product area)
- Generation model: Anthropic Claude claude-sonnet-4-6 via API
- Orchestration: LangChain with custom prompt templates and guardrails
- Escalation integration: Zendesk API (ticket creation, conversation transcript attachment)
- Auth integration: Vaultify session token → account context enrichment
- Frontend widget: React component embedded in Vaultify's existing Next.js app
- Analytics: Custom dashboard built on Metabase showing deflection rate, CSAT, escalation triggers, and unanswered intent distribution
- Monitoring: LangSmith for prompt performance tracking and regression detection
