AI Customer Support Chatbot for SaaS Platform

The Challenge

Vaultify is a B2B SaaS platform providing document storage and compliance management to professional services firms. With 2,800 active business accounts and a 60-person support team, they were experiencing a common scaling problem: support ticket volume was growing at 18% quarter-on-quarter while headcount grew at 6%.

The analysis of their Zendesk ticket data revealed a familiar pattern: 58% of tickets were tier-1 queries — questions already answered in their help centre — that required a support agent to locate and send a link or copy-paste a standard response. Each ticket averaged 12 minutes of agent time including context-switching overhead.

The compounding problems:

Average first-response time of 4.2 hours during peak periods — creating friction for business customers who needed answers quickly
Support agents spent the majority of their day on repetitive tier-1 work, leading to high turnover in the support function
A growing help centre (340 articles) that customers struggled to navigate — most opened a ticket rather than searching
A planned product expansion would add two new modules and significantly increase support volume

The brief: deploy a customer-facing AI chatbot that could handle tier-1 queries accurately, escalate to human agents when needed, and integrate with their existing Zendesk workflow — without degrading the 4.7/5 CSAT score they had worked hard to maintain.

Our Approach

Week 1–2: Knowledge base audit and conversation design

We audited all 340 help centre articles and 6 months of closed Zendesk tickets. We identified 94 intent categories that covered 81% of ticket volume and mapped each to the relevant documentation.

This analysis shaped the conversation design: rather than building a rigid decision tree, we designed an intent classification layer that routed queries to the appropriate RAG retrieval context — ensuring the chatbot answered with Vaultify's actual documentation rather than hallucinated responses.

We also defined explicit escalation triggers:

Account-specific queries (billing, contract changes, data exports)
Multi-step configuration issues beyond tier-1 complexity
Any message containing frustration indicators (detected via sentiment classification)
Explicit requests to speak to a human

Week 3–6: RAG pipeline and chatbot development

We built a Retrieval-Augmented Generation pipeline using:

OpenAI text-embedding-3-large to embed all 340 help centre articles and update embeddings automatically when articles were published or revised
Pinecone as the vector store, with namespace segmentation by product module
Claude claude-sonnet-4-6 as the generation model — chosen for its instruction-following reliability and lower hallucination rate on grounded retrieval tasks
LangChain for orchestration, with a custom prompt template that enforced source citation and prohibited answers unsupported by retrieved context

For escalation, we built a Zendesk ticket creation flow that pre-populated the ticket with the conversation history, the user's account ID (pulled from their authenticated session), and the chatbot's assessment of intent — giving agents immediate context without requiring customers to repeat themselves.

Week 7–8: Testing and CSAT calibration

We ran 1,200 test conversations against historical tickets with known correct answers, measuring both accuracy and response appropriateness. Initial accuracy on tier-1 queries was 91.3%; after prompt refinement and retrieval tuning it reached 94.7%.

CSAT calibration was the most sensitive part of the project. We tuned escalation thresholds conservatively — preferring to escalate borderline cases rather than over-deflect — and tested with a panel of 40 real Vaultify customers before go-live.

Week 9: Phased rollout

We launched to 20% of inbound chat traffic for the first week, monitoring deflection rate, CSAT, and escalation accuracy before opening to full traffic.

The Results

Operational outcomes at 90 days:

Ticket deflection: 61% of tier-1 inquiries resolved by the chatbot without agent involvement — above the 55% target
First-response time: Reduced from 4.2 hours to under 90 seconds for all chatbot-handled queries
Agent workload: Support agents shifted from 58% tier-1 work to 19% tier-1 work — spending more time on complex, relationship-building issues
CSAT: Maintained at 4.6/5 — a marginal decrease of 0.1 point from the 4.7/5 baseline, within acceptable tolerance

Business outcomes:

Support cost per resolution reduced by 44% (blended across chatbot-handled and agent-handled tickets)
Vaultify onboarded the two new product modules without adding support headcount
Agent attrition in the support team dropped by approximately 30% in the subsequent quarter — attributed partly to more interesting work and reduced repetitive volume

Unexpected benefit:

The chatbot's conversation logs became a product intelligence asset. Vaultify's product team began reviewing weekly digests of unanswered questions — queries the chatbot escalated because no documentation existed — as a direct input to their documentation roadmap and product FAQ improvements.

Technical Stack

Embeddings: OpenAI text-embedding-3-large
Vector store: Pinecone (3 namespaces by product area)
Generation model: Anthropic Claude claude-sonnet-4-6 via API
Orchestration: LangChain with custom prompt templates and guardrails
Escalation integration: Zendesk API (ticket creation, conversation transcript attachment)
Auth integration: Vaultify session token → account context enrichment
Frontend widget: React component embedded in Vaultify's existing Next.js app
Analytics: Custom dashboard built on Metabase showing deflection rate, CSAT, escalation triggers, and unanswered intent distribution
Monitoring: LangSmith for prompt performance tracking and regression detection

Key Results

The Challenge

Our Approach

The Results

Technical Stack

Ready to Achieve Results Like These?

Uzair Technology