Case Study

AI Customer Support Chatbot for SaaS Platform

An LLM-powered support chatbot that deflected 61% of tier-1 support tickets, reduced first-response time from 4.2 hours to under 90 seconds, and cut support cost per resolution by 44%.

Client
Vaultify (B2B SaaS)
Industry
Technology / SaaS
Duration
9 weeks
Delivered
March 2026
Chatbot DevelopmentAI IntegrationAI Automation

Key Results

  • Ticket deflection rate: 61%

  • First-response time: 4.2 hours → <90 seconds

  • Support cost per resolution -44%

  • CSAT score maintained at 4.6/5

  • Support team capacity freed for complex issues

The Challenge

Vaultify is a B2B SaaS platform providing document storage and compliance management to professional services firms. With 2,800 active business accounts and a 60-person support team, they were experiencing a common scaling problem: support ticket volume was growing at 18% quarter-on-quarter while headcount grew at 6%.

The analysis of their Zendesk ticket data revealed a familiar pattern: 58% of tickets were tier-1 queries — questions already answered in their help centre — that required a support agent to locate and send a link or copy-paste a standard response. Each ticket averaged 12 minutes of agent time including context-switching overhead.

The compounding problems:

  • Average first-response time of 4.2 hours during peak periods — creating friction for business customers who needed answers quickly
  • Support agents spent the majority of their day on repetitive tier-1 work, leading to high turnover in the support function
  • A growing help centre (340 articles) that customers struggled to navigate — most opened a ticket rather than searching
  • A planned product expansion would add two new modules and significantly increase support volume

The brief: deploy a customer-facing AI chatbot that could handle tier-1 queries accurately, escalate to human agents when needed, and integrate with their existing Zendesk workflow — without degrading the 4.7/5 CSAT score they had worked hard to maintain.

Our Approach

Week 1–2: Knowledge base audit and conversation design

We audited all 340 help centre articles and 6 months of closed Zendesk tickets. We identified 94 intent categories that covered 81% of ticket volume and mapped each to the relevant documentation.

This analysis shaped the conversation design: rather than building a rigid decision tree, we designed an intent classification layer that routed queries to the appropriate RAG retrieval context — ensuring the chatbot answered with Vaultify's actual documentation rather than hallucinated responses.

We also defined explicit escalation triggers:

  • Account-specific queries (billing, contract changes, data exports)
  • Multi-step configuration issues beyond tier-1 complexity
  • Any message containing frustration indicators (detected via sentiment classification)
  • Explicit requests to speak to a human

Week 3–6: RAG pipeline and chatbot development

We built a Retrieval-Augmented Generation pipeline using:

  • OpenAI text-embedding-3-large to embed all 340 help centre articles and update embeddings automatically when articles were published or revised
  • Pinecone as the vector store, with namespace segmentation by product module
  • Claude claude-sonnet-4-6 as the generation model — chosen for its instruction-following reliability and lower hallucination rate on grounded retrieval tasks
  • LangChain for orchestration, with a custom prompt template that enforced source citation and prohibited answers unsupported by retrieved context

For escalation, we built a Zendesk ticket creation flow that pre-populated the ticket with the conversation history, the user's account ID (pulled from their authenticated session), and the chatbot's assessment of intent — giving agents immediate context without requiring customers to repeat themselves.

Week 7–8: Testing and CSAT calibration

We ran 1,200 test conversations against historical tickets with known correct answers, measuring both accuracy and response appropriateness. Initial accuracy on tier-1 queries was 91.3%; after prompt refinement and retrieval tuning it reached 94.7%.

CSAT calibration was the most sensitive part of the project. We tuned escalation thresholds conservatively — preferring to escalate borderline cases rather than over-deflect — and tested with a panel of 40 real Vaultify customers before go-live.

Week 9: Phased rollout

We launched to 20% of inbound chat traffic for the first week, monitoring deflection rate, CSAT, and escalation accuracy before opening to full traffic.

The Results

Operational outcomes at 90 days:

  • Ticket deflection: 61% of tier-1 inquiries resolved by the chatbot without agent involvement — above the 55% target
  • First-response time: Reduced from 4.2 hours to under 90 seconds for all chatbot-handled queries
  • Agent workload: Support agents shifted from 58% tier-1 work to 19% tier-1 work — spending more time on complex, relationship-building issues
  • CSAT: Maintained at 4.6/5 — a marginal decrease of 0.1 point from the 4.7/5 baseline, within acceptable tolerance

Business outcomes:

  • Support cost per resolution reduced by 44% (blended across chatbot-handled and agent-handled tickets)
  • Vaultify onboarded the two new product modules without adding support headcount
  • Agent attrition in the support team dropped by approximately 30% in the subsequent quarter — attributed partly to more interesting work and reduced repetitive volume

Unexpected benefit:

The chatbot's conversation logs became a product intelligence asset. Vaultify's product team began reviewing weekly digests of unanswered questions — queries the chatbot escalated because no documentation existed — as a direct input to their documentation roadmap and product FAQ improvements.

Technical Stack

  • Embeddings: OpenAI text-embedding-3-large
  • Vector store: Pinecone (3 namespaces by product area)
  • Generation model: Anthropic Claude claude-sonnet-4-6 via API
  • Orchestration: LangChain with custom prompt templates and guardrails
  • Escalation integration: Zendesk API (ticket creation, conversation transcript attachment)
  • Auth integration: Vaultify session token → account context enrichment
  • Frontend widget: React component embedded in Vaultify's existing Next.js app
  • Analytics: Custom dashboard built on Metabase showing deflection rate, CSAT, escalation triggers, and unanswered intent distribution
  • Monitoring: LangSmith for prompt performance tracking and regression detection