Conversational AI for Tier-One Customer Support

ABOUT THE PROJECT

Overview

A fast-growing e-commerce brand selling across apparel, accessories, and lifestyle categories was handling 18,400 customer support contacts per month through a team of 14 agents working across email, live chat, and a Facebook Messenger channel. The brand had grown 3.4x in revenue over 24 months but had only increased its support headcount by 40% in the same period — a gap that was showing up in the data. Average first response time across all channels had slipped from 1.8 hours to 4.2 hours over the previous six months. CSAT scores had fallen from 4.7 to 4.3 out of 5. A recurring theme in negative reviews cited slow response times as the primary complaint.

Verttx built and deployed a conversational AI handling all tier-one support contacts across the brand's three channels — trained on the brand's actual support history, integrated with their order management system and returns platform, and designed with escalation paths that hand off to human agents with full conversation context. The system went live 8 weeks after the initial discovery call. Within 60 days, 74% of contacts were resolved without human involvement.

‍

The Situation

An analysis of the brand's previous 12 months of support tickets revealed that 78% of contacts fell into six query categories: order status enquiries, delivery delay complaints, return and exchange requests, product sizing questions, discount code issues, and account access problems. These were not complex queries requiring judgement or product expertise. They were information retrieval and process execution tasks — the kind of work that a well-designed system connected to the right data sources could handle without any human involvement.

The support team knew this. Agents were spending 61% of their available time on these six categories, leaving 39% for the genuinely complex contacts — damaged goods claims, wholesale enquiries, multi-item sizing consultations, and complaint escalations — that actually required experienced agents. Morale was low. Turnover in the support team over the previous 12 months was 43% — well above the industry average of 30% for e-commerce support roles — and the hiring manager attributed it directly to agent frustration with high-volume, low-complexity workload.

The business case was straightforward. Automating the six high-volume categories would free the agent team for complex work, reduce response times for the majority of customers to near-instant, and create the headroom to grow contact volume without linear headcount growth. The risk the brand's Head of Customer Experience wanted to avoid was the common failure mode of e-commerce chatbots: a system that handles simple cases adequately but frustrates customers who need escalation and damages CSAT in the process.

‍

The Approach

Built from real support history

Verttx began by analysing 94,000 historical support tickets from the previous 18 months — categorising, tagging, and mapping the resolution paths for every ticket type. This was not a desk exercise. We identified 47 distinct query sub-types within the six primary categories, mapped the exact data sources each resolution required, and documented the edge cases within each category that genuinely required human judgement versus those that were being escalated unnecessarily by agents following overly conservative guidelines. The distinction between "genuinely complex" and "complex-seeming but resolvable" is where most chatbot implementations fail — they either automate too aggressively and produce wrong answers, or they escalate too conservatively and achieve poor containment rates.

RAG architecture for accurate answers

The conversational AI is powered by a large language model with a Retrieval-Augmented Generation (RAG) architecture connecting it to four live data sources: the order management system (real-time order status, delivery tracking, and fulfilment data), the returns and exchanges platform (return eligibility rules, exchange availability, and refund status), the product knowledge base (sizing guides, material specifications, care instructions, and availability), and the brand's policy documentation (shipping commitments, discount terms, and account policies). Every answer the AI gives is grounded in a live data source — it does not generate responses from training knowledge alone. This eliminates the hallucination risk that makes generic LLM chatbots unsuitable for customer-facing use cases where factual accuracy is non-negotiable.

Escalation designed as carefully as automation

The escalation architecture received the same design rigour as the automation flows. When the AI determines that a contact requires human involvement — based on query type, sentiment signals, or explicit customer request — it hands off to the agent queue with a structured context summary covering the customer's account history, the current order details, the conversation transcript, a classification of the query type, and a recommended resolution approach derived from similar resolved cases. Agents receive escalations pre-briefed rather than cold. Average agent handle time on escalated contacts fell from 8.4 minutes to 5.1 minutes as a direct result of the context handoff quality.

The system also monitors for sentiment deterioration in real time. A customer whose language signals frustration or distress triggers an escalation review regardless of query type — the AI does not attempt to resolve a contact where the customer's emotional state suggests the interaction requires a human presence, even if the underlying query is automatable.

Channel deployment and testing

The AI was deployed across email, live chat, and Facebook Messenger simultaneously, with channel-specific response formatting — conversational and brief for chat and Messenger, structured and thorough for email. A two-week parallel testing period ran the AI alongside human agents on a 30/70 traffic split before full deployment. CSAT scores for AI-handled contacts during the test period averaged 4.5 out of 5 against the human agent baseline of 4.3 — confirming that automation would improve rather than damage satisfaction before the full cutover was approved.

‍

The Result

Within 60 days of full deployment, 74% of all incoming contacts were resolved by the AI without human involvement. Average first response time fell from 4.2 hours to 90 seconds across all automated contacts. For the 26% of contacts escalated to human agents, average first response time fell from 4.2 hours to 38 minutes — driven by the reduction in queue volume freeing agent capacity for complex cases. CSAT scores, which had fallen to 4.3 before deployment, recovered to 4.6 out of 5 within 90 days — above the pre-decline baseline — driven by the near-instant response time that customers had been citing as their primary complaint.

Cost per support resolution fell from $8.40 to $3.00 for automated contacts — a 64% reduction — producing a blended cost across automated and human-handled contacts of $4.10, down from the pre-implementation baseline of $8.40. The 14-agent support team was restructured: 8 agents now handle complex escalations, complaints, and wholesale enquiries full time. The remaining 6 positions were not backfilled as natural attrition occurred, reducing the support headcount cost by 43% while handling a contact volume that had grown by 28% since implementation.

Agent turnover fell from 43% to 18% in the 12 months post-implementation. The Head of Customer Experience attributed the improvement directly to the change in work quality — agents were no longer spending 61% of their day on order status lookups and discount code resets. The brand's Trustpilot rating improved from 3.8 to 4.2 over the same period, with reviewers specifically citing response speed as a positive change.

The complete conversational AI system — the LLM integration, the RAG knowledge base, all four live data source integrations, the escalation architecture, and the analytics layer — was transferred to the brand's engineering team at handover with full documentation and no lock-in to any Verttx tooling or infrastructure.

Every chatbot we had looked at before Verttx either couldn't connect to our systems properly or produced answers that were close enough to right to be dangerous. Verttx built something that actually knows what it's talking about because it's pulling from our live data. Our CSAT went up after automation. That was not what I expected. — Head of Customer Experience, E-commerce Brand

‍

RESULTS

74% of all incoming support contacts resolved automatically within 60 days of deployment. Average response time fell from 4.2 hours to 90 seconds. CSAT improved from 4.3 to 4.6 out of 5. Cost per resolution fell 64% from $8.40 to $3.00 for automated contacts. Agent turnover fell from 43% to 18% as the team shifted from high-volume routine work to complex escalations. The brand's Trustpilot rating improved from 3.8 to 4.2 over the same 12-month period.

74%

Of contacts resolved without a human agent

90 sec

Average response time, down from 4.2 hours

64%

Reduction in cost per support resolution

18%

Agent annual turnover, down from 43%