How much customer support can AI actually resolve?
Most published auto-resolution rates are inflated by how "resolved" is counted. Here's an honest framework — and the 66–71% we see in real deployments.
Short answer: in our deployments, an AI agent resolves 66–71% of customer conversations without a human — but that range is almost useless until you know how "resolved" is being counted. The same AI can look like it resolves 90% or 40% of tickets depending entirely on the definition, the business, and what you plugged it into.
That gap is where most buyers get burned. A vendor quotes you a headline number, you launch, and your real rate lands nowhere near it. So before you trust any percentage — ours included — it's worth understanding what actually drives it.
What "resolution" even means (and why most rates are inflated)
There are two ways to count a resolved conversation, and they produce wildly different numbers:
- Assumed resolution — the customer didn't reply again within some window, so the system marks it resolved. Cheap to measure, and it counts every person who gave up, got distracted, or rage-quit as a success.
- Confirmed resolution — the customer's actual problem was handled: the order was found, the answer was correct, no human had to step in, and they didn't reopen or escalate.
Almost every impressive auto-resolution headline you've seen uses something close to the first definition. As a benchmark, Intercom reports its Fin agent averages ~66% resolution across thousands of customers — a credible, widely-cited figure, but one based on the vendor's own "assumed resolution" style of counting, which tends to read high. Treat any single published rate, from anyone, as directional rather than a promise.
The number worth chasing is confirmed resolution, because it's the only one that correlates with a customer who actually got helped.
What a realistic rate looks like
Here's what we see across live deployments, counted as conversations the AI handled end-to-end without handing off to a person:
| Business | Type | What AI handles | Resolved without a human |
|---|---|---|---|
| Turna | Travel & ticketing | Changes, refunds, ticket rules — across languages, around the clock | 71% |
| İnce Topuk | Fashion e-commerce | Sizes, stock, prices, order status on WhatsApp | 66% |
| DüğünBuketi | Wedding marketplace | Open-ended planning questions that turn into enquiries | 38% became qualified leads |
Two things to notice. First, the strong performers cluster in the 66–71% range, not 90% — and that's the honest ceiling for most businesses once you count properly. Second, DüğünBuketi's number isn't a resolution rate at all: for a high-touch service, "success" is a qualified lead, not a closed ticket. The right metric depends on the job.
Across all of them, the median first reply lands in about 8 seconds — which matters because speed is what makes customers willing to let an agent try in the first place.
Why the same AI gets 40% for one business and 71% for another
The model is rarely the bottleneck. Three things move the number far more than which LLM is under the hood:
1. Whether the answer exists somewhere the AI can read. An agent grounded in your actual catalog, help docs, and policies — via retrieval over your own knowledge base — answers correctly. An agent guessing from general training hallucinates and gets escalated. This single factor explains most of the spread between a 40% and a 70% deployment.
2. Whether the AI can do things, not just talk. Resolution requires action: looking up an order, checking live stock, processing a return. An agent that calls real tools resolves "where's my order?" in one turn. A bot that can only chat sends the customer to a human for anything transactional — which caps its rate at the share of questions that are purely informational.
3. Whether handoff is designed, not accidental. Counterintuitively, a clean handoff to a human raises your effective resolution rate, because it stops the AI from forcing bad answers on the 30% it shouldn't touch. The goal isn't 100% automation — it's automating the repetitive 70% well and routing the judgment calls fast.
The metric that matters more than resolution rate
Resolution rate alone is a vanity metric. An agent that "resolves" 80% of conversations by stonewalling people into giving up is worse than one that resolves 60% with happy customers.
So pair it with two others:
- CSAT on AI-resolved conversations — are the people the AI handled actually satisfied, measured separately from human-handled ones?
- Value per resolution — not just "how many tickets did we deflect" but "what did each resolved conversation produce" — a recovered cart, a recommendation, a lead. This is the shift from treating support as a cost to measuring what conversations are worth.
A 66% resolution rate with high CSAT and recovered revenue beats an 85% rate of quietly abandoned customers every time.
How to estimate your own realistic rate
You don't need to launch to get a defensible estimate. Audit one month of conversations and sort them into three buckets:
- Repetitive + answer-exists + no action needed (e.g. "what are your shipping times?"). The AI resolves nearly all of these.
- Repetitive + needs a system lookup (e.g. "where's my order?"). The AI resolves these if it's connected to the system that has the answer.
- Judgment, exceptions, emotion (e.g. "this is the third time this broke"). These should go to a human — and counting them as automatable is how vendors inflate their numbers.
Your realistic resolution rate is roughly buckets 1 and 2, weighted by how much of your knowledge is actually written down and how many systems you can connect. For most e-commerce and support teams, that lands in the 60–75% range — which matches what we see in production. (For a worked example on a single channel, see running a WhatsApp agent on a Shopify store.)
If you want to see where your own conversations would land across those three buckets, that audit is the first thing we do in a Vivollo deployment — before quoting you any number at all.
Common questions
- Is a higher resolution rate always better?
No. A high rate driven by assumed resolution often means customers gave up. Optimize for confirmed resolution plus CSAT.
- Can AI resolve 90%+ of support?
Rarely, and not honestly. The judgment-call tail of real support resists automation; claims above ~80% usually rely on loose counting.
- What's the fastest way to raise the rate?
Improve what the AI can read (your knowledge base) and what it can do (tool/system access) before changing anything about the model.
