Vivollo
insights/5 min read

How much customer support can AI actually resolve?

Most published auto-resolution rates are inflated by how "resolved" is counted. Here's an honest framework — and the 66–71% we see in real deployments.

Vivollo Team·
Share

Short answer: in our deployments, an AI agent resolves 66–71% of customer conversations without a human — but that range is almost useless until you know how "resolved" is being counted. The same AI can look like it resolves 90% or 40% of tickets depending entirely on the definition, the business, and what you plugged it into.

That gap is where most buyers get burned. A vendor quotes you a headline number, you launch, and your real rate lands nowhere near it. So before you trust any percentage — ours included — it's worth understanding what actually drives it.

What "resolution" even means (and why most rates are inflated)

There are two ways to count a resolved conversation, and they produce wildly different numbers:

  • Assumed resolution — the customer didn't reply again within some window, so the system marks it resolved. Cheap to measure, and it counts every person who gave up, got distracted, or rage-quit as a success.
  • Confirmed resolution — the customer's actual problem was handled: the order was found, the answer was correct, no human had to step in, and they didn't reopen or escalate.

Almost every impressive auto-resolution headline you've seen uses something close to the first definition. As a benchmark, Intercom reports its Fin agent averages ~66% resolution across thousands of customers — a credible, widely-cited figure, but one based on the vendor's own "assumed resolution" style of counting, which tends to read high. Treat any single published rate, from anyone, as directional rather than a promise.

The number worth chasing is confirmed resolution, because it's the only one that correlates with a customer who actually got helped.

What a realistic rate looks like

Here's what we see across live deployments, counted as conversations the AI handled end-to-end without handing off to a person:

BusinessTypeWhat AI handlesResolved without a human
TurnaTravel & ticketingChanges, refunds, ticket rules — across languages, around the clock71%
İnce TopukFashion e-commerceSizes, stock, prices, order status on WhatsApp66%
DüğünBuketiWedding marketplaceOpen-ended planning questions that turn into enquiries38% became qualified leads

Two things to notice. First, the strong performers cluster in the 66–71% range, not 90% — and that's the honest ceiling for most businesses once you count properly. Second, DüğünBuketi's number isn't a resolution rate at all: for a high-touch service, "success" is a qualified lead, not a closed ticket. The right metric depends on the job.

Across all of them, the median first reply lands in about 8 seconds — which matters because speed is what makes customers willing to let an agent try in the first place.

Why the same AI gets 40% for one business and 71% for another

The model is rarely the bottleneck. Three things move the number far more than which LLM is under the hood:

1. Whether the answer exists somewhere the AI can read. An agent grounded in your actual catalog, help docs, and policies — via retrieval over your own knowledge base — answers correctly. An agent guessing from general training hallucinates and gets escalated. This single factor explains most of the spread between a 40% and a 70% deployment.

2. Whether the AI can do things, not just talk. Resolution requires action: looking up an order, checking live stock, processing a return. An agent that calls real tools resolves "where's my order?" in one turn. A bot that can only chat sends the customer to a human for anything transactional — which caps its rate at the share of questions that are purely informational.

3. Whether handoff is designed, not accidental. Counterintuitively, a clean handoff to a human raises your effective resolution rate, because it stops the AI from forcing bad answers on the 30% it shouldn't touch. The goal isn't 100% automation — it's automating the repetitive 70% well and routing the judgment calls fast.

The metric that matters more than resolution rate

Resolution rate alone is a vanity metric. An agent that "resolves" 80% of conversations by stonewalling people into giving up is worse than one that resolves 60% with happy customers.

So pair it with two others:

  • CSAT on AI-resolved conversations — are the people the AI handled actually satisfied, measured separately from human-handled ones?
  • Value per resolution — not just "how many tickets did we deflect" but "what did each resolved conversation produce" — a recovered cart, a recommendation, a lead. This is the shift from treating support as a cost to measuring what conversations are worth.

A 66% resolution rate with high CSAT and recovered revenue beats an 85% rate of quietly abandoned customers every time.

How to estimate your own realistic rate

You don't need to launch to get a defensible estimate. Audit one month of conversations and sort them into three buckets:

  1. Repetitive + answer-exists + no action needed (e.g. "what are your shipping times?"). The AI resolves nearly all of these.
  2. Repetitive + needs a system lookup (e.g. "where's my order?"). The AI resolves these if it's connected to the system that has the answer.
  3. Judgment, exceptions, emotion (e.g. "this is the third time this broke"). These should go to a human — and counting them as automatable is how vendors inflate their numbers.

Your realistic resolution rate is roughly buckets 1 and 2, weighted by how much of your knowledge is actually written down and how many systems you can connect. For most e-commerce and support teams, that lands in the 60–75% range — which matches what we see in production. (For a worked example on a single channel, see running a WhatsApp agent on a Shopify store.)


If you want to see where your own conversations would land across those three buckets, that audit is the first thing we do in a Vivollo deployment — before quoting you any number at all.

Common questions

Is a higher resolution rate always better?

No. A high rate driven by assumed resolution often means customers gave up. Optimize for confirmed resolution plus CSAT.

Can AI resolve 90%+ of support?

Rarely, and not honestly. The judgment-call tail of real support resists automation; claims above ~80% usually rely on loose counting.

What's the fastest way to raise the rate?

Improve what the AI can read (your knowledge base) and what it can do (tool/system access) before changing anything about the model.

Keep reading