Vivollo

Vision (photos & images)

When a customer sends a photo, your agent can actually see it — and reason about what's in it.

Sometimes a customer can't describe the problem — but they can show it. A cracked screen. A label they can't read. "Is this the right part?" with a photo attached. With Vision, Vivollo's agent can actually see those images and reason about them, instead of asking the customer to put it into words.

What it does

When a customer sends a photo in the conversation, the image is passed to the AI so it can look at it and factor it into the reply. That opens up conversations that were awkward or impossible before:

  • "Is this the part I need?" — the agent looks at the photo and compares it to your catalog.
  • "My order arrived damaged" — the agent sees the damage and starts a return, no interrogation required.
  • "I'm getting this error" — a screenshot tells the agent more than three paragraphs of back-and-forth.
  • "Does this come in my size?" — a photo of an item helps the agent identify it.

The result is fewer rounds of "can you describe it?" and faster resolutions — because the customer just showed you.

How it works in a conversation

There's nothing special to do — Vision fits into the normal flow:

  1. The customer attaches a photo and sends it like any other message.
  2. The image appears in the thread, visible to your team too.
  3. The agent considers the image alongside the text when it reasons and replies.

Recent images in the conversation are taken into account, so a customer can send a couple of angles and the agent sees them together.

Beyond images

Customers can also send other files — video, audio, documents — and those are captured and shown in the inbox for your team to open. Today it's images that the agent actively reasons about; other media is stored and displayed so a human can review it.

Vision is part of certain plans rather than every tier. If image understanding is important to your business, check your plan under Plans & limits. When Vision isn't available, photos still arrive and your team can see them — the agent just won't reason over them automatically.

A note on graceful behavior

Vision is designed to never break a conversation. If an image can't be processed for any reason, the agent simply carries on with the text it has rather than stalling. The customer keeps getting helped either way — at worst, the agent asks a clarifying question the old-fashioned way.