6 min read 14 May

Automation: from bottleneck to advantage

Learn how we built a serverless AWS architecture for an entertainment company that handles thousands of requests daily.

Amazon Bedrock
Amazon OpenSearch
Amazon API Gateway
AWS Lambda
Amazon S3

Jakub Forysiak AI Technical Lead

Tl;dr:

A high-volume entertainment company faced thousands of daily support messages, repetitive requests, pressure on response times and costs while using Intercom as the support front door.
We built a cloud-native, serverless architecture on AWS with Amazon Bedrock, Amazon Knowledge Bases, OpenSearch, AWS Lambda, and Amazon API Gateway — integrated seamlessly with Intercom.
Two AI layers solve distinct problems: an AI firewall that protects flows, detects urgency, and prioritizes correctly, and a conversational AI agent that resolves simple cases end-to-end.
The result is real operational impact: faster responses, lower cost per resolution, better agent focus, and a support function that scales safely instead of reactively.

As digital products grow, customer support often becomes the first system to break under scale. When tens of thousands of messages arrive each month, even well-designed workflows collapse into queues, workarounds, and frustrated users trying to reach a human at any cost. The real issue isn’t response speed — it’s misallocated human effort. Skilled agents handle repetitive requests while complex, high-risk cases wait longer than they should. This creates rising costs, declining experience, and an operation that feels reactive rather than scalable.

In this article, we explore a practical approach to automating customer support for high-volume environments using AI-driven intent recognition and backend action execution. You’ll see how automation can resolve common requests instantly while routing sensitive cases to humans with the right context. The patterns described here come from building production systems designed to handle large message volumes with strict accuracy requirements.

Why this problem matters

At scale, customer support is shaped by three compounding pressures that traditional workflows can’t absorb:

Volume growth — As products expand across regions, time zones, and channels, inbound support traffic increases not just in size but in variability. Messages arrive around the clock, spanning everything from simple questions to urgent incidents, often outside standard operating hours.
Repetition — A disproportionate share of support volume consists of routine tasks — password resets, balance inquiries, transaction status checks. Individually trivial, these interactions collectively consume a large portion of agent capacity, leaving less time for cases that genuinely require human judgment.
The expectation gap — Users expect instant, accurate answers. Businesses, meanwhile, need predictable costs, fast SLAs, and consistent quality. Human-only support struggles to satisfy all three simultaneously.

A less obvious but critical challenge emerges at this point: users learn how to bypass guided flows. Free-text inputs and generic options like “Other” introduce noise into queues, pulling agents into avoidable back-and-forth and delaying resolution for high-priority cases.

Solving this requires more than a chatbot. It demands an intelligent automation layer that can protect support flows, triage intent accurately, handle routine actions end-to-end, and escalate complex or sensitive cases to humans — fast, safely, and with full context.

Our use case & objectives

We framed the initiative around four progressive capabilities for our customer support AI:

Phase 1: Detect & filter bypass attempts — Stop conversations from skipping intended routes and nudge users to the correct path.
Phase 2: Flag urgent cases for escalation — Recognize high-risk signals (e.g., potential account compromise) and prioritize a human.
Phase 3: Conversational AI for simple cases — Resolve low-complexity intents autonomously with deterministic steps and grounding.
Phase 4: Expansion to adjacent domains — Scale the approach to more categories and workflows as confidence grows.

These goals map tightly to business outcomes: lower cost per resolution, faster responses, better agent focus, and higher consistency across policies.

High-level flow

The starting point is a button-based support interface built on Intercom. Users are guided toward predefined categories intended to structure demand and reduce ambiguity. This approach works well for many — but at scale, cracks emerge. Some users shortcut the system by selecting generic options or entering free-text to reach an agent faster, creating noise and misaligned routing. To address this, the architecture introduces AI in two distinct but complementary roles.

The first is an AI firewall that sits between the incoming user message and human agent routing. Every interaction is evaluated before it reaches the support queue:

If the message represents a bypass attempt or falls outside allowed paths, the system gently redirects the user back to the appropriate flow.
When high-severity signals appear, the message is immediately escalated and prioritised, with context preserved.
In normal cases, the request proceeds to agents unchanged, but with clearer intent classification.

The second AI role operates alongside humans as a conversational support agent. This component handles low-complexity, high-frequency cases independently, following deterministic, validated workflows. It resolves simple requests end-to-end or hands off to a human when conditions exceed defined boundaries.

Together, these layers protect agent focus, reduce queue noise, and allow automation to grow safely — one category at a time.

Architecture overview

This architecture illustrates how AI is layered onto an existing helpdesk system. While the full platform includes monitoring, secrets management, and security controls, those elements are deliberately omitted here to keep the diagram focused on core data and decision flows.

The user interacts with a helpdesk interface through a primarily button-based flow, with limited free-text input at the end. All user interactions are sent via a REST API to Amazon API Gateway, which triggers a single, central AWS Lambda function acting as an AI agent router. This Lambda is responsible for orchestrating the system’s behaviour and contains the application logic that determines how each request is handled.

The Lambda communicates directly with a large language model hosted on Amazon Bedrock, where the core reasoning and intent interpretation occur. Depending on the request, the same underlying model operates in two distinct modes. In AI firewall mode, it validates intent, detects bypass attempts or misuse, redirects users to appropriate paths, or escalates urgent cases. In AI support agent mode, it resolves low-complexity issues autonomously or prepares structured hand-offs to human agents.

Both modes rely on retrieval from Amazon OpenSearch, which indexes curated knowledge stored in Amazon S3. A dedicated “data-puller” Lambda, triggered daily via EventBridge, synchronizes selected internal sources to keep this knowledge current. When needed, the agent Lambda also integrates with backend services to perform lightweight checks or actions.

This is the kind of project where the architecture appears simple, but the underlying solution is complex and hard to fine-tune. The real effort sits in Lambda code and LLM behaviour design, while parts of conversation handling remain in helpdesk workflows — yet despite its simplicity on paper, it delivers real impact by reducing costs and freeing agent time.

Data & knowledge: what we used

Support Flow Documentation — Intercom support chat path definitions, decision criteria, exceptions.
Internal Procedures — Operational rules that determine whether to self-serve or escalate.
FAQ Documents — The backbone for common-case automation content.

We curated these sources for clarity and minimal ambiguity, because early in the PoC the chosen model favored concise, essential instructions; richer context could cause inconsistent outputs. A more capable model would let us reintroduce detail without losing stability.

Early results & adoption signals

In early validations and reviewer tests:

AI “firewall” showed promising performance (80%+) at stopping or redirecting improper paths.
Urgent case detection exceeded 90%, ensuring high-risk cases reach humans quickly.
Latency averaged ~3 seconds, well under the 15-second connector timeout and aligned with the sub-30-second first response target.
The single-turn decision pattern limited context gathering — sometimes the AI should ask a clarifying question before routing.

We also captured human feedback on specific misclassifications (e.g., “ok vs wrong-path”), which is essential for the next iteration.

Final thoughts

AI-driven support automation isn’t about replacing agents — it’s about protecting their time. Start by defending flows and prioritizing safety, then grow automation where the evidence is strongest. The combination of serverless AWS, Amazon Bedrock RAG, and tight documentation governance gives you a stack that’s scalable, explainable, and adaptable to future channels and models.

If you’re planning a similar initiative, anchor the work in your current pain points (bypass, latency, costs), commit to evaluation discipline, and treat content quality as seriously as code. The payoff is a support operation that moves from cost center to competitive advantage — faster for users, calmer for agents, and clearer for leadership.