18 min read 30 June

A practical guide to Strands Agents

Get started with the Strands Agents SDK and build model-driven agents on AWS with this practical, hands-on guide.

Amazon Bedrock
Amazon Bedrock AgentCore
AWS Lambda
Strands Agents
Python

Gracjan Strzelec AI Engineer

Most of the popular ways to build an AI agent today start with a graph. LangGraph, probably the best-known example, has you lay out the nodes and edges yourself: what happens at each step, which branch to follow, how state moves through the flow. You wire the whole thing together by hand. It works, and for plenty of problems it’s exactly the right approach. But you do end up maintaining a fair amount of orchestration around the agent rather than improving the agent itself, and the day a better model comes out, and you want to swap it in, the framework tends to have opinions about that too.

Strands Agents takes a different route. It’s an open-source SDK that AWS built for its own teams and later released publicly, and its starting assumption is that you don’t need to hand-script the control flow anymore. You give the model some tools and a prompt, and you let it work out the steps. You stop drawing the flowchart and start trusting the model to reason through the problem on its own.

What got me hooked was how little sits between you and a working agent. The “hello world” is three lines, and the gap between that and something I’d actually put in front of a client turned out to be surprisingly small.

In this guide I’ll cover the idea behind the model-driven approach, the three parts that make up any Strands agent, and six small demos that build from those three lines all the way up to an agent that answers AWS questions from live documentation and pricing. At the end we’ll look at where this runs on AWS.

Two ways to build an agent

Before any code, it’s worth naming the two mental models for building agents, because Strands clearly picks a side.

Workflow-driven vs model-driven approaches

The first is workflow-driven. This is how most of us started, and it’s still the right choice for plenty of problems. You define the flow explicitly: do this, then call that, branch here, loop there. It’s predictable and easy to audit, because you wrote every path yourself. The cost is rigidity. The system only ever does what you spelled out, and when the requirements change you’re back in the graph, rewiring nodes and refactoring half the repo to make room.
The second is model-driven. Here the model and its prompt sit in the middle, with tools hanging off it that extend what it can do, and there’s no predefined path. You trust the model to reason through the task and decide which tools to call, in what order. It’s far more adaptive. When you want new behavior, you usually just adjust the prompt or add a tool rather than redesign a flow.

Important:

Strands defaults to the model-driven style. That’s really the whole philosophy, and AWS sums it up in a line they repeat a lot: Modern models are sophisticated enough to be their own orchestrators.

A few years ago that wasn’t true. Models weren’t steady enough to hand them the wheel. In 2026 they increasingly are, and the practical effect is that you steer with context instead of controlling the flow. You still engineer the tools, the prompts, and the information the model gets to see, but you let it decide what to do with all of it. You keep validating outputs, of course. You just stop micromanaging the route it takes to get there.

None of this is all-or-nothing, by the way. Workflows, graphs, and swarms are all still there in Strands for when you need hard structure. The model-driven loop is the default, not a cage. And the SDK is open-source and model-agnostic, so you’re not signing up to a vendor to use any of it.

What’s inside a Strands agent

A Strands agent is really just three things working together:

The model is the brain. It does the reasoning, decides whether a tool is needed, and picks which one. By default it runs on Amazon Bedrock, but it can just as easily be Anthropic’s API, OpenAI, or a local model through Ollama.
The tools are what make it an agent rather than a chatbot. On its own a model can only talk; give it tools and it can actually do things. A tool can be any Python function you write, an MCP server, or one of the roughly thirty built-ins that ship in the strands-agents-tools package.
The prompt sets the agent’s role and goals. This is the LLM-native way of “programming,” and it’s where prompt engineering does its work: change the prompt and you change how the agent behaves.

Put the three together and you get the agentic loop. You invoke the agent with a prompt, it reasons, maybe calls a tool, reads the result, reasons again, calls another tool, and keeps going until it decides on its own that it has the answer. Then it stops and hands it back.

You’ll see that loop play out in the demos below, including a couple of moments where the model gets something wrong and fixes it without any help, which is where the whole approach starts to earn its place.

Building up from three lines

Let’s actually build something. You’ll need Python 3.10+ and AWS credentials with Amazon Bedrock access.

pip install strands-agents strands-agents-tools

Set up your AWS credentials the way you normally would (aws configure, environment variables, or an IAM role), and make sure you’ve enabled model access in the Amazon Bedrock console for whatever models you plan to use.

I’d genuinely suggest following along and running these yourself rather than just reading. Each demo is a complete, self-contained script, they build on each other one idea at a time, and watching the loop happen live tells you far more than my summary of it can.

Demo 1: Hello world

We start at the absolute minimum: import the agent, create it, ask it a question. The smallest useful agent really is three lines.

demo1.pypython

from strands import Agentagent = Agent()agent("What is an AI agent?")

Agent() with no arguments uses Amazon Bedrock as the default provider. Run it and you get a clean answer:

An AI agent is a computer program that can perceive its environment,make decisions, and take actions autonomously to achieve specific goals.

Note: When I recorded these demos, the default model was Claude Sonnet 4, so your defaults may be different at this point

Fine, but there’s no real agency yet. It’s just a model answering a question. Let’s give it some shape.

Demo 2: Shaping behavior with a system prompt

Same agent as before, with one extra argument: a system prompt telling it to always answer in the form of a haiku. The rule is deliberately silly, but it’s the fastest way to see that the prompt on its own changes how the agent behaves.

demo2.pypython

from strands import Agentagent = Agent(    system_prompt="You always respond in the form of a haiku, with no exceptions.")agent("What is an AI agent?")

Ask the same question and the answer comes back in 5-7-5:

AI agent worksAutonomously toward goalsThrough environment

The constraint is a toy, but the point behind it isn’t: the system prompt is how you set the agent’s role and voice without writing a line of control logic.

Demo 3: Giving the agent tools

Now we make it properly agentic. We hand it two built-in tools, current_time and calculator, and ask for something it can’t answer on its own: the difference in minutes between the current time in Warsaw and in Las Vegas. To get there it has to look up both local times and do the arithmetic itself.

demo3.pypython

from strands import Agentfrom strands_tools import calculator, current_timeagent = Agent(    system_prompt="You always respond in the form of a haiku, with no exceptions.",    tools=[current_time, calculator],)agent("What's the difference in minutes between the current time in Warsaw and Las Vegas?")

Run it and the loop gets to work. The haiku system prompt is still in place, so even the agent’s running commentary comes out in tercets:

I need to checkBoth cities' current timeThen find the differenceTool #1: current_timeTool #2: current_timeLet me try the rightLas Vegas timezone namePacific time zoneTool #3: current_timeNow I'll computeTime difference between zonesIn minutes for youTool #4: calculatorWarsaw leads VegasBy nine hours exactly -Five forty minutes

This is the part that actually got me interested. Look at that third current_time call. On its first pass the model reached for the wrong Las Vegas time zone, then caught itself (Let me try the right / Las Vegas timezone name / Pacific time zone), looked it up again, and only then did the math, landing on the right answer of 540 minutes.

I hadn’t written a single line of error handling, and the loop still found and fixed its own mistake. Seeing that happen live is what sold me on the model-driven idea more than any slide could.

Tip:

In production you’d usually hide or restructure this reasoning trace (Strands lets you turn off the console streaming with callback_handler=None), but while you’re developing it’s gold.

Demo 4: Swapping the model

“Model-agnostic” is an easy thing to claim, so here’s what it actually costs. The task is identical to demo 3 on purpose — the only thing we’re changing is the brain. We define the model explicitly and switch to OpenAI’s open-weight gpt-oss-120b, served via Amazon Bedrock:

demo4.pypython

from strands import Agentfrom strands.models import BedrockModelfrom strands_tools import calculator, current_timemodel = BedrockModel(    model_id="openai.gpt-oss-120b-1:0",    temperature=0.1,    top_p=0.1,)agent = Agent(    model=model,    system_prompt="You always respond in the form of a haiku, with no exceptions.",    tools=[current_time, calculator],)agent("What's the difference in minutes between the current time in Warsaw and Las Vegas?")

The only line that changed is the model argument.

Tip:

Defining the model yourself also lets you tune parameters like temperature and top_p; keeping them low, as here, stops a factual agent from getting “creative” when you’d rather it didn’t.

Same question, same correct answer, and a completely different journey to get there:

The user asks: "What's the difference in minutes between current time inWarsaw and Las Vegas?" We need to respond in haiku form, no exceptions.Also we need to compute the difference in minutes... Warsaw is CET/CESTdepending on DST. Las Vegas is PST/PDT... In June, both are likely in DST:Warsaw is CEST (UTC+2), Las Vegas is PDT (UTC-7). So difference is 9 hours= 540 minutes. But need to confirm. Let's call current_time for both.Tool #1: current_timeTool #2: current_time[...pages of internal reasoning, even counting out the haiku's syllables...]Nine hours separateWarsaw sunrise, Vegas duskFive hundred forty

gpt-oss-120b lands on the same 540 minutes, but notice how much more of its reasoning spills out, and that while it formats the final answer as a haiku, it doesn’t think in haiku the way Claude did. That’s the thing to remember: swapping models is a one-line change, but two models won’t reason the same way, so it’s worth re-testing your prompts whenever you switch.

Demo 5: Writing your own tool

Built-ins only take you so far — real agents need tools that hit your own systems. So we write one. The goal is an agent that reports the total monthly EC2 cost for a region and breaks it down per running instance. In Strands, a tool is just a decorated Python function:

demo5.pypython

from strands import Agent, tool@tooldef list_ec2_instances(region: str) -> str:    """List EC2 instances in an AWS region with their type, state, and hourly cost.    Use this whenever the user asks about EC2 instances, usage, or costs in a region.    Args:        region (str): AWS region to inspect, e.g. "eu-central-1".    Returns:        str: One line per instance with name, id, type, state, and hourly rate.    """    # In production this would call boto3:    #   ec2 = boto3.client("ec2", region_name=region)    #   ec2.describe_instances(...)    # Here we mock the response:    return (        "web-prod      | i-abc123 | t3.medium  | running | $0.0456/hour\n"        "data-pipeline | i-jkl012 | r5.2xlarge | running | $0.504/hour\n"        "batch-old     | i-ghi789 | c5.xlarge  | stopped | $0.192/hour"    )agent = Agent(    system_prompt="You are a careful AWS infrastructure assistant.",    tools=[list_ec2_instances],)agent("What's the total monthly EC2 cost in eu-central-1? "      "Show the breakdown per running instance.")

Two things make this work:

The @tool decorator, which is all it takes for the agent to treat the function as something it can call.
The docstring, easy to underrate. In normal code a docstring is a note to other developers. For an agent it’s functional. It goes straight into the model’s context and it’s what the agent reads to decide whether and how to call your tool. Notice we also describe the region argument, because the model has to fill that in itself — we never pass it.

Tip:

If your agent keeps ignoring a tool or calling it wrong, the docstring is the first place I’d look.

Here’s what we got back:

## Monthly EC2 Cost Breakdown for eu-central-1Running instances (billable):  1. web-prod (i-abc123) — t3.medium — 720 hrs × $0.0456/hr = $32.83/month  2. data-pipeline (i-jkl012) — r5.2xlarge — 720 hrs × $0.504/hr = $362.88/monthStopped instances (no compute cost):  3. batch-old (i-ghi789) — c5.xlarge — $0.00Total monthly EC2 cost: $395.71Note: Only running instances incur charges. The stopped instance isn'tgenerating compute costs, though any attached EBS volumes still would.

We never told it any of those steps. It read the instance list, used the calculator to turn the two running instances’ hourly rates into monthly costs, gave the stopped one a $0 charge, added everything up to $395.71, and, because we’d asked it to be a careful assistant, pointed out that the stopped instance can still cost money through attached EBS storage.

Demo 6: Plugging in MCP

This is the demo I’d point most people to. The Model Context Protocol (MCP) is the open standard for connecting agents to outside tools and data; the easiest way to think about it is USB-C for AI agents. Instead of writing a bespoke integration for every service, you plug into an MCP server, and Strands treats those tools just like any other.

What we want here is an agent that can answer a concrete pricing-and-specs question about an instance type and cite the AWS docs it pulled the answer from. We connect two official AWS MCP servers, one for documentation and one for pricing, and add the calculator:

demo6.pypython

from mcp import stdio_client, StdioServerParametersfrom strands import Agentfrom strands.tools.mcp import MCPClientfrom strands_tools import calculator# Official AWS Documentation MCP serveraws_docs = MCPClient(lambda: stdio_client(    StdioServerParameters(        command="uvx",        args=["awslabs.aws-documentation-mcp-server@latest"],    )))# Official AWS Pricing MCP serveraws_pricing = MCPClient(lambda: stdio_client(    StdioServerParameters(        command="uvx",        args=["awslabs.aws-pricing-mcp-server@latest"],    )))with aws_docs, aws_pricing:    tools = aws_docs.list_tools_sync() + aws_pricing.list_tools_sync()    agent = Agent(        system_prompt="You are an AWS expert. Always cite the documentation you use.",        tools=tools + [calculator],    )    agent(        "What's the on-demand monthly cost of a t3.medium instance in eu-central-1? "        "How many vCPUs and how much RAM does it have? Cite the docs."    )

Roughly six lines per MCP client, and the agent picks up a whole new set of abilities. The loop is the best part to watch. It connects to both servers, has a go at pricing, jumps over to the documentation to search and read the right pages, hits a dead end on one search and recovers on its own, then uses the calculator to work out the monthly figure:

Processing request of type ListToolsRequestI'll find the on-demand pricing for a t3.medium in eu-central-1, plus its specs.Tool #1: get_pricingLet me also get the EC2 docs to cite the specifications:Tool #2: search_documentationTool #3: search_documentationTool #4: search_documentationTool #5: read_sections  ERROR: No matching sections were found: "General purpose instances".         Available sections: "Available instance types", "Hardware         specifications", ... "Processors". Retry with one of these, or use         read_documentation for the full page.Let me read the full instance types page instead:Tool #6: read_documentationTool #7: search_documentationTool #8: read_documentationTool #9: read_documentationNow let me calculate the monthly cost:Tool #10: calculator## t3.medium pricing in eu-central-1 (Frankfurt)On-demand monthly cost: $34.56  - Hourly rate: $0.048/hour  - $0.048 × 24 hours × 30 days = $34.56## t3.medium specificationsAccording to the Amazon EC2 Instance Types Guide:  - vCPUs: 2  - Memory: 4 GiBSource: https://docs.aws.amazon.com/ec2/latest/instancetypes/gp.html

See the read_sections error at tool #5? It asked the docs for a section that didn’t exist, got told which sections actually do exist, and quietly adjusted and read the full page instead, with no help from me. And that documentation link at the end is the important bit; it’s exactly what you want from an agent you’re going to rely on for facts. All in, that’s a few dozen lines of code for something that can reason over live AWS docs and pricing. I’d genuinely reach for this to scope services or sanity-check a cost estimate.

Where it fits in AWS

Strands is open-source and runs anywhere Python runs, so you’re not tied to AWS at all. But since that’s where a lot of this work happens, here’s how the pieces line up.

The model layer is Amazon Bedrock, the default source, and the integration is as smooth as the demos suggest. It’s serverless, so there’s no infrastructure to stand up and you pay per token. Two details tend to matter most in practice. One is choice: there are 30-plus foundation models from 10-plus providers (Anthropic, Meta, Amazon, OpenAI, Mistral AI, Qwen, and others), which is what makes swapping models a one-liner. The other is data residency: for EU companies with compliance requirements, Amazon Bedrock offers models hosted in European regions, and your data isn’t used to train the underlying models.

Warning:

The one thing to watch is quotas, which can be tight on new accounts. If you’re expecting heavy load from day one, talk to AWS early, because the increases aren’t automatic and aren’t always granted on the spot.

Once you have the agent and a model, you need somewhere to run it. The three usual options:

Option	Best for	Watch out for
AWS Lambda	Cheap at any scale, serverless, bounded multi-step tasks	15-minute timeout; you assemble the surrounding pieces yourself (API Gateway to expose it, DynamoDB for memory)
Amazon Bedrock AgentCore	An all-in-one platform with built-in memory and observability; long, iterative sessions (up to ~8 hours)	Costs more for the convenience; newer, with fewer regions
Amazon EC2	Steady, high-volume workloads; full control (e.g. GPU instances for local models)	Not serverless, so you pay 24/7; rarely the right call for agents

AWS Lambda is what I reach for most, and it’s a great fit as long as the work is bounded; just respect that 15-minute ceiling for big open-ended agents. Amazon Bedrock AgentCore is purpose-built for production agents and takes most of the assembly off your plate. Amazon EC2 is the exception, for the rare case where serverless economics genuinely don’t suit the workload, or when you want full, low-level control over the infrastructure the agent runs on.

Before you build

“Trust the model” is about reasoning and tool selection, not about taking outputs on faith. The autonomy that makes this approach pleasant to work with is the same autonomy you have to keep an eye on, so validate the things that matter.
For your own tools, treat the docstring as part of your prompt engineering, not as an afterthought. It’s what the agent reads to decide when and how to call the function, and tightening it up fixes most “why won’t it use my tool” problems.
Swapping models is cheap, but their behavior isn’t interchangeable. A one-line change can shift how the agent reasons and formats, so re-test when you switch. And when you need a new capability, an official MCP server is usually the fastest way to get it, with maintained integrations and citations more or less for free.
Finally, this isn’t an experiment you’re gambling on. Amazon Bedrock AgentCore is built on Strands, and AWS uses it inside its own products, including Amazon Q Developer, AWS Glue, and VPC Reachability Analyzer. There’s a TypeScript SDK alongside the Python one. You’re building on something AWS runs in production itself, not a side project that disappears in a quarter.

Wrapping up

The move from workflow-driven to model-driven agents tracks the models themselves. As they get better at reasoning, the value shifts away from hand-coding control flow and toward giving the model the right context, tools, and prompts, and then stepping back. Strands is the cleanest take on that idea I’ve worked with: a few lines of Python gets you a real agent, switching models is trivial, MCP makes new capabilities a plug away, and nothing locks you in.

If you want to size it up for yourself, build the six demos above. You’ll have them running in an afternoon, and that’s the quickest way to feel whether the model-driven approach fits how you work.

This guide is based on a webinar I gave on AWS Strands Agents. The full recording is available here. If you’d like to talk through agentic AI for your own use cases, reach me at gracjan.strzelec@chaosgears.com or on LinkedIn.

Resources