19 min read

Orchestrating a modern data platform on AWS with Apache Airflow, dbt, and Cosmos

Learn how to combine Apache Airflow on MWAA with dbt and Cosmos to build a scalable, production-ready data platform on AWS.



In modern data platforms, data flows from multiple sources and has to be processed in a specific order, often with dependencies between steps. We need a process that knows what to run and when, what to do when something fails, and how to report on it. This is exactly the job of an orchestrator. When discussing the modern data stack it’s hard not to think of Apache Airflow as the go-to orchestration tool, and that’s what I’ll cover today. On top of that, I’ll walk you through hosting Airflow on AWS using MWAA, how it pairs with dbt, and how Cosmos can extend that setup to get the most out of Airflow + dbt combo.

Apache Airflow as the industry standard

At its core, Airflow lets you describe a data process as code. You write a Python file, define the steps, say which step depends on which, and Airflow turns that into a DAG - a Directed Acyclic Graph. The “directed” part means the process flows in one direction, and “acyclic” means there are no loops, so you can’t accidentally create a pipeline that runs forever. In practice, a DAG is just a map of your workflow: do A, then B and C in parallel, then D once both are done.

Airflow reads the DAG, figures out what can run now and what has to wait, retries failed steps according to your rules, and keeps a full history of every run. If something breaks at 3 AM, you don’t wake up to a mystery: you open the UI and see exactly which step failed, what the logs say, and how long each previous run took. For anyone who’s ever debugged a pipeline by grepping through scattered log files, this alone is a game-changer.

From a business perspective, that visibility is a big deal. You get auditability out of the box - every run is logged, every failure tracked, every change in the pipeline versioned in Git just like application code. That’s why Airflow has become the default choice for teams building serious data platforms. It’s mature, battle-tested, and has an ecosystem of thousands of ready-made integrations covering pretty much anything you’d want to talk to: databases, cloud services, APIs, you name it.

For engineers, the Python-first approach is a breath of fresh air compared to drag-and-drop tools. Your pipeline is code, which means you can version it, test it, review it in pull requests, and reuse pieces of it across projects. No more “it worked when I clicked the buttons in the right order”, if it’s in the repo, it’s reproducible.

Airflow UI

Airflow's UI listing our DAGs, their schedules, and the status of the latest and next runs.

It’s worth mentioning that Airflow recently jumped to version 3.x, and the developer experience took a noticeable step up. The UI got a proper refresh, with task-level visualization that actually shows your pipeline as clean blocks connected by arrows, which is a huge improvement over the older, more cluttered interface. The new UI makes exploring runs, checking logs, and understanding dependencies genuinely pleasant. Airflow 3.x also brings better handling of data assets and a handful of quality-of-life improvements that make day-to-day work smoother.

Tip:

Why not just use something simpler, like AWS Step Functions? And the honest answer is - for simple workflows, you absolutely can. But the moment your pipeline grows beyond a handful of steps, needs to coordinate multiple teams, or has to integrate with tools outside the AWS ecosystem, Airflow starts paying for itself. It’s the tool you reach for when you’re building a data platform, not just a single workflow.

Managed Airflow on AWS

Running Airflow yourself sounds simple until you actually try it. You need a web server, a scheduler, workers that execute the tasks, a metadata database to track everything, and a way to keep all of that alive, patched, and talking to each other. For a small team that just wants to run some pipelines, that’s a lot of infrastructure to babysit before you even get to the interesting part.

This is exactly the gap that Amazon Managed Workflows for Apache Airflow (MWAA) fills. It’s a fully managed service from AWS that hands you a working Airflow environment without the setup headache. You pick an Airflow version, point it at an Amazon S3 bucket where your DAGs live, and AWS takes care of the rest: the web server, the scheduler, the workers, the database, the scaling, the patching. You focus on writing pipelines, not on keeping Airflow alive.

The integration with the rest of AWS is where things get particularly nice. Permissions are handled through IAM, which means your Airflow tasks can access Amazon S3, Amazon Redshift, AWS Glue, or anything else in your account without juggling credentials. Logs flow straight into Amazon CloudWatch, so everything your DAGs print out ends up in one central place where you can search, alert, and monitor it alongside the rest of your infrastructure logs. And the whole environment runs inside a VPC, so you can lock it down to your private network and control exactly what it can and cannot reach.

Deploying code is refreshingly simple. DAGs live in an Amazon S3 bucket, and MWAA picks them up automatically - no SSH, no deploy scripts, no server restarts. Upload a new Python file, and a few seconds later the DAG shows up in the UI. For teams used to the pain of deploying changes to self-hosted Airflow, this alone is worth the switch.

Important:

One architectural pattern worth calling out: MWAA is great at orchestrating, but it’s not the place to do heavy compute. Airflow itself was never designed to crunch through large datasets - it’s an orchestrator, not the compute engine. The usual approach is to let MWAA coordinate the work while the actual heavy lifting happens where it belongs: SQL transformations run on Amazon Redshift, Snowflake or Databricks, Spark jobs run on AWS Glue or EMR, and MWAA just tells each service when to start and waits for it to finish. In our demo setup, for example, dbt runs directly on the MWAA workers, but all the real data transformation happens inside Amazon Redshift. MWAA barely breaks a sweat.

dbt - the SQL-first approach to data models

If Airflow is the orchestrator that decides when things run, dbt (data build tool) is what actually does the work of transforming your data once it’s in the warehouse. The idea behind dbt is refreshingly simple: instead of writing complex ETL (Extract-Transform-Load) scripts in Python or Java, you describe your transformations as SQL SELECT statements, and dbt takes care of the rest - the order things run in, which models depend on which, testing, and even generating documentation for your data.

The key shift here is in how you think about transformations. In a traditional ETL script, you spell out every step: connect to the source, pull the data, transform it row by row, write it to the target. With dbt, you just write a SELECT that describes what the final table should look like. If model B references model A in its query, dbt figures out on its own that A has to run first. You don’t write the dependency graph - you write the SQL, and the dependency graph emerges from it. For anyone who’s ever maintained a tangled web of ETL scripts, this is a quiet revolution.

dbt also lives in a natural sweet spot with modern data warehouses. Tools like Amazon Redshift, Snowflake, and BigQuery are extremely good at running SQL at scale - so why move the data somewhere else just to transform it? dbt pushes the transformation logic down into the warehouse itself, which means the data never leaves the place it’s already stored. Your warehouse does the heavy lifting, and dbt just tells it what to do. Less data movement, less infrastructure, less to go wrong. And because the transformation logic lives in dbt rather than inside any particular warehouse, you also gain a surprising amount of flexibility: dbt handles the SQL dialect translation under the hood, which means the same project can run against multiple engines, and migrating from one warehouse to another later doesn’t mean rewriting your entire transformation layer from scratch.

One pattern that pairs especially well with dbt is the medallion architecture, and it’s the approach we used in our demo. The idea is to organize transformations into three layers: bronze, silver, and gold.

This layered approach pays off in two big ways. For analysts and business users, the gold layer means they get clean, reliable, query-ready data without having to understand what happened upstream. For everyone else - engineers, auditors, anyone troubleshooting a number that looks off - you can always trace any value in gold back through silver and into the original bronze record. Nothing is hidden, nothing is lost, and the whole flow is documented in code that lives in your repo just like any other part of your system.

Airflow + dbt are “bare bones”

So far, Airflow and dbt sound like a perfect match. Airflow handles the “when” and “what runs after what” across your whole platform, and dbt handles the “how” of transforming your data inside the warehouse. The natural next question is: how do you actually wire them together?

The simplest approach is to wrap the entire dbt project in a single Airflow task that runs dbt run. One task, one command, done. And to be fair, this works - dbt will execute all your models in the right order, because that’s exactly what dbt is designed to do. But you very quickly hit the limits of treating your dbt project as a black box.

Here’s what that “black box” approach looks like in practice - this is the DAG we started with in our demo, before any of the improvements we’ll get to in the next chapter:

Airflow DAG with three dbt layer tasks

The "black box" DAG: one task per dbt layer, with everything dbt knows about individual models hidden inside.

Three blocks. Bronze runs, then silver, then gold. Clean, simple, and almost completely useless once something goes wrong. If a single model in the silver layer fails, the whole run_dbt_silver task is marked as failed, and you have no idea from the Airflow UI which of the seven silver models actually broke. You have to dig into the logs, scroll through dbt’s output, and figure it out manually. The granularity that dbt gives you internally is completely hidden from your orchestrator.

The retry story is even worse. Say one silver model fails because of a transient warehouse hiccup. In the black box setup, retrying means re-running the entire silver layer from scratch - including the six models that ran perfectly fine the first time. On a small demo with seven models, that’s annoying. On a real platform with hundreds of models, that’s wasted compute, wasted time, and a real cost on your warehouse bill.

There’s also a subtler problem: Airflow doesn’t actually understand your data pipeline. It sees three tasks in a chain, but it has no idea that gold_revenue_per_stadium depends on silver_sales, which depends on bronze_sales. All of that dependency information lives inside dbt, completely invisible to Airflow. Which means Airflow can’t make smart decisions about parallelism, can’t show you a meaningful pipeline graph, and can’t help you when something goes wrong halfway through.

The obvious workaround is to write a separate Airflow task for every dbt model by hand. That gives you the granularity you want - but now you’re maintaining the same dependency graph in two places: once in dbt, and once in Python. Every time someone adds a new model, someone else has to remember to update the DAG. Every refactor in dbt means a refactor in Airflow. It’s the kind of duplication that looks fine on day one and turns into a maintenance nightmare by month six.

So we’re stuck between two bad options: a black box that hides everything, or handwritten tasks that duplicate everything. What we really want is for Airflow to just know what’s inside our dbt project, and build the right DAG automatically.

Cosmos — the bridge to solve the problem

This is where Cosmos comes in. It’s an open-source library and its whole purpose is to make Airflow and dbt work together properly. The way it does that is almost embarrassingly simple:you point Cosmos at your dbt project folder, and it generates the Airflow DAG for you, model by model, with all the dependencies wired up correctly.

You don’t write tasks. You don’t maintain a parallel dependency graph in Python. You don’t have to remember to update the DAG every time you add a new dbt model. Cosmos reads the structure of your dbt project, including dbt’s manifest.json, which already knows how every model relates to every other model, and builds the right DAG automatically. Add a new model in dbt, and the next time the DAG is parsed, that model just shows up as a new task. The team that works in dbt keeps working in dbt, and Airflow follows along on its own.

Here’s what our pipeline looks like after switching to Cosmos:

Airflow DAG generated by Cosmos with per-model tasks

The same pipeline rendered by Cosmos: seventeen per-model tasks with the real dependency arrows between them. When silver_users_run fails, only gold_user_activity_run is marked upstream_failed — everything else runs.

This is the same pipeline we saw in the previous chapter, but now every single dbt model is a first-class task in Airflow. Seven bronze tasks, seven silver tasks, three gold tasks - seventeen blocks instead of three, with the actual dependency arrows between them. Bronze tables feed their corresponding silver tables, silver tables feed the gold views that need them, and Cosmos figured all of that out by reading the dbt project. We didn’t write a single line of code to describe these dependencies in Airflow.

What’s particularly nice is what this image is showing in terms of failure handling. The silver_users_run task failed. In the old “black box” setup, that single failure would have killed the whole silver layer task and the entire gold layer along with it - meaning all seven silver models and all three gold models would have been marked as failed, even though only one model actually had a problem. With Cosmos, look at what actually happened: the six silver models that don’t depend on users finished cleanly, two of the three gold views (gold_revenue_by_venue_run and gold_sales_by_event_run) ran all the way through to success, and only gold_user_activity_run, the one view that genuinely depends on silver_users_run, got marked as upstream_failed and skipped. Cosmos understood the real shape of the dependency graph and did the minimum amount of damage. When we fix the broken model and retry, we only re-run what actually needs re-running, not the entire pipeline from scratch.

The other thing worth pointing out is how little code this took. The DAG file we wrote with Cosmos is actually shorter than the old three-task version, even though it now produces a seventeen-task pipeline with full dependency awareness. That’s the mark of a good abstraction: more functionality, less code, and the source of truth lives in exactly one place - your dbt project.

dags/dbt_tickit_medallion_dag.pypython
from __future__ import annotationsfrom datetime import timedeltafrom pathlib import Pathimport yamlfrom airflow.utils import timezonefrom cosmos import DbtDag, ExecutionConfig, ProfileConfig, ProjectConfig, RenderConfigfrom cosmos.constants import ExecutionMode, LoadMode, TestBehaviorDBT_VENV_BIN = "/usr/local/airflow/python3-virtualenv/dbt-env/bin/dbt"DBT_PROJECT_PATH = Path("/usr/local/airflow/dags/dbt_tickit")DBT_PROFILE_NAME = yaml.safe_load((DBT_PROJECT_PATH / "dbt_project.yml").read_text())["profile"]dbt_tickit_medallion_dag = DbtDag(    dag_id="dbt_tickit_medallion_dag",    project_config=ProjectConfig(        dbt_project_path=DBT_PROJECT_PATH,    ),    profile_config=ProfileConfig(        profile_name=DBT_PROFILE_NAME,        target_name="dwh",        profiles_yml_filepath=DBT_PROJECT_PATH / "profiles.yml",    ),    execution_config=ExecutionConfig(        execution_mode=ExecutionMode.LOCAL,        dbt_executable_path=DBT_VENV_BIN,    ),    render_config=RenderConfig(        load_method=LoadMode.CUSTOM,        test_behavior=TestBehavior.NONE,    ),    operator_args={        "retries": 2,        "retry_delay": timedelta(seconds=30),        "env": {            "DBT_TARGET": "dwh",            "DBT_REDSHIFT_REGION": "{{ var.value.get('redshift_region', 'eu-central-1') }}",            "DBT_REDSHIFT_HOST": "{{ var.value.redshift_host }}",            "DBT_REDSHIFT_USER": "{{ var.value.get('redshift_db_user', 'admin') }}",            "DBT_REDSHIFT_DBNAME": "{{ var.value.redshift_database_name }}",            "DBT_REDSHIFT_SCHEMA": "tickit_bronze",            "DBT_REDSHIFT_PORT": "5439",            "DBT_THREADS": "1",            "DBT_REDSHIFT_CLUSTER_ID": "{{ var.value.get('redshift_cluster_identifier', '') }}",            "DBT_REDSHIFT_WORKGROUP_NAME": "{{ var.value.get('redshift_workgroup_name', '') }}",            "DBT_TICKIT_S3_IAM_ROLE_ARN": "{{ var.value.get('tickit_source_iam_role_arn', '') }}",            "DBT_TICKIT_S3_REGION": "{{ var.value.get('tickit_source_s3_region', 'us-east-1') }}",        },    },    schedule="@daily",    start_date=timezone.utcnow() - timedelta(days=1),    tags=["dbt", "tickit", "medallion", "cosmos"],    catchup=False,)

The entire Cosmos DAG definition — a single DbtDag that turns the dbt project into seventeen Airflow tasks.

Cosmos is open source, which means you can use it whether you’re running Airflow on MWAA, on Kubernetes, or on a Raspberry Pi in your closet. There’s no lock-in, no licensing surprises, and the project has an active community keeping it current with new Airflow and dbt releases. For anyone running dbt on Airflow in 2026, there’s really no good reason not to use it.

Putting it all together on AWS

To make all of this concrete, let me walk you through the actual setup we built. The goal was to assemble a small but realistic data platform on AWS using exactly the pieces we’ve talked about so far (Airflow on MWAA, dbt for transformations, and Cosmos as the glue between them) and see how it all behaves end to end.

For the data, we used tickit, a publicly available sample dataset that AWS provides for Amazon Redshift demos. It models a fictional ticket marketplace, with seven tables covering venues, events, sales, categories, dates, users, and listings. The biggest table has just under 200,000 records, which isn’t huge by data engineering standards, but it’s more than enough to exercise the pipeline and produce some interesting business views at the end. The other nice thing about tickit is that anyone reading this blog can grab the same dataset themselves and reproduce the setup, which makes it a much better demo than something built on synthetic data nobody else has access to.

The architecture itself is straightforward. Amazon Redshift sits at the bottom of the stack as both the storage layer and the compute layer for transformations - the raw tickit data lives there, and that’s also where every SQL transformation actually runs. dbt is organized using the medallion pattern from chapter 4: seven bronze models that mirror the raw tables one-for-one, seven silver models that handle type casting and cleanup, and three gold models that join silver tables together to produce business-facing views. Airflow runs on MWAA and orchestrates the whole thing, with Cosmos generating the DAG directly from the dbt project. The dbt process itself runs on the MWAA workers - lightweight enough to live alongside the orchestrator, while all the actual data crunching happens inside Amazon Redshift.

The three gold views were chosen to feel like real business questions:

  1. The first one calculates total revenue per venue, so you can answer “which stadium is making us the most money”.
  2. The second breaks down sales per event, which is the kind of report a marketing team would want to look at the morning after a big show.
  3. The third tracks user activity by total spend, ranking customers by how much they’ve put through the platform.

None of these are revolutionary, but they’re the kind of views every business team eventually asks for, and they make the demo feel grounded in something a non-technical stakeholder can actually relate to.

What’s worth noting about this setup is how little custom code it required to glue everything together. The infrastructure side is handled by Terraform modules. The transformation logic lives entirely in dbt as plain SQL files. The orchestration is generated by Cosmos from the dbt project itself. There’s no Python pipeline code beyond a small DAG file that wires Cosmos to the right folders. Each tool does what it’s good at, and the boundaries between them are clean enough that a new team member can work on one piece without having to understand all the others.

Architecture diagram

The end-to-end architecture: MWAA orchestrates Cosmos-generated tasks, while dbt runs the medallion transformations inside Amazon Redshift.

The end result is a platform that’s small enough to demo in few hours, but built using the exact same patterns we’d use on a real client engagement. Same orchestrator, same transformation framework, same warehouse, same way of structuring the layers. Scaling this up from seventeen models to several hundred is a matter of adding more SQL files to the dbt project - the orchestration, the dependency tracking, and the failure handling all keep working without any additional Python code.

Who is this solution for?

The honest answer is: more teams than you’d think. The Airflow + MWAA + dbt + Cosmos stack works well across very different audiences, because each piece solves a problem someone on the team is already feeling.

For business and technical leaders, this is a low-risk way to bring a modern, production-grade data platform onto AWS. MWAA removes the operational burden of running Airflow yourself, and the rest of the stack is built on managed AWS services and mature open-source tools - no lock-in, no exotic dependencies. There’s also a longer-term angle worth keeping in mind: because the transformation logic lives in dbt rather than inside your warehouse, you’re not painting yourself into a corner when it comes to future migrations. Switching engines or running against more than one at the same time becomes a question of configuration, not a year-long rewrite project. And you’re hiring into the mainstream of the industry. Airflow and dbt are a de facto standard in data engineering, so new engineers almost always know the tools already and onboard in weeks rather than months.

For data engineers, the combination of dbt and Cosmos is the part worth getting excited about. You write SQL, Cosmos turns your dbt project into a properly-orchestrated Airflow DAG with no extra glue code, and the dependency graph stays in exactly one place. If you’re already using Airflow and dbt separately, there’s really no good reason to keep wiring them together by hand.

For cloud and platform engineers, MWAA hits a nice middle ground - more capable than Step Functions for serious data workloads, and far less work than running your own Airflow on Kubernetes. You get the standard AWS experience: AWS IAM, Amazon VPC, Amazon CloudWatch, and Terraform-friendly infrastructure that fits cleanly into whatever you already have.

For data analysts, the practical impact is that the data they use is reliable, traceable, and shows up on time. Transformations live in plain SQL, the gold layer gives them clean business-ready views, and when something looks off, the whole lineage from source to report is visible in one place.

The point is that this stack doesn’t ask any one role to compromise. Engineers get a real orchestrator, analysts get clean data, leaders get something that won’t paint the company into a corner. That’s a rare combination, and it’s why we keep coming back to this exact set of tools.

Summary

Building a modern data platform doesn’t have to mean reinventing the wheel. The combination of Airflow, MWAA, dbt, and Cosmos covers the full orchestration story on AWS without forcing you to glue together half a dozen mismatched tools or commit to a single vendor’s ecosystem.

What we walked through in this blog was a small demo, but the patterns behind it scale. The same architecture supports seventeen models or seven hundred. The same setup runs equally well in a sandbox account and in an enterprise environment with multiple teams and strict access controls. And because every layer of the stack is either managed by AWS or open source with an active community, there’s nothing stopping you from picking it up tomorrow and adapting it to your own use case.

Let's talk about your project

We'd love to answer your questions and help you thrive in the cloud.