• Pricing
© 2026 Serverless, Inc. All rights reserved.

Framework

  • Overview
  • Documentation
  • Plugins360
  • Pricing

Learn

  • Blog
  • GuidesUpdated
  • Examples240
  • Courses

Resources

  • Support
  • Security
  • Trust Center
  • Status

Community

  • Slack
  • GitHub47k
  • Forum
  • Meetups

Company

  • About
  • Careers
  • Contact
  • Partners

Legal

  • Terms of Service
  • Privacy Policy
  • Trademark
  • DMCA
Updated March 2026

The Ultimate Guide to
AWS Step Functions

AWS Step Functions is a fully managed service for orchestrating serverless workflows as visual state machines. It coordinates AWS services like Lambda, DynamoDB, and SQS into reliable, repeatable processes without managing infrastructure.

Build a Serverless WorkflowRead the Docs

AWS Step Functions Key Features

Step Functions replaces custom orchestration code with a managed service that handles sequencing, retries, parallelism, and error handling out of the box.

Core

Visual Workflow Editor

Design workflows as state machines with a drag-and-drop visual editor. See the flow of your application at a glance, making complex orchestration logic easy to understand and debug.

Compute

Lambda & 200+ Integrations

Call Lambda functions, query DynamoDB, send SQS messages, run ECS tasks, start Glue jobs, and invoke over 200 AWS services directly without writing glue code.

Reliability

Built-in Retry & Error Handling

Define retry policies with exponential backoff and catch blocks for error routing. Handle transient failures automatically without writing retry logic in your application code.

Flow Control

Branching & Parallel Execution

Use Choice states for conditional logic and Parallel states to run multiple branches simultaneously. Map states iterate over arrays, processing items concurrently at scale.

Observability

Execution Visibility

Inspect every execution with a visual timeline showing which states ran, their inputs and outputs, and where failures occurred. Built-in CloudWatch metrics and X-Ray tracing included.

Flexibility

Standard & Express Workflows

Standard workflows run up to one year with exactly-once semantics. Express workflows handle high-volume, short-duration tasks at a fraction of the cost.

How Step Functions Works

Step Functions operates as a state machine. You define states and transitions using the Amazon States Language, and the service executes them in order, handling retries, branching, and parallelism automatically.

1

Define

Author a state machine using the Amazon States Language (JSON). Specify Task, Choice, Parallel, Map, Wait, Pass, Succeed, and Fail states with transitions between them.

2

Execute

Start an execution with a JSON input. Step Functions walks through each state, invoking Lambda functions or AWS services, applying retry logic, and routing errors to catch blocks.

3

Observe

Track every execution in the AWS console with a visual timeline. Inspect state inputs, outputs, and durations. Debug failures by seeing exactly which state failed and why.

AWS Service Integrations

Step Functions connects directly with over 200 AWS services. The most common integrations include:

AWS Lambda

Invoke functions for custom business logic. The most common integration for serverless workflow steps.

Amazon DynamoDB

Read, write, and query items directly from workflow states without a Lambda intermediary.

Amazon SQS & SNS

Send messages to queues or publish notifications to topics for event-driven architectures.

Amazon ECS & AWS Batch

Run containerized tasks and batch processing jobs as part of your workflow.

AWS Glue & SageMaker

Orchestrate ETL pipelines and machine learning training and inference jobs.

Nested Step Functions

Invoke child state machines to break large workflows into manageable, reusable components.

Standard vs. Express Workflows

AWS offers two workflow types. Standard workflows are ideal for long-running, exactly-once processes. Express workflows are built for high-volume, short-duration tasks.

FeatureStandardExpress
Max durationUp to 1 yearUp to 5 minutes
Execution semanticsExactly-onceAt-least-once (async) / At-most-once (sync)
Pricing modelPer state transitionPer request + duration
Execution historyVisible in consoleCloudWatch Logs only
Max execution history25,000 eventsNo limit (duration-bound)
Best forLong-running, auditable workflowsHigh-volume data processing, IoT ingestion
Cost at 1M executions (10 steps)~$250~$1 + duration charges

For most serverless applications with short-running tasks, Express workflows offer dramatic cost savings. Use Standard workflows when you need exactly-once guarantees or executions that span minutes, hours, or days.

Using Step Functions with the Serverless Framework

The Serverless Framework supports Step Functions through the serverless-step-functions plugin. Define your state machines directly in serverless.yml alongside your Lambda functions:

serverless.yml
service: my-workflow

provider:
  name: aws
  runtime: nodejs22.x

plugins:
  - serverless-step-functions

functions:
  processOrder:
    handler: handler.processOrder
  chargePayment:
    handler: handler.chargePayment
  sendConfirmation:
    handler: handler.sendConfirmation

stepFunctions:
  stateMachines:
    orderWorkflow:
      name: OrderProcessingWorkflow
      definition:
        StartAt: ProcessOrder
        States:
          ProcessOrder:
            Type: Task
            Resource:
              Fn::GetAtt: [processOrder, Arn]
            Next: ChargePayment
          ChargePayment:
            Type: Task
            Resource:
              Fn::GetAtt: [chargePayment, Arn]
            Retry:
              - ErrorEquals: [States.TaskFailed]
                IntervalSeconds: 3
                MaxAttempts: 3
                BackoffRate: 2
            Next: SendConfirmation
          SendConfirmation:
            Type: Task
            Resource:
              Fn::GetAtt: [sendConfirmation, Arn]
            End: true

The plugin handles all CloudFormation resource creation: state machine definitions, IAM roles, Lambda permissions, and CloudWatch log groups. It also supports Express workflows, API Gateway triggers, EventBridge schedules, and nested state machines.

Benefits of Step Functions

Orchestrate Complex Workflows Without Code

Coordinating ten interconnected serverless functions manually creates exponential complexity. Step Functions handles sequencing, retries, and error routing declaratively. You define what should happen, not how to manage it. The visual editor makes it easy to design, understand, and modify workflows that would otherwise require hundreds of lines of orchestration code.

Manage State Across Stateless Functions

Lambda functions are stateless by design. Passing data between them typically requires setting up queues, databases, or custom middleware. Step Functions provides built-in state management: each step's output automatically becomes the next step's input. You can filter, transform, and merge data between states using JSONPath expressions without any infrastructure setup.

Separate Workflow Logic from Business Logic

Embedding orchestration logic inside application code couples your business logic to execution flow. Step Functions moves workflow concerns (ordering, branching, retries, timeouts) into a separate declaration. Each Lambda function focuses on one task, stays small, and remains independently testable.

Scale with Parallel and Distributed Execution

Parallel states run multiple branches simultaneously, and Map states process arrays of items concurrently. Distributed Map mode can process millions of items from S3 in parallel with up to 10,000 concurrent child executions. Performance scales alongside your workload without any custom threading or queue management.

Trade-offs & Limitations

Step Functions is the right choice for most serverless orchestration, but these constraints are worth understanding upfront.

Amazon States Language complexity

ASL is a JSON-based, proprietary language optimized for machines, not humans. The syntax is verbose, and writing complex branching or error-handling logic requires significant effort. The learning curve is steep, and the skills do not transfer outside AWS.

Vendor lock-in

State machine definitions are written in a proprietary AWS format. Migrating to another cloud provider means rewriting your orchestration layer entirely. If multi-cloud portability matters, consider open standards like Temporal or Apache Airflow.

25,000 event execution history limit

Standard workflows cap at 25,000 events per execution. Long-running workflows with many iterations can hit this ceiling. The workaround is splitting into child workflows, which adds architectural complexity.

256 KB payload limit per state

Data passed between states cannot exceed 256 KB. For larger payloads, store data in S3 or DynamoDB and pass references. This adds latency and complexity to data-heavy workflows.

Cost at high transition volumes

Standard workflows charge per state transition. A workflow with 20 states running 1M times per month costs $500 in Step Functions alone. For high-volume use cases, Express workflows or direct Lambda-to-Lambda patterns may be more cost-effective.

Step Functions Pricing

Pricing differs significantly between Standard and Express workflows. Standard charges per state transition; Express charges per request and duration.

Free Tier (Never Expires)

4,000

Standard state transitions / month

Permanent

Free tier never expires (not 12-month limited)

ServicePrice
Standard state transitions$0.025 per 1,000 transitions
Express requests$1.00 per 1M requests
Express duration (first 1,000 hrs)$0.0600 per GB-hour
Express duration (next 4,000 hrs)$0.0400 per GB-hour
Express duration (over 5,000 hrs)$0.0267 per GB-hour

Example: Image processing pipeline, 100,000 images/month

10 transitions per image x 100,000 executions = 1,000,000 transitions

Plus 10% retries: 1,100,000 transitions total

1,100 x $0.025 = $27.50/month (Standard)

Combined with Lambda compute (~$600) and data transfer (~$100), total monthly cost is approximately $727.50. Express workflows would reduce the Step Functions portion to under $2.

See the official Step Functions pricing page for current regional rates.

When to Use Step Functions

Use Step Functions when you need to coordinate multiple AWS services into a reliable workflow, want built-in retry and error handling, need visibility into execution progress, or are building ETL pipelines, order processing, user onboarding flows, or any multi-step process where steps depend on each other.

Consider alternatives when your workflow is a simple sequence of two or three Lambda functions (direct invocation or SQS may be simpler), you need sub-millisecond latency between steps (the state machine adds overhead), or you are processing extremely high volumes where per-transition costs become prohibitive. For simple scheduled tasks, EventBridge Scheduler with a single Lambda function is more appropriate. For complex data pipelines outside AWS, consider Apache Airflow or Temporal.

Learn More

Documentation

  • serverless-step-functions Plugin
  • Managing Step Functions with Serverless
  • AWS Lambda Guide
  • AWS Step Functions Docs

Related Guides

  • Amazon API Gateway
  • Amazon DynamoDB
  • Amazon EventBridge
  • Browse all guides

Step Functions Limits

Key quotas to plan around. Most soft limits can be raised through AWS Support.

LimitValue
Execution history25,000 events max (hard limit)
Input/output payload256 KB per state
Maximum request size1 MB
State machines per account10,000 (adjustable)
Concurrent executions (Standard)1,000,000 (adjustable)
API rate (StartExecution, Standard)2,000 requests/second
Tags per resource50
Execution timeout (Standard)1 year
Execution timeout (Express)5 minutes
Concurrent executions (Express)100,000 (adjustable)

Step Functions Alternatives

Step Functions is not the only way to orchestrate serverless workflows. These alternatives may be a better fit depending on your requirements.

AWS Lambda (direct invocation)

For simple sequential tasks, invoke the next Lambda function directly from code. No orchestration layer needed. Works well for two or three steps with minimal branching.

Amazon SQS

Decouple services with message queues for high-throughput async processing. Best when you need reliable delivery without coordinating execution order across many steps.

Amazon EventBridge

Rule-based event routing for loosely coupled, event-driven architectures. Best when services react to events independently rather than following a prescribed sequence.

Apache Airflow (Amazon MWAA)

DAG-based workflow orchestration. Best for data engineering pipelines with complex dependency graphs, scheduling, and integration with non-AWS systems.

Temporal

Open-source workflow engine with durable execution semantics. Best for complex, long-running business processes where you want to avoid vendor lock-in and write workflow logic in application code.

Step Functions FAQ

Common questions about AWS Step Functions.

What is AWS Step Functions?
AWS Step Functions is a fully managed AWS service that lets you coordinate multiple AWS services into serverless workflows using visual state machines. You define steps, transitions, retries, and error handling in a declarative format, and Step Functions manages execution for you.
What is the difference between Standard and Express workflows?
Standard workflows run for up to one year, support exactly-once execution, and cost $0.025 per 1,000 state transitions. Express workflows run for up to five minutes, support at-least-once (async) or at-most-once (sync) execution, and are priced by number of requests and duration, making them far cheaper for high-volume, short-duration workloads.
How much do Step Functions cost?
Standard workflows cost $0.025 per 1,000 state transitions, with a permanent free tier of 4,000 transitions per month. Express workflows cost $1.00 per 1M requests plus compute duration charges. For high-volume workloads, Express workflows are significantly cheaper.
What is the Amazon States Language?
The Amazon States Language (ASL) is a JSON-based, declarative language used to define Step Functions state machines. It describes states (Task, Choice, Parallel, Map, Wait, Pass, Succeed, Fail) and the transitions between them. While powerful, the syntax is designed for machine readability and has a learning curve.
What is the maximum execution history for a workflow?
Standard workflows support up to 25,000 events in their execution history. If your workflow approaches this limit, split it into child workflows using nested state machines. Express workflows do not have this limit since they are designed for short-duration runs.
Can Step Functions call services other than Lambda?
Yes. Step Functions has direct integrations with over 200 AWS services, including DynamoDB, SQS, SNS, ECS, Batch, Glue, SageMaker, and EventBridge. You can also call any HTTP endpoint through API Gateway or Lambda wrapper functions.
How does Step Functions handle errors and retries?
Step Functions provides built-in error handling through Retry and Catch fields on Task and Parallel states. You can define retry intervals, backoff rates, and maximum attempts. Catch blocks route failures to recovery or cleanup states, giving you fine-grained control without writing retry logic in your application code.

Build Your First Serverless Workflow

Deploy a Step Functions state machine with Lambda in minutes using the Serverless Framework.

Get Started FreeView Documentation