Serverless Scaling: Deploying Strands + MCP on AWS
Written by Om-Shree-0709 on .
- 1. Introduction
- 2. Deployment Options Overview
- 3. Native AWS Lambda (Stateless MCP)
- 4. Lambda + Web Adapter (Containerized MCP)
- 5. AWS Fargate (Containerized MCP)
- 6. Choosing the Right Model
- 7. Key Considerations
- 8. Next Steps
- References
In this Article, we'll explore how to deploy a Strands Agent connected to an MCP server using serverless AWS services. We'll cover three deployment models—Lambda (native & web adapter) and Fargate—and compare their pros, limitations, and recommended scenarios.
1. Introduction
Strands Agents SDK provides a convenient model-driven loop, while MCP enables dynamic tool invocation. Deploying them on AWS serverless platforms allows you to build scalable, maintainable agents without managing servers1.
2. Deployment Options Overview
Option | Benefits | Limitations |
AWS Lambda (Native) | Fast startup, easy CI/CD, unified observability | Max 15-minute execution, no streaming support |
Lambda with Web Adapter | Preserve web frameworks, serverless pay-per-use | Slower cold start (1–3 s), added complexity |
AWS Fargate (ECS/EKS) | Long-running containers, streaming support | Higher cost, container lifecycle management |
3. Native AWS Lambda (Stateless MCP)
Approach: Package your MCP server as a Lambda function using FastMCP with HTTP transport3.
How to Deploy:
Optionally, expose it via API Gateway:
Benefits:
Fast cold starts
Simplified deployment for stateless tools
Integrated with AWS native monitoring
Limitations:
No streaming support
15-minute execution timeout
No persistent state between invocations
4. Lambda + Web Adapter (Containerized MCP)
Approach: Package MCP within a web framework (FastAPI, Flask, or Express) inside a Lambda Web Adapter container. This enables web-like behavior within Lambda.
Dockerfile:
app.py Example:
Deploy via AWS CDK Example:
Benefits:
Allows existing web frameworks
Flexible HTTP routing via API Gateway
Serverless, pay-per-use
Limitations:
Added container and adapter complexity
Cold start delays (1–3 seconds)
Still no native streaming support
5. AWS Fargate (Containerized MCP)
Approach: Fully containerize the MCP server and deploy on AWS Fargate via ECS or EKS. Suitable for agents requiring persistent sessions and streaming2.
Dockerfile:
mcp_server.py Example:
CDK Deployment Example:
Benefits:
Full streaming and persistent workloads supported
Scalability with ECS or EKS
Suitable for production-grade deployments
Limitations:
More costly than Lambda for low-usage patterns
Slightly longer deploy cycles
Requires container orchestration setup
6. Choosing the Right Model
Use Native Lambda for testing, short-lived tasks, low traffic.
Add Web Adapter when integrating with web apps or frameworks.
Choose Fargate for streaming, persistent workloads, or higher performance needs43.
7. Key Considerations
Security & Observability: Lambda and Fargate integrate with X-Ray, CloudWatch, IAM, and OpenTelemetry23.
Cost & Scaling: Lambda is cost-effective for burst workloads; Fargate favors steady or stream-heavy usage4.
Developer Experience: Native Lambda offers fastest dev loop; Fargate supports production parity and long-lived workflows3.
8. Next Steps
Start with a proof-of-concept using native Lambda + FastMCP.
Expand to include frameworks via Web Adapter for structured web API support.
Move to a containerized MCP + agent deployment on Fargate via Strands’ sample projects1.
References
Written by Om-Shree-0709 (@Om-Shree-0709)