Implementation Week 2 · Tuesday

Day 9: Deployment Strategies + Enterprise Integration

Learning Objectives

- Design model cascading patterns (small -> large based on complexity)
- Understand container-based LLM deployment (ECS/EKS with GPU)
- Know edge/hybrid options (Outposts, Wavelength, Lambda@Edge)
- Implement WebSocket streaming with API Gateway + Lambda + Bedrock
- Apply the AI Gateway pattern for rate limiting and access control

Tasks

0/5 completed

Read30m
GenAI Application Builder Architecture Overview
End-to-end reference architecture for production GenAI applications.
Blog25m
Building an AI Gateway to Bedrock with API Gateway
Rate limiting, access control, usage tracking for GenAI APIs. Enterprise pattern.
Blog25m
Serverless Generative AI Architectural Patterns
API Gateway + Lambda + Bedrock foundational patterns.
Blog25m
Orchestrate GenAI Workflows with Bedrock and Step Functions
Parallel API calls, error handling, complex orchestration.
Watch20m
AWS Step Functions for Generative AI
Video overview of Step Functions integration with GenAI services.

Exam Skills

Write your understanding, then reveal the reference answer.

0/11 reviewed

Hands-On Lab

Build real muscle memory with these activities.

intermediate 45 min

Set Up API Gateway → Lambda → Bedrock Pattern

Build the foundational serverless pattern for exposing Bedrock as a REST API.

1 Create a Lambda function (Python) that calls bedrock-runtime InvokeModel with Claude
2 Add the bedrock:InvokeModel permission to the Lambda execution role
3 Create a REST API in API Gateway with a POST /chat resource
4 Configure the Lambda proxy integration
5 Test with curl: curl -X POST <api-url>/chat -d '{"prompt": "Hello"}' and verify the Bedrock response

Open Lab

intermediate 30 min

Deploy the AI Gateway Pattern with Rate Limiting

Extend the basic API Gateway pattern with usage plans and API keys for rate limiting.

1 In API Gateway, create a Usage Plan with rate limit: 100 requests/second, burst: 200
2 Create an API key and associate it with the usage plan
3 Enable API key requirement on the /chat resource
4 Test that requests without the API key return 403
5 Test that requests exceeding the rate limit return 429 (Too Many Requests)

Open Lab

Scenarios

Think through each scenario before revealing the answer.

D2: ImplementationHard

Model Cascading Cost Optimization

An enterprise has 10,000 daily FM API calls. 70% are simple FAQ lookups, 20% are moderate analysis, 10% are complex reasoning tasks. Current cost is $X/month using Claude Sonnet for everything. How do you optimize?

Think First

•What is the model cascading pattern?
•Which model handles simple queries cheaply?
•How do you add caching for repeated queries?
•What Bedrock feature automates model routing?

Practice Questions

5 questions across 3 difficulty levels.

Day 9: Deployment Strategies + Enterprise Integration

Learning Objectives

Tasks

Tasks

Exam Skills

Hands-On Lab

Set Up API Gateway → Lambda → Bedrock Pattern

Deploy the AI Gateway Pattern with Rate Limiting

Scenarios

Model Cascading Cost Optimization

Practice Questions

Foundation

Further Reading

Serverless Generative AI Architectural Patterns

Building an AI Gateway to Bedrock with API Gateway

Orchestrate GenAI Workflows with Bedrock and Step Functions

Serverless Prompt Chaining — GitHub Repo

GenAI Application Builder Architecture Overview