AIP-C01 Study Hub
Implementation Week 2 · Tuesday

Day 9: Deployment Strategies + Enterprise Integration

Learning Objectives

  • - Design model cascading patterns (small -> large based on complexity)
  • - Understand container-based LLM deployment (ECS/EKS with GPU)
  • - Know edge/hybrid options (Outposts, Wavelength, Lambda@Edge)
  • - Implement WebSocket streaming with API Gateway + Lambda + Bedrock
  • - Apply the AI Gateway pattern for rate limiting and access control

Tasks

Tasks

0/5 completed
  • Read30m

    GenAI Application Builder Architecture Overview

    End-to-end reference architecture for production GenAI applications.

  • Blog25m

    Building an AI Gateway to Bedrock with API Gateway

    Rate limiting, access control, usage tracking for GenAI APIs. Enterprise pattern.

  • Blog25m

    Serverless Generative AI Architectural Patterns

    API Gateway + Lambda + Bedrock foundational patterns.

  • Blog25m

    Orchestrate GenAI Workflows with Bedrock and Step Functions

    Parallel API calls, error handling, complex orchestration.

  • Watch20m

    AWS Step Functions for Generative AI

    Video overview of Step Functions integration with GenAI services.

Exam Skills

Write your understanding, then reveal the reference answer.

0/11 reviewed

Hands-On Lab

Build real muscle memory with these activities.

intermediate 45 min

Set Up API Gateway → Lambda → Bedrock Pattern

Build the foundational serverless pattern for exposing Bedrock as a REST API.

  1. 1 Create a Lambda function (Python) that calls bedrock-runtime InvokeModel with Claude
  2. 2 Add the bedrock:InvokeModel permission to the Lambda execution role
  3. 3 Create a REST API in API Gateway with a POST /chat resource
  4. 4 Configure the Lambda proxy integration
  5. 5 Test with curl: curl -X POST <api-url>/chat -d '{"prompt": "Hello"}' and verify the Bedrock response
Open Lab
intermediate 30 min

Deploy the AI Gateway Pattern with Rate Limiting

Extend the basic API Gateway pattern with usage plans and API keys for rate limiting.

  1. 1 In API Gateway, create a Usage Plan with rate limit: 100 requests/second, burst: 200
  2. 2 Create an API key and associate it with the usage plan
  3. 3 Enable API key requirement on the /chat resource
  4. 4 Test that requests without the API key return 403
  5. 5 Test that requests exceeding the rate limit return 429 (Too Many Requests)
Open Lab

Scenarios

Think through each scenario before revealing the answer.

D2: ImplementationHard
#9

Model Cascading Cost Optimization

An enterprise has 10,000 daily FM API calls. 70% are simple FAQ lookups, 20% are moderate analysis, 10% are complex reasoning tasks. Current cost is $X/month using Claude Sonnet for everything. How do you optimize?
Think First
  • What is the model cascading pattern?
  • Which model handles simple queries cheaply?
  • How do you add caching for repeated queries?
  • What Bedrock feature automates model routing?

Practice Questions

5 questions across 3 difficulty levels.

Further Reading

Go deeper into today's topics.