FM Integration Week 1 · Saturday

Day 6: RAG Architecture - Retrieval, Chunking, Embeddings

Learning Objectives

- Compare chunking strategies (fixed, semantic, hierarchical) and when to use each
- Select embedding models (Titan Embeddings v2, Cohere Embed v3, Nova Multimodal)
- Understand Retrieve vs RetrieveAndGenerate APIs
- Implement query expansion, decomposition, and reranking
- Use MCP for consistent vector query access patterns

Tasks

0/5 completed

Read45m
Bedrock Knowledge Bases - How It Works
The authoritative reference for chunking, embedding, retrieval, and reranking.
Blog25m
Advanced Parsing, Chunking, and Query Reformulation in KB
Deep dive into advanced chunking and query reformulation options.
Read20m
Bedrock Reranker Models Documentation
How reranker models rescore retrieved chunks for better relevance.
Blog15m
Amazon Nova Multimodal Embeddings
Crossmodal search: text + image + video + audio embeddings in one model.
Hands-on120m
Bedrock RAG Workshop - Chunking and Embedding Notebooks
Work through different chunking strategies and embedding model selections.

Exam Skills

Write your understanding, then reveal the reference answer.

0/6 reviewed

Hands-On Lab

Build real muscle memory with these activities.

intermediate 60 min

Build an End-to-End RAG Pipeline with Knowledge Bases

Create a complete RAG pipeline from document upload to retrieval-augmented generation.

1 Create an S3 bucket and upload 5-10 PDF documents on a specific topic
2 Create a Bedrock Knowledge Base with OpenSearch Serverless, Titan Embeddings v2, and fixed-size chunking (300 tokens, 20% overlap)
3 Sync the data source and wait for indexing to complete
4 Test retrieval with the Retrieve API — try different queries and check chunk relevance
5 Switch to RetrieveAndGenerate API and compare the augmented response quality

Open Lab

intermediate 45 min

Experiment with Chunking Strategies

Compare fixed-size, semantic, and hierarchical chunking on the same document set.

1 Using your existing KB, note the retrieval quality with fixed-size chunking
2 Create a second KB with the same data source but semantic chunking enabled
3 Create a third KB with hierarchical chunking (parent: 1500 tokens, child: 300 tokens)
4 Run the same 3 test queries against all 3 KBs and compare chunk relevance scores
5 Document which strategy works best for your document type and why

Open Lab

Scenarios

Think through each scenario before revealing the answer.

D1: FM IntegrationHard

#1B

Multilingual RAG Pipeline

A multinational company needs their RAG system to handle queries in English, German, and Japanese. Documents are in all three languages. The system must return results in the user's language. Design the retrieval pipeline.

Think First

•Which embedding model supports 100+ languages?
•How do you detect the user's query language?
•Should you use metadata filtering for language?

D1: FM IntegrationHard

Legal RAG Chunking Fix

A legal firm's RAG system retrieves contract clauses, but users complain that retrieved chunks often split mid-clause, producing incoherent context. The chunking uses 512-token fixed-size. How do you fix this?

Think First

•Why is fixed-size chunking bad for legal documents?
•Which chunking strategy respects document structure?
•How do reranker models help after retrieval?

Practice Questions

15 questions across 3 difficulty levels.

Day 6: RAG Architecture - Retrieval, Chunking, Embeddings

Learning Objectives

Tasks

Tasks

Exam Skills

Hands-On Lab

Build an End-to-End RAG Pipeline with Knowledge Bases

Experiment with Chunking Strategies

Scenarios

Multilingual RAG Pipeline

Legal RAG Chunking Fix

Practice Questions

Foundation

Applied

Expert

Further Reading

Advanced Parsing, Chunking, and Query Reformulation in KB

End-to-End RAG with CDK

Advanced RAG with Terraform: Chunking, Hybrid Search, Reranking

re:Invent 2024 — Build Scalable RAG with Bedrock KB (AIM305)

Implementing RAG with Bedrock and Lambda