AIP-C01 Study Hub
FM Integration Week 1 · Saturday

Day 6: RAG Architecture - Retrieval, Chunking, Embeddings

Learning Objectives

  • - Compare chunking strategies (fixed, semantic, hierarchical) and when to use each
  • - Select embedding models (Titan Embeddings v2, Cohere Embed v3, Nova Multimodal)
  • - Understand Retrieve vs RetrieveAndGenerate APIs
  • - Implement query expansion, decomposition, and reranking
  • - Use MCP for consistent vector query access patterns

Tasks

Tasks

0/5 completed
  • Read45m

    Bedrock Knowledge Bases - How It Works

    The authoritative reference for chunking, embedding, retrieval, and reranking.

  • Blog25m

    Advanced Parsing, Chunking, and Query Reformulation in KB

    Deep dive into advanced chunking and query reformulation options.

  • Read20m

    Bedrock Reranker Models Documentation

    How reranker models rescore retrieved chunks for better relevance.

  • Blog15m

    Amazon Nova Multimodal Embeddings

    Crossmodal search: text + image + video + audio embeddings in one model.

  • Hands-on120m

    Bedrock RAG Workshop - Chunking and Embedding Notebooks

    Work through different chunking strategies and embedding model selections.

Exam Skills

Write your understanding, then reveal the reference answer.

0/6 reviewed

Hands-On Lab

Build real muscle memory with these activities.

intermediate 60 min

Build an End-to-End RAG Pipeline with Knowledge Bases

Create a complete RAG pipeline from document upload to retrieval-augmented generation.

  1. 1 Create an S3 bucket and upload 5-10 PDF documents on a specific topic
  2. 2 Create a Bedrock Knowledge Base with OpenSearch Serverless, Titan Embeddings v2, and fixed-size chunking (300 tokens, 20% overlap)
  3. 3 Sync the data source and wait for indexing to complete
  4. 4 Test retrieval with the Retrieve API — try different queries and check chunk relevance
  5. 5 Switch to RetrieveAndGenerate API and compare the augmented response quality
Open Lab
intermediate 45 min

Experiment with Chunking Strategies

Compare fixed-size, semantic, and hierarchical chunking on the same document set.

  1. 1 Using your existing KB, note the retrieval quality with fixed-size chunking
  2. 2 Create a second KB with the same data source but semantic chunking enabled
  3. 3 Create a third KB with hierarchical chunking (parent: 1500 tokens, child: 300 tokens)
  4. 4 Run the same 3 test queries against all 3 KBs and compare chunk relevance scores
  5. 5 Document which strategy works best for your document type and why
Open Lab

Scenarios

Think through each scenario before revealing the answer.

D1: FM IntegrationHard
#1B

Multilingual RAG Pipeline

A multinational company needs their RAG system to handle queries in English, German, and Japanese. Documents are in all three languages. The system must return results in the user's language. Design the retrieval pipeline.
Think First
  • Which embedding model supports 100+ languages?
  • How do you detect the user's query language?
  • Should you use metadata filtering for language?
D1: FM IntegrationHard
#6

Legal RAG Chunking Fix

A legal firm's RAG system retrieves contract clauses, but users complain that retrieved chunks often split mid-clause, producing incoherent context. The chunking uses 512-token fixed-size. How do you fix this?
Think First
  • Why is fixed-size chunking bad for legal documents?
  • Which chunking strategy respects document structure?
  • How do reranker models help after retrieval?

Practice Questions

15 questions across 3 difficulty levels.

Further Reading

Go deeper into today's topics.