Why Bedrock Changes the AI Equation for AWS Customers
Amazon Bedrock has changed the calculus for AI on AWS. You no longer need to run your own model infrastructure, negotiate GPU spot pricing, or navigate the complexity of deploying and serving open-source LLMs. Bedrock gives you API access to foundation models — Claude, Titan, Llama, Mistral — with AWS-native security, IAM, VPC integration, and CloudWatch observability baked in.
But going from a Bedrock API call to a production AI agent that handles real business workflows is still a significant engineering challenge. Here is how we approach it.
The Architecture Stack
For production Bedrock agents, we typically use:
- Amazon Bedrock Agents — orchestration layer that manages multi-step reasoning and tool calling
- Bedrock Knowledge Bases — RAG (retrieval-augmented generation) over your proprietary data, backed by OpenSearch Serverless or Aurora PostgreSQL with pgvector
- Lambda functions — action groups that give the agent capability to interact with external systems (APIs, databases, internal tools)
- LangGraph — for complex multi-agent workflows where you need explicit state management and routing logic that Bedrock Agents alone cannot handle
- DynamoDB — session state and conversation history
- CloudWatch + X-Ray — observability across every agent step
The Decision That Matters Most: Bedrock Agents vs LangGraph
Bedrock Agents handles the common case well — a single agent with a knowledge base and a set of tools that needs to reason across a few steps. If your use case fits this pattern, use Bedrock Agents. The managed orchestration saves significant engineering effort and the AWS-native integration is a genuine advantage.
Use LangGraph when:
- You need multiple specialised agents working together with explicit routing logic
- Your workflow has conditional branches that depend on previous agent outputs
- You need fine-grained control over retry logic, timeout handling, and partial failure states
- You are building a human-in-the-loop workflow where an agent needs to pause and wait for approval
We often combine both — Bedrock Agents for the individual specialised agents, LangGraph as the orchestrator that routes between them.
Knowledge Bases: The RAG Implementation Details
Most production AI agents need to reason over proprietary data. Bedrock Knowledge Bases handles the ingestion pipeline (chunking, embedding, indexing) automatically, but there are decisions you need to make explicitly:
- Chunk size: Smaller chunks (256-512 tokens) work better for factual retrieval. Larger chunks (1024-2048 tokens) work better when the agent needs surrounding context to answer correctly.
- Embedding model: Titan Embeddings v2 is our default. It is cost-effective and well-optimised for enterprise content.
- Retrieval strategy: Hybrid search (semantic + keyword) consistently outperforms pure vector search on enterprise content. Enable it.
- Re-ranking: For high-stakes use cases, adding a Cohere Rerank model after initial retrieval meaningfully improves answer quality.
Security and Compliance Considerations
Bedrock inherits AWS IAM, so your existing least-privilege policies apply. Key considerations:
- Use IAM resource policies on Bedrock models to restrict which roles can invoke which models
- Enable Bedrock Guardrails for content filtering, topic blocking, and PII detection on production agents
- Never pass sensitive data (PII, credentials) through the agent prompt — use Lambda action groups to handle sensitive lookups server-side
- Log all Bedrock invocations to CloudWatch. Enable model invocation logging at the account level for compliance audit trails
A Real Example: Internal Document Q&A Agent
One of our clients — a 150-person logistics company — needed their operations team to query a 10-year archive of SOPs, contracts, and regulatory documents without opening a support ticket every time. We built:
- A Bedrock Knowledge Base ingesting 4,000 documents from S3, chunked at 512 tokens with hybrid search
- A Bedrock Agent with Claude 3 Sonnet as the model, connected to the Knowledge Base and a Lambda action group that could fetch real-time shipment status from their internal API
- A simple React interface authenticated via Cognito, with the agent backend on Lambda behind API Gateway
The agent handles 200+ queries per day. Operations ticket volume dropped 60% in the first month. Total infrastructure cost: under $400/month.
Getting Started
The barrier to production Bedrock agents has never been lower. The main investment is in the design work — defining what the agent should do, what data it should access, and what tools it needs. The implementation, when done correctly, follows from that design cleanly.
If you are evaluating Bedrock for an internal use case or a customer-facing product, get in touch. We can scope the architecture in a single session.
