Dev Unit 6 (continued): RAG & Multi-Tool Agents

Adding knowledge and chaining tools

Agentic Development Course

What We'll Cover Today

Quick Recap What we built last time

RAG & Embeddings Give your agent a knowledge base

Multi-Tool Agents Chaining tools together

Conversation Memory Multi-turn context

Beyond LangChain Other frameworks

Production Considerations Costs, debugging, security

Your Agent Project Adding RAG and finishing up

What You Should Be Able to DO After Today

1. Implement RAG — embed documents, vector search, semantic retrieval

2. Build a multi-tool chatbot that picks the right tool for each question

3. Add conversation memory to maintain context across turns

4. Evaluate when to use an agent vs. a single LLM call

Today you complete the picture: agents with knowledge, multi-tool reasoning, and memory.

Quick Recap: What We Built Last Session

You now know how to:

Define tools with tool() — function + name/description + Zod schema
Create a ReAct agent with createReactAgent from LangChain
Build a calculator tool and a web search tool
Handle errors in tools (return messages, never throw)

Today: We add a knowledge base (RAG), chain multiple tools together, and complete your homework assignment.

RAG & Embeddings

Part 1

Give your agent a knowledge base

Tool 3: RAG (In-Memory Vector Search)

What is RAG? Retrieval-Augmented Generation

Why? Give the agent access to YOUR documents — company docs, course notes, API references, anything.

What Are Embeddings?

Constellation of meaning - words as stars

Embeddings convert text into vectors (arrays of numbers) that capture meaning:

Embeddings: similar text clusters together in vector space

Embeddings in Code


"king"    → [0.21, -0.45, 0.89, 0.12, ...]
"queen"   → [0.19, -0.42, 0.91, 0.15, ...]   ← similar!
"banana"  → [0.82, 0.33, -0.11, 0.67, ...]   ← different!

Key insight: Similar meanings → similar vectors → we can search by meaning, not just keywords.


import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",  // Fast and cheap
});

const vector = await embeddings.embedQuery("What is photosynthesis?");
// → [0.021, -0.045, 0.089, ...] (1536 numbers)

Note: Anthropic doesn't offer an embeddings model, so you'll need an OpenAI API key for this part regardless of which chat model you use. Alternatively, look into Voyage AI (recommended by Anthropic) or a free local option.

Building the Vector Store


import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

// Create in-memory vector store
const vectorStore = new MemoryVectorStore(embeddings);

// Add your documents
await vectorStore.addDocuments([
  {
    pageContent: "Our API rate limit is 100 requests per minute...",
    metadata: { source: "api-docs.md", topic: "rate-limits" },
  },
  {
    pageContent: "Authentication uses JWT tokens with 24h expiry...",
    metadata: { source: "auth-docs.md", topic: "authentication" },
  },
  // ... add as many documents as you want
]);

Loading Documents for Your Homework

Option 1: Create a documents.ts file with your content as an array of objects

Option 2: Use LangChain's DirectoryLoader to load files from a docs/ folder:


import { DirectoryLoader } from "langchain/document_loaders/fs/directory";
import { TextLoader } from "langchain/document_loaders/fs/text";

const loader = new DirectoryLoader("./docs", {
  ".txt": (path) => new TextLoader(path),
});
const docs = await loader.load();
await vectorStore.addDocuments(docs);

Either approach works — pick whichever fits your document source.

The RAG Tool


const knowledgeBase = tool(
  async ({ query }) => {
    // Semantic search — finds documents by MEANING
    const results = await vectorStore.similaritySearch(query, 3);

    if (results.length === 0) {
      return "No relevant documents found.";
    }

    return results
      .map((doc, i) =>
        `[${i + 1}] (Source: ${doc.metadata.source})\n${doc.pageContent}`
      )
      .join("\n\n");
  },
  {
    name: "knowledge_base",
    description:
      "Search the documentation knowledge base using semantic " +
      "search. Use this to find information from our docs " +
      "about APIs, authentication, configuration, etc.",
    schema: z.object({
      query: z.string().describe(
        "Natural language query about the documentation"
      ),
    }),
  }
);

RAG: Why In-Memory?

Approach	Best For	Trade-offs
In-memory (today)	Prototypes, small datasets (<10K docs)	Fast, no setup; lost on restart
Pinecone / Weaviate	Production, large datasets	Persistent, scalable; costs money
ChromaDB	Local development	Persistent, free; single machine
PostgreSQL + pgvector	If you already use Postgres	Integrated; requires Postgres

For homework: In-memory is perfect. Load your docs at startup, search them during conversation.

In Practice

Multi-Tool Agents

Part 2

Chaining tools together

Multi-Tool Agent Architecture

The LLM reads tool descriptions and decides which tool(s) to call for each question.

Multi-Tool Reasoning in Action

Question: "How much does the starter plan cost per year?"

Multi-tool chain: RAG → Calculator → Final Answer

The agent chained two tools together — RAG for the fact, calculator for the math. You didn't have to tell it to do this.

Multi-Tool Reasoning: Another Example

A campus event budget app:


Question: "How much would it cost to cater 3 club events this month?"

Step 1: knowledge_base("catering pricing options")
  → "Basic pizza package: $85/event. Sandwich platter: $120/event."

Step 2: calculator("85 * 3")
  → "255"

Final answer: "The basic pizza package for 3 events would be $255.
  The sandwich platter option would be $360 (3 × $120)."

The agent chained RAG (pricing knowledge) + calculator (cost math) — same multi-tool pattern.

Conversation Memory

Part 3

Multi-turn context

Adding Conversation Memory

Conversation memory: message history grows across turns


// Maintain message history across turns
let messageHistory = [];

async function chat(userMessage) {
  messageHistory.push({
    role: "user",
    content: userMessage,
  });

  const result = await agent.invoke({
    messages: messageHistory,
  });

  const assistantMessage =
    result.messages[result.messages.length - 1];

  messageHistory.push({
    role: "assistant",
    content: assistantMessage.content,
  });

  return assistantMessage.content;
}

// Multi-turn conversation
await chat("What does the starter plan cost?");
await chat("And what's that per year?");  // Remembers context!

Conversation Memory: Caveats

Note: This is a naive implementation — the message array grows without limit. In production, you'd truncate or summarize history to stay within context window limits.

Note: This simplified version only stores the final response. For tool-using agents, you may want to preserve the full message history including tool calls for better context.

In Practice

Part 4

Beyond LangChain

Other agent frameworks

Beyond LangChain: Other Agent Frameworks

Framework	Language	Best For
LangChain (this course)	Python + JS/TS	Broadest ecosystem, most tutorials
Mastra	TypeScript only	TS-native DX, built-in production features
CrewAI	Python	Multi-agent teams with defined roles
OpenAI Agents SDK	Python	Simple agents, OpenAI-only (vendor locked)

The key insight: They all implement the same ReAct loop under the hood. Learn one well, understand the patterns — switching frameworks is straightforward.

For this course: We use LangChain (broadest ecosystem). Explore others on your own.

Production Considerations

Part 5

Costs, debugging, security

Token Costs and Iteration Limits

Every iteration = another LLM API call = more tokens = more cost

Scenario	Typical Iterations	Cost Impact
Simple calculation	1-2	Low
Web search + answer	2-3	Medium
Multi-tool chain	3-5	Higher
Agent stuck in loop	10+	Expensive!

Control the loop:


const agent = createReactAgent({
  model: new ChatAnthropic({ model: "claude-haiku-3-5" }),
  tools: tools,
});

// Limit iterations when invoking
const result = await agent.invoke(
  { messages },
  { recursionLimit: 10 }  // Prevent infinite loops
);

Security Considerations

From Dev Unit 4 — still applies here:

Never hardcode API keys — use environment variables
Validate tool inputs — especially for calculator
Sanitize user input — prevent prompt injection through tools
Audit tool outputs — don't blindly trust web search results


// ❌ Dangerous — eval executes arbitrary code
const result = eval(userExpression);

// ❌ Function() is also unsafe for untrusted input (as we noted last session)

// ✅ Use a proper math library in production
import { evaluate } from "mathjs";
const result = evaluate(expression);  // Only does math, nothing else

Troubleshooting: Common Errors (Reference Handout)

Error	Cause	Fix
`API_KEY not set`	Missing env variable	`export ANTHROPIC_API_KEY="..."` or `export OPENAI_API_KEY="sk-..."`
`Tool not found`	Typo in tool name	Check tool `name` matches exactly
Agent returns text instead of calling tool	Bad tool description	Make description more specific about WHEN to use
`RateLimitError`	Too many API calls	Add delays, use `claude-haiku-3-5`, reduce `maxResults`
RAG returns no results	Documents not loaded	Verify `addDocuments()` was `await`ed
`TypeError: Cannot read properties`	Forgot `async`/`await`	Web and RAG tools MUST be `async`

Your Agent Project — Adding RAG and Finishing Up

Part 6 Your Target by the Deadline

Here's what we're working toward over the next two weeks:

A working agent with calculator and web search tools
Repo with context.md, PRD, and roadmap
A web UI in progress (or at minimum a working terminal interface)
Incremental git history showing your development process
Logging that shows which tool was called, arguments passed, and results returned

You have two weeks — focus on getting the agent working first, then layer in the infrastructure.

What to Add Now

Add a RAG tool:

Choose a document set (project docs, course notes, a library's docs, a company FAQ)
Create at least 5 documents with meaningful content
Use OpenAI embeddings (or Voyage AI) + in-memory vector store
Build a knowledge_base tool that returns relevant documents with source attribution

Add conversation memory:

Maintain a message history array across turns
Pass the full history to the agent on each invocation
Verify multi-turn conversations work (follow-up questions that reference earlier context)

Complete your web UI — your agent should be usable through a simple chat web page, not just a terminal.

Continue updating your roadmap and committing incrementally as you add these features.

Deliverables

1. A repo that demonstrates your development process — context.md, .gitignore, PRD, roadmap with phases checked off, logging that shows which tool was called, arguments passed, and results returned, incremental git history

2. Working agent — three tools (calculator, web search, RAG with 5+ real docs) with conversation memory, accessible through a web UI

3. README.md — explaining your agent, what tools it has, and how to run it

4. Demo — a 2-minute screen capture showing your web UI with at least 2-3 of your agent's tools/features

Stretch Goals (Optional)

1. Add streaming — show responses in real time in your web UI; dramatically improves the user experience

2. Add a 4th tool — file reader, database query, or API call to a service you use

3. Persistent vector store — use ChromaDB or a hosted vector DB so documents survive restarts

4. Connect to your project — Identify one feature in a project you've built that would benefit from an agent pattern (tool calling, RAG, or multi-step reasoning). Write a ~1 page proposal: what the agent would do, what tools it would need, and why it's better than a single LLM call.

Key Takeaways

RAG gives agents YOUR knowledge Embeddings + vector search + any document set

Multi-tool chaining is automatic The agent reasons about which tools to call and in what order

Agent frameworks share the same patterns LangChain, Mastra, CrewAI all implement the same ReAct loop

Not every feature needs an agent Set a clear hypothesis before building one

The full picture: Agents = LLM + Tools + Loop + Knowledge You can now build all of it

Resources

🛠 Anthropic Docs docs.anthropic.com 🔗 LangChain + Anthropic docs.langchain.com/oss/javascript/integrations/chat/anthropic 🔗 LangChain.js Docs js.langchain.com 🔗 LangChain Agents docs.langchain.com/oss/javascript/langchain/agents 📈 LangGraph (ReAct) langchain-ai.github.io/langgraphjs 🔍 Tavily (Web Search) tavily.com 🧠 OpenAI Embeddings platform.openai.com/docs/guides/embeddings 🛠 Mastra Docs mastra.ai/docs 📜 Zod (Schemas) zod.dev 📚 ReAct Paper arxiv.org/abs/2210.03629

Now go build something amazing.

You have two weeks — start today.

Agentic Development Course