RAG (Retrieval Augmented Generation)
RAG (Retrieval Augmented Generation) enables agents to search and retrieve relevant information from documents to provide more accurate and contextual responses.
RAG Types
Astreus supports two types of RAG systems:
- Vector RAG: Uses vector embeddings for semantic search
- Document RAG: Uses keyword-based search with full-text indexing
Creating RAG Systems
Use the RAG factory function to create RAG instances:
import { createAgent, createProvider, createMemory, createDatabase, createRAG } from '@astreus-ai/astreus';
async function setupAgentWithRAG() {
const db = await createDatabase();
const memory = await createMemory({ database: db });
const provider = createProvider({ type: 'openai', model: 'gpt-4o-mini' });
// Create Vector RAG for semantic search
const vectorRAG = await createRAG({
type: 'vector',
database: db,
provider: provider,
tableName: 'vector_documents',
chunkSize: 1000,
chunkOverlap: 200
});
// Create Document RAG for keyword search
const documentRAG = await createRAG({
type: 'document',
database: db,
tableName: 'text_documents',
chunkSize: 500
});
const agent = await createAgent({
name: 'RAGAgent',
provider: provider,
memory: memory,
systemPrompt: "You are a helpful assistant with access to document search capabilities."
});
return { agent, vectorRAG, documentRAG };
}
Vector RAG Configuration
Vector RAG uses embeddings for semantic similarity search:
// Vector RAG with external vector database
const vectorRAG = await createRAG({
type: 'vector',
database: mainDatabase,
vectorDatabase: {
type: 'postgres',
connectionString: 'postgresql://user:password@localhost:5432/vector_db'
},
provider: provider,
tableName: 'knowledge_base',
chunkSize: 1000,
chunkOverlap: 200
});
// Vector RAG with same database for vectors
const singleDbVectorRAG = await createRAG({
type: 'vector',
database: database, // Same database for both documents and vectors
provider: provider,
tableName: 'documents',
chunkSize: 800,
chunkOverlap: 150
});
Vector RAG Options
Option | Type | Description | Default |
---|---|---|---|
type | string | Must be 'vector' | Required |
database | DatabaseInstance | Main database for metadata | Required |
vectorDatabase | VectorDatabaseConfig | Optional separate vector database config | Same as database |
provider | ProviderInstance | Provider for embeddings | Required |
tableName | string | Base table name (creates tableName_documents, tableName_chunks) | 'rag' |
chunkSize | number | Size of text chunks | 1000 |
chunkOverlap | number | Overlap between chunks | 200 |
Document RAG Configuration
Document RAG uses full-text search for keyword matching:
const documentRAG = await createRAG({
type: 'document',
database: database,
tableName: 'documents',
chunkSize: 500,
searchFields: ['title', 'content', 'tags']
});
Document RAG Options
Option | Type | Description | Default |
---|---|---|---|
type | string | Must be 'document' | Required |
database | DatabaseInstance | Database for storage | Required |
tableName | string | Base table name (creates tableName_documents) | 'rag' |
chunkSize | number | Size of text chunks | 1000 |
searchFields | string[] | Fields to search in | ['content'] |
Adding Documents
Add documents to your RAG system for search:
// Add a single document
const docId = await vectorRAG.addDocument({
title: "Machine Learning Guide",
content: "Machine learning is a subset of artificial intelligence...",
metadata: {
author: "John Doe",
category: "AI",
tags: ["ml", "ai", "guide"],
publishDate: "2024-01-15"
}
});
console.log(`Document added with ID: ${docId}`);
// Add multiple documents
const documents = [
{
title: "Deep Learning Basics",
content: "Deep learning uses neural networks with multiple layers...",
metadata: { category: "AI", difficulty: "beginner" }
},
{
title: "Natural Language Processing",
content: "NLP enables computers to understand human language...",
metadata: { category: "AI", difficulty: "intermediate" }
}
];
for (const doc of documents) {
await vectorRAG.addDocument(doc);
}
Searching Documents
Perform searches to retrieve relevant information:
// Vector search (semantic similarity)
const vectorResults = await vectorRAG.search("How do neural networks learn?", 5);
console.log(`Found ${vectorResults.length} relevant documents`);
vectorResults.forEach(result => {
console.log(`Score: ${result.similarity}`);
console.log(`Title: ${result.metadata.title}`);
console.log(`Content: ${result.content.substring(0, 200)}...`);
});
// Document search (keyword matching)
const documentResults = await documentRAG.search("machine learning algorithms", 3);
documentResults.forEach(result => {
console.log(`Title: ${result.metadata.title}`);
console.log(`Relevance: ${result.similarity}`);
});
RAG Tools for Agents
RAG systems automatically create search tools that agents can use:
// Get RAG tools
const vectorTools = vectorRAG.createRAGTools();
const documentTools = documentRAG.createRAGTools();
// Add tools to agent
vectorTools.forEach(tool => agent.addTool(tool));
documentTools.forEach(tool => agent.addTool(tool));
// Agent can now use RAG in conversations
const response = await agent.chat({
message: "What are the latest developments in machine learning?",
sessionId: "research-session",
stream: true,
onChunk: (chunk) => console.log(chunk)
});
// The agent will automatically search documents and include relevant information
Advanced RAG Usage
Custom Search with Metadata Filtering
// Search with metadata filters (for Document RAG)
if ('searchByMetadata' in documentRAG) {
const filteredResults = await documentRAG.searchByMetadata({
category: "AI",
difficulty: "intermediate"
}, 5);
}
// Search within date range
const recentResults = await vectorRAG.search("latest AI trends", 10);
Hybrid Search (Vector + Document)
Combine both RAG types for comprehensive search:
class HybridRAGSearch {
constructor(private vectorRAG: any, private documentRAG: any) {}
async hybridSearch(query: string, limit: number = 5) {
// Perform both searches in parallel
const [vectorResults, documentResults] = await Promise.all([
this.vectorRAG.search(query, limit),
this.documentRAG.search(query, limit)
]);
// Combine and deduplicate results
const combinedResults = new Map();
// Add vector results with semantic scores
vectorResults.forEach(result => {
combinedResults.set(result.id, {
...result,
vectorScore: result.similarity,
documentScore: 0
});
});
// Add document results with keyword scores
documentResults.forEach(result => {
if (combinedResults.has(result.id)) {
combinedResults.get(result.id).documentScore = result.similarity;
} else {
combinedResults.set(result.id, {
...result,
vectorScore: 0,
documentScore: result.similarity
});
}
});
// Calculate hybrid scores and sort
const hybridResults = Array.from(combinedResults.values())
.map(result => ({
...result,
hybridScore: (result.vectorScore * 0.7) + (result.documentScore * 0.3)
}))
.sort((a, b) => b.hybridScore - a.hybridScore)
.slice(0, limit);
return hybridResults;
}
}
// Use hybrid search
const hybridSearch = new HybridRAGSearch(vectorRAG, documentRAG);
const hybridResults = await hybridSearch.hybridSearch("machine learning algorithms", 5);
Document Management
Manage documents in your RAG system:
// Get document by ID
const document = await vectorRAG.getDocument(docId);
console.log('Document:', document);
// Update document
await vectorRAG.updateDocument(docId, {
title: "Updated Machine Learning Guide",
content: "Updated content...",
metadata: {
...document.metadata,
lastModified: new Date().toISOString()
}
});
// Delete document
await vectorRAG.deleteDocument(docId);
// List all documents
const allDocuments = await vectorRAG.listDocuments({
limit: 20,
offset: 0
});
console.log(`Total documents: ${allDocuments.length}`);
RAG with External Vector Databases
For production use, you can use external vector databases:
// Example with separate vector database
const mainDB = await createDatabase({
type: 'postgresql',
host: 'localhost',
database: 'main_app'
});
const vectorDB = await createDatabase({
type: 'postgresql',
host: 'vector-db-host',
database: 'vector_store'
});
const vectorRAG = await createRAG({
type: 'vector',
database: mainDB, // Main database for metadata
vectorDatabase: vectorDB, // Separate vector database
provider: provider,
tableName: 'knowledge_base'
});
// This setup:
// - Stores document metadata in main database
// - Stores vectors in separate vector database
// - Optimizes performance and storage
Performance Optimization
Chunking Strategy
// Optimize chunking for your content type
const technicalDocsRAG = await createRAG({
type: 'vector',
database: db,
provider: provider,
chunkSize: 1500, // Larger chunks for technical content
chunkOverlap: 300 // More overlap for context preservation
});
const chatLogsRAG = await createRAG({
type: 'vector',
database: db,
provider: provider,
chunkSize: 500, // Smaller chunks for conversational content
chunkOverlap: 100 // Less overlap for distinct messages
});
Batch Document Processing
// Process multiple documents efficiently
async function batchAddDocuments(rag: any, documents: any[]) {
const batchSize = 10;
const results = [];
for (let i = 0; i < documents.length; i += batchSize) {
const batch = documents.slice(i, i + batchSize);
// Process batch in parallel
const batchPromises = batch.map(doc => rag.addDocument(doc));
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Optional: Add delay between batches to avoid rate limits
if (i + batchSize < documents.length) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
return results;
}
// Use batch processing
const documentIds = await batchAddDocuments(vectorRAG, largeDocumentSet);
Integration with Agents
Complete example of RAG-enabled agent:
import { createAgent, createProvider, createMemory, createDatabase, createRAG } from '@astreus-ai/astreus';
async function createRAGAgent() {
const db = await createDatabase();
const memory = await createMemory({ database: db });
const provider = createProvider({ type: 'openai', model: 'gpt-4o-mini' });
// Create RAG system
const rag = await createRAG({
type: 'vector',
database: db,
provider: provider,
tableName: 'knowledge_base',
chunkSize: 1000,
chunkOverlap: 200
});
// Create agent
const agent = await createAgent({
name: 'KnowledgeAgent',
provider: provider,
memory: memory,
systemPrompt: `You are a knowledgeable assistant with access to a comprehensive document database.
When answering questions, search for relevant information and cite your sources.`
});
// Add RAG tool to agent
const ragTool = rag.getTool();
agent.addTool(ragTool);
return { agent, rag };
}
// Use RAG-enabled agent
const { agent, rag } = await createRAGAgent();
// Add some documents
await rag.addDocument({
title: "Company Policies",
content: "Our company values include integrity, innovation, and customer focus...",
metadata: { type: "policy", department: "HR" }
});
// Agent can now search and reference documents
const response = await agent.chat({
message: "What are our company values?",
sessionId: "hr-inquiry",
stream: true,
onChunk: (chunk) => console.log(chunk)
});
Best Practices
- Chunking Strategy: Choose appropriate chunk sizes based on your content type
- Metadata Design: Include relevant metadata for filtering and organization
- Search Thresholds: Set appropriate similarity thresholds for vector search
- Document Updates: Regularly update documents to maintain accuracy
- Performance Monitoring: Monitor search performance and optimize as needed
- Hybrid Approach: Consider combining vector and document RAG for comprehensive search
- External Databases: Use separate vector databases for production workloads
Migration Notes
The RAG system has been optimized for better performance and database usage:
- External Vector Databases: When using separate vector databases, only minimal tables are created in the main database
- Improved Chunking: Better text chunking with configurable overlap
- Enhanced Search: More accurate similarity search with metadata filtering
- Tool Integration: Automatic tool creation for seamless agent integration
How is this guide?