Skip to main content

clawdesk-memory

Memory and retrieval-augmented generation (RAG) pipeline. Combines BM25 keyword search, vector embeddings, and hybrid re-ranking to provide contextual memory for agent conversations.

Dependencies

Internal: clawdesk-types, clawdesk-storage

External: tokio, serde, tracing, async-trait

Modules

ModuleDescription
bm25BM25 keyword-based search scoring
embeddingText embedding generation (via provider or local model)
hybridHybrid search combining BM25 + vector similarity
ingestDocument ingestion pipeline (chunk, embed, store)
managerMemoryManager — high-level memory operations
pipelineRAG pipeline orchestration
rerankerCross-encoder re-ranking for result refinement

Key Types

/// High-level memory manager
pub struct MemoryManager {
embedding: Arc<dyn EmbeddingProvider>,
vector_store: Arc<dyn VectorStore>,
bm25_index: BM25Index,
reranker: Option<Reranker>,
}

impl MemoryManager {
/// Ingest a document into memory
pub async fn ingest(&self, doc: &Document) -> Result<(), MemoryError> { /* ... */ }

/// Search memory using hybrid retrieval
pub async fn search(&self, query: &str, top_k: usize) -> Result<Vec<MemoryResult>, MemoryError> { /* ... */ }

/// Get relevant context for a conversation
pub async fn recall(&self, session: &Session, query: &str) -> Result<Vec<MemoryResult>, MemoryError> { /* ... */ }
}

/// BM25 search index
pub struct BM25Index {
documents: Vec<TokenizedDoc>,
avg_doc_len: f32,
k1: f32,
b: f32,
}

/// Hybrid search combining keyword and semantic search
pub struct HybridSearch {
bm25: BM25Index,
vector_store: Arc<dyn VectorStore>,
alpha: f32, // Weight for vector vs keyword (0.0 = all BM25, 1.0 = all vector)
}

/// Memory search result
#[derive(Debug, Clone)]
pub struct MemoryResult {
pub content: String,
pub score: f32,
pub source: String,
pub metadata: serde_json::Value,
}

RAG Pipeline

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ Query │───▶│ Hybrid │───▶│ Reranker │───▶│ Context │
│ │ │ Search │ │ │ │ Assembly │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │
┌────┘ └────┐
▼ ▼
┌──────┐ ┌────────┐
│ BM25 │ │ Vector │
└──────┘ └────────┘

Example Usage

use clawdesk_memory::{MemoryManager, Document};

let manager = MemoryManager::new(embedding_provider, vector_store);

// Ingest documents
manager.ingest(&Document {
content: "ClawDesk uses hexagonal architecture...".into(),
source: "docs/architecture.md".into(),
metadata: Default::default(),
}).await?;

// Search with hybrid retrieval
let results = manager.search("hexagonal architecture", 5).await?;

for result in &results {
println!("[{:.2}] {}", result.score, result.content);
}
tip

Adjust the alpha parameter in HybridSearch to balance keyword vs. semantic matching. Values around 0.6–0.7 work well for technical documentation.