clawdesk-memory
Memory and retrieval-augmented generation (RAG) pipeline. Combines BM25 keyword search, vector embeddings, and hybrid re-ranking to provide contextual memory for agent conversations.
Dependencies
Internal: clawdesk-types, clawdesk-storage
External: tokio, serde, tracing, async-trait
Modules
| Module | Description |
|---|---|
bm25 | BM25 keyword-based search scoring |
embedding | Text embedding generation (via provider or local model) |
hybrid | Hybrid search combining BM25 + vector similarity |
ingest | Document ingestion pipeline (chunk, embed, store) |
manager | MemoryManager — high-level memory operations |
pipeline | RAG pipeline orchestration |
reranker | Cross-encoder re-ranking for result refinement |
Key Types
/// High-level memory manager
pub struct MemoryManager {
embedding: Arc<dyn EmbeddingProvider>,
vector_store: Arc<dyn VectorStore>,
bm25_index: BM25Index,
reranker: Option<Reranker>,
}
impl MemoryManager {
/// Ingest a document into memory
pub async fn ingest(&self, doc: &Document) -> Result<(), MemoryError> { /* ... */ }
/// Search memory using hybrid retrieval
pub async fn search(&self, query: &str, top_k: usize) -> Result<Vec<MemoryResult>, MemoryError> { /* ... */ }
/// Get relevant context for a conversation
pub async fn recall(&self, session: &Session, query: &str) -> Result<Vec<MemoryResult>, MemoryError> { /* ... */ }
}
/// BM25 search index
pub struct BM25Index {
documents: Vec<TokenizedDoc>,
avg_doc_len: f32,
k1: f32,
b: f32,
}
/// Hybrid search combining keyword and semantic search
pub struct HybridSearch {
bm25: BM25Index,
vector_store: Arc<dyn VectorStore>,
alpha: f32, // Weight for vector vs keyword (0.0 = all BM25, 1.0 = all vector)
}
/// Memory search result
#[derive(Debug, Clone)]
pub struct MemoryResult {
pub content: String,
pub score: f32,
pub source: String,
pub metadata: serde_json::Value,
}
RAG Pipeline
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Query │───▶│ Hybrid │───▶│ Reranker │───▶│ Context │
│ │ │ Search │ │ │ │ Assembly │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │
┌────┘ └────┐
▼ ▼
┌──────┐ ┌────────┐
│ BM25 │ │ Vector │
└──────┘ └────────┘
Example Usage
use clawdesk_memory::{MemoryManager, Document};
let manager = MemoryManager::new(embedding_provider, vector_store);
// Ingest documents
manager.ingest(&Document {
content: "ClawDesk uses hexagonal architecture...".into(),
source: "docs/architecture.md".into(),
metadata: Default::default(),
}).await?;
// Search with hybrid retrieval
let results = manager.search("hexagonal architecture", 5).await?;
for result in &results {
println!("[{:.2}] {}", result.score, result.content);
}
tip
Adjust the alpha parameter in HybridSearch to balance keyword vs. semantic matching. Values around 0.6–0.7 work well for technical documentation.