Skip to main content

Memory & Skills

ClawDesk features a built-in memory system for long-term knowledge retention and a composable skill system for extending agent capabilities with reusable prompt fragments and tool bindings.


Memory System

The memory system gives agents persistent knowledge beyond the conversation context window. It combines BM25 keyword search, vector embeddings, and hybrid retrieval to find relevant information from past conversations and ingested documents.

Architecture

Key Components

ComponentModuleDescription
MemoryManagerpipelineTop-level orchestrator for memory operations
HybridSearcherhybridCombines BM25 and vector search with RRF
BatchPipelinepipelineBatch ingestion with chunking and embedding
Bm25Indexbm25BM25 keyword index (Okapi BM25F)
EmbeddingProviderembeddingAbstraction over embedding backends
RerankerrerankerCross-encoder reranking for precision

Configuration

[memory]
enabled = true
data_dir = "${CLAWDESK_DATA_DIR}/memory"

# Chunking settings
[memory.chunking]
strategy = "semantic" # "fixed" | "sentence" | "semantic"
chunk_size = 512 # target tokens per chunk
chunk_overlap = 64 # overlap between chunks
max_chunk_size = 1024 # hard limit

# Embedding configuration
[memory.embedding]
provider = "ollama" # "ollama" | "openai" | "local"
model = "nomic-embed-text" # embedding model
dimensions = 768 # embedding dimensions
batch_size = 32 # chunks per embedding batch

# BM25 configuration
[memory.bm25]
k1 = 1.2 # term frequency saturation
b = 0.75 # document length normalization
avg_doc_length = 256 # estimated average document length

# Hybrid search
[memory.search]
strategy = "hybrid" # "bm25" | "vector" | "hybrid"
bm25_weight = 0.3 # weight for BM25 in RRF
vector_weight = 0.7 # weight for vector search in RRF
top_k = 10 # results to retrieve
rerank = true # enable cross-encoder reranking
rerank_model = "cross-encoder/ms-marco-MiniLM-L-6-v2"

# Auto-ingestion
[memory.auto_ingest]
enabled = true
conversations = true # ingest conversation turns
min_message_length = 50 # skip short messages
exclude_channels = [] # channel IDs to exclude

The BM25 module provides fast keyword-based retrieval using the Okapi BM25F algorithm:

pub struct Bm25Index {
documents: Vec<IndexedDocument>,
inverted_index: HashMap<String, Vec<(DocId, f32)>>,
config: Bm25Config,
}

impl Bm25Index {
/// Add a document to the index
pub fn index(&mut self, doc: Document) -> DocId;

/// Search with a text query
pub fn search(&self, query: &str, top_k: usize) -> Vec<SearchResult>;

/// Remove a document
pub fn remove(&mut self, doc_id: &DocId);

/// Persist the index to disk
pub fn save(&self, path: &Path) -> Result<()>;

/// Load from disk
pub fn load(path: &Path) -> Result<Self>;
}

BM25 excels at finding exact keyword matches and is complementary to vector search for factual queries:

# BM25 is great for:
# - "What is the API key for Anthropic?" (exact term: "API key", "Anthropic")
# - "Error code 429" (exact match: "429")
# - "John's phone number" (exact name: "John")

Embedding Provider

The embedding module supports multiple backends:

pub enum EmbeddingProvider {
Ollama(OllamaEmbedder),
OpenAI(OpenAIEmbedder),
Local(LocalEmbedder),
}

impl EmbeddingProvider {
/// Embed a single text
pub async fn embed(&self, text: &str) -> Result<Vec<f32>>;

/// Embed a batch of texts
pub async fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>;

/// Get the embedding dimensions
pub fn dimensions(&self) -> usize;
}

Supported embedding models:

ProviderModelDimensionsSpeed
Ollamanomic-embed-text768Fast
Ollamamxbai-embed-large1024Medium
OpenAItext-embedding-3-small1536Fast
OpenAItext-embedding-3-large3072Medium
Localall-MiniLM-L6-v2384Fastest

Hybrid Search (RRF)

The HybridSearcher combines BM25 and vector search using Reciprocal Rank Fusion:

pub struct HybridSearcher {
bm25: Arc<Bm25Index>,
vector_store: Arc<VectorStore>,
config: HybridConfig,
}

impl HybridSearcher {
pub async fn search(&self, query: &str, top_k: usize) -> Result<Vec<SearchResult>> {
// 1. Run BM25 search
let bm25_results = self.bm25.search(query, top_k * 2);

// 2. Run vector search
let embedding = self.embedder.embed(query).await?;
let vector_results = self.vector_store
.search(&embedding, top_k * 2)
.await?;

// 3. Fuse with RRF
let fused = reciprocal_rank_fusion(
&bm25_results,
&vector_results,
self.config.bm25_weight,
self.config.vector_weight,
60, // RRF constant k
);

// 4. Rerank (optional)
if self.config.rerank {
self.reranker.rerank(query, &fused, top_k).await
} else {
Ok(fused.into_iter().take(top_k).collect())
}
}
}

The RRF formula:

$$ \text{RRF}(d) = \sum_{r \in R} \frac{w_r}{k + \text{rank}_r(d)} $$

where $R$ is the set of result lists, $w_r$ is the weight for list $r$, $k$ is a constant (default 60), and $\text{rank}_r(d)$ is the rank of document $d$ in list $r$.

tip

Hybrid search consistently outperforms either BM25 or vector search alone. The default weights (0.3 BM25/0.7 vector) work well for most use cases. Increase BM25 weight for more factual/keyword-heavy workloads.

Reranker

The cross-encoder reranker provides a second pass of scoring for improved precision:

pub struct Reranker {
model: CrossEncoderModel,
}

impl Reranker {
pub async fn rerank(
&self,
query: &str,
candidates: &[SearchResult],
top_k: usize,
) -> Result<Vec<SearchResult>> {
let mut scored: Vec<(f32, &SearchResult)> = Vec::new();

for candidate in candidates {
let score = self.model.score(query, &candidate.text).await?;
scored.push((score, candidate));
}

scored.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap());
Ok(scored.into_iter().take(top_k).map(|(_, r)| r.clone()).collect())
}
}

Ingestion Pipeline

pub struct BatchPipeline {
chunker: Box<dyn Chunker>,
embedder: Arc<EmbeddingProvider>,
bm25: Arc<RwLock<Bm25Index>>,
vector_store: Arc<VectorStore>,
}

impl BatchPipeline {
/// Ingest a document
pub async fn ingest(&self, document: Document) -> Result<IngestResult> {
// 1. Chunk the document
let chunks = self.chunker.chunk(&document)?;

// 2. Generate embeddings (batched)
let texts: Vec<&str> = chunks.iter().map(|c| c.text.as_str()).collect();
let embeddings = self.embedder.embed_batch(&texts).await?;

// 3. Index in BM25
{
let mut bm25 = self.bm25.write().await;
for chunk in &chunks {
bm25.index(chunk.into());
}
}

// 4. Store embeddings
self.vector_store.insert_batch(&chunks, &embeddings).await?;

Ok(IngestResult {
chunks_created: chunks.len(),
document_id: document.id,
})
}
}

CLI Operations

# Ingest a file
clawdesk memory ingest ./docs/manual.pdf

# Ingest a directory
clawdesk memory ingest ./knowledge-base/ --recursive

# Search memory
clawdesk memory search "How do I configure Telegram?"

# Show memory stats
clawdesk memory stats
# Output:
# Documents: 142
# Chunks: 3,847
# BM25 index size: 2.1 MB
# Vector store size: 48.3 MB
# Embedding model: nomic-embed-text (768 dims)

# Clear memory
clawdesk memory clear --confirm

# Export memory
clawdesk memory export --format jsonl > backup.jsonl

# Import memory
clawdesk memory import backup.jsonl

Skills System

Skills are composable units of agent capability. Each skill is a combination of a prompt fragment, tool bindings, parameters, and dependencies—packaged for reuse and hot-reloading.

Skill Structure

Defining a Skill

Skills are defined in TOML files:

# skills/customer-support.toml
[skill]
name = "customer_support"
version = "1.0.0"
description = "Customer support agent with ticket management"
author = "Acme Corp"

[skill.prompt]
fragment = """
You are a customer support agent for Acme Corp.
Always be polite and professional.
When a customer reports an issue:
1. Acknowledge the problem
2. Search the knowledge base for solutions
3. If no solution found, create a support ticket
4. Provide the ticket number to the customer
"""
position = "prepend" # "prepend" | "append" | "replace"

[skill.tools]
required = ["knowledge_base", "ticket_create", "ticket_status"]
optional = ["email_send"]

[skill.params]
company_name = { type = "string", default = "Acme Corp" }
escalation_email = { type = "string", required = true }
max_ticket_priority = { type = "integer", default = 3 }
auto_assign = { type = "boolean", default = true }

[skill.deps]
knowledge_base = ">=1.0.0"

Skill Selection: Weighted Knapsack

When multiple skills are available, ClawDesk uses a weighted knapsack algorithm to select the optimal subset that fits within the token budget:

The selection algorithm runs in $O(k \log k)$ where $k$ is the number of candidate skills:

pub struct SkillSelector {
token_budget: usize,
}

impl SkillSelector {
/// Select skills that fit within the token budget
pub fn select(
&self,
candidates: &[ScoredSkill],
budget: usize,
) -> Vec<&ScoredSkill> {
// Sort by score/token_cost ratio (descending)
let mut ranked: Vec<_> = candidates.iter()
.map(|s| (s, s.relevance_score / s.token_cost as f32))
.collect();
ranked.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());

// Greedy knapsack
let mut selected = Vec::new();
let mut remaining_budget = budget;

for (skill, _ratio) in ranked {
if skill.token_cost <= remaining_budget {
selected.push(skill);
remaining_budget -= skill.token_cost;
}
}

selected
}
}

Configuration

[skills]
enabled = true
directory = "${CLAWDESK_DATA_DIR}/skills"
hot_reload = true
reload_interval_secs = 5

# Token budget for skills (from the total context window)
token_budget = 2048

# Skill-specific overrides
[skills.overrides.customer_support]
enabled = true
params.company_name = "My Company"
params.escalation_email = "support@mycompany.com"

[skills.overrides.code_review]
enabled = false # disable this skill

Hot Reloading

Skills support hot-reloading. When a skill file changes on disk, ClawDesk automatically picks up the changes:

# Watch for skill changes (in gateway mode, this happens automatically)
$ clawdesk gateway
[INFO] Watching skills directory: /home/user/.clawdesk/data/skills
[INFO] Loaded 5 skills: customer_support, code_review, writing, research, math
# ... edit a skill file ...
[INFO] Skill reloaded: customer_support (v1.0.0 → v1.1.0)
info

Hot-reload only applies to TOML/YAML skill definitions. If a skill has compiled Rust tool bindings, those require a restart or plugin hot-reload.

Creating a Skill

# Scaffold a new skill
clawdesk skills create my-skill

# This creates:
# skills/my-skill/
# ├── skill.toml # Skill definition
# ├── prompt.md # Prompt fragment (optional, can inline in TOML)
# └── README.md # Documentation

Built-in Skills

SkillDescriptionToken Cost
general_assistantGeneral-purpose helpful assistant~200
code_reviewCode review with best practices~350
writingWriting assistance with style guides~300
researchWeb research and summarization~250
mathMathematical reasoning and computation~150
customer_supportCustomer support with ticket management~400
data_analysisData analysis and visualization~300

Skill Dependencies

Skills can depend on other skills, creating a dependency graph:

# skills/advanced-support.toml
[skill]
name = "advanced_support"

[skill.deps]
customer_support = ">=1.0.0"
knowledge_base = ">=1.0.0"

[skill.prompt]
fragment = """
In addition to standard support procedures, you can:
- Escalate to engineering with detailed technical reports
- Access the internal documentation system
- Review customer subscription and billing information
"""

When a skill with dependencies is selected, all its dependencies are automatically included (and their token costs counted against the budget).


Integration: Memory + Skills

Memory and skills work together to create powerful agent experiences:

# A skill that leverages memory
[skill]
name = "contextual_helper"

[skill.prompt]
fragment = """
Before answering, always search memory for relevant context.
Use the knowledge_base tool to find previous conversations
and documentation that might be relevant.
Cite your sources when using information from memory.
"""

[skill.tools]
required = ["knowledge_base"]

[skill.params]
search_top_k = { type = "integer", default = 5 }
include_sources = { type = "boolean", default = true }
tip

For best results, pair the hybrid search memory system with skills that instruct the agent on how and when to query memory. This gives the agent both the capability (tools) and the knowledge (prompt) to effectively use long-term memory.