Memory & Skills

ClawDesk features a built-in memory system for long-term knowledge retention and a composable skill system for extending agent capabilities with reusable prompt fragments and tool bindings.

Memory System

The memory system gives agents persistent knowledge beyond the conversation context window. It combines BM25 keyword search, vector embeddings, and hybrid retrieval to find relevant information from past conversations and ingested documents.

Architecture

Key Components

Component	Module	Description
`MemoryManager`	`pipeline`	Top-level orchestrator for memory operations
`HybridSearcher`	`hybrid`	Combines BM25 and vector search with RRF
`BatchPipeline`	`pipeline`	Batch ingestion with chunking and embedding
`Bm25Index`	`bm25`	BM25 keyword index (Okapi BM25F)
`EmbeddingProvider`	`embedding`	Abstraction over embedding backends
`Reranker`	`reranker`	Cross-encoder reranking for precision

Configuration

[memory]
enabled = true
data_dir = "${CLAWDESK_DATA_DIR}/memory"

# Chunking settings
[memory.chunking]
strategy = "semantic"       # "fixed" | "sentence" | "semantic"
chunk_size = 512            # target tokens per chunk
chunk_overlap = 64          # overlap between chunks
max_chunk_size = 1024       # hard limit

# Embedding configuration
[memory.embedding]
provider = "ollama"         # "ollama" | "openai" | "local"
model = "nomic-embed-text"  # embedding model
dimensions = 768            # embedding dimensions
batch_size = 32             # chunks per embedding batch

# BM25 configuration
[memory.bm25]
k1 = 1.2                   # term frequency saturation
b = 0.75                    # document length normalization
avg_doc_length = 256        # estimated average document length

# Hybrid search
[memory.search]
strategy = "hybrid"         # "bm25" | "vector" | "hybrid"
bm25_weight = 0.3           # weight for BM25 in RRF
vector_weight = 0.7         # weight for vector search in RRF
top_k = 10                  # results to retrieve
rerank = true               # enable cross-encoder reranking
rerank_model = "cross-encoder/ms-marco-MiniLM-L-6-v2"

# Auto-ingestion
[memory.auto_ingest]
enabled = true
conversations = true        # ingest conversation turns
min_message_length = 50     # skip short messages
exclude_channels = []       # channel IDs to exclude

BM25 Search

The BM25 module provides fast keyword-based retrieval using the Okapi BM25F algorithm:

pub struct Bm25Index {
    documents: Vec<IndexedDocument>,
    inverted_index: HashMap<String, Vec<(DocId, f32)>>,
    config: Bm25Config,
}

impl Bm25Index {
    /// Add a document to the index
    pub fn index(&mut self, doc: Document) -> DocId;

    /// Search with a text query
    pub fn search(&self, query: &str, top_k: usize) -> Vec<SearchResult>;

    /// Remove a document
    pub fn remove(&mut self, doc_id: &DocId);

    /// Persist the index to disk
    pub fn save(&self, path: &Path) -> Result<()>;

    /// Load from disk
    pub fn load(path: &Path) -> Result<Self>;
}

BM25 excels at finding exact keyword matches and is complementary to vector search for factual queries:

# BM25 is great for:
# - "What is the API key for Anthropic?"  (exact term: "API key", "Anthropic")
# - "Error code 429"                      (exact match: "429")
# - "John's phone number"                 (exact name: "John")

Embedding Provider

The embedding module supports multiple backends:

pub enum EmbeddingProvider {
    Ollama(OllamaEmbedder),
    OpenAI(OpenAIEmbedder),
    Local(LocalEmbedder),
}

impl EmbeddingProvider {
    /// Embed a single text
    pub async fn embed(&self, text: &str) -> Result<Vec<f32>>;

    /// Embed a batch of texts
    pub async fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>;

    /// Get the embedding dimensions
    pub fn dimensions(&self) -> usize;
}

Supported embedding models:

Provider	Model	Dimensions	Speed
Ollama	`nomic-embed-text`	768	Fast
Ollama	`mxbai-embed-large`	1024	Medium
OpenAI	`text-embedding-3-small`	1536	Fast
OpenAI	`text-embedding-3-large`	3072	Medium
Local	`all-MiniLM-L6-v2`	384	Fastest

Hybrid Search (RRF)

The HybridSearcher combines BM25 and vector search using Reciprocal Rank Fusion:

pub struct HybridSearcher {
    bm25: Arc<Bm25Index>,
    vector_store: Arc<VectorStore>,
    config: HybridConfig,
}

impl HybridSearcher {
    pub async fn search(&self, query: &str, top_k: usize) -> Result<Vec<SearchResult>> {
        // 1. Run BM25 search
        let bm25_results = self.bm25.search(query, top_k * 2);

        // 2. Run vector search
        let embedding = self.embedder.embed(query).await?;
        let vector_results = self.vector_store
            .search(&embedding, top_k * 2)
            .await?;

        // 3. Fuse with RRF
        let fused = reciprocal_rank_fusion(
            &bm25_results,
            &vector_results,
            self.config.bm25_weight,
            self.config.vector_weight,
            60, // RRF constant k
        );

        // 4. Rerank (optional)
        if self.config.rerank {
            self.reranker.rerank(query, &fused, top_k).await
        } else {
            Ok(fused.into_iter().take(top_k).collect())
        }
    }
}

The RRF formula:

$$ \text{RRF}(d) = \sum_{r \in R} \frac{w_r}{k + \text{rank}_r(d)} $$

where $R$ is the set of result lists, $w_r$ is the weight for list $r$, $k$ is a constant (default 60), and $\text{rank}_r(d)$ is the rank of document $d$ in list $r$.

tip

Hybrid search consistently outperforms either BM25 or vector search alone. The default weights (0.3 BM25/0.7 vector) work well for most use cases. Increase BM25 weight for more factual/keyword-heavy workloads.

Reranker

The cross-encoder reranker provides a second pass of scoring for improved precision:

pub struct Reranker {
    model: CrossEncoderModel,
}

impl Reranker {
    pub async fn rerank(
        &self,
        query: &str,
        candidates: &[SearchResult],
        top_k: usize,
    ) -> Result<Vec<SearchResult>> {
        let mut scored: Vec<(f32, &SearchResult)> = Vec::new();

        for candidate in candidates {
            let score = self.model.score(query, &candidate.text).await?;
            scored.push((score, candidate));
        }

        scored.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap());
        Ok(scored.into_iter().take(top_k).map(|(_, r)| r.clone()).collect())
    }
}

Ingestion Pipeline

pub struct BatchPipeline {
    chunker: Box<dyn Chunker>,
    embedder: Arc<EmbeddingProvider>,
    bm25: Arc<RwLock<Bm25Index>>,
    vector_store: Arc<VectorStore>,
}

impl BatchPipeline {
    /// Ingest a document
    pub async fn ingest(&self, document: Document) -> Result<IngestResult> {
        // 1. Chunk the document
        let chunks = self.chunker.chunk(&document)?;

        // 2. Generate embeddings (batched)
        let texts: Vec<&str> = chunks.iter().map(|c| c.text.as_str()).collect();
        let embeddings = self.embedder.embed_batch(&texts).await?;

        // 3. Index in BM25
        {
            let mut bm25 = self.bm25.write().await;
            for chunk in &chunks {
                bm25.index(chunk.into());
            }
        }

        // 4. Store embeddings
        self.vector_store.insert_batch(&chunks, &embeddings).await?;

        Ok(IngestResult {
            chunks_created: chunks.len(),
            document_id: document.id,
        })
    }
}

CLI Operations

# Ingest a file
clawdesk memory ingest ./docs/manual.pdf

# Ingest a directory
clawdesk memory ingest ./knowledge-base/ --recursive

# Search memory
clawdesk memory search "How do I configure Telegram?"

# Show memory stats
clawdesk memory stats
# Output:
# Documents: 142
# Chunks: 3,847
# BM25 index size: 2.1 MB
# Vector store size: 48.3 MB
# Embedding model: nomic-embed-text (768 dims)

# Clear memory
clawdesk memory clear --confirm

# Export memory
clawdesk memory export --format jsonl > backup.jsonl

# Import memory
clawdesk memory import backup.jsonl

Skills System

Skills are composable units of agent capability. Each skill is a combination of a prompt fragment, tool bindings, parameters, and dependencies—packaged for reuse and hot-reloading.

Skill Structure

Defining a Skill

Skills are defined in TOML files:

# skills/customer-support.toml
[skill]
name = "customer_support"
version = "1.0.0"
description = "Customer support agent with ticket management"
author = "Acme Corp"

[skill.prompt]
fragment = """
You are a customer support agent for Acme Corp.
Always be polite and professional.
When a customer reports an issue:
1. Acknowledge the problem
2. Search the knowledge base for solutions
3. If no solution found, create a support ticket
4. Provide the ticket number to the customer
"""
position = "prepend"  # "prepend" | "append" | "replace"

[skill.tools]
required = ["knowledge_base", "ticket_create", "ticket_status"]
optional = ["email_send"]

[skill.params]
company_name = { type = "string", default = "Acme Corp" }
escalation_email = { type = "string", required = true }
max_ticket_priority = { type = "integer", default = 3 }
auto_assign = { type = "boolean", default = true }

[skill.deps]
knowledge_base = ">=1.0.0"

Skill Selection: Weighted Knapsack

When multiple skills are available, ClawDesk uses a weighted knapsack algorithm to select the optimal subset that fits within the token budget:

The selection algorithm runs in $O(k \log k)$ where $k$ is the number of candidate skills:

pub struct SkillSelector {
    token_budget: usize,
}

impl SkillSelector {
    /// Select skills that fit within the token budget
    pub fn select(
        &self,
        candidates: &[ScoredSkill],
        budget: usize,
    ) -> Vec<&ScoredSkill> {
        // Sort by score/token_cost ratio (descending)
        let mut ranked: Vec<_> = candidates.iter()
            .map(|s| (s, s.relevance_score / s.token_cost as f32))
            .collect();
        ranked.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());

        // Greedy knapsack
        let mut selected = Vec::new();
        let mut remaining_budget = budget;

        for (skill, _ratio) in ranked {
            if skill.token_cost <= remaining_budget {
                selected.push(skill);
                remaining_budget -= skill.token_cost;
            }
        }

        selected
    }
}

Configuration

[skills]
enabled = true
directory = "${CLAWDESK_DATA_DIR}/skills"
hot_reload = true
reload_interval_secs = 5

# Token budget for skills (from the total context window)
token_budget = 2048

# Skill-specific overrides
[skills.overrides.customer_support]
enabled = true
params.company_name = "My Company"
params.escalation_email = "support@mycompany.com"

[skills.overrides.code_review]
enabled = false  # disable this skill

Hot Reloading

Skills support hot-reloading. When a skill file changes on disk, ClawDesk automatically picks up the changes:

# Watch for skill changes (in gateway mode, this happens automatically)
$ clawdesk gateway
[INFO] Watching skills directory: /home/user/.clawdesk/data/skills
[INFO] Loaded 5 skills: customer_support, code_review, writing, research, math
# ... edit a skill file ...
[INFO] Skill reloaded: customer_support (v1.0.0 → v1.1.0)

info

Hot-reload only applies to TOML/YAML skill definitions. If a skill has compiled Rust tool bindings, those require a restart or plugin hot-reload.

Creating a Skill

# Scaffold a new skill
clawdesk skills create my-skill

# This creates:
# skills/my-skill/
# ├── skill.toml       # Skill definition
# ├── prompt.md        # Prompt fragment (optional, can inline in TOML)
# └── README.md        # Documentation

Built-in Skills

Skill	Description	Token Cost
`general_assistant`	General-purpose helpful assistant	~200
`code_review`	Code review with best practices	~350
`writing`	Writing assistance with style guides	~300
`research`	Web research and summarization	~250
`math`	Mathematical reasoning and computation	~150
`customer_support`	Customer support with ticket management	~400
`data_analysis`	Data analysis and visualization	~300

Skill Dependencies

Skills can depend on other skills, creating a dependency graph:

# skills/advanced-support.toml
[skill]
name = "advanced_support"

[skill.deps]
customer_support = ">=1.0.0"
knowledge_base = ">=1.0.0"

[skill.prompt]
fragment = """
In addition to standard support procedures, you can:
- Escalate to engineering with detailed technical reports
- Access the internal documentation system
- Review customer subscription and billing information
"""

When a skill with dependencies is selected, all its dependencies are automatically included (and their token costs counted against the budget).

Integration: Memory + Skills

Memory and skills work together to create powerful agent experiences:

# A skill that leverages memory
[skill]
name = "contextual_helper"

[skill.prompt]
fragment = """
Before answering, always search memory for relevant context.
Use the knowledge_base tool to find previous conversations
and documentation that might be relevant.
Cite your sources when using information from memory.
"""

[skill.tools]
required = ["knowledge_base"]

[skill.params]
search_top_k = { type = "integer", default = 5 }
include_sources = { type = "boolean", default = true }

tip

For best results, pair the hybrid search memory system with skills that instruct the agent on how and when to query memory. This gives the agent both the capability (tools) and the knowledge (prompt) to effectively use long-term memory.

Memory System​

Architecture​

Key Components​

Configuration​

BM25 Search​

Embedding Provider​

Hybrid Search (RRF)​

Reranker​

Ingestion Pipeline​

CLI Operations​

Skills System​

Skill Structure​

Defining a Skill​

Skill Selection: Weighted Knapsack​

Configuration​

Hot Reloading​

Creating a Skill​

Built-in Skills​

Skill Dependencies​

Integration: Memory + Skills​

Memory System

Architecture

Key Components

Configuration

BM25 Search

Embedding Provider

Hybrid Search (RRF)

Reranker

Ingestion Pipeline

CLI Operations

Skills System

Skill Structure

Defining a Skill

Skill Selection: Weighted Knapsack

Configuration

Hot Reloading

Creating a Skill

Built-in Skills

Skill Dependencies

Integration: Memory + Skills