Storage Layer

ClawDesk uses SochDB as its embedded ACID-compliant vector database. SochDB provides MVCC transactions, columnar storage, and hybrid search (B-Tree + HNSW) — all within the same process, eliminating the need for external database infrastructure.

Architecture Overview

Port → Adapter Mapping

Port Trait (`clawdesk-storage`)	Adapter (`clawdesk-sochdb`)	SochDB Feature
`SessionStore`	`SochSessionStore`	B-Tree key-value
`ConversationStore`	`SochConversationStore`	Columnar LSCS
`ConfigStore`	`SochConfigStore`	B-Tree key-value
`VectorStore`	`SochVectorStore`	HNSW index

SochDB Fundamentals

Storage Engine: LSCS

SochDB uses a Log-structured Columnar Storage (LSCS) engine that combines the benefits of LSM trees with columnar layout:

MVCC + SSI Transactions

SochDB implements Multi-Version Concurrency Control (MVCC) with Serializable Snapshot Isolation (SSI):

// MVCC transaction model in SochDB
pub struct Transaction {
    /// Unique, monotonically increasing transaction ID
    tx_id: TxId,

    /// Snapshot timestamp — reads see data as of this point
    snapshot_ts: Timestamp,

    /// Write set — tracks all modifications
    write_set: HashSet<Key>,

    /// Read set — tracks all reads (for SSI conflict detection)
    read_set: HashSet<Key>,

    /// Transaction state
    state: TxState,
}

#[derive(Debug)]
pub enum TxState {
    Active,
    Committed,
    Aborted,
}

Isolation Levels

Level	Reads	Writes	Anomalies Prevented
Snapshot Isolation	Consistent snapshot	Deferred conflict check	Dirty, non-repeatable, phantom
Serializable (SSI)	Consistent snapshot + read tracking	Write-write + read-write conflict detection	All anomalies

Default Isolation

SochDB defaults to SSI for all ClawDesk transactions. This prevents all concurrency anomalies including write skew, at the cost of occasional transaction aborts under high contention.

Conflict Detection

SSI detects conflicts using read-write intersection analysis:

$$ \text{conflict}(T_1, T_2) = (\text{readSet}(T_1) \cap \text{writeSet}(T_2) \neq \emptyset) \lor (\text{writeSet}(T_1) \cap \text{readSet}(T_2) \neq \emptyset) $$

When a conflict is detected, the later transaction is aborted:

// Conflict detection during commit
impl Transaction {
    pub fn commit(mut self) -> Result<(), TransactionError> {
        // Check for write-write conflicts
        for key in &self.write_set {
            if self.db.was_written_since(key, self.snapshot_ts)? {
                return Err(TransactionError::WriteConflict {
                    key: key.clone(),
                    tx_id: self.tx_id,
                });
            }
        }

        // Check for read-write conflicts (SSI)
        for key in &self.read_set {
            if self.db.was_written_since(key, self.snapshot_ts)? {
                return Err(TransactionError::SerializationFailure {
                    key: key.clone(),
                    tx_id: self.tx_id,
                });
            }
        }

        // All checks pass — durably commit
        self.db.wal.append_commit(self.tx_id, &self.write_set)?;
        self.state = TxState::Committed;
        Ok(())
    }
}

Write-Ahead Log (WAL)

All writes are first appended to the WAL for crash recovery:

/// WAL entry format
#[derive(Debug, Serialize, Deserialize)]
pub struct WalEntry {
    /// Transaction ID
    pub tx_id: TxId,

    /// Entry sequence number (monotonic)
    pub lsn: LogSequenceNumber,

    /// Checksum for integrity
    pub checksum: u32,

    /// Operation
    pub op: WalOp,
}

#[derive(Debug, Serialize, Deserialize)]
pub enum WalOp {
    Put { key: Vec<u8>, value: Vec<u8> },
    Delete { key: Vec<u8> },
    Commit { tx_id: TxId },
    Abort { tx_id: TxId },
}

Durability

The WAL uses fsync() after each commit to ensure durability. This means committed data survives process crashes and power failures. The trade-off is increased write latency (~1-2ms per sync on SSD).

Vector Search: HNSW

SochDB includes a built-in Hierarchical Navigable Small World (HNSW) index for approximate nearest neighbor search:

/// HNSW index configuration
pub struct HnswConfig {
    /// Maximum number of connections per node at layer 0
    pub m: usize,              // default: 16

    /// Maximum number of connections per node at higher layers
    pub m_max: usize,          // default: 32

    /// Size of the dynamic candidate list during construction
    pub ef_construction: usize, // default: 200

    /// Size of the dynamic candidate list during search
    pub ef_search: usize,       // default: 100

    /// Distance metric
    pub metric: DistanceMetric, // default: Cosine
}

#[derive(Debug, Clone, Copy)]
pub enum DistanceMetric {
    Cosine,
    Euclidean,
    DotProduct,
}

Search Performance

The HNSW index provides sub-linear search time:

$$ T_{\text{search}} = O(\log n \cdot ef) $$

Where $n$ is the number of indexed vectors and $ef$ is the search beam width.

Dataset Size	ef=50	ef=100	ef=200	Recall@10
1K vectors	0.2ms	0.4ms	0.6ms	99.5%
10K vectors	0.8ms	1.5ms	2.8ms	99.2%
100K vectors	2.1ms	3.8ms	7.2ms	98.8%
1M vectors	4.5ms	8.2ms	15.1ms	98.1%

Vector Store Implementation

// crates/clawdesk-sochdb/src/vector_store.rs

pub struct SochVectorStore {
    db: Database,
    hnsw: HnswIndex,
    dimension: usize,
}

#[async_trait]
impl VectorStore for SochVectorStore {
    async fn upsert_embedding(
        &self,
        id: &str,
        embedding: &[f32],
        metadata: serde_json::Value,
    ) -> Result<(), StorageError> {
        assert_eq!(embedding.len(), self.dimension);

        let txn = self.db.begin_write()?;

        // Store the raw embedding + metadata
        let mut table = txn.open_table("embeddings")?;
        let record = EmbeddingRecord {
            id: id.to_string(),
            vector: embedding.to_vec(),
            metadata,
            updated_at: Utc::now(),
        };
        table.insert(id.as_bytes(), &bincode::serialize(&record)?)?;

        // Update the HNSW index
        self.hnsw.insert(id, embedding)?;

        txn.commit()?;
        Ok(())
    }

    async fn search_similar(
        &self,
        query_embedding: &[f32],
        top_k: usize,
        filter: Option<VectorFilter>,
    ) -> Result<Vec<VectorMatch>, StorageError> {
        let candidates = self.hnsw.search(query_embedding, top_k * 2)?;

        // Apply metadata filters
        let txn = self.db.begin_read()?;
        let table = txn.open_table("embeddings")?;

        let mut results = Vec::with_capacity(top_k);
        for candidate in candidates {
            if let Some(bytes) = table.get(candidate.id.as_bytes())? {
                let record: EmbeddingRecord = bincode::deserialize(&bytes)?;

                if let Some(ref f) = filter {
                    if !f.matches(&record.metadata) {
                        continue;
                    }
                }

                results.push(VectorMatch {
                    id: record.id,
                    score: candidate.distance,
                    metadata: record.metadata,
                });

                if results.len() >= top_k {
                    break;
                }
            }
        }

        Ok(results)
    }
}

Context Queries and TOON Format

SochDB's Path API enables structured context queries using the ContextQueryBuilder. Results can be formatted in TOON (Token-Optimized Object Notation), which uses 58–67% fewer tokens than equivalent JSON.

ContextQueryBuilder

/// Builder for context-aware queries that combine
/// conversation history, vector search, and metadata filtering.
pub struct ContextQueryBuilder<'a> {
    db: &'a Database,
    session_key: Option<&'a SessionKey>,
    query_text: Option<String>,
    query_embedding: Option<Vec<f32>>,
    time_range: Option<(DateTime<Utc>, DateTime<Utc>)>,
    max_results: usize,
    output_format: OutputFormat,
}

impl<'a> ContextQueryBuilder<'a> {
    pub fn new(db: &'a Database) -> Self {
        Self {
            db,
            session_key: None,
            query_text: None,
            query_embedding: None,
            time_range: None,
            max_results: 10,
            output_format: OutputFormat::Toon,
        }
    }

    pub fn session(mut self, key: &'a SessionKey) -> Self {
        self.session_key = Some(key);
        self
    }

    pub fn text_query(mut self, text: impl Into<String>) -> Self {
        self.query_text = Some(text.into());
        self
    }

    pub fn vector_query(mut self, embedding: Vec<f32>) -> Self {
        self.query_embedding = Some(embedding);
        self
    }

    pub fn time_range(mut self, from: DateTime<Utc>, to: DateTime<Utc>) -> Self {
        self.time_range = Some((from, to));
        self
    }

    pub fn max_results(mut self, n: usize) -> Self {
        self.max_results = n;
        self
    }

    pub fn format(mut self, format: OutputFormat) -> Self {
        self.output_format = format;
        self
    }

    pub async fn execute(&self) -> Result<ContextResult, StorageError> {
        // ... combines BM25, vector search, and metadata filters
    }
}

#[derive(Debug, Clone, Copy)]
pub enum OutputFormat {
    Json,
    Toon,
    Markdown,
}

TOON Format

TOON is a compact serialization format designed for LLM context windows:

# JSON (153 tokens)
{
  "messages": [
    {
      "role": "user",
      "content": "What is ClawDesk?",
      "timestamp": "2026-01-15T10:30:00Z"
    },
    {
      "role": "assistant",
      "content": "ClawDesk is a multi-channel AI agent gateway.",
      "timestamp": "2026-01-15T10:30:05Z"
    }
  ]
}

# TOON equivalent (58 tokens — 62% reduction)
messages[
  {role:user content:"What is ClawDesk?" ts:2026-01-15T10:30:00Z}
  {role:assistant content:"ClawDesk is a multi-channel AI agent gateway." ts:2026-01-15T10:30:05Z}
]

Key TOON optimizations:

Feature	JSON	TOON	Saving
Quotes on keys	Required	Omitted	~15%
Commas between items	Required	Whitespace-delimited	~5%
Null values	Explicitly `null`	Omitted entirely	~10%
Datetime format	ISO 8601 string	Compact notation	~8%
Arrays	`[item, item]`	`name[item item]`	~5%
Nested objects	`{"a": {"b": 1}}`	`{a.b:1}` (path notation)	~20%

Token Savings at Scale

For a 10-message conversation with tool call history, TOON typically saves 600–1,200 tokens compared to JSON. At scale this significantly increases the amount of useful context that fits within a model's context window.

Hybrid Search (Memory Subsystem)

The clawdesk-memory crate implements hybrid search combining BM25 lexical search and vector semantic search:

Reciprocal Rank Fusion (RRF)

RRF merges results from BM25 and vector search using a rank-based scoring function:

$$ \text{RRF}(d) = \sum_{r \in R} \frac{1}{k + r(d)} $$

Where $R$ is the set of result lists, $r(d)$ is the rank of document $d$ in list $r$, and $k$ is a smoothing constant (default: 60).

// crates/clawdesk-memory/src/hybrid.rs

pub struct HybridSearcher {
    bm25: Bm25Index,
    vector_store: Arc<dyn VectorStore>,
    embedding_model: Arc<dyn EmbeddingModel>,
    reranker: Option<Arc<dyn Reranker>>,
    rrf_k: f32,  // default: 60.0
}

impl HybridSearcher {
    pub async fn search(
        &self,
        query: &str,
        token_budget: usize,
    ) -> Result<ContextChunk, MemoryError> {
        // 1. BM25 lexical search
        let bm25_results = self.bm25.search(query, 20);

        // 2. Vector similarity search
        let embedding = self.embedding_model.embed(query).await?;
        let vector_results = self.vector_store
            .search_similar(&embedding, 20, None)
            .await?;

        // 3. Reciprocal Rank Fusion
        let merged = reciprocal_rank_fusion(
            &bm25_results,
            &vector_results,
            self.rrf_k,
        );

        // 4. Optional reranking
        let ranked = match &self.reranker {
            Some(reranker) => reranker.rerank(query, merged).await?,
            None => merged,
        };

        // 5. Truncate to token budget
        let mut accumulated_tokens = 0;
        let mut selected = Vec::new();
        for result in ranked {
            let tokens = estimate_tokens(&result.content);
            if accumulated_tokens + tokens > token_budget {
                break;
            }
            accumulated_tokens += tokens;
            selected.push(result);
        }

        Ok(ContextChunk {
            results: selected,
            total_tokens: accumulated_tokens,
        })
    }
}

BM25 Index

// crates/clawdesk-memory/src/bm25.rs

pub struct Bm25Index {
    /// Inverted index: term → (doc_id, term_frequency)
    index: HashMap<String, Vec<(DocId, f32)>>,

    /// Document lengths for normalization
    doc_lengths: HashMap<DocId, usize>,

    /// Average document length
    avg_doc_length: f32,

    /// Total number of documents
    doc_count: usize,

    /// BM25 parameters
    k1: f32,  // default: 1.2
    b: f32,   // default: 0.75
}

The BM25 scoring formula:

$$ \text{BM25}(q, d) = \sum_{t \in q} \text{IDF}(t) \cdot \frac{f(t, d) \cdot (k_1 + 1)}{f(t, d) + k_1 \cdot \left(1 - b + b \cdot \frac{|d|}{\text{avgdl}}\right)} $$

Database Compaction

SochDB periodically compacts SSTables to reclaim space and merge levels:

/// Compaction runs on spawn_blocking to avoid blocking the async runtime.
pub async fn schedule_compaction(
    db: Database,
    interval: Duration,
    token: CancellationToken,
) {
    let mut ticker = tokio::time::interval(interval);

    loop {
        tokio::select! {
            _ = ticker.tick() => {
                let db = db.clone();
                match tokio::task::spawn_blocking(move || db.compact()).await {
                    Ok(Ok(stats)) => {
                        tracing::info!(
                            reclaimed_bytes = stats.reclaimed_bytes,
                            merged_tables = stats.merged_tables,
                            "compaction completed"
                        );
                    }
                    Ok(Err(e)) => {
                        tracing::error!("compaction failed: {e}");
                    }
                    Err(e) => {
                        tracing::error!("compaction task panicked: {e}");
                    }
                }
            }
            _ = token.cancelled() => break,
        }
    }
}

Storage Metrics

Metric	Typical Value	Notes
Point read latency (p50)	< 0.1ms	B-Tree, mmap
Point read latency (p99)	< 1ms	Including cache miss
Write latency (p50)	< 1ms	WAL append + fsync
Vector search (10K, ef=100)	< 2ms	HNSW
Compaction throughput	~100 MB/s	LZ4 compression
Compression ratio	3–5x	LZ4 on columnar data
WAL write amplification	1.0x	Single write to WAL
Total write amplification	2–3x	WAL + compaction

Architecture Overview​

Port → Adapter Mapping​

SochDB Fundamentals​

Storage Engine: LSCS​

MVCC + SSI Transactions​

Isolation Levels​

Conflict Detection​

Write-Ahead Log (WAL)​

Vector Search: HNSW​

Search Performance​

Vector Store Implementation​

Context Queries and TOON Format​

ContextQueryBuilder​

TOON Format​

Hybrid Search (Memory Subsystem)​

Reciprocal Rank Fusion (RRF)​

BM25 Index​

Database Compaction​

Storage Metrics​