Skip to main content

clawdesk-media

Media processing pipeline for handling audio, video, and image attachments in messages. Provides a MediaProcessor trait for extensible format support and handles transcoding, thumbnail generation, and metadata extraction.

Dependencies

Internal: clawdesk-types

External: tokio, serde, tracing, thiserror

Modules

ModuleDescription
audioAudio processing — transcription prep, format conversion
videoVideo processing — frame extraction, transcoding
imageImage processing — resize, thumbnail, format conversion

Key Types

/// Core media processing trait
#[async_trait]
pub trait MediaProcessor: Send + Sync {
/// Process a media payload and return the processed result
async fn process(&self, input: &MediaPayload) -> Result<ProcessedMedia, MediaError>;

/// Extract metadata from a media file
async fn metadata(&self, input: &MediaPayload) -> Result<MediaMetadata, MediaError>;

/// Generate a thumbnail for preview
async fn thumbnail(&self, input: &MediaPayload, size: ThumbnailSize) -> Result<Vec<u8>, MediaError>;
}

/// Media payload from a message
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MediaPayload {
pub media_type: MediaType,
pub data: Vec<u8>,
pub filename: Option<String>,
pub mime_type: String,
pub size_bytes: usize,
}

/// Supported media types
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
pub enum MediaType {
Image,
Audio,
Video,
Document,
}

/// Processed media result
#[derive(Debug, Clone)]
pub struct ProcessedMedia {
pub data: Vec<u8>,
pub mime_type: String,
pub metadata: MediaMetadata,
pub thumbnail: Option<Vec<u8>>,
}

/// Media metadata
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MediaMetadata {
pub width: Option<u32>,
pub height: Option<u32>,
pub duration_secs: Option<f64>,
pub format: String,
pub size_bytes: usize,
}

Example Usage

use clawdesk_media::{MediaProcessor, MediaPayload, MediaType, ThumbnailSize};

let processor = ImageProcessor::new();

let payload = MediaPayload {
media_type: MediaType::Image,
data: image_bytes,
filename: Some("photo.jpg".into()),
mime_type: "image/jpeg".into(),
size_bytes: image_bytes.len(),
};

// Process the image (resize, optimize)
let processed = processor.process(&payload).await?;

// Generate a thumbnail
let thumb = processor.thumbnail(&payload, ThumbnailSize::Small).await?;
info

Media processing happens before messages enter the agent pipeline. Large media files are processed asynchronously to avoid blocking the message flow.