Vidix
A native macOS application that embeds AI directly into your workflow. Select text or an image in any app, trigger a shortcut, and get AI-powered results without switching context. Built on a multi-agent architecture with RAG-powered recipes, MCP server integration, multi-provider support, and a strict privacy-first approach where zero data touches our servers.
Visit vidix.appsupported
& agents
on our servers
(replace, type, editor)
Eliminating the context switch tax.
Every time you use an AI tool, you pay a context switch tax: leave your application, open a browser, navigate to a chat interface, paste your content, wait for a response, copy the result, switch back to your app, paste it in. That's 8 steps minimum, repeated dozens of times per day. For knowledge workers — developers, writers, analysts, project managers — this friction adds up to hours of lost productivity weekly.
Existing solutions were either web-based (still requiring a switch) or locked to a single AI provider. None offered the deep system-level integration needed to truly make AI invisible: capturing content from any application, processing it through the right model with the right prompt, and delivering results exactly where you need them.
We needed to build a tool that lived at the OS level — accessible from every application via a single shortcut, supporting multiple AI providers and custom workflows, with an absolute commitment to privacy: no data stored, no intermediary servers, no tracking. Just AI where you work.
A multi-agent system inside your Mac.
Vidix isn't a simple API wrapper. It's a coordinated system of specialized agents, each responsible for a different aspect of the AI workflow, orchestrated by a central engine that routes requests to the right agent with the right context.
Capture Agent
Interfaces with macOS Accessibility APIs to capture selected text, images, or screen regions from any running application. Handles the complexity of different app frameworks — native Cocoa, Electron, web views, Terminal — with fallback strategies for apps that don't expose standard accessibility hooks. Detects content type automatically and routes to the appropriate processing pipeline.
Router Agent
The orchestration layer. Receives captured content and determines how to process it: which recipe to apply, which AI provider to use, what system prompt to inject. Uses a RAG-indexed recipe library to match content type and user intent to the right processing pipeline. Handles provider failover — if Claude is unavailable, it can route to GPT or a local Ollama model based on user-defined fallback preferences.
Recipe Agents
Each recipe is effectively a mini-agent with its own system prompt, provider preference, temperature setting, and output format. Built-in recipes cover common use cases: "Improve writing," "Explain code," "Translate to Spanish," "Extract key points." Users create custom recipe agents without code — defining the prompt, selecting a provider, and assigning a keyboard shortcut. Each recipe agent manages its own conversation context.
Vision Agent
Handles image inputs using Claude's vision capabilities and GPT-4V. Users can capture a screen region and ask questions about it, extract text from images (OCR), describe visual content, analyze charts and diagrams, or convert mockups into code. The agent automatically selects the best vision-capable provider based on the task and user's API key availability.
Output Agent
Manages how AI responses are delivered back to the user. Three modes: direct replacement (swaps selected text), character-by-character typing (for apps that block paste, like certain terminals and form fields), and editor mode (opens a markdown buffer where users can iterate — "make it shorter," "add bullet points" — before inserting). Handles formatting preservation and clipboard management.
MCP Bridge Agent
Integrates with Model Context Protocol (MCP) servers to extend Vidix's capabilities beyond text and image processing. Users can connect MCP servers for database queries, API calls, file system operations, and custom tools — all accessible through the same shortcut-driven interface. The bridge agent handles MCP server discovery, connection management, and tool routing.
A knowledge-driven recipe engine.
The recipe system is powered by RAG, enabling intelligent recipe matching and context-aware suggestions that go beyond simple keyword search.
The Palette (command launcher) uses vector search to match user queries to recipes. Typing "make this email more professional" finds the "Improve Writing" recipe even without exact keyword matches. The recipe library is indexed in a local vector store for instant retrieval.
The Router Agent analyzes captured content and suggests the most relevant recipes. Select code and it surfaces development-related recipes; select prose and it suggests writing recipes. Suggestions are ranked by content type, frequency of use, and the active application.
Users build their own recipe agents without writing code: define a system prompt, choose a provider, set parameters, assign a shortcut. Custom recipes are automatically indexed into the RAG store for semantic search alongside built-in recipes.
The editor mode maintains conversation context, enabling iterative refinement. The system stores recent interaction history in a local RAG index, allowing follow-up prompts like "now translate that to French" to work seamlessly across sessions.
Zero trust. Zero data retention.
Vidix was designed from the ground up with a non-negotiable privacy constraint: we never see, store, or process user data. The entire architecture enforces this at every level.
All processing happens locally on the user's Mac. When AI is needed, content routes directly from the application to the user's chosen provider — Claude, GPT, Gemini, or a local model via Ollama. No intermediary server, no proxy, no logging. Users bring their own API keys, stored in the macOS Keychain.
For users who can't send data to any external API, Ollama support enables fully offline AI processing with local models. The same recipes, the same interface, the same workflow — but nothing leaves the machine.
Built with Claude Code.
A native Swift application built with Claude Code driving the entire development workflow — from architecture design to App Store submission.
Used Claude Code's plan mode to design the agent architecture in Swift, mapping how native macOS APIs (Accessibility, Keychain, Pasteboard) would interface with the multi-provider AI layer.
A project-specific slash command that scaffolds new recipe agents: generates the prompt template, configures provider settings, creates test fixtures, and indexes the recipe into the local vector store.
Project rules enforce Swift naming conventions, ensure proper async/await patterns for API calls, mandate error handling for all provider interactions, and maintain privacy-first patterns across the codebase.
Claude Code's subagent capabilities enabled parallel development of the MCP Bridge Agent alongside the core application, testing server discovery and tool routing in isolation.
The full system.
Native macOS
AI & Providers
Development
Need AI embedded in
your workflow?
We build native applications with deep AI integration, multi-provider support, and privacy-first architecture. Let's talk about your use case.