AI Agents Image Generation RAG Claude

Phopet

An AI-powered platform that transforms pet photos into styled portraits. Behind a simple user experience lies a multi-agent system with RAG-based style recommendations, Claude-powered quality assessment, autonomous GPU orchestration, and a support chatbot — all processing 10,000+ images without human intervention.

Visit phopet.com

10K+

Images generated

40+

Themed albums

Autonomous agents

Manual steps

The Challenge

Scaling creative production from hours to seconds.

Custom pet portraits are wildly popular but fundamentally unscalable. Each portrait requires a skilled designer: understanding the pet's features, selecting the right artistic style, composing the image, iterating on details, and delivering the final piece. That's 4+ hours of manual work per image — a handful per day at best, with backlogs growing weekly.

The challenge wasn't just automating image generation — that's the easy part. The real complexity was building a system that could: validate that uploaded photos are actually usable for training, train a custom model that captures the unique features of each individual pet, recommend styles that work well for each pet's breed and coloring, assess output quality without human review, handle payment processing and delivery, and provide customer support — all autonomously.

We needed to replace an entire creative studio with an AI-powered pipeline that handles every step from photo upload to delivery — with quality that matches or exceeds manual work.

Architecture

Six agents, one orchestrator, zero humans in the loop.

The system runs on a multi-agent architecture where each agent owns a specific domain. A central orchestrator — built on Claude — coordinates the pipeline and handles edge cases autonomously.

Agent 01

Validation Agent

Analyzes uploaded photos using computer vision. Checks for minimum resolution, detects whether the subject is actually a pet, evaluates lighting and focus quality, identifies duplicate uploads, and rejects images that won't produce good training results. Uses a Claude-powered assessment for borderline cases — "is this photo usable?" — with structured output for consistent decision-making.

Agent 02

Training Agent

Manages the custom model training pipeline. Preprocesses validated photos (background removal, normalization, augmentation), configures training parameters based on pet type and photo set size, dispatches training jobs to GPU workers, monitors training progress, and validates the resulting model with test generations before marking it as ready.

Agent 03

Style Recommender (RAG)

A RAG-powered agent that recommends album themes based on the pet's characteristics. Retrieves from a knowledge base of breed-specific style data, past generation results, and user preference patterns. A Golden Retriever gets different style recommendations than a Persian cat. The retriever uses vector embeddings of pet features matched against a curated style database with Pinecone.

Agent 04

Generation Agent

Handles the actual image generation pipeline. Combines the pet's custom model with album-specific prompts and style parameters, manages GPU resource allocation, implements generation batching for efficiency, runs automatic quality scoring on outputs using Claude's vision capabilities, and retries or adjusts parameters when results don't meet the quality threshold.

Agent 05

Delivery Agent

Manages post-generation processing: image upscaling for print-quality output, watermark application for previews, CDN upload and cache optimization, user notification delivery, and integration with the merchandise API for wall art and product ordering. Handles the full lifecycle from raw generation to user-ready assets.

Agent 06

Support Chatbot

A Claude-powered customer support chatbot that handles user inquiries about their orders, generation status, photo requirements, and billing questions. Uses RAG to retrieve order-specific context and account history, enabling personalized responses. Escalates complex issues to human support with full conversation context. Supports English, Spanish, and Japanese.

The Pipeline

From photo upload to delivered portraits.

Step 01 — Upload & Validation

Smart photo ingestion

Users upload 10–20 reference photos. The Validation Agent runs each through a multi-stage check: resolution validation, pet detection via computer vision, focus and lighting scoring, duplicate detection, and a final Claude-powered assessment for edge cases. Users receive real-time feedback on which photos passed and why any were rejected — with specific suggestions for better alternatives.

Step 02 — Model Training

Custom model per pet

The Training Agent preprocesses approved photos — background removal, color normalization, data augmentation — and trains a LoRA model specific to that pet. Training parameters are dynamically adjusted based on the number and quality of input photos. The agent monitors loss curves in real-time and stops training at optimal convergence. A validation generation confirms the model captures the pet's distinctive features before it's marked as ready.

Step 03 — Generation

GPU-orchestrated batch production

When the user selects an album, the Generation Agent combines the pet's model with curated prompts and style parameters. Jobs are queued with priority scoring, batched for GPU efficiency, and dispatched to the worker pool. Each output gets an automatic quality score from Claude's vision API — images below the threshold are regenerated with adjusted parameters. The entire album (50+ images) generates in minutes, not hours.

Step 04 — Delivery & Commerce

From digital to physical

Generated images are upscaled to print resolution, optimized for web delivery, and uploaded to the CDN. Users browse their gallery, favorite images, and can order physical products — canvas prints, wall art, merchandise — through integrated commerce. The Delivery Agent handles the full chain from raw generation output to user-ready assets and order fulfillment.

Claude Integration

Claude as a first-class system component.

Claude isn't just used for one task — it's woven throughout the entire system as a decision-making engine, quality gate, and customer interface.

Quality assessment

Claude's vision capabilities score every generated image for quality, consistency with the pet's features, and artistic merit. Below-threshold images are automatically regenerated.

Photo validation

For borderline uploads, Claude provides nuanced assessment — "this photo has good pose but insufficient lighting" — with structured output that drives validation decisions.

Prompt optimization

Claude analyzes generation results and dynamically adjusts prompts to improve output quality for specific pet types and styles. A feedback loop that gets better over time.

Customer support

The support chatbot uses Claude with RAG-retrieved order context to handle inquiries in three languages, resolving 80%+ of tickets without human escalation.

Orchestration decisions

The central orchestrator uses Claude for edge-case routing — when a job fails, when quality is borderline, when a user request doesn't fit standard patterns.

Development

Built with Claude Code.

The entire system was developed using Claude Code as the primary engineering tool — from architecture design to production deployment.

Architecture planning

Used Claude Code's plan mode to design the multi-agent architecture, define agent responsibilities, and map data flows before writing a single line of code.

Custom slash commands

Created project-specific skills for common operations: /deploy for zero-downtime deployments, /test-agent for running agent integration tests, /gpu-status for monitoring the worker pool.

CLAUDE.md rules

Project rules enforce coding standards, ensure proper error handling in agent code, mandate structured logging, and maintain consistency across the multi-agent codebase.

Parallel agent development

Claude Code's subagent capabilities enabled parallel development of independent agents — building and testing the Validation Agent and Training Agent simultaneously.

Tech Stack

The full system.

Backend & Infrastructure

Laravel Python Redis PostgreSQL Queue Workers GPU Orchestration

AI & ML

Claude API Stable Diffusion LoRA Training Pinecone Computer Vision RAG Pipeline

Platform

Stripe CDN Image Upscaling i18n (EN/ES/JP) Merchandise API Claude Code

Need an autonomous
AI pipeline?

We build multi-agent systems that handle high-volume AI workloads end-to-end. Let's talk about your use case.

Book a discovery call See all projects →