Specifications by Theme
Organized by architectural concerns to show how different aspects of the system work together.
π¬ Core Agent System
The seven agents that implement the multi-agent video production pipeline.
| Agent | Role | Key Innovation |
|---|---|---|
| ProducerAgent | Budget planning, pilot strategies | Competitive pilot system with dynamic budget allocation |
| ScriptWriterAgent | Scene breakdown from concept | Provider-aware prompt optimization using learnings |
| VideoGeneratorAgent | Video generation | Pluggable providers with unified interface |
| AudioGeneratorAgent | TTS narration | Multi-provider TTS (ElevenLabs, OpenAI, Google) |
| QAVerifierAgent | Vision-based verification | Claude Vision analysis of video frames |
| CriticAgent | Quality evaluation | Gap analysis, learning extraction |
| EditorAgent | EDL creation | Selects best scenes from multiple pilots |
Design Philosophy: Each agent has a single, well-defined responsibility. They communicate through structured data, enabling parallel execution and easy testing.
ποΈ System Architecture
Foundation and infrastructure specifications.
- PROJECT_VISION.md - Core mission statement
- Original βinceptionβ use case
- Competitive pilot innovation
- Budget-aware production
- ARCHITECTURE.md - System design
- Data flow diagrams
- Agent orchestration patterns
- Integration points
- STRANDS_INTEGRATION.md - Agent orchestration
- Why Strands SDK
- Agent initialization
- Message passing
π Provider System
Pluggable providers for video, audio, image, music, and storage.
- PROVIDERS_COMPLETE.md - Provider architecture
- Unified interface design
- Cost estimation
- Error handling patterns
- Mock mode for testing
- LUMA_PROVIDER_SPEC.md - Luma AI implementation
- Text-to-video
- Image-to-video
- Extend/interpolate
- Comprehensive API mapping
- AUDIO_SYSTEM.md - Audio production tiers
- Tier 1: No audio
- Tier 2: Background music
- Tier 3+: Full TTS narration
- SEED_ASSETS.md - Multi-modal inputs
- User-provided images/video
- Brand consistency
- Voice cloning
Key Innovation: Providers are hot-swappable. Same script can be produced with Luma, Runway, or Pika just by changing a CLI flag.
π§ Memory & Learning System
Continuous improvement through learning and knowledge integration.
- MEMORY_AND_DASHBOARD.md - Initial memory design
- Provider-specific learnings
- Tips, gotchas, preferences
- Web dashboard
- MULTI_TENANT_MEMORY_ARCHITECTURE.md - Enterprise multi-tenancy
- Namespace hierarchy: SESSION β USER β ORG β PLATFORM
- Security model and isolation
- Learning promotion based on validation
- KNOWLEDGE_TO_VIDEO.md - Knowledge base system
- Document ingestion (PDF, Markdown)
- Atomic concept extraction
- Context-aware video generation
- DOCUMENT_TO_VIDEO.md - Document pipeline
- PDF atomization
- Semantic chunking
- Research paper to video
Evolution: Started with simple prompt optimization, evolved to full knowledge management and enterprise memory system.
π οΈ Developer Experience
Tools and infrastructure for building and testing.
- TESTING_AND_PROVIDERS.md - Testing philosophy
- Mock providers for testing
- Integration testing strategies
- Cost-aware testing
- DOCKER_DEV_ENVIRONMENT.md - Containerized development
- FFmpeg dependencies
- Python environment
- Hot reload
- CLI_INTROSPECTION.md - CLI enhancements
- Package introspection
- Command structure
- Configuration management
Focus: Make it easy to develop without incurring API costs. Mock mode first, live mode when ready.
View By Timeline
| Timeline View β | Evolution Story β |