Knowledge-to-Video Pipeline Specification
Overview
A multi-stage pipeline that transforms knowledge sources (papers, articles, datasets, notes) into video content through iterative refinement and adversarial synthesis. The knowledge base is the product; videos are views into it.
CORE PHILOSOPHY: "Knowledge Base → Multiple Productions"
┌─────────────────────────────────────────────────────────────────┐
│ KNOWLEDGE BASE │
│ Papers, articles, datasets, notes, connections, insights │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
▼ ▼ ▼
Production 1 Production 2 Production 3
"Intro to X" "Deep dive Y" "X + Y synthesis"
(static tier) (broll tier) (full tier)
Production Tiers
v1 STATIC: Audio + single image $0.50 2 min
v2 TEXT: + text overlays (quotes) $1.00 5 min
v3 ANIMATED: + Ken Burns, motion graphics $2.00 10 min
v4 BROLL: + AI video (Luma) $5.00 15 min
v5 AVATAR: + virtual presenter $15.00 20 min
v6 FULL: All of the above $30.00 30 min
Part 1: Knowledge Base Model
Core Data Structures
KnowledgeProject
@dataclass
class KnowledgeProject:
"""A growing knowledge base that can spawn multiple videos"""
project_id: str
name: str
description: str
created_at: datetime
updated_at: datetime
# Sources (multiple documents, notes, links)
sources: Dict[str, KnowledgeSource]
# Unified knowledge graph across all sources
knowledge_graph: KnowledgeGraph
# User additions
notes: Dict[str, Note]
artifacts: Dict[str, Artifact] # Design docs, one-pagers created
connections: Dict[str, Connection] # Explicit links between concepts
# Generated outputs
productions: Dict[str, Production]
series: Dict[str, ProductionSeries] # Adversarial series
# Asset library (reusable across productions)
asset_library: Dict[str, GeneratedAsset]
# Memory namespace
memory_namespace: str
def get_source(self, source_id: str) -> KnowledgeSource: ...
def get_atoms(self, source_ids: List[str] = None) -> List[DocumentAtom]: ...
def get_figures(self, source_ids: List[str] = None) -> List[DocumentAtom]: ...
def search(self, query: str, limit: int = 10) -> List[DocumentAtom]: ...
KnowledgeSource
class SourceType(Enum):
PAPER = "paper" # Academic paper (PDF)
ARTICLE = "article" # News article (URL or text)
DATASET = "dataset" # CSV, JSON data
NOTE = "note" # Your observations
CODE_EXPERIMENT = "code" # Results from trying code
DESIGN_DOC = "design" # One-pagers, specs you created
TRANSCRIPT = "transcript" # Interview, podcast transcript
URL = "url" # Generic web content
@dataclass
class KnowledgeSource:
"""A single source of knowledge"""
source_id: str
source_type: SourceType
# Original content
title: str
raw_content: Optional[str]
file_path: Optional[str]
url: Optional[str]
# Extracted structure
document_graph: Optional[DocumentGraph] # For documents
data_summary: Optional[DataSummary] # For datasets
# Metadata
authors: List[str]
date: Optional[datetime]
tags: List[str]
# Processing state
added_at: datetime
processed_at: Optional[datetime]
extraction_method: str # "pandoc", "marker", "manual"
def get_atoms(self) -> List[DocumentAtom]: ...
def get_figures(self) -> List[DocumentAtom]: ...
def get_key_quotes(self) -> List[DocumentAtom]: ...
DocumentGraph (from Stage 0)
@dataclass
class DocumentAtom:
"""Smallest unit of extracted knowledge"""
atom_id: str
source_id: str # Which source this came from
atom_type: AtomType
# Content
content: str # Text content or description
raw_data: Optional[bytes] # For figures/tables: the actual image
# Location in source
source_page: Optional[int]
source_location: Optional[Rect]
# Semantic metadata
topics: List[str]
entities: List[str]
claims: List[str] # Factual claims made
# Importance
importance_score: float
# For figures/tables
caption: Optional[str]
figure_number: Optional[str]
data_summary: Optional[str] # LLM description of what it shows
class AtomType(Enum):
# Text atoms
TITLE = "title"
ABSTRACT = "abstract"
SECTION_HEADER = "section_header"
PARAGRAPH = "paragraph"
QUOTE = "quote"
CITATION = "citation"
CLAIM = "claim" # A specific factual claim
# Visual atoms
FIGURE = "figure"
CHART = "chart"
TABLE = "table"
EQUATION = "equation"
DIAGRAM = "diagram"
# Data atoms (from datasets)
DATA_COLUMN = "data_column"
DATA_ROW = "data_row"
DATA_INSIGHT = "data_insight" # LLM-generated insight from data
# Meta atoms
AUTHOR = "author"
DATE = "date"
KEYWORD = "keyword"
@dataclass
class KnowledgeGraph:
"""Unified graph across all sources"""
atoms: Dict[str, DocumentAtom]
# Relationships
hierarchy: Dict[str, List[str]] # Parent -> children
references: Dict[str, List[str]] # Atom -> atoms it references
contradicts: Dict[str, List[str]] # Atoms that contradict each other
supports: Dict[str, List[str]] # Atoms that support each other
same_topic: Dict[str, List[str]] # Atoms about same topic
# Cross-source connections
cross_source_links: List[CrossSourceLink]
# Summaries
topics: List[str] # All topics across sources
entities: List[str] # All entities
key_claims: List[str] # Most important claims
def get_atoms_by_topic(self, topic: str) -> List[DocumentAtom]: ...
def get_related_atoms(self, atom_id: str) -> List[DocumentAtom]: ...
def get_contradicting_atoms(self, atom_id: str) -> List[DocumentAtom]: ...
def search(self, query: str) -> List[DocumentAtom]: ...
@dataclass
class CrossSourceLink:
"""A relationship between atoms from different sources"""
link_id: str
source_atom_id: str
target_atom_id: str
relationship: str # "supports", "contradicts", "extends", "applies"
confidence: float
auto_detected: bool # True if LLM found it, False if user created
User Additions
@dataclass
class Note:
"""User's observation or insight"""
note_id: str
title: str
content: str
# What this note is about
related_sources: List[str]
related_atoms: List[str]
# Metadata
created_at: datetime
updated_at: datetime
tags: List[str]
@dataclass
class Artifact:
"""Something the user created (design doc, one-pager, etc.)"""
artifact_id: str
artifact_type: str # "design_doc", "one_pager", "code", "diagram"
title: str
# Content
content: str
file_path: Optional[str]
# What it's based on
source_ids: List[str]
atom_ids: List[str]
# Can be treated as a source itself
as_source: Optional[KnowledgeSource]
created_at: datetime
@dataclass
class Connection:
"""An explicit relationship the user has identified"""
connection_id: str
# What's connected
source_atoms: List[str] # Atom IDs (can be from different sources)
# The insight
relationship_type: str # "builds_on", "contradicts", "applies_to", "combines_with", "explains"
title: str
description: str
# Supporting reasoning
reasoning: str
evidence: List[str] # Atom IDs that support this connection
# Metadata
created_at: datetime
confidence: float # User's confidence in this connection
# Can become an atom itself for use in scripts
as_atom: Optional[DocumentAtom]
Dataset Support
@dataclass
class DataSummary:
"""Summary of a dataset source"""
source_id: str
# Structure
columns: List[ColumnInfo]
row_count: int
# Extracted insights
insights: List[DataInsight]
# Key statistics
statistics: Dict[str, Any]
# Queryable
def query(self, sql: str) -> pd.DataFrame: ...
def get_column(self, name: str) -> List[Any]: ...
@dataclass
class ColumnInfo:
name: str
dtype: str
description: str # LLM-generated
sample_values: List[Any]
statistics: Dict[str, Any] # min, max, mean, etc.
@dataclass
class DataInsight:
"""An insight extracted from data"""
insight_id: str
insight_type: str # "trend", "outlier", "correlation", "comparison"
description: str
# Evidence
columns_involved: List[str]
rows_involved: List[int]
statistical_support: Dict[str, Any]
# As an atom
as_atom: DocumentAtom
Part 2: Production Model
Production
@dataclass
class Production:
"""A video/podcast generated from the knowledge base"""
production_id: str
project_id: str
# What this production draws from
source_ids: List[str]
atom_ids: List[str] # Specific atoms to include
connection_ids: List[str] # Connections to highlight
note_ids: List[str] # User notes to incorporate
# The creative direction
production_prompt: str # "Explain how A and B connect"
target_audience: str # "general", "technical", "academic"
tone: str # "educational", "persuasive", "balanced"
# Target specs
target_duration: float
target_tier: ProductionTier
# Generated content
script: Optional[VideoScript]
assets: Dict[str, GeneratedAsset]
timeline: Optional[Timeline]
# Output
output_path: Optional[str]
# Versioning
version: int
previous_version_id: Optional[str]
# Cost tracking
generation_cost: float
generation_time: float
# State
status: str # "draft", "generating", "complete", "failed"
created_at: datetime
completed_at: Optional[datetime]
class ProductionTier(Enum):
STATIC = "static"
TEXT_OVERLAY = "text"
ANIMATED = "animated"
BROLL = "broll"
AVATAR = "avatar"
FULL = "full"
VideoScript
@dataclass
class VideoScript:
"""Complete script for video production"""
script_id: str
production_id: str
version: int
# Content
segments: List[ScriptSegment]
# Asset manifest
required_assets: List[AssetRequirement]
# Metadata
total_duration_estimate: float
figures_used: List[str]
sources_referenced: List[str]
# For iterative refinement
locked_segments: List[str] # Segment IDs that shouldn't change
@dataclass
class ScriptSegment:
"""A segment of the script"""
segment_id: str
segment_type: SegmentType
# Timing
start_time: Optional[float]
duration_estimate: float
# Dialogue/Narration
dialogue: str
speaker: str # "narrator", "host", "expert"
# Visual specification (tier-dependent)
visual: VisualSpec
# Source attribution
source_atoms: List[str] # Which atoms this draws from
source_quotes: List[str] # Direct quotes used
# For refinement
locked: bool = False
notes: Optional[str] = None
# Generated assets
tts_asset_id: Optional[str]
visual_asset_id: Optional[str]
class SegmentType(Enum):
INTRO = "intro"
SECTION_HEADER = "section_header"
EXPLANATION = "explanation"
FIGURE_CALLOUT = "figure_callout"
DATA_HIGHLIGHT = "data_highlight"
QUOTE = "quote"
COMPARISON = "comparison" # Comparing two sources/claims
CONNECTION = "connection" # Highlighting a user connection
TRANSITION = "transition"
RECAP = "recap"
OUTRO = "outro"
@dataclass
class VisualSpec:
"""What should be shown - tier-aware"""
# Tier 1: Static
static_image: Optional[str] = None
# Tier 2: Text overlay
text_overlay: Optional[TextOverlay] = None
# Tier 3: Animation
animation: Optional[AnimationSpec] = None
# Tier 4: B-roll
broll: Optional[BrollSpec] = None
# Tier 5: Avatar
avatar: Optional[AvatarSpec] = None
# Compositing
layout: str = "full"
def get_for_tier(self, tier: ProductionTier) -> 'VisualSpec':
"""Downgrade visual spec to match production tier"""
...
@dataclass
class TextOverlay:
text: str
style: str # "quote", "title", "bullet", "statistic"
position: str # "center", "lower_third", "full"
animation: str # "fade", "typewriter", "slide"
source_citation: Optional[str]
@dataclass
class AnimationSpec:
animation_type: str # "ken_burns", "parallax", "morph", "chart_build"
source_image: str # Atom ID
motion_params: Dict
@dataclass
class BrollSpec:
prompt: str
seed_image: Optional[str]
style: str
continuity_group: Optional[str]
duration: float
@dataclass
class AvatarSpec:
avatar_id: str
emotion: str
gesture: Optional[str]
position: str
Part 3: Adversarial Series Model
Overview
An adversarial series generates multiple productions from the same sources, exploring different perspectives, then uses LLM-as-judge to evaluate and synthesize a stronger final production.
Sources ──┬── Perspective A ("Pro") ────┐
│ │
├── Perspective B ("Con") ────┼── Judge ── Synthesis
│ │
└── Perspective C ("Skeptic")─┘
Data Structures
@dataclass
class ProductionSeries:
"""A set of related productions exploring a topic from multiple angles"""
series_id: str
project_id: str
name: str
# The shared foundation
source_ids: List[str]
base_prompt: str # The topic/question being explored
# Configuration
target_tier: ProductionTier
target_duration: float
# Perspectives to explore
perspectives: Dict[str, Perspective]
# Judgment rounds
judgments: List[Judgment]
# Synthesis
synthesis_production_id: Optional[str]
# State
current_round: int
status: str # "defining", "generating", "judging", "synthesizing", "complete"
created_at: datetime
completed_at: Optional[datetime]
@dataclass
class Perspective:
"""A specific angle/stance to argue"""
perspective_id: str
series_id: str
name: str # "Pro", "Con", "Skeptic", "Optimist", "Technical"
# Instructions for this perspective
stance: str # "Argue in favor of X"
framing: str # "From an economic perspective..."
# Guidance on source usage
atoms_to_emphasize: List[str] # Atoms to highlight
atoms_to_minimize: List[str] # Atoms to downplay (for adversarial)
required_atoms: List[str] # Must include these
# Constraints
tone: str # "confident", "cautious", "provocative"
target_audience: str
# Generated production
production_id: Optional[str]
production: Optional[Production]
# Evaluation
judgment_scores: Dict[str, float] # From judge rounds
created_at: datetime
@dataclass
class Judgment:
"""LLM evaluation of productions in a series"""
judgment_id: str
series_id: str
round: int
# What was evaluated
perspective_ids: List[str]
# Per-perspective evaluation
evaluations: Dict[str, PerspectiveEvaluation]
# Cross-perspective analysis
gaps: List[Gap] # What all perspectives missed
contradictions: List[Contradiction] # Conflicting claims
tensions: List[Tension] # Valid disagreements
# Synthesis recommendations
synthesis_recommendations: List[SynthesisRecommendation]
# Overall assessment
strongest_perspective: str
most_balanced_perspective: str
recommended_approach: str
judged_at: datetime
judge_model: str # Which LLM did the judging
@dataclass
class PerspectiveEvaluation:
"""Evaluation of a single perspective"""
perspective_id: str
# Scores
argument_strength: float # 0-100
evidence_usage: float # 0-100
logical_consistency: float # 0-100
source_fidelity: float # 0-100 (does it accurately represent sources?)
persuasiveness: float # 0-100
overall_score: float
# Detailed feedback
strengths: List[StrengthWeakness]
weaknesses: List[StrengthWeakness]
# Evidence analysis
well_used_evidence: List[str] # Atom IDs used effectively
misrepresented_evidence: List[str] # Atom IDs misrepresented
ignored_evidence: List[str] # Relevant atoms not used
# Logical analysis
sound_arguments: List[str]
logical_fallacies: List[str]
unsupported_claims: List[str]
@dataclass
class StrengthWeakness:
description: str
evidence: List[str] # Atom IDs supporting this assessment
severity: str # "minor", "moderate", "major"
segment_ids: List[str] # Which segments this applies to
@dataclass
class Gap:
"""Something all perspectives missed"""
description: str
relevant_atoms: List[str] # Evidence that should have been used
importance: str # "critical", "important", "minor"
suggested_incorporation: str # How to include in synthesis
@dataclass
class Contradiction:
"""Conflicting claims between perspectives"""
description: str
perspective_a: str
perspective_b: str
claim_a: str
claim_b: str
relevant_evidence: List[str] # Atoms that bear on this
resolution: str # Which is better supported, or both valid
@dataclass
class Tension:
"""A valid disagreement (not resolvable by evidence)"""
description: str
perspectives_involved: List[str]
nature: str # "values", "priorities", "interpretation"
synthesis_approach: str # How to handle in synthesis
@dataclass
class SynthesisRecommendation:
"""Recommendation for the synthesis production"""
recommendation: str
priority: str # "essential", "recommended", "optional"
atoms_to_incorporate: List[str]
perspectives_to_draw_from: List[str]
segment_type_suggested: str
Synthesis Production
@dataclass
class SynthesisSpec:
"""Specification for generating a synthesis production"""
series_id: str
# What to incorporate
from_judgments: List[str] # Judgment IDs to incorporate
# Structure guidance
structure: str # "balanced", "thesis_antithesis", "comprehensive"
# Content guidance
must_address_gaps: List[str] # Gap IDs that must be addressed
must_acknowledge_tensions: List[str] # Tension IDs to acknowledge
resolution_approach: str # "adjudicate", "present_both", "meta_analysis"
# Reuse from perspectives
segments_to_reuse: Dict[str, List[str]] # perspective_id -> segment_ids
assets_to_reuse: List[str]
# Target
target_tier: ProductionTier
target_duration: float
target_audience: str
Part 4: CLI Interface
Project Management
# Create knowledge base
claude-studio kb create "Project Name" [--description "..."]
# Add sources
claude-studio kb add <project> --paper paper.pdf [--title "..."]
claude-studio kb add <project> --article article.pdf
claude-studio kb add <project> --url "https://..."
claude-studio kb add <project> --dataset data.csv [--description "..."]
claude-studio kb add <project> --note "My observation..." [--title "..."]
claude-studio kb add <project> --artifact design_doc.md --type design_doc
# Explore knowledge base
claude-studio kb show <project>
claude-studio kb sources <project>
claude-studio kb atoms <project> [--source <source_id>] [--type figures]
claude-studio kb search <project> "query"
claude-studio kb graph <project> [--output graph.png]
# Add connections
claude-studio kb connect <project> \
--atoms "source_a:atom_1" "source_b:atom_2" \
--relationship "builds_on" \
--description "A's method could improve B's results" \
[--reasoning "Because..."]
# View connections
claude-studio kb connections <project>
Single Production
# Generate production from knowledge base
claude-studio kb produce <project> \
[--sources source_1 source_2] \
[--atoms atom_1 atom_2] \
[--connections conn_1] \
--prompt "Explain the key findings" \
--tier static \
[--duration 120] \
[--audience general]
# Upgrade production tier
claude-studio kb upgrade <project> --production <prod_id> \
--to-tier broll \
[--keep-audio] \
[--regenerate-segments 3,4,5]
# Refine specific segments
claude-studio kb regen <project> --production <prod_id> \
--segment 4 \
[--prompt "Make this more technical"] \
[--provider runway]
# Lock good segments
claude-studio kb lock <project> --production <prod_id> --segments 1,2,6
# Preview
claude-studio kb preview <project> --production <prod_id> [--segment 3]
Adversarial Series
# Create series
claude-studio kb series create <project> \
--name "Series Name" \
--prompt "What is the impact of X?" \
--sources source_1 source_2 source_3 \
[--tier static] \
[--duration 120]
# Add perspectives
claude-studio kb series perspective <series> \
--name "Pro" \
--stance "Argue that X is beneficial" \
[--emphasize atom_1 atom_2] \
[--tone confident]
claude-studio kb series perspective <series> \
--name "Con" \
--stance "Argue that X has significant costs" \
[--emphasize atom_3 atom_4] \
[--tone cautious]
claude-studio kb series perspective <series> \
--name "Skeptic" \
--stance "Question the methodology and conclusions" \
[--tone provocative]
# Generate perspective productions
claude-studio kb series generate <series> [--perspective Pro]
# Run judge
claude-studio kb series judge <series> [--round 1]
# View judgment
claude-studio kb series judgment <series> [--round 1]
# Generate synthesis
claude-studio kb series synthesize <series> \
[--structure balanced] \
[--tier broll]
# Iterate
claude-studio kb series perspective <series> \
--name "Devil's Advocate" \
--stance "Challenge the synthesis"
claude-studio kb series judge <series> --round 2
claude-studio kb series synthesize <series> --round 2
# Export series report
claude-studio kb series report <series> --output report.md
Inspection & Analysis
# Show production details
claude-studio kb production show <project> <prod_id>
# Show script
claude-studio kb production script <project> <prod_id>
# Show assets
claude-studio kb production assets <project> <prod_id>
# Compare productions
claude-studio kb compare <project> <prod_id_1> <prod_id_2>
# Cost analysis
claude-studio kb cost <project> [--production <prod_id>] [--series <series_id>]
# Export knowledge base
claude-studio kb export <project> --output project_export.json
# Import/merge knowledge bases
claude-studio kb import <project> --from other_project_export.json
Part 5: Execution Model
Phase-Based Execution
@dataclass
class ExecutionPlan:
"""Plan for generating a production"""
production_id: str
phases: List[ExecutionPhase]
@dataclass
class ExecutionPhase:
phase_id: str
phase_type: str # "extraction", "script", "audio", "video", "assembly"
groups: List[TaskGroup]
depends_on: List[str] # Phase IDs
skip_if: Optional[Callable] # Condition to skip
estimated_duration: float
estimated_cost: float
@dataclass
class TaskGroup:
group_id: str
tasks: List[str] # Task/scene IDs
mode: ExecutionMode # PARALLEL or SEQUENTIAL
provider: str
# For sequential (chaining)
chain_type: Optional[str] # "luma_extend", "keyframe"
# Example execution plan for BROLL tier production:
plan = ExecutionPlan(
production_id="prod_123",
phases=[
# Phase 1: Script (if not cached)
ExecutionPhase(
phase_id="script",
phase_type="script",
groups=[
TaskGroup(
tasks=["generate_script"],
mode=SEQUENTIAL,
provider="llm",
),
],
),
# Phase 2: Audio generation
ExecutionPhase(
phase_id="audio",
phase_type="audio",
groups=[
TaskGroup(
tasks=["tts_seg_1", "tts_seg_2", "tts_seg_3", ...],
mode=PARALLEL,
provider="elevenlabs",
),
],
depends_on=["script"],
),
# Phase 3: Video generation (parallel groups, sequential within chains)
ExecutionPhase(
phase_id="video",
phase_type="video",
groups=[
# Ken Burns on figures
TaskGroup(
group_id="figures",
tasks=["ken_burns_fig1", "ken_burns_fig2"],
mode=PARALLEL,
provider="ffmpeg",
),
# Luma B-roll chain A
TaskGroup(
group_id="broll_chain_a",
tasks=["broll_1", "broll_3"],
mode=SEQUENTIAL,
provider="luma",
chain_type="luma_extend",
),
# Luma B-roll chain B
TaskGroup(
group_id="broll_chain_b",
tasks=["broll_2"],
mode=PARALLEL,
provider="luma",
),
],
depends_on=["script"],
),
# Phase 4: Assembly
ExecutionPhase(
phase_id="assembly",
phase_type="assembly",
groups=[
TaskGroup(
tasks=["build_timeline", "render"],
mode=SEQUENTIAL,
provider="ffmpeg",
),
],
depends_on=["audio", "video"],
),
],
)
Adversarial Series Execution
# Series execution plan:
series_plan = SeriesExecutionPlan(
series_id="series_456",
stages=[
# Stage 1: Generate all perspectives (parallel)
SeriesStage(
stage_type="generate_perspectives",
perspective_ids=["pro", "con", "skeptic"],
parallel=True,
),
# Stage 2: Judge
SeriesStage(
stage_type="judge",
round=1,
depends_on=["generate_perspectives"],
),
# Stage 3: Synthesize
SeriesStage(
stage_type="synthesize",
round=1,
depends_on=["judge"],
),
# Optional: More rounds
# Stage 4: New perspective based on synthesis
# Stage 5: Judge round 2
# Stage 6: Synthesize round 2
],
)
Part 6: Memory Integration
Namespace Structure
/org/{org}/kb/{project}/
├── config.json # Project configuration
├── sources/
│ ├── {source_id}/
│ │ ├── metadata.json
│ │ ├── raw_content.txt
│ │ ├── document_graph.json
│ │ └── figures/
│ │ ├── figure_1.png
│ │ └── ...
│ └── ...
├── knowledge_graph.json # Unified graph
├── user/
│ ├── notes/
│ │ └── {note_id}.json
│ ├── artifacts/
│ │ └── {artifact_id}.json
│ └── connections/
│ └── {connection_id}.json
├── productions/
│ └── {production_id}/
│ ├── config.json
│ ├── script.json
│ ├── assets/
│ │ ├── tts_seg_1.mp3
│ │ └── ...
│ ├── timeline.json
│ └── output/
│ └── final.mp4
├── series/
│ └── {series_id}/
│ ├── config.json
│ ├── perspectives/
│ │ ├── pro.json
│ │ └── con.json
│ ├── judgments/
│ │ ├── round_1.json
│ │ └── round_2.json
│ └── synthesis/
│ └── round_1.json
└── learnings/
├── source_insights.json # What we learned about sources
├── production_patterns.json # What production approaches worked
└── series_learnings.json # Adversarial insights
Cross-Project Learning
/org/{org}/learnings/
├── provider/{provider}/ # Provider-specific learnings
├── format/{format}/ # Format-specific (podcast, explainer)
├── topic/{topic}/ # Topic-specific patterns
└── adversarial/ # Adversarial synthesis patterns
├── effective_perspectives.json
├── judge_patterns.json
└── synthesis_strategies.json
Part 7: Example Workflows
Workflow 1: Single Paper Explainer
# Create project
claude-studio kb create "CRISPR Cancer Treatment"
# Add paper
claude-studio kb add crispr --paper crispr_paper.pdf
# Quick static production
claude-studio kb produce crispr \
--prompt "Explain this paper's key findings for a general audience" \
--tier static \
--duration 180
# Review, then upgrade
claude-studio kb upgrade crispr --production prod_001 \
--to-tier broll \
--keep-audio
Workflow 2: Multi-Source Analysis
# Create project
claude-studio kb create "AI Regulation Analysis"
# Add diverse sources
claude-studio kb add ai_reg --paper eu_ai_act.pdf
claude-studio kb add ai_reg --article nyt_coverage.txt
claude-studio kb add ai_reg --url "https://techcrunch.com/..."
claude-studio kb add ai_reg --dataset enforcement_actions.csv
# Add your analysis
claude-studio kb add ai_reg --note "The EU approach differs from US in these ways..."
# Connect insights across sources
claude-studio kb connect ai_reg \
--atoms "eu_ai_act:article_5" "enforcement:row_47" \
--relationship "contradicts" \
--description "The act requires X but enforcement data shows Y"
# Generate synthesis video
claude-studio kb produce ai_reg \
--prompt "Analyze the gap between AI regulation intent and enforcement reality" \
--tier animated
Workflow 3: Adversarial Policy Analysis
# Create project with news + data
claude-studio kb create "Immigration Economics"
claude-studio kb add immig --url "https://nytimes.com/..."
claude-studio kb add immig --url "https://wsj.com/..."
claude-studio kb add immig --dataset census_data.csv
claude-studio kb add immig --dataset border_stats.csv
# Create adversarial series
claude-studio kb series create immig \
--name "Immigration Economic Impact" \
--prompt "What is the economic impact of immigration?"
# Define perspectives
claude-studio kb series perspective immig_series \
--name "Pro-Immigration" \
--stance "Argue immigration is economically beneficial"
claude-studio kb series perspective immig_series \
--name "Immigration-Skeptic" \
--stance "Argue immigration has significant economic costs"
# Generate and judge
claude-studio kb series generate immig_series
claude-studio kb series judge immig_series
# Review judgment, then synthesize
claude-studio kb series judgment immig_series
claude-studio kb series synthesize immig_series --tier broll
Workflow 4: Iterative Research Project
# Start with one paper
claude-studio kb create "Transformer Architectures"
claude-studio kb add transformers --paper attention_is_all_you_need.pdf
# Initial video
claude-studio kb produce transformers \
--prompt "Explain the transformer architecture" \
--tier static
# Later: Add related paper
claude-studio kb add transformers --paper bert.pdf
# Connect them
claude-studio kb connect transformers \
--atoms "attention:encoder" "bert:bidirectional" \
--relationship "extends"
# New video about the connection
claude-studio kb produce transformers \
--sources attention bert \
--connections conn_001 \
--prompt "How BERT built on and extended the original transformer"
# Even later: Add your implementation notes
claude-studio kb add transformers \
--note "Implemented attention from scratch, key insight was..." \
--artifact my_implementation.py --type code
# Video incorporating your experience
claude-studio kb produce transformers \
--prompt "Practical lessons from implementing transformers" \
--notes note_001
Part 8: Future Extensions
Multi-Language Support
# Generate in multiple languages
production = kb.produce(
prompt="...",
languages=["en", "es", "zh"], # Parallel TTS in each
)
Collaborative Knowledge Bases
# Share knowledge base
kb.share(users=["collaborator@example.com"], permission="edit")
# Merge another user's additions
kb.merge(from_user="collaborator", changes=["note_005", "conn_003"])
Live Source Updates
# Monitor sources for updates
kb.watch(source_id="news_feed", check_interval="1h")
# Auto-regenerate when sources change
production.auto_update = True
Interactive Exploration
# Chat with knowledge base
claude-studio kb chat <project>
> What does paper A say about X?
> How does that compare to the dataset?
> Generate a video segment explaining this
Summary
This spec enables:
- Knowledge-first approach - Build understanding, videos follow
- Multi-source synthesis - Papers, articles, data, notes
- User insights as first-class - Your connections and notes matter
- Adversarial reasoning - Explore multiple perspectives, stronger synthesis
- Iterative refinement - Cheap drafts → polished productions
- Asset reuse - TTS, B-roll cached across productions
- Learning over time - Patterns improve future productions
The knowledge base is the product. Videos are just one way to view it.