AI Podcast Generator is an end‑to‑end podcast generation pipeline that ingests curated learning materials (PDFs/MDX/JSON outlines), builds a two‑host conversational script, and produces an MP3 via TTS. This is an opinionated MVP: minimal surface area, clear defaults, no yak‑shaving.
What this is
- A clean, modular FastAPI backend that demonstrates the full flow: ingest → script builder → TTS → storage/API.
- Pluggable providers: default OpenAI for script generation and Google TTS for audio (easily swappable).
- A Vue 3 frontend (Vite, TypeScript, Pinia, Tailwind) to create podcasts, list/play episodes, filter, and delete.
- Sensible errors & debug routes to inspect state while developing.
What this isn’t
- Not a no‑code product. It expects prepared inputs and uses simple heuristics in model prompts.
- Not a generalized copyright crawler. It does not fetch from the web; it uses local materials you provide.
Quickstart (backend)
Requirements: Python 3.11+, poetry
, and optionally OpenAI API keys.
- Install
poetry install
- Configure
.env
(example)
PORT=3000
BASE_URL=http://localhost:3000
OUTPUT_DIR=./storage/episodes
DATABASE_URL=postgresql+psycopg://user:pass@localhost:5432/podcast
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY= # optional
MATERIALS_DIRS=./materials/*,./materials/??
- Run
poetry run uvicorn app.api.v1.podcast:app --reload --port 3000
- Smoke test (create a podcast from local materials)
curl -X POST "http://localhost:3000/api/v1/podcasts" \
-H 'content-type: application/json' \
-d '{
"title": "Sample Episode",
"description": "Short intro",
"materials_glob": "materials/**/*.json",
"script_style": "two-host",
"voice": "en-US-Standard-A"
}'
Quickstart (frontend)
Frontend lives in frontend/
and is built with Vue 3, TypeScript, Pinia, and TailwindCSS.
- Install deps
pnpm install
- Configure
.env
VITE_API_BASE=http://localhost:3000/api
- Run dev
pnpm dev
- Build
pnpm build
The app provides a UI for creating podcasts, previewing the list of episodes, playing audio, filtering, and deleting.
API (selected)
POST /v1/podcasts
→ create a background job
{
"title": "Episode title",
"description": "optional",
"materials_glob": "materials/**/*.json",
"voice": "en-US-Standard-A",
"script_style": "two-host"
}
GET /v1/podcasts/{id}
→ returns status, progress, and result (MP3 URL) once ready
GET /v1/podcasts?status=done
→ list finished items
DELETE /v1/podcasts/{id}
→ cancel or remove
Debug routes (optional):
/api/debug/raw
returns raw state dump for troubleshooting.
How materials ingestion works
- The service scans the selected directory for content pillar JSON or raw items (e.g.
content_outline.json
). - Materials can point to PDFs/MDX/TXT/JSON; the service builds a combined source text.
- A prompt builds a two‑host script (A/B roles), with deterministic or OpenAI‑based generation (when API key is present).
- Output is stored under
OUTPUT_DIR
and served at/media/<id>.mp3
(or via your storage adapter).
TTS
Default implementation uses Google TTS via gTTS
(simple and local). You can swap it for another engine in app/services/tts_service.py
.
- Output files are stored in
OUTPUT_DIR
and served at/media/<id>.mp3
. - Swap voices by changing a label/URL; add SSML if your engine supports it.
Frontend features
- Create Podcast form.
- List View with progress indicator and integrated audio player per episode.
- Filtering by status (e.g. active/complete).
- Delete episodes.
- Automatic polling while a job is running.
Architecture (high level)
Materials (JSON/MDX/PDF) → Script builder (OpenAI/local) → TTS (Google TTS by default) → MP3 → Storage/API
↑
Vue 3 Frontend
Roadmap ideas
- Streaming TTS + advanced playback controls.
- Better error reporting and user notifications.
- Progress bars and richer status updates.
- Mobile‑friendly UI improvements.