Upload any PDF. Pick an angle — or write your own. ListenIQ produces a structured brief and a two-host podcast tailored to exactly what you care about. Built on E2E Cloud's TIR Platform, powered by Llama-3.1-405B and a six-engine TTS fleet.
ListenIQ is what a serious AI agent looks like in production. It demonstrates the orchestration patterns your AI team will eventually build, packaged as a product your business can use today.
Six TTS engines. Indic voice support through AI4Bharat Parler TTS. Llama-3.1-405B-AWQ orchestrating extraction, persona lensing, and dialogue authoring in parallel. Not a demo — a production pattern your AI team can fork and rebrand.
Upload a PDF. Pick an angle — CXO, sales, engineer, marketer, investor — or write your own. Come back to a structured brief and a two-host podcast you can play while you drive to the next meeting.
Same source PDF, a dozen briefs. A single research paper turns into five different briefings for five different audiences — without re-uploading, without re-prompting from scratch.
Every upload runs through the same pipeline. Insight extraction fans out across chunks concurrently — vLLM batches them server-side. Brief writing and dialogue authoring use the strongest model for the job.
[1] PDF EXTRACTION pdfplumber · OCR fallback · page sampling per length tier ↓ [2] INSIGHT EXTRACTION Llama-3.1-405B-AWQ × N parallel chunks (32k context, 80GB A100) ↓ [3] AUTO-TAGGING Llama-3.3-70B-Instruct (fast, cheap, accurate) ↓ [4] PERSONA-SHAPED BRIEF Llama-3.1-405B-AWQ — user's angle as system prompt ↓ ↓ [BRIEF PDF] [5] TWO-HOST DIALOGUE Llama-3.1-405B-AWQ writes natural conversation ↓ [6] TTS SYNTHESIS VibeVoice (default) · Dia · Kokoro · XTTS · F5-TTS · Parler ↓ [OUTPUT] MP3 + transcript + brief PDF, all in E2E Object Storage (S3-compatible storage)
ListenIQ deployed as three tiers across E2E TIR Platform. Pre-configured, benchmarked at 423-page documents end-to-end. Customize per workload — scale horizontally for parallelism, vertically for context length.
| Brief / Insights | Llama-3.1-405B-AWQ 1× NVIDIA A100 80GB · 32k context |
| Tagging / Fallback | Llama-3.3-70B-Instruct 4× NVIDIA A40 |
| Vision (optional) | Qwen2-VL-7B 1× NVIDIA L4 · for chart extraction |
| Default engine | VibeVoice 1.5B Long-form multi-speaker |
| Alternates | Dia · Kokoro · XTTS · F5-TTS · Parler 21 languages, voice cloning, Indic |
| Hardware | 4× NVIDIA V100 32GB All six engines co-resident |
| API + Worker | FastAPI · Postgres · Python 3.10 1× 4-vCPU / 8 GB RAM VM |
| Storage | E2E Object Storage PDFs · briefs · audio · transcripts |
| Edge | Nginx · Let's Encrypt SSL Google SSO + admin approval |
| Queue model | Always-on worker, batch-friendly Schedule larger jobs for off-peak GPU |
| Single-doc range | 10 – 500+ pages Tested end-to-end at 423 pages |
| Re-render audio | 3–5 min per voice swap Reuses existing brief and script |
Each preset is a starting prompt your users can edit before submitting. Pick the closest match, refine the wording, queue the brief. The same source PDF can be briefed from any angle, on demand.
All personas live in a Postgres table — add, edit, or remove via SQL.
Or skip the presets entirely with the Custom… option.
Built around an extensible pipeline. New input sources, new output channels, new model integrations — drop them in without restructuring.
Access is via Google SSO. New accounts require admin approval before first use — typically same-business-day during the PoC phase.
Sign in with Google