A deployable AI agent for E2E Cloud customers

Persona-driven document briefings — delivered as audio, on your GPUs.

Upload any PDF. Pick an angle — or write your own. ListenIQ produces a structured brief and a two-host podcast tailored to exactly what you care about. Built on E2E Cloud's TIR Platform, powered by Llama-3.1-405B and a six-engine TTS fleet.

Sales teams briefing on long customer reports Research leaders skimming the field at 2× speed AI teams shipping agentic patterns into production

What you get

One system, three audiences.

ListenIQ is what a serious AI agent looks like in production. It demonstrates the orchestration patterns your AI team will eventually build, packaged as a product your business can use today.

Showcases what TIR Platform can build

Six TTS engines. Indic voice support through AI4Bharat Parler TTS. Llama-3.1-405B-AWQ orchestrating extraction, persona lensing, and dialogue authoring in parallel. Not a demo — a production pattern your AI team can fork and rebrand.

Hours of reading → minutes of listening

Upload a PDF. Pick an angle — CXO, sales, engineer, marketer, investor — or write your own. Come back to a structured brief and a two-host podcast you can play while you drive to the next meeting.

One document. Every angle.

Same source PDF, a dozen briefs. A single research paper turns into five different briefings for five different audiences — without re-uploading, without re-prompting from scratch.

How it works

Five-stage pipeline. Parallel where it matters.

Every upload runs through the same pipeline. Insight extraction fans out across chunks concurrently — vLLM batches them server-side. Brief writing and dialogue authoring use the strongest model for the job.

[1] PDF EXTRACTION            pdfplumber · OCR fallback · page sampling per length tier
       ↓
[2] INSIGHT EXTRACTION        Llama-3.1-405B-AWQ × N parallel chunks (32k context, 80GB A100)
       ↓
[3] AUTO-TAGGING              Llama-3.3-70B-Instruct (fast, cheap, accurate)
       ↓
[4] PERSONA-SHAPED BRIEF      Llama-3.1-405B-AWQ — user's angle as system prompt
       ↓               ↓
   [BRIEF PDF]     [5] TWO-HOST DIALOGUE   Llama-3.1-405B-AWQ writes natural conversation
                            ↓
                    [6] TTS SYNTHESIS      VibeVoice (default) · Dia · Kokoro · XTTS · F5-TTS · Parler
                            ↓
                    [OUTPUT]               MP3 + transcript + brief PDF, all in E2E Object Storage (S3-compatible storage)

Pipeline stage

Stack note

Reference deployment

Build/Deploy your own ListenIQ agent using TIR.

ListenIQ deployed as three tiers across E2E TIR Platform. Pre-configured, benchmarked at 423-page documents end-to-end. Customize per workload — scale horizontally for parallelism, vertically for context length.

LLM Tier

Brief / Insights	Llama-3.1-405B-AWQ 1× NVIDIA A100 80GB · 32k context
Tagging / Fallback	Llama-3.3-70B-Instruct 4× NVIDIA A40
Vision (optional)	Qwen2-VL-7B 1× NVIDIA L4 · for chart extraction

TTS Tier

Default engine	VibeVoice 1.5B Long-form multi-speaker
Alternates	Dia · Kokoro · XTTS · F5-TTS · Parler 21 languages, voice cloning, Indic
Hardware	4× NVIDIA V100 32GB All six engines co-resident

Application Tier

API + Worker	FastAPI · Postgres · Python 3.10 1× 4-vCPU / 8 GB RAM VM
Storage	E2E Object Storage PDFs · briefs · audio · transcripts
Edge	Nginx · Let's Encrypt SSL Google SSO + admin approval

Operational

Queue model	Always-on worker, batch-friendly Schedule larger jobs for off-peak GPU
Single-doc range	10 – 500+ pages Tested end-to-end at 423 pages
Re-render audio	3–5 min per voice swap Reuses existing brief and script

briefs/hour per VM

~20

briefs in 8-hour overnight batch

33 min

end-to-end on a 423-page report

Built-in personas

Ten angles out of the box. Plus your own.

Each preset is a starting prompt your users can edit before submitting. Pick the closest match, refine the wording, queue the brief. The same source PDF can be briefed from any angle, on demand.

All personas live in a Postgres table — add, edit, or remove via SQL. Or skip the presets entirely with the Custom… option.

On the roadmap

Where ListenIQ is going.

Built around an extensible pipeline. New input sources, new output channels, new model integrations — drop them in without restructuring.

⌗

REST API

Programmatic upload from enterprise document pipelines.

▤

SharePoint / Drive / Confluence

Auto-brief documents as they're added to shared workspaces.

Slack & Teams bots

Drop a PDF in a channel, get a brief and audio back in DM.

⊕

Private RSS feeds

Subscribe a team to a personalized briefing stream.

⌘

Multi-language briefs

Hindi, Tamil, Telugu, Bengali — leveraging Indic Parler-TTS.

⊞

Vision-aware extraction

Chart and table understanding via Qwen2-VL on a dedicated L4.