A 234-node autonomous n8n pipeline that researches, scripts, voices, animates, and publishes a daily AI-presenter investing video to YouTube and TikTok — entirely on self-hosted infrastructure.

The brief: a faceless video studio that runs itself.
The client wanted to operate a short-form investing channel with the cadence and polish of a full production team — a daily YouTube Short and TikTok built around a single, recognisable on-screen presenter — without employing writers, voice talent, editors, or an uploader. The presenter is an AI persona with a fixed voice and personality: fast-paced, contrarian, anti-establishment, the kind of host who stops the scroll in two seconds. Every video also had to do quiet commercial work, surfacing one of four ETFs when — and only when — the story genuinely connected to it.
We delivered this as a 234-node n8n workflow, the largest and most complex build in this portfolio, orchestrating roughly a dozen external AI and media services into one continuous, self-healing assembly line. Like our other autonomous systems, it is split into discrete webhook-chained stages so each can be tested, retried, and reasoned about independently — essential at this node count.
Stage 1 — Trend detection and story selection. The pipeline pulls fresh financial headlines via SerpAPI's Google News endpoint, then an AI content strategist scores every candidate against five short-form criteria: timeliness, emotional hook, contrarian angle, broad appeal, and whether the core idea is explainable in under sixty seconds. It returns a primary pick and a backup, plus two distinct video angles — one scroll-stopping, one analytical.
Stage 2 — Fund-connection analysis. A second agent evaluates the chosen story against four ETF profiles, assigning exactly one "YES" (strongest link), optional "MID" connections, and "NO" to the rest — with a short pitch for how the winning fund could be woven in naturally. This gating is what keeps product mentions credible rather than spammy.
Stage 3 — Source reading and scripting. The article is fetched full-text via Jina Reader, stripped to clean body text by an extraction agent, then handed to the scriptwriting agent embodying the host persona. Augmented with live Brave Search for one or two fresh supporting stats, it writes a 150–300-word teleprompter-ready script in a consistent voice, choosing one of three opening styles and landing either a fund or a channel-subscribe call-to-action.
Stage 4 — Expressive voice synthesis. Because flat narration kills retention, a dedicated agent tags the script with a controlled vocabulary of emotion, tone, and audio-effect markers for Fish Audio's S1 TTS engine — calibrated to tag 30–50% of sentences while preserving the text character-for-character. The result is voiceover with genuine cadence and emphasis.
Stage 5 — Segmentation and dynamic layout. A "scissors not a pen" segmentation agent splits the script into 4–6 balanced segments without altering a single word. For each segment, a layout specialist selects from seven Creatomate video templates (talking head, image triptych, Q&A card, search-bar visual, bold statement, and more) based on the segment's tone and position, then a generation agent produces the exact on-screen text and image prompts each template needs.
Stage 6 — Visuals and AI avatar. Image prompts are rendered via Google Gemini image generation, while the presenter's segments are animated as a lip-synced talking avatar using Kling AI Avatar (via fal.run), driven by the synthesised voice track. Assets are assembled and stitched with ffmpeg, then stored and shared from Google Drive.
Stage 7 — SEO and multi-platform publishing. A metadata-optimisation agent runs live YouTube searches (SerpApi) to engineer a sub-70-character title, a hook-first description, and five tiered keywords. The finished video is then published through the Late API — uploaded immediately to YouTube and scheduled to TikTok — from the host's connected accounts.
Resilience by design. Google Sheets serves as the orchestration spine and status ledger across all stages, and Gmail "workflow failure" alerts with wait-for-response gates are wired throughout, so a failure in any single stage notifies the operator rather than silently breaking the chain. The entire system runs on self-hosted infrastructure, keeping a complex, multi-vendor media stack under the owner's control.