AI Cloud Agent System
Autonomous agents handling interview coordination, reader engagement, scheduling, and live site chat — running a defense-in-depth guardrail stack
Tech Stack
The Problem
Job searching means dozens of conversations happening at once — recruiters, hiring managers, scheduling coordinators. Every email needs a thoughtful, timely response. But I was also building, writing, and running operations full-time.
I needed a system that could represent me accurately when I wasn’t available — not a chatbot, but an agent that actually knew my work and could hold a real conversation.
The Solution
Three specialized email agents, each handling a different domain:
-
Interview Agent (
[email protected]) — Answers questions about my experience, technical skills, and approach to operations. Maintains a comprehensive knowledge base of my professional history and can discuss specific projects in detail. -
Substack Agent (
[email protected]) — Engages with readers of my writing. Understands the voice and themes of published posts and can have genuine conversations about the ideas. -
Calendar Agent (
[email protected]) — Coordinates meeting scheduling with live Google Calendar integration. Checks real-time availability and proposes specific times. -
Site Chat Agent — The live chat on this website. Same shared identity core, same guardrails, answering in seconds for anyone evaluating the work. You can try it right now.
How It Works
Inbound Email → Cloudflare Email Routing → Worker
→ Vercel Serverless Function → Claude API → Resend → Reply
Conversation Memory: Each sender gets a persistent conversation thread stored in Redis — up to 20 message pairs, identities hashed, with a tight 3-day retention window because conversation content is PII and nothing should accumulate by default.
Smart Escalation: The system monitors conversations for signals that need human attention — job offers, salary discussions, complex scheduling, or sensitive topics. When detected, it sends priority notifications with full conversation context so I can step in seamlessly.
Calendar Integration: The scheduling agent fetches live free/busy data from Google Calendar via OAuth2, injecting real availability into each response. Token refresh is automatic with a 5-minute cache buffer.
Guardrails — the part that makes it production, not a demo: Every inbound message is treated as data, never instructions: untrusted text is wrapped in a structural envelope the prompts are trained against, with forged-boundary stripping. Every outbound reply passes three scans before a single byte leaves — a link allowlist (off-list domains, IP literals, and punycode all block the send), a secret/PII detector (API-key shapes, Luhn-valid card numbers, SSNs, live env values), and a prompt-leak check (marker phrases plus verbatim-line matching). A blocked reply never sends; it escalates to me for review instead.
Abuse economics: Three independent rate layers — per-sender daily caps, a per-channel sub-cap, and a global circuit breaker across all agents — so a spoofed-address flood hits a hard cost ceiling. A loop guard silently drops auto-responders, bounces, list mail, and the system’s own addresses, because replying to a mail loop is the loop.
The Architecture
- Cloudflare Workers — Email routing layer. Parses inbound emails by address and forwards to the correct Vercel endpoint. Copies everything to a monitoring inbox.
- Vercel Serverless Functions — 11 endpoints handling the three agents plus health checks, escalation management, and a SAVANT Lite integration.
- Upstash Redis — Conversation state, rate limiting, and escalation tracking. All with TTLs so nothing accumulates indefinitely.
- Claude API — Each agent has its own system prompt with domain-specific knowledge, voice, and boundaries.
- Resend — Outbound email delivery through the
@withjhinna.comdomain.
What This Demonstrates
This isn’t a wrapper around an API. It’s a production system that handles real conversations with real people — recruiters, readers, collaborators. The architecture decisions matter: conversation persistence so context isn’t lost, escalation detection so nothing important slips through, rate limiting so the system stays stable.
Built in a single focused session using Claude Code as a development partner. Scoped the problem, designed the architecture, shipped to production. That’s the workflow: identify what needs to exist, build it, deploy it.
Need something similar for your team?
Let's Talk