Project Case Study

AI Cold Email Outreach Pipeline

AI Cold Email Outreach Pipeline End-to-end automated lead generation and personalized outreach for a student-run AI company The Problem Summit Intelligent Systems needs a steady pipeline of new clients to grow. Cold outreach to local Philadelphia small businesses is one of the most effective ways to generate leads —...

Automated Email Pipeline / Email Scraping / Not Agentic Solution

End-to-end automated lead generation and personalized outreach for a student-run AI company

The Problem

Summit Intelligent Systems needs a steady pipeline of new clients to grow. Cold outreach to local Philadelphia small businesses is one of the most effective ways to generate leads — but doing it manually is brutal. Finding businesses, visiting their websites, researching what they do, writing a personalized email for each one, and actually sending it takes hours per day and doesn't scale.

The challenge was to build a system that could do all of that automatically — discovery, research, personalized writing, and delivery — while still keeping a human in the loop so no embarrassing or off-brand email ever goes out without review.

The Solution

I built a fully automated, 7-module Python pipeline that runs on a daily schedule. Every morning it discovers new local businesses across 15 categories, scrapes their websites for contact info and context, uses an LLM to write a personalized cold email for each one, and queues the drafts for human review. After approval, a separate process sends each email through Gmail with randomized delays to stay off spam filters — then notifies the owner with a summary of what went out.

The entire system costs nothing to run. It uses DuckDuckGo for business discovery (no API key), Groq's free inference tier for writing, Google Sheets as the database, and Gmail's free API for sending. The only human step is a single-keypress approval terminal UI — each email takes under five seconds to review.

System Architecture — 7 Modules

ARCHITECTURE DIAGRAM

See the attached diagram: outreach-pipeline-architecture.png. Upload this to Sanity as the project's featured diagram image.

Key Design Decisions

Human approval as a hard gate

No email is ever sent without a human reviewing and approving it first. This was a non-negotiable design constraint. A single bad cold email — one that sounds robotic, gets the business name wrong, or makes a weird claim — can damage Summit's reputation permanently. The approve.py terminal UI makes review so fast (under 5 seconds per draft) that it doesn't slow anyone down, but it ensures every outgoing email has been seen by a human.

Dual-track outreach for maximum lead capture

Not every business website lists a public email address. Rather than dropping those prospects entirely, the pipeline routes them to a separate Phone Outreach tab in Google Sheets with a short AI-generated business summary. No lead is wasted — businesses without emails become phone call candidates instead.

Spam-safe sending design

The sender module uses Gmail's OAuth 2.0 API rather than raw SMTP, which is dramatically more reliable and less likely to trigger spam filters. It sends plain text only (HTML emails are flagged more aggressively), caps at 10 emails per day, and waits a random 5 to 10 minutes between each send — mimicking the pattern of a human manually sending emails rather than a script blasting in bulk.

Zero-cost infrastructure

Every component of the pipeline uses a free tier or free service: DuckDuckGo requires no API key, Groq's free tier handles all LLM inference, Google Sheets replaces a paid database, and Gmail provides email delivery. The total monthly cost to run this system is $0.

Graceful failure at every step

A scraping failure on one prospect doesn't stop the run. A Groq rate limit is caught, the retry-after time is parsed from the error response, and the pipeline waits exactly that long before continuing. A failed notification email doesn't surface as an error. Each module is designed to degrade gracefully so a single bad business or temporary API issue never kills the entire pipeline.

Prompt Engineering

The email generation prompt enforces a rigid 5-paragraph structure: a personalized greeting using the business name, a specific observation about something on their website, a brief introduction to Summit, a concrete proof point (a deployed project), and a clear call to action. The model is explicitly instructed to avoid bullet points, stay under five paragraphs, always mention a free consultation, and make the subject line specific to the business rather than generic.

The quality of the output depends almost entirely on how well the scraper extracted site context. Businesses with descriptive, well-structured websites get the most personalized emails — businesses with thin or JavaScript-heavy sites get more generic drafts, which is why the human approval step is critical for catching the latter.

Results & Impact

Pipeline runs autonomously every morning — no manual intervention required beyond the approval step

Up to 150 new prospects discovered and processed per daily run across 15 Philadelphia business categories

10 approved, personalized emails sent per day with spam-safe delivery patterns

Dual-track system ensures no prospect is lost — businesses without emails become phone leads

Total infrastructure cost: $0 per month

The pipeline is what powers Summit's active client acquisition — every new client conversation starts here

What I Learned

Building a pipeline that touches this many external services — DuckDuckGo, arbitrary business websites, Groq, Google Sheets, Gmail — taught me that robustness is the hardest engineering problem. Every integration point is a failure point, and real-world websites are far messier than any test case. The scraper had to handle JavaScript-heavy sites, paywalled pages, broken HTML, and dozens of URL patterns for contact pages.

The prompt engineering work was more iterative than I expected. The first version of the email prompt produced drafts that were technically correct but felt generic. Getting the model to write something that actually sounds like it was written by a person who visited the website — and cares about that specific business — required many rounds of refinement, example outputs, and explicit negative constraints in the prompt.

The biggest lesson was about system design: a pipeline where one failure stops everything is useless in production. Every module had to be designed to catch, log, and move past errors independently. That discipline — writing code that fails gracefully rather than loudly — is something I now apply to everything I build.