HVAC Leads — AI pipeline that ranks contractor websites by sales opportunity
A five-stage autonomous pipeline that finds HVAC contractors in a target city, screenshots their websites, audits them with Claude Sonnet 4.5 vision, and outputs a prioritized Google Sheet with a specific sales angle per business.
- Role
- Solo — pipeline design, scraping, vision prompt, scoring, sheet export
- Year
- 2026
Problem
To sell website redesigns to small-business owners, you need a shortlist of businesses whose websites are actually costing them money — not a random scrape. HVAC is a near-perfect niche: high ticket size per job, phone-driven conversion, and a long tail of contractors still running sites that look like 2012. But the signal that matters isn't "do they have a website" — it's "does the website convert, and can I tell a specific story about why it doesn't." Hand-auditing a city's worth of contractors takes a day per city. I wanted a pipeline that produces the same shortlist in under an hour, with a concrete sales angle already written per lead.
Approach
Five stages, one bun run all command, roughly $5 per city.
AI Lead Generation Pipeline: Discover → Capture → Audit → AI Score → Output
- Discover. Google Places API (New) with six search queries per city (
HVAC contractor,AC repair,heating cooling,furnace repair,air conditioning service,heating contractor), biased to a 40 km radius around the city centroid. Results dedupe byplaceId, closed/permanent-closed businesses filter out. - Capture. Puppeteer takes full-page desktop (1280×800 UA) and mobile (iPhone 17 UA) screenshots.
networkidle2with adomcontentloadedfallback, a mid-page scroll to trigger lazy-loaded hero media, then back to top before the shot so the fold looks the way a real visitor would see it. - Audit. PageSpeed Insights API on mobile + desktop for an honest Core Web Vitals read that Cloudflare WAFs can't fake.
- AI Score. Claude Sonnet 4.5 gets both screenshots and a 7-factor conversion rubric — CTA prominence, mobile UX, trust signals, visual quality, conversion optimization, improvement potential, and a free-text
salesAngle. Structured JSON out, no markdown. - Output. Scores composite into a final rank (vision signals + PageSpeed + reviews + rating), CSV export, imported into a Google Sheet with Tier A / B / C coloring and the generated sales angle per row. Ready for outreach.
Technical deep-dive
- The prompt is the product. Claude isn't asked "is this a good website" — it's asked "is this website costing this HVAC contractor phone calls." The rubric is explicit about lead-generation over aesthetics ("a site can LOOK good but still be a poor converter"), weights mobile CTA visibility heavily, and produces a
salesAnglestring tailored to the actual failure — "No sticky call button on mobile", "Hero has no CTA", "Phone number hidden in footer." That string is the opening line of the outreach email; if the prompt drifts, the whole pipeline degrades. - Dual-image vision with resize guard. Each lead sends two full-page screenshots in one call so Claude can compare desktop and mobile directly. Full-page mobile screenshots routinely exceed Claude's image dimension limits, so
sharpresizes anything over 7500 px on the long edge before base64 encoding. Caps without cropping — the whole page is still in frame, just downsampled. - Blocked-page detection. Before scoring, the capture step sniffs the DOM for Cloudflare challenge text, 403 copy, and suspiciously empty bodies. If it's flagged, Claude is asked to confirm from the screenshot and set
blocked: true, and the lead gets a-1score and a "needs manual review" breakdown — not garbage scores that pollute the ranking. - Resumable by design. Every processed lead is keyed by
placeIdinleads-scored.json. On restart, already-scored entries skip the full audit. The save function is also guarded against overlapping writes with an in-flight / pending flag so concurrent workers can't corrupt the output file mid-write. - Concurrency capped at 3.
p-limit(3)on the whole pipeline. That's the sweet spot for Puppeteer (memory), PageSpeed (rate limits), and Anthropic (cost burst) combined — higher and I hit rate-limit 429s on PageSpeed, lower and a full city takes an hour longer. - Costs, observed. Roughly $10-15 in Google Places per city (~240 businesses), $0 on PageSpeed (free tier), $2-5 on Anthropic for vision analysis. Total stays under $20 per city, which is cheaper than a single hour of my time spent auditing by hand.
Google Sheet output: 77 HVAC leads for Tulsa, OK — tier A / B coloring, scores 80-115, sales-ready columns
Pipeline running in PowerShell: Sonnet 4.5 scoring leads with per-factor breakdown and generated sales angles
Outcome
Tulsa, OK — 77 contractors scraped, scored, and ranked in under an hour. The top tier (score 90+) is every business with a real footprint on Google whose website is demonstrably leaking leads: slow on mobile, weak CTA, no trust signals, stale visual design, or some mix. Each row in the sheet has a specific sales angle written by Claude — the outreach email almost writes itself from there. The same pipeline runs against any US city by editing three env vars and re-running bun run all.
Stack & handoff
TypeScript on Bun, Puppeteer for captures, Claude Sonnet 4.5 via the Anthropic SDK for vision scoring, Google Places API (New) + PageSpeed Insights for discovery and performance signals, Sharp for image pre-processing, p-limit for concurrency. Output is a CSV that imports cleanly into Google Sheets with conditional formatting on the Tier column. Internal tool — source private, but the prompt and rubric are the IP, and both are one file.
Stack
Related work
- AI products & automation2026
hazratali.me — this portfolio, AI-reframed
Yes — the website you're reading right now. A portfolio with a Claude-powered hero reframer and a floating Ask-me chat. The self-reference is the point.
Next.js 16React 19TypeScriptTailwind v4+4Read case study - Web & ecommerce2026
Bismillah Agro — bilingual agriculture ecommerce for Bangladesh
A bilingual (Bangla + English) agriculture ecommerce store for Bangladesh, built as a demo through my agency NextWebz — Next.js + MongoDB + Redis, SSLCommerz checkout, and a conditional COD rule that unlocks cash-on-delivery only after a clean delivery history.
Next.js 16ReactTypeScriptMongoDB+7Read case study