Website Lead Enricher

Heuristic B2B lead enrichment for the Apify platform — emails, phones, socials, WHOIS, email patterns, and a live HTTP API.

Pipeline at a glance

Input
URLs
Up to 1,000 per run
domains auto-prefixed
Step 1
Scrape
Cheerio + Axios
rotating user agents
Step 2
Email Pattern Finder
DNS + SMTP probe
+ Hunter.io opt-in
Step 3
Classify & Validate
emails · phones
socials · company type
Score
Quality 0–100
completeness · validity
B2B social presence
Apify Dataset
one row per URL
CRM-ready columns
CRM CSV
HubSpot · Salesforce
Pipedrive
Standby HTTP API
/leads · /leads/{domain}
for agents & apps
KV Store
runSummary envelope
OUTPUT_RUN_SUMMARY

Why this Actor

🎯
Outreach-safe by default
Per-record isSendable flag + per-domain bounceRiskBucket so Instantly, Smartlead, and Apollo can filter out high-risk domains before spending sending credits.
📧
5–10× email coverage
Step 2 detects first.last / flast / first patterns from the page's emails, runs a single SMTP catch-all probe, generates 2–10 predicted team emails per domain.
🔌
Live HTTP API
Standby mode exposes /leads, /leads/{domain}, /stats, /health — drop-in endpoint for AI agents, MCP integrations, and embedded B2B tools.
📊
CRM-ready exports
HubSpot, Salesforce, and Pipedrive column shapes built in. No mapping required — pick csvMode and import the CSV directly.
🤖
Heuristic, not AI
Deterministic rules — no LLM cost, no external API keys, fully auditable. methodology: "heuristic, not AI" is stamped on every record.
🛡️
No silent failures
Per-step error isolation: one bad step never kills the record. Every step carries ok / error + structured {code, message} on failure.

Sample output — JSON & CSV from the same record

Below is the same enriched record for acme.com exported two ways: as JSON (Apify Dataset default) and as CSV (HubSpot import mode). Switch tabs to compare.

{
  "url": "https://acme.com",
  "domain": "acme.com",
  "scrapedAt": "2026-06-28T14:00:00Z",
  "company": {
    "name": "Acme Corp",
    "registrant": "Acme Corp",
    "createdAt": "2005-03-15T00:00:00Z"
  },
  "address": {
    "street": "123 Market Street",
    "city": "San Francisco",
    "postal_code": "94103",
    "country_code": "US"
  },
  "contacts": {
    "emails": [
      { "address": "jan.curry@acme.com", "type": "corporate" },
      { "address": "info@acme.com",     "type": "generic" },
      { "address": "sales@acme.com",    "type": "generic" }
    ],
    "emails_corporate": "jan.curry@acme.com",
    "emails_generic":   "info@acme.com|sales@acme.com",
    "phones": ["+14155551234"]
  },
  "socials": {
    "linkedin":  "https://linkedin.com/company/acme-corp",
    "facebook":  "https://facebook.com/acme",
    "instagram": "https://instagram.com/acme",
    "twitter":   "https://x.com/acme",
    "youtube":   "https://youtube.com/@acme"
  },
  "qualityScore": { "total": 92, "breakdown": { "completeness": 90, "emailValidity": 100, "phoneValidity": 100, "socialPresence": 75 } },
  "companyType": "saas",
  "companyTypeConfidence": "high",
  "isSendable": true,
  "contactFormDetected": true,
  "contactFormUrl": "https://acme.com/contact",
  "emailPattern": "first.last",
  "patternConfidence": 0.92,
  "generatedEmails": [
    { "address": "jan.curry@acme.com", "name": "Jan Curry", "source": "page-discovered" }
  ],
  "patternAnalysis": {
    "mxValid": true,
    "isCatchAll": false,
    "bounceRiskBucket": "low"
  },
  "scrapeError": null
}
First Name,Last Name,Email,Phone,Company,Job Title,Website,Address,City,State/Province,Zip/Postal Code,Country,LinkedIn Profile URL
Jan,Curry,jan.curry@acme.com,+14155551234,Acme Corp,,https://acme.com,123 Market Street,San Francisco,,94103,US,https://linkedin.com/company/acme-corp

HTTP API (Standby mode)

GET /health
GET /leads?limit=500&offset=0
GET /leads/acme.com
GET /stats

OpenAPI schema: .actor/openapi.json. Import into Postman, Insomnia, or any OpenAPI generator to scaffold a client in seconds.