Playbook

Data Enrichment Waterfall Architecture and Sequencing

Single-provider enrichment caps at 65-70% coverage. Multi-provider waterfalls hit 90%+. Here is how to build one.

Why Waterfalls Exist

No single data provider covers the market. Clay covers roughly 65% of US B2B contact emails. Apollo covers a different 65% with significant overlap but also unique hits. Lusha covers another slice. Each provider has pockets of strength: Apollo indexes startups better, Cognism covers Europe better, Lusha has stronger direct dial data for enterprise contacts.

Waterfalls call providers sequentially: if provider 1 returns a result, stop. If not, try provider 2. Then provider 3. This maximizes coverage while minimizing cost because you only pay for expensive providers when cheaper ones miss.

The difference between single-provider and waterfall enrichment: 65% vs 92% email coverage. On a list of 1,000 accounts, that is 270 additional contacts your competitors don't reach. At a 3% meeting rate, that is 8 more meetings from the same target list with zero additional prospecting effort.

Core Pattern

Input normalization: Clean names (proper case, remove titles like "Mr./Dr."), validate domains (strip www, http, trailing slashes), deduplicate by email or name+company. Dirty input data cascades errors through every downstream provider. Spend 15 minutes cleaning before you spend credits enriching.

Provider sequencing: Cheapest provider with acceptable accuracy first. Expensive providers only fire on misses. This ordering decision alone determines whether your per-record cost is $0.05 or $0.25.

Result validation: Every email result gets verified before it enters your outbound pipeline. Every phone number gets format-checked. Every LinkedIn URL gets domain-matched against the target company. Unvalidated data wastes downstream credits and hurts deliverability.

Output routing: Verified results push to your CRM and outbound tool. Unverified results (catch-all domains) route to a LinkedIn-only outreach track. Complete misses (no data from any provider) route to a manual research queue for high-value accounts or get archived for low-priority ones.

Email Waterfall

Stage 1: Clay built-in enrichment. Hit rate: 60-65%. Cost: included in your Clay plan (no incremental credit cost for basic email lookup). Clay aggregates data from multiple underlying sources, making it a strong first-pass provider. Run this on every record.

Stage 2: Apollo email finder. Catches 15-20% of Stage 1 misses. Cost: $0.01-0.03 per credit (varies by plan). Apollo's strength is startup and SMB coverage. It often finds emails that Clay misses for smaller companies and recently-hired contacts.

Stage 3: Lusha or Cognism. Catches 5-10% of remaining misses. Cost: $0.10-0.30 per credit. More expensive but covers different pockets: Lusha is strong in US mid-market, Cognism excels in European contacts. Choose based on your target geography.

Optional Stage 4: FullEnrich or specialty providers. Catches 2-5% of the hardest-to-find emails. These are the contacts that no mainstream provider indexes. Cost varies but typically $0.15-0.50 per lookup. Only worth running on Tier A accounts where the missing contact is a specific decision-maker.

Cumulative coverage: Stage 1 alone: 63%. Stages 1-2: 82%. Stages 1-3: 90%. Stages 1-4: 93%. The marginal cost per additional percentage point increases sharply after Stage 2, which is why you sequence by cost.

Phone Waterfall

Phone data is harder to source than email. Expect lower coverage and higher costs across the board.

Stage 1: Apollo. 30-40% direct dial coverage for US contacts. Free with your Apollo plan. Good for mobile numbers on SDR and mid-level contacts. Weaker for C-suite.

Stage 2: Lusha direct dials. 10-15% additional coverage. $0.15-0.30 per credit. Lusha's strength is verified direct dials (mobile numbers, not switchboards). Particularly strong for enterprise contacts that Apollo misses.

Stage 3: Cognism Diamond Data. 5-10% additional. $0.20-0.40 per credit. Cognism's Diamond Data is phone-verified, meaning a human confirmed the number works within the last 90 days. Highest accuracy but highest cost.

Total coverage: 50-60% for US contacts. 35-45% for European contacts. Phone coverage will never match email coverage, which is why multi-channel outbound (email + phone + LinkedIn) is essential. Don't build a phone-only outbound strategy.

Company Waterfall

Clay's built-in company enrichment covers 85-90% of US B2B companies. Apollo adds 5-8% for smaller companies. Clearbit fills gaps for enterprise accounts. Company data is the least waterfall-dependent data type because baseline coverage is already high.

Focus your company waterfall budget on data freshness rather than coverage. Employee counts change quarterly. Funding data updates monthly. Tech stacks shift continuously. Re-enriching company data every 90 days prevents your scoring models from operating on stale inputs.

Cost Optimization

Filter before enriching. Run ICP scoring on company-level data first (cheap, 2-3 credits per account). Eliminate Tier D accounts before spending 10-15 credits per account on contact enrichment. This single step cuts your total enrichment spend by 30-40%.

Batch by priority. Tier A accounts get the full 4-stage waterfall. Tier B gets stages 1-2. Tier C gets stage 1 only. This tiered approach aligns spend with expected return. Don't spend $0.50 per record enriching accounts that will enter a nurture sequence and probably never convert.

Cache results. Before running any enrichment, check if you already have data from the last 90 days. For contacts that haven't changed jobs (verifiable via LinkedIn date), re-enrichment is wasted spend. Build a lookup against your CRM or a master enrichment database before calling any provider.

Negotiate volume pricing. At 5,000+ records per month, most providers offer 20-40% discounts. Apollo's annual plans are 50-60% cheaper than monthly. Lusha offers custom pricing above 500 credits/month. Your negotiating power increases with volume. Get quotes from 3 providers before committing.

Clay Implementation

Column 1: Clay built-in email enrichment. Runs on every row automatically.

Column 2: Conditional Apollo lookup. Formula: "If Column 1 is empty, call Apollo email finder." This prevents double-spending on contacts Clay already found.

Column 3: Conditional Lusha lookup. Formula: "If Column 1 AND Column 2 are both empty, call Lusha." Third-tier provider only fires on double misses.

Column 4: COALESCE formula. Returns the first non-empty value from Columns 1-3. This is your "best available email" column that downstream systems reference.

Column 5: Email verification via ZeroBounce, NeverBounce, or MillionVerifier ($0.003-0.005 per verification). Flag bounced, catch-all, and invalid results. Only verified emails proceed to outbound.

Column 6: Source tracking formula. Records which provider returned the winning result. This data feeds your quarterly provider performance review. If Apollo consistently catches what Clay misses for a specific segment, that insight informs future waterfall ordering.

A 500-row batch completes in 10-30 minutes depending on provider response times. Template 2 in the Clay templates library implements this exact structure.

Common Waterfall Mistakes

Wrong provider ordering. Running Cognism ($0.25/credit) before Apollo ($0.02/credit) wastes 10x on every record that Apollo could have found. Always test providers on a 200-record sample to measure hit rate and accuracy before setting the waterfall order.

Skipping verification. Unverified emails from any provider bounce at 8-15%. Those bounces damage your sender reputation and can take weeks to recover from. Verification costs $0.003-0.005 per email. There is no scenario where skipping it saves money.

Over-enriching low-priority accounts. Running a 4-stage waterfall on 5,000 Tier C accounts because "we might as well" burns $2,500-5,000 in credits for leads that enter a nurture sequence with 0.5% conversion probability. Tier your enrichment depth to match account priority.

Not measuring per-provider accuracy. Quarterly, pull 50 random records from each provider's results. Manually verify 10-15 of them. If a provider's accuracy drops below 85%, move them down the waterfall or replace them. Provider data quality shifts over time.

Measuring Performance

Monthly dashboard: Coverage rate by waterfall stage (what percentage does each provider contribute?). Incremental coverage per dollar (diminishing returns analysis). Total cost per enriched record (target: under $0.10 for email, under $0.25 for phone). Verification pass rate by provider.

Quarterly audit: Manual accuracy check on 50 records per provider. Compare provider hit rates vs 6 months ago. Review downstream conversion rates by enrichment source (do Apollo-sourced contacts convert differently than Lusha-sourced contacts?). Adjust waterfall ordering based on data.

Annual review: Evaluate new providers entering the market. Test against your existing waterfall on a 500-record sample. Replace underperforming providers. Renegotiate contracts based on actual usage data.

Provider Pricing Comparison

Clay (included with plan): Built-in enrichment runs on your Clay credit balance. Explorer plan ($149/month) includes 5,000 credits. Pro plan ($349/month) includes 10,000 credits. Basic email lookup costs 1-2 credits. Company enrichment costs 2-3 credits. Clay is your cheapest first-stage provider because the enrichment is bundled with your orchestration platform.

Apollo ($49/month Professional): Email finder at $0.01-0.03 per credit. Strong coverage for US startups and mid-market companies. Free tier gives 50 credits/month for testing. Annual plans cut per-credit cost by 50-60%. Apollo's strength: fast turnaround, broad US coverage, good for companies under 500 employees.

Lusha ($36/month Pro): Direct dials and verified emails. Per-credit cost: $0.10-0.30 depending on plan. Lusha is expensive but fills gaps other providers miss, particularly for mid-market and enterprise contacts. Strong in US market. Use only as Stage 3 after Clay and Apollo miss.

Cognism ($25,000-50,000/year): European and global coverage. Diamond Data (phone-verified numbers) is their differentiator. Per-contact cost: $0.20-0.40. Only justified if European contacts represent 30%+ of your outbound or phone-verified data is critical for your sales motion.

FullEnrich (custom pricing): Multi-source waterfall provider. Aggregates 15+ data sources in a single API call. Per-lookup cost: $0.15-0.50 depending on volume. Best as a Stage 4 provider for hard-to-find contacts at Tier A accounts where the missing email is a specific VP or C-suite target.

ZeroBounce/NeverBounce ($0.003-0.008 per verification): Email verification is non-negotiable. At 1,000 emails, verification costs $3-8. That investment prevents bounce-rate damage that costs weeks of deliverability recovery. Never skip verification to save $5.

Waterfall Implementation Checklist

Before running your first production batch:

1. Input data cleaned: proper-case names, validated domains, duplicates removed. 2. Clay table built with conditional enrichment columns (Stage 1 fires on every row, Stage 2 fires only on Stage 1 misses, etc.). 3. COALESCE formula column returning the best available result from all stages. 4. Email verification column running on every non-empty email result. 5. Source tracking column recording which provider returned the winning result. 6. CRM push configured for verified results (HubSpot, Salesforce, or CRM of choice). 7. Manual research queue for high-value accounts with complete misses. 8. Provider cost tracking in place (credits consumed per batch, cost per enriched record). 9. 200-record test batch run with manual accuracy check (10-15 records verified by hand). 10. Monthly dashboard template created for coverage rates, costs, and provider performance.

Data Decay and Re-Enrichment

Contact data decays at 2-3% per month. After 6 months, 12-18% of your enriched data is stale: people changed jobs, companies were acquired, phone numbers were reassigned. Stale data wastes outbound credits and damages deliverability.

Re-verification schedule: Re-verify all emails every 60 days. Email verification is cheap ($0.003-0.005 per check). A 5,000-record verification run costs $15-25 and catches the 2-3% that became invalid since your last check. Remove any emails that fail re-verification immediately.

Full re-enrichment schedule: Re-enrich your active pipeline every 90 days. For contacts not in active sequences, re-enrich before reactivating any dormant list. Running a year-old enrichment list through outbound without re-enrichment produces 10-15% bounce rates. That damages your sender reputation for weeks.

Job change detection: Clay's LinkedIn enrichment can detect title changes. Run a monthly scan on your top 500 accounts. Contacts who changed companies need re-enrichment at the new company. Contacts who got promoted may need re-segmentation into a different persona template.

Frequently Asked Questions

How many providers in a waterfall?

3-4. Provider 1 handles 60-70%. Provider 2 catches 15-20% of remainder. Provider 3 adds 5-10%. Beyond 4, incremental coverage per dollar drops too low.

What order?

Cheapest with acceptable accuracy first. For email: Clay built-in first, Apollo ($0.01-0.03/credit), then Lusha/Cognism ($0.10-0.30/credit).

How to measure coverage?

Coverage rate, accuracy rate, cost per enriched record. Test 200 known-good records. Benchmarks: 85%+ coverage, 90%+ accuracy, under $0.10/record for email.

Verify after waterfall?

Always verify emails. $0.003-0.005 per address. Never send to unverified addresses. Phone validation for calling campaigns.

How often re-enrich?

Data decays 2-3%/month. Active pipeline every 90 days. Re-verify emails every 60 days. Full re-enrichment before reactivating dormant lists.

Source: State of GTM Engineering Report 2026 (n=228). Salary data combines survey responses from 228 GTM Engineers across 32 countries with analysis of 3,342 job postings.

Get the Weekly Pulse

Salary shifts, tool intel, and job market data for GTM Engineers. Weekly enrichment strategies and data pipeline insights.