Account Scoring Model Guide: Building from Enrichment Data
Stop treating all accounts equally. A weighted scoring model turns enrichment data into prioritized pipeline.
The Problem with Flat Account Lists
Your enrichment pipeline produced 2,000 accounts. Your team can work 200 per month. Which 200? Without scoring, you are guessing. A Tier A account that would close in 14 days gets the same attention as a Tier D that will never buy. Account scoring assigns a numeric value based on ICP fit and buying signals.
The math is simple. A sales rep working unsorted accounts books meetings at 2-3% of outreach volume. The same rep working Tier A scored accounts books at 5-8%. That is a 2-3x improvement in pipeline from the same headcount and the same tools. The scoring model costs nothing to run once built. It just requires the right data and the right weights.
Layer 1: Firmographic Fit (0-30 points)
Employee count match (0-10): Perfect match to your ICP range = 10. Adjacent bucket (one step larger or smaller) = 5. Two+ buckets away = 0. Use the ranges from your ICP definition. If your best customers are 51-200 employees, a 201-500 company scores 5 and a 1,000+ company scores 0.
Industry match (0-10): Exact industry match = 10. Adjacent industry (e.g., fintech for a SaaS targeting financial services) = 5. No match = 0. Map to SIC/NAICS codes for consistency. Clay enrichment returns industry classifications for 90%+ of US B2B companies.
Revenue/funding fit (0-10): Revenue in your sweet spot = 10. Adjacent range = 5. For early-stage companies without revenue data, use funding stage as a proxy: Series A-B = 10, Seed = 7, Series C+ = 5 (for most startup-focused products). Funding raised in the last 18 months adds 3 bonus points because fresh capital correlates with tool purchasing.
Layer 2: Technographic Fit (0-25 points)
Complementary tools (0-10): Build a list of 5-8 tools your best customers run alongside your product. Each match scores 2 points, capped at 10. If 70% of your closed-won accounts use HubSpot, HubSpot presence is a strong positive signal. Clay technographics detect 200+ tools including CRM, marketing automation, and sales engagement platforms.
Stack density (0-8): Total number of GTM tools detected. 1-2 tools = 2 points. 3-4 tools = 5 points. 5+ tools = 8 points. Stack-dense companies have established budget for tooling, ops support to manage integrations, and the sophistication to evaluate new solutions. They're easier to sell to.
Competitor usage (0-7): Accounts running a direct competitor score 7. They already understand the problem space, have budget allocated, and know what they want to improve. Displacement campaigns targeting competitor users convert at 1.5-2x greenfield rates. Detection methods: Clay technographics for web-based tools, job posting mentions for less visible tools.
Layer 3: Behavioral Signals (0-25 points)
Hiring activity (0-10): The strongest predictive signal outside of inbound intent. Companies hiring for roles your product supports close at 2.4x the rate of cold accounts. Hiring a VP of Sales while you sell CRM? That is 10 points. Hiring SDRs (adjacent) = 6 points. General hiring growth (10+ open roles) = 3 points. See the buying signal guide for detection methods.
Engagement signals (0-8): Website visits = 3 points. Content downloads = 5 points. Pricing page visit = 8 points. These require marketing automation or website analytics integration. If you don't have first-party intent data, skip this sub-score and redistribute points to hiring and third-party signals.
Third-party intent (0-7): 6sense, Bombora, or G2 buyer intent data. High intent on your category = 7. Medium = 4. None = 0. Intent data subscriptions run $30,000-100,000/year, so this layer only justifies for companies with $20M+ ARR and broad TAMs. For earlier-stage teams, the free signals (hiring, funding, tech stack changes) cover 80% of the same ground.
Layer 4: Deal Potential (0-20 points)
Estimated ACV (0-10): Based on company size and your pricing model. If your average deal for 51-200 employee companies is $25,000/year, companies in that range score 10. Companies that would likely be smaller deals (under $10,000) score 3-5. The point is to prioritize accounts that justify the sales effort.
Expansion potential (0-5): Multi-department potential = 5. Multiple geographies = 3. Single-team use case = 1. Companies with expansion potential have higher lifetime value, making them worth more upfront investment in the sales process.
No disqualifiers (0-5): Start with 5 points and subtract for red flags. Government procurement process = -3. Active contract with competitor (known from intel) = -2. Company in financial distress (layoffs, negative news) = -5. This prevents high-scoring accounts from wasting rep time on deals that won't close regardless of fit.
Implementing in Clay
Create an enrichment table with all data points as columns. Use Clay's built-in enrichment for firmographics and technographics. Add manual or webhook-fed columns for behavioral signals (engagement data from your marketing platform, hiring data from scheduled Clay job posting checks).
Score column structure: Create 4 sub-columns: firmographic_score, tech_score, behavioral_score, deal_score. Each uses an IF/formula chain based on the criteria above. A fifth column sums all four. A sixth column assigns the tier using IF logic: A (70-100), B (50-69), C (30-49), D (0-29).
Why sub-columns matter: When a Tier B account looks like it should be Tier A, you need to see which dimension pulled the score down. Maybe the firmographics are perfect but there are zero behavioral signals. That tells you the account is a fit but not in a buying cycle. Sub-columns make debugging instant instead of a guessing game.
Automated routing: Use Clay's action columns or webhook integrations to route scored accounts. Tier A: create in CRM (HubSpot/Salesforce) with owner assignment + Slack notification to the rep. Tier B: push to Instantly or Smartlead for standard outbound sequences. Tier C: tag in CRM as nurture with a 90-day re-score trigger. Tier D: archive. No manual sorting required.
Calibration
Backtest first. Run 50 closed-won deals through the model. At least 70% should score Tier A. If fewer than 60% do, your weights are off. The most common error: under-weighting technographic fit and over-weighting firmographics.
Check false positives. Run 50 closed-lost deals through the same model. Fewer than 30% should score Tier A. If 40%+ of your losses score Tier A, the model isn't discriminating well enough. Tighten criteria or add disqualifiers.
A/B test in production. Run scored outbound alongside unscored outbound for 30 days. Track meeting rates, pipeline generated, and cycle length for each group. Tier A should produce 2-3x the meeting rate of unsorted accounts. If the gap is less than 1.5x, recalibrate.
Quarterly recalibration: Pull fresh closed-won data every quarter. Compare actual conversion rates across tiers. If Tier B converts at the same rate as Tier A, your weights need adjustment. Shift points from dimensions that don't predict toward dimensions that do. Most models need 2-3 calibration cycles before they stabilize.
Scoring at Scale
1,000+ accounts per month runs automatically through Clay with zero manual intervention. The bottleneck is enrichment credits. Each account consumes 10-20 credits depending on waterfall depth and the number of enrichment columns.
Credit planning: At 1,000 accounts/month with 15 credits average per account, you need 15,000 credits. Clay Pro ($349/month) includes 10,000 credits. Clay Team ($720/month) includes 25,000. Map your per-account credit consumption before committing to a plan.
Cost optimization: Score in two phases. Phase 1: firmographic scoring only (2-3 credits per account). Filter out Tier D accounts. Phase 2: full enrichment on remaining accounts (10-15 credits each). This cuts total credit usage by 30-40% because you never spend deep-enrichment credits on accounts that don't pass basic fit criteria.
See Template 3 in the Clay templates library for the Clay table structure.
Common Scoring Mistakes
Too many criteria. Models with 12+ scoring dimensions create noise. Every dimension adds complexity and makes debugging harder. Stick to 5-8 weighted criteria. If a criterion doesn't predict conversion after 30 days of data, drop it.
Equal weighting. Giving every dimension the same weight assumes they all predict equally well. They don't. Technographic fit and hiring signals predict better than pure firmographics in most B2B contexts. Start with equal weights for the first 30 days, then shift points toward the dimensions that correlate with conversion.
No backtesting. Launching a scoring model without running historical deals through it first is guessing. Pull 50 closed-won and 50 closed-lost deals. If the model can't separate them into different tiers, the weights are wrong.
Scoring without acting. A score that sits in a spreadsheet produces zero pipeline. The model must connect to routing: Tier A accounts get immediate, personalized outbound. Tier B enters standard sequences. Tier C goes to nurture. Tier D gets archived. If reps still pick accounts manually, the scoring model is decoration.
Ignoring decay. Scores are snapshots. An account that scored 85 three months ago may score 60 today because the hiring signal expired and the funding is old news. Re-score active pipeline accounts every 30 days. Re-score the full database every 90 days.
Scoring Tool Pricing
Clay Explorer ($149/month): 5,000 credits. Enough to score 500-700 accounts with full enrichment. Formula columns handle all scoring logic natively. No additional tools needed for most teams.
Clay Pro ($349/month): 10,000 credits. For teams scoring 1,000+ accounts/month with deep waterfall enrichment. The additional credits justify the upgrade when your Tier A conversion rate proves the model works.
HubSpot native scoring (included in Professional, $800/month): Built-in lead scoring with property-based rules. Limited to CRM data. No enrichment integration without additional tools. Good for basic firmographic scoring, weak for technographic and behavioral signals.
Salesforce Einstein Lead Scoring (included in Enterprise, $165/user/month): ML-based scoring that learns from your data. Requires 500+ closed deals to train effectively. Don't use ML scoring with fewer than 200 deals. Rule-based scoring outperforms at that sample size.
Madkudu ($20,000-50,000/year): Purpose-built scoring platform. ML models trained on your data with predictive analytics. Only justified at $10M+ ARR with 500+ closed deals per year and a dedicated RevOps team to manage it.
Implementation Checklist
Before going live with scored outbound:
1. Defined 5-8 scoring criteria across firmographic, technographic, behavioral, and deal potential dimensions. 2. Assigned point values with documented rationale for each weight. 3. Built sub-columns in Clay for each dimension (firmographic_score, tech_score, behavioral_score, deal_score). 4. Created a master score column summing all sub-columns. 5. Created a tier column with IF logic (A: 70-100, B: 50-69, C: 30-49, D: 0-29). 6. Backtested against 50 closed-won deals (70%+ should score Tier A). 7. Backtested against 50 closed-lost deals (under 30% should score Tier A). 8. Configured automated routing: Tier A to CRM + Slack notification, Tier B to standard sequences, Tier C to nurture, Tier D archived. 9. Set a 30-day calendar reminder for first calibration review. 10. Set a quarterly reminder for full recalibration with fresh closed-won data.
What Good Looks Like
A well-calibrated scoring model produces three measurable outcomes. First, Tier A accounts convert to meetings at 2-3x the rate of unscored outbound. Second, average deal velocity (days to close) for Tier A is 20-40% faster than lower tiers. Third, reps stop wasting time on accounts that never close and focus their energy where it compounds.
One more benefit that's hard to quantify: rep morale. Reps working scored lists trust the data and execute sequences with more confidence. Reps working unsorted lists burn out faster because 80% of their outreach goes nowhere. Scoring doesn't just improve pipeline. It improves retention of your best salespeople.
Frequently Asked Questions
What is the difference between account scoring and lead scoring?
Account scoring evaluates companies (firmographics, technographics). Lead scoring evaluates contacts (behavior, demographics). Build both: account scoring filters companies, lead scoring prioritizes contacts within them.
How many data points for a reliable model?
5-8 weighted criteria. More than 12 creates noise. Key signals: employee count fit, industry, tech stack, funding recency, hiring, engagement.
Should I use machine learning?
Not until 500+ closed-won data points. Below that, rule-based scoring outperforms ML. Most Series A-B teams have 50-200 deals, enough for rules but not supervised learning.
How often recalibrate?
Quarterly. Compare actual win rates across tiers. If Tier B converts at the same rate as Tier A, weights need adjustment.
What tools do I need?
CRM with closed-won data, enrichment tool (Clay or Apollo), spreadsheet for initial model. Implement in Clay formula or CRM native scoring. No code required.
Source: State of GTM Engineering Report 2026 (n=228). Salary data combines survey responses from 228 GTM Engineers across 32 countries with analysis of 3,342 job postings.