In this tutorial, you'll build a practical lead enrichment flow that takes a simple list of company names and websites, then automatically enriches each lead with detailed information and an AI-generated quality score.
What You'll Build
A complete lead enrichment pipeline that:
- Starts with basic lead data (company name + website)
- Analyzes each company's website
- Extracts key business information
- Generates an AI-powered lead score
- Saves enriched data back to your dataset
- Deploys as an API for CRM integration
Step 1: Prepare Your Dataset
Create Input Dataset
- 1
Go to Datasets Navigate to the Datasets section
- 2
Create new dataset Name it "Leads - Raw"
- 3
Add columns company, website, source, created_at
- 4
Import or add sample data Add 5-10 test leads
Sample Data
company,website,source,created_at
Acme Corp,https://acme.com,webform,2024-01-15
TechStart Inc,https://techstart.io,linkedin,2024-01-15
Global Solutions,https://globalsolutions.com,referral,2024-01-16Step 2: Create Enrichment Prompts
Prompt 1: Business Analyzer
In the Playground, create this prompt:
Analyze this company's website and provide structured information:
Company: {{companyName}}
Website Content: {{websiteContent}}
Provide a JSON response with:
{
"industry": "Primary industry",
"companySize": "estimated employee count range",
"products": ["list", "of", "main products/services"],
"targetMarket": "B2B, B2C, or Both",
"technologyStack": ["detected", "technologies"],
"description": "2-sentence company overview"
}
Be factual and concise.Test it, then save as "Business Analyzer".
Prompt 2: Lead Scorer
Score this lead from 1-10 based on these factors:
Company: {{companyName}}
Industry: {{industry}}
Company Size: {{companySize}}
Products: {{products}}
Target Market: {{targetMarket}}
Source: {{leadSource}}
Scoring criteria:
- Relevance to our ideal customer profile
- Company size (prefer 50-500 employees)
- Technology adoption (modern stack is better)
- B2B focus (higher score)
- Website quality and professionalism
Provide:
{
"score": 1-10,
"reasoning": "brief explanation",
"nextSteps": "recommended action",
"priority": "high/medium/low"
}Test and save as "Lead Scorer".
Step 3: Build the Flow
Flow Structure
Dataset Source ("Leads - Raw")
↓
Website Mapper (discover pages)
↓
Page Scraper (get homepage)
↓
HTML Text Extractor (clean content)
↓
Prompt: Business Analyzer (extract info)
↓
Prompt: Lead Scorer (score & prioritize)
↓
Dataset Sink (UPDATE "Leads - Raw")Detailed Configuration
- 1
Dataset Source Select "Leads - Raw" dataset
Expose fields: company, website, source
Filter: enriched_at IS NULL (only process new leads) - 2
Website Mapper Map: out.website → url
Max pages: 1 (we only need homepage) - 3
Page Scraper Map: out.urls[0] → url
Selector: "body" (get full page) - 4
HTML Text Extractor Map: out.html → html
Mode: Standard - 5
Prompt: Business Analyzer Select "Business Analyzer" prompt
Map: out.company → companyName
Map: HTMLExtractor.out.text → websiteContent - 6
Prompt: Lead Scorer Select "Lead Scorer" prompt
Map: out.company → companyName
Map: BusinessAnalyzer.out.industry → industry
Map: BusinessAnalyzer.out.companySize → companySize
Map: BusinessAnalyzer.out.products → products
Map: BusinessAnalyzer.out.targetMarket → targetMarket
Map: out.source → leadSource - 7
Dataset Sink Target: "Leads - Raw"
Mode: UPDATE
Match on: id
Map fields: industry, company_size, products, target_market, lead_score, priority, next_steps, enriched_at
Step 4: Test Your Flow
Run with Sample Data
- 1
Select 2-3 leads Don't run all at once initially
- 2
Click "Run Flow" Watch the execution
- 3
Monitor progress Check each node's output
- 4
Verify results Check the updated dataset
Expected Output
Your dataset should now have enriched data:
{
"company": "Acme Corp",
"website": "https://acme.com",
"source": "webform",
"industry": "Manufacturing",
"company_size": "200-500 employees",
"products": ["Industrial equipment", "Automation solutions"],
"target_market": "B2B",
"lead_score": 8,
"priority": "high",
"next_steps": "Schedule demo, emphasize automation ROI",
"enriched_at": "2024-01-15T10:30:00Z"
}Step 5: Handle Array Processing
Add Batch Capabilities
To process multiple leads efficiently:
Dataset Source
↓
Array Splitter (parallel: 5)
↓
[Individual processing per lead]
↓
Array Flatten
↓
Dataset Sink (batch update)Parallel Configuration
- Concurrency: 5 (safe for most APIs)
- Error handling: Skip and continue
- Timeout: 60s per lead
- Retry: 2 attempts on failure
Step 6: Add Quality Control
Validation Node
Add a validation step before Dataset Sink:
Lead Scorer output
↓
Validation: Check required fields
- lead_score exists and 1-10
- priority is high/medium/low
- industry is not "Unknown"
↓
If valid: → Dataset Sink
If invalid: → Error log + skipFallback Strategy
If website scraping fails:
→ Try alternative data source
→ Or set default values
→ Mark as "needs_manual_review"
→ Still save to datasetStep 7: Deploy as API
Add API Nodes
API Input (single lead)
fields: company, website, source
↓
[Processing pipeline]
↓
API Output
return: enriched lead dataAPI Integration
# CRM integration
def enrich_new_lead(company, website, source):
response = requests.post(
"https://api.evaligo.com/flows/lead-enrichment/execute",
headers={"Authorization": f"Bearer {api_key}"},
json={
"company": company,
"website": website,
"source": source
}
)
enriched_data = response.json()
# Update CRM with enriched data
crm.update_lead(enriched_data)
# Route high-priority leads
if enriched_data["priority"] == "high":
notify_sales_team(enriched_data)
# Webhook from web form
@app.route('/webhook/new-lead', methods=['POST'])
def new_lead_webhook():
lead = request.json
enriched = enrich_new_lead(
lead['company'],
lead['website'],
'webform'
)
return jsonify(enriched)Step 8: Monitor and Optimize
Key Metrics to Track
- Enrichment success rate
- Average processing time
- Cost per lead
- Lead score distribution
- Conversion rate by score
Optimization Strategies
Week 1 Baseline:
Success rate: 85%
Avg time: 15s per lead
Cost: $0.25 per lead
Optimizations:
1. Cache website data (same domain)
→ Cost: $0.18 per lead (-28%)
2. Parallel processing (5 concurrent)
→ Time: 3s per lead (-80%)
3. Refined scoring prompt
→ Better prioritization
Week 4 Results:
Success rate: 92%
Avg time: 3s per lead
Cost: $0.18 per lead
Conversion rate (high priority): +35%Advanced Features
Re-enrichment Schedule
Run flow weekly:
Filter: enriched_at < 30 days ago
→ Update company data
→ Recalculate scores
→ Detect changes (funding, growth, etc.)Multi-Source Enrichment
Source 1: Website analysis
Source 2: LinkedIn company data
Source 3: News mentions
Source 4: Tech stack detection
→ Combine all sources
→ Generate comprehensive profileSmart Routing
High-score leads (8-10):
→ Immediate sales notification
→ Priority in CRM
Mid-score leads (5-7):
→ Marketing nurture campaign
→ Weekly follow-up
Low-score leads (1-4):
→ Generic nurture sequence
→ Monthly check-inTroubleshooting
Common Issues
Low Enrichment Success Rate
Problem: Many leads fail to enrich
Solutions:
- Verify websites are accessible
- Increase timeout for slow sites
- Add retry logic
- Implement fallback data sources
Inconsistent Scores
Problem: Lead scores seem random
Solutions:
- Use structured output schema
- Add more specific criteria to prompt
- Lower temperature (0.3-0.5)
- Test prompt with edge cases
Slow Processing
Problem: Takes too long
Solutions:
- Enable parallel processing
- Cache website data
- Use async execution
- Optimize prompt length
Next Steps
Enhance Your Flow
- Add competitor analysis
- Integrate with LinkedIn API
- Extract contact information
- Detect recent funding/news
- Build custom ICP matching
Scale to Production
- Process full lead database
- Set up automated scheduling
- Integrate with CRM webhooks
- Build enrichment dashboard
- Track ROI and conversion rates