API Integrations

Use without Airtable in Lambda, Airflow, Glue, GCP, and more

Processing Modes

Choose how to use the enricher:

ModeUse CaseNeeds Airtable?
batch Read from Airtable, enrich, write back โœ… Yes - requires apiKey, baseId, tableId
single Enrich specific Airtable records โœ… Yes - same as batch + recordIds
api Standalone enrichment (no Airtable) โŒ No - just provide companies array

Quick Start (API Mode)

Use without Airtable - perfect for integrations:

POST https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items?token=YOUR_TOKEN

{
  "mode": "api",
  "companies": [
    {"companyName": "Acme Corp", "website": "https://acme.example"}
  ]
}

Max 1000 companies per run (API mode)

For Airtable mode: Max 100 records per run

Integration Examples

Ready-to-use code for common platforms. All examples in examples/ directory.

๐Ÿ“ API Keys: Pass your LLM API key directly in the JSON config under "llm": {"apiKey": "sk-..."}. Examples show reading from infrastructure env vars - that's optional and only for your own code.
AWS Lambda Python โ€ข Node.js
GCP Functions Python
AWS Glue ETL Job
AWS ECS Task Definition
Docker Run Anywhere
GitHub Actions Workflow
Airflow DAG

AWS Lambda (Python)

File: examples/lambda_python.py

import json
import os
import requests

def lambda_handler(event, context):
    config = {
        "mode": "api",
        "companies": [
            {"companyName": "Acme Corp", "website": "https://acme.example"}
        ],
        "llm": {
            "enabled": True,
            "provider": "openai",
            "apiKey": os.environ['OPENAI_API_KEY']
        }
    }

    response = requests.post(
        "https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items",
        params={"token": os.environ['APIFY_TOKEN']},
        json=config,
        timeout=300
    )

    results = response.json()
    enriched = [r for r in results if r.get('type') == 'ENRICHED_COMPANY']

    return {'statusCode': 200, 'body': json.dumps({'enriched': enriched})}

Environment Variables: APIFY_TOKEN, OPENAI_API_KEY

AWS Lambda (Node.js)

File: examples/lambda_node.js

exports.handler = async (event) => {
  const config = {
    mode: "api",
    companies: [
      { companyName: "Acme Corp", website: "https://acme.example" }
    ],
    llm: {
      enabled: true,
      provider: "openai",
      apiKey: process.env.OPENAI_API_KEY
    }
  };

  const response = await fetch(
    `https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items?token=${process.env.APIFY_TOKEN}`,
    {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(config)
    }
  );

  const results = await response.json();
  const enriched = results.filter(r => r.type === 'ENRICHED_COMPANY');

  return { statusCode: 200, body: JSON.stringify({ enriched }) };
};

GCP Cloud Functions (Python)

File: examples/gcp_function_python.py

import functions_framework
import requests
import os

@functions_framework.http
def enrich_leads(request):
    request_json = request.get_json(silent=True)
    companies = request_json.get('companies', [])

    config = {
        "mode": "api",
        "companies": companies,
        "llm": {
            "enabled": True,
            "provider": "openai",
            "apiKey": os.environ['OPENAI_API_KEY']
        }
    }

    response = requests.post(
        "https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items",
        params={"token": os.environ['APIFY_TOKEN']},
        json=config,
        timeout=300
    )

    results = response.json()
    enriched = [r for r in results if r.get('type') == 'ENRICHED_COMPANY']
    return json.dumps({'enriched': enriched})

AWS Glue Job

File: examples/glue_job.py

from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
import requests

args = getResolvedOptions(sys.argv, ['JOB_NAME', 'APIFY_TOKEN', 'OPENAI_API_KEY'])
sc = SparkContext()
glueContext = GlueContext(sc)

# Read input data
input_df = glueContext.create_dynamic_frame.from_catalog(
    database="your_database",
    table_name="companies"
).toDF()

# Enrich each company
for row in input_df.collect():
    config = {
        "mode": "api",
        "companies": [{"companyName": row.company_name, "website": row.website}],
        "llm": {"enabled": True, "provider": "openai", "apiKey": args['OPENAI_API_KEY']}
    }

    response = requests.post(
        "https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items",
        params={"token": args['APIFY_TOKEN']},
        json=config
    )
    # Process results...

AWS ECS/Fargate

File: examples/ecs_task_definition.json

{
  "family": "lead-enricher",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "containerDefinitions": [{
    "name": "lead-enricher",
    "image": "python:3.11-slim",
    "command": ["python", "-c", "import requests; ..."],
    "secrets": [
      {"name": "APIFY_TOKEN", "valueFrom": "arn:aws:secretsmanager:..."},
      {"name": "OPENAI_API_KEY", "valueFrom": "arn:aws:secretsmanager:..."}
    ]
  }]
}

Docker (EC2, Local, Anywhere)

File: examples/docker_run.sh

docker run --rm \
  -e APIFY_TOKEN="${APIFY_TOKEN}" \
  -e OPENAI_API_KEY="${OPENAI_API_KEY}" \
  python:3.11-slim \
  bash -c "pip install requests && python -c 'import requests; ...'"

GitHub Actions

File: examples/github_actions.yml

name: Enrich Leads
on:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM

jobs:
  enrich:
    runs-on: ubuntu-latest
    steps:
      - name: Enrich Leads
        env:
          APIFY_TOKEN: ${{ secrets.APIFY_TOKEN }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: |
          curl -X POST \
            "https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items?token=${APIFY_TOKEN}" \
            -H "Content-Type: application/json" \
            -d '{"mode":"api","companies":[...]}'

Apache Airflow

File: examples/airflow_dag.py

from airflow import DAG
from airflow.operators.python import PythonOperator
import requests

def enrich_leads(**context):
    config = {
        "mode": "api",
        "companies": [{"companyName": "Acme", "website": "https://acme.example"}],
        "llm": {"enabled": True, "provider": "openai", "apiKey": os.environ['OPENAI_API_KEY']}
    }

    response = requests.post(
        "https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items",
        params={"token": os.environ['APIFY_TOKEN']},
        json=config
    )

    return response.json()

with DAG('enrich_leads', schedule_interval='0 2 * * *') as dag:
    PythonOperator(task_id='enrich', python_callable=enrich_leads)

Configuration

Airtable Mode Config

{
  "mode": "batch",
  "airtable": {
    "apiKey": "patXXXXXXXXXXXXXX",
    "baseId": "appXXXXXXXXXXXXXX",
    "tableId": "tblXXXXXXXXXXXXXX",
    "inputFields": {
      "companyName": "Company Name",  // Map to your column
      "website": "Website"
    },
    "outputFields": {
      "email": "Contact Email",       // Map to your column
      "phone": "Phone Number",
      "leadScore": "Lead Score"
    }
  }
}

API Mode Config

{
  "mode": "api",
  "companies": [
    {"companyName": "Acme", "website": "https://acme.example"}
  ],
  "llm": {
    "enabled": true,
    "provider": "openai",  // or "anthropic", "bedrock"
    "apiKey": "sk-...",
    "model": "gpt-4o"      // optional
  },
  "enrichment": {
    "sources": ["google_maps", "website", "hunter"],
    "hunter": {"enabled": true, "apiKey": "..."}
  },
  "scoring": {
    "enabled": true,
    "icpCriteria": "B2B SaaS, 50-500 employees, US-based"
  }
}

LLM Providers

ProviderModelAPI Key
OpenAIgpt-4oGet key
Anthropicclaude-haiku-4-5Get key
Bedrockclaude-haiku-4-5Setup

CRM Integrations

Integrate with Salesforce, HubSpot, Pipedrive, Zoho, and other CRMs.

Method 1: Webhooks

Receive enriched data via POST webhook:

{
  "webhookUrl": "https://your-crm.com/webhook/enriched-leads"
}

Method 2: Automation Platforms

PlatformUse Case
ZapierNew CRM lead โ†’ Enrich โ†’ Update CRM
Make (Integromat)Schedule enrichment, sync to CRM
n8nSelf-hosted workflow automation

Method 3: Direct API

Call from CRM automation (Salesforce Apex, HubSpot workflows, etc.):

// HubSpot Workflow Custom Code
const response = await fetch(
  'https://api.apify.com/v2/acts/datahq~airtable-lead-enricher/run-sync-get-dataset-items?token=YOUR_TOKEN',
  {
    method: 'POST',
    body: JSON.stringify({
      mode: 'api',
      companies: [{ companyName: company.name, website: company.website }]
    })
  }
);

// Update HubSpot contact with enriched data
const enriched = await response.json();
// ... update logic ...

Example: Salesforce โ†’ Enrich โ†’ Update

Using Make.com:

  1. Trigger: New Salesforce Lead created
  2. HTTP Module: POST to Apify API with lead data
  3. Salesforce Module: Update lead with enriched data (email, phone, score)

Example: HubSpot Auto-Enrichment

Using Zapier:

  1. Trigger: New HubSpot contact added to list "Needs Enrichment"
  2. Webhooks by Zapier: POST to Apify API
  3. HubSpot: Update contact properties with enriched data
  4. HubSpot: Remove from "Needs Enrichment" list

Response Format

[
  {
    "type": "RUN_STATS",
    "stats": {"companiesProcessed": 1, "enrichmentSuccessful": 1, "llmEnabled": true}
  },
  {
    "type": "ENRICHED_COMPANY",
    "input": {"companyName": "Acme Corp", "website": "https://acme.example"},
    "output": {
      "email": "contact@acme.example",
      "phone": "+1 555 0100",
      "leadScore": "Good",
      "icpScore": "Fair",
      "summary": "...",
      "techStack": ["React", "Node.js"],
      "enrichedAt": "2025-12-21T10:00:00.000Z"
    },
    "success": true
  }
]

Output Fields

FieldSource
emailWebsite / Hunter
phoneMaps / Website
leadScoreAI (Excellent/Good/Fair/Poor/Bad)
icpScoreAI (Excellent/Good/Fair/Poor/Bad)
linkedinUrlHunter
techStackWebsite
summaryAI

Links