Website Change Monitoring API: The Complete Developer Guide

Website Change Monitoring API: The Complete Developer Guide

You need to know when a web page changes. Maybe it is a competitor's pricing page, a regulatory document, a product listing, or a documentation site your application depends on. The question is whether you build the monitoring infrastructure yourself or use an API that handles it for you.

This guide covers everything a developer needs to know about website change monitoring APIs: how they work, how to integrate them, and how to build automated workflows on top of change detection.

What a Monitoring API Does

A website change monitoring API handles four things you would otherwise build yourself:

  1. Scheduled fetching - Opens pages on a configurable schedule using a real browser (handling JavaScript rendering, dynamic content, and bot protection)
  2. Change detection - Compares each fetch with the previous one, identifying what specifically changed
  3. Diff generation - Produces structured diffs in multiple formats (text, HTML, markdown, visual screenshots)
  4. Event delivery - Sends webhooks, emails, or push notifications when changes are detected

Without a monitoring API, you are writing cron jobs, managing headless browsers, storing snapshots, building comparison logic, and wiring up notifications. That is a substantial engineering investment for what should be a utility.

Setting Up Your First Monitor

Most monitoring APIs follow a similar pattern. Here is how it works with PageCrawl's REST API:

1. Create a monitor:

curl -X POST "https://pagecrawl.io/api/track-simple" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://competitor.com/pricing",
    "tracking_mode": "fullpage",
    "frequency": 60
  }'

2. Set up a webhook:

curl -X POST "https://pagecrawl.io/api/hooks" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "target_url": "https://your-app.com/webhooks/page-change",
    "match_type": "all",
    "events": ["change_detected"]
  }'

3. Process incoming changes:

@app.route("/webhooks/page-change", methods=["POST"])
def handle_change():
    data = request.json

    print(f"Page changed: {data['title']}")
    print(f"AI summary: {data.get('ai_summary')}")
    print(f"Diff: {data.get('markdown_difference')}")

    # Your application logic here
    return "", 200

That is the entire setup. The API handles scheduling, browser rendering, comparison, and delivery.

Tracking Modes

Different content types require different detection strategies. A pricing page needs different handling than a terms of service document or an RSS feed.

Mode Best for What it tracks
fullpage General pages, ToS, policies All visible text
content_only News, blogs, articles Main content (strips navigation and sidebars)
reader Editorial content Reader-mode extracted text
price E-commerce, product pages Auto-detected prices and availability
specific_text Any page Text of a specific CSS/XPath element
specific_number Dashboards, stats Numeric value from a specific element
feed Job boards, listings Repeating items (additions, removals, changes)
seo Marketing, SEO Title, meta, canonical, robots, OG tags

For most use cases, fullpage is the right default. Use price for e-commerce pages, content_only for editorial content, and specific_text when you only care about one element on the page.

Webhook Payload Structure

When a change is detected, the webhook delivers a JSON payload with structured data about the change. You can customize which fields are included.

{
  "id": 12345,
  "title": "Competitor Pricing Page",
  "status": "ok",
  "changed_at": "2026-05-04T14:30:00Z",
  "contents": "Pro plan: $59/month...",
  "difference": 8,
  "human_difference": "Text difference of 8%",
  "ai_summary": "The Pro plan price increased from $49/mo to $59/mo. The Enterprise plan now includes SSO.",
  "ai_priority_score": 85,
  "markdown_difference": "- Pro plan: $49/month\n+ Pro plan: $59/month",
  "page": {
    "id": 100,
    "name": "Competitor Pricing Page",
    "url": "https://competitor.com/pricing",
    "slug": "competitor-pricing"
  }
}

Key fields for developers:

  • markdown_difference - Machine-parseable diff showing additions and removals
  • ai_summary - Natural language summary of what changed (generated automatically)
  • ai_priority_score - 0-100 importance score (higher = more significant change)
  • contents - Current value of the tracked element
  • page_screenshot_image - Signed URL to a full-page screenshot

Diff Formats

Beyond webhooks, you can retrieve diffs on demand in multiple formats:

Format Endpoint Use case
PNG image GET /api/pages/{id}/checks/{checkId}/diff.png Embed in reports, emails
HTML GET /api/pages/{id}/checks/{checkId}/diff.html Render in web UI
Markdown GET /api/pages/{id}/checks/{checkId}/diff.markdown Feed to LLMs, log in text
Patch GET /api/pages/{id}/checks/{checkId}/diff.patch Apply as unified diff

Notification Rules

Not every change deserves an alert. Use notification rules to filter which changes trigger webhooks:

{
  "rules_enabled": true,
  "rules": [
    {"type": "contains", "value": "price"},
    {"type": "text_difference", "value": 5}
  ],
  "rules_and": false
}

Available rule types include contains, added, removed, gt, lt, increased, decreased, and more. See the full API reference for all options.

Common Integration Patterns

Database logging

Store every detected change in your own database for historical analysis:

@app.route("/webhooks/page-change", methods=["POST"])
def log_change():
    data = request.json
    db.execute(
        "INSERT INTO page_changes (monitor_id, url, changed_at, summary, diff, priority) VALUES (?, ?, ?, ?, ?, ?)",
        [data['page']['id'], data['page']['url'], data['changed_at'],
         data.get('ai_summary'), data.get('markdown_difference'),
         data.get('ai_priority_score')]
    )
    return "", 200

Slack alerting with priority filtering

Only send high-priority changes to Slack:

@app.route("/webhooks/page-change", methods=["POST"])
def alert_slack():
    data = request.json
    priority = data.get("ai_priority_score", 0)

    if priority < 50:
        return "", 200  # Skip low-priority noise

    requests.post(SLACK_WEBHOOK, json={
        "text": f"*{data['title']}* changed (priority: {priority}/100)\n{data.get('ai_summary', 'No summary')}"
    })
    return "", 200

Trigger CI/CD on documentation changes

Re-deploy when your documentation source changes:

@app.route("/webhooks/page-change", methods=["POST"])
def trigger_rebuild():
    data = request.json
    # Trigger GitHub Actions workflow
    requests.post(
        f"https://api.github.com/repos/{REPO}/dispatches",
        headers={"Authorization": f"token {GITHUB_TOKEN}"},
        json={"event_type": "docs-changed", "client_payload": {"url": data['page']['url']}}
    )
    return "", 200

MCP Server for AI Assistants

If you use Claude, ChatGPT, or Cursor, the PageCrawl MCP server lets AI assistants manage monitors through conversation:

  • "Monitor docs.example.com and alert me when the API reference changes"
  • "What changed across all my monitors today?"
  • "Show me the diff for the last change on the pricing page"

This is particularly useful for developers who want to set up monitoring without leaving their IDE.

Getting Started

Start with the API quick-start guide to create your first monitor and webhook in under a minute. The interactive API reference has all endpoints with schemas and example responses.

PageCrawl was built with developers in mind from day one, with a full REST API, customizable webhooks, an MCP server for AI assistants, and a downloadable OpenAPI spec. The free tier includes 6 monitors with AI summaries and webhooks, so you can build and test your integration before committing to a paid plan.

Last updated: 13 May, 2026

Get Started with PageCrawl.io

Start monitoring website changes in under 60 seconds. Join thousands of users who never miss important updates. No credit card required.

Go to dashboard