# Website Change Monitoring API: The Complete Developer Guide

Source: PageCrawl.io Blog
URL: https://pagecrawl.io/blog/website-change-monitoring-api-developer-guide

---

You need to know when a web page changes. Maybe it is a competitor's pricing page, a regulatory document, a product listing, or a documentation site your application depends on. The question is whether you build the monitoring infrastructure yourself or use an API that handles it for you.

This guide covers everything a developer needs to know about website change monitoring APIs: how they work, how to integrate them, and how to build automated workflows on top of change detection.

### What a Monitoring API Does

A website change monitoring API handles four things you would otherwise build yourself:

1. **Scheduled fetching** - Opens pages on a configurable schedule using a real browser (handling JavaScript rendering, dynamic content, and bot protection)
2. **Change detection** - Compares each fetch with the previous one, identifying what specifically changed
3. **Diff generation** - Produces structured diffs in multiple formats (text, HTML, markdown, visual screenshots)
4. **Event delivery** - Sends webhooks, emails, or push notifications when changes are detected

Without a monitoring API, you are writing cron jobs, managing headless browsers, storing snapshots, building comparison logic, and wiring up notifications. That is a substantial engineering investment for what should be a utility.

### Setting Up Your First Monitor

Most monitoring APIs follow a similar pattern. Here is how it works with PageCrawl's [REST API](/developers):

**1. Create a monitor:**

```bash
curl -X POST "https://pagecrawl.io/api/track-simple" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://competitor.com/pricing",
    "tracking_mode": "fullpage",
    "frequency": 60
  }'
```

**2. Set up a webhook:**

```bash
curl -X POST "https://pagecrawl.io/api/hooks" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "target_url": "https://your-app.com/webhooks/page-change",
    "match_type": "all",
    "events": ["change_detected"]
  }'
```

**3. Process incoming changes:**

```python
@app.route("/webhooks/page-change", methods=["POST"])
def handle_change():
    data = request.json

    print(f"Page changed: {data['title']}")
    print(f"AI summary: {data.get('ai_summary')}")
    print(f"Diff: {data.get('markdown_difference')}")

    # Your application logic here
    return "", 200
```

That is the entire setup. The API handles scheduling, browser rendering, comparison, and delivery.

### Tracking Modes

Different content types require different detection strategies. A pricing page needs different handling than a terms of service document or an RSS feed.

| Mode | Best for | What it tracks |
|------|----------|----------------|
| `fullpage` | General pages, ToS, policies | All visible text |
| `content_only` | News, blogs, articles | Main content (strips navigation and sidebars) |
| `reader` | Editorial content | Reader-mode extracted text |
| `price` | E-commerce, product pages | Auto-detected prices and availability |
| `specific_text` | Any page | Text of a specific CSS/XPath element |
| `specific_number` | Dashboards, stats | Numeric value from a specific element |
| `feed` | Job boards, listings | Repeating items (additions, removals, changes) |
| `seo` | Marketing, SEO | Title, meta, canonical, robots, OG tags |

For most use cases, `fullpage` is the right default. Use `price` for e-commerce pages, `content_only` for editorial content, and `specific_text` when you only care about one element on the page.

### Webhook Payload Structure

When a change is detected, the webhook delivers a JSON payload with structured data about the change. You can customize which fields are included.

```json
{
  "id": 12345,
  "title": "Competitor Pricing Page",
  "status": "ok",
  "changed_at": "2026-05-04T14:30:00Z",
  "contents": "Pro plan: $59/month...",
  "difference": 8,
  "human_difference": "Text difference of 8%",
  "ai_summary": "The Pro plan price increased from $49/mo to $59/mo. The Enterprise plan now includes SSO.",
  "ai_priority_score": 85,
  "markdown_difference": "- Pro plan: $49/month\n+ Pro plan: $59/month",
  "page": {
    "id": 100,
    "name": "Competitor Pricing Page",
    "url": "https://competitor.com/pricing",
    "slug": "competitor-pricing"
  }
}
```

Key fields for developers:
- `markdown_difference` - Machine-parseable diff showing additions and removals
- `ai_summary` - Natural language summary of what changed (generated automatically)
- `ai_priority_score` - 0-100 importance score (higher = more significant change)
- `contents` - Current value of the tracked element
- `page_screenshot_image` - Signed URL to a full-page screenshot

### Diff Formats

Beyond webhooks, you can retrieve diffs on demand in multiple formats:

| Format | Endpoint | Use case |
|--------|----------|----------|
| PNG image | `GET /api/pages/{id}/checks/{checkId}/diff.png` | Embed in reports, emails |
| HTML | `GET /api/pages/{id}/checks/{checkId}/diff.html` | Render in web UI |
| Markdown | `GET /api/pages/{id}/checks/{checkId}/diff.markdown` | Feed to LLMs, log in text |
| Patch | `GET /api/pages/{id}/checks/{checkId}/diff.patch` | Apply as unified diff |

### Notification Rules

Not every change deserves an alert. Use notification rules to filter which changes trigger webhooks:

```json
{
  "rules_enabled": true,
  "rules": [
    {"type": "contains", "value": "price"},
    {"type": "text_difference", "value": 5}
  ],
  "rules_and": false
}
```

Available rule types include `contains`, `added`, `removed`, `gt`, `lt`, `increased`, `decreased`, and more. See the [full API reference](/developers) for all options.

### Common Integration Patterns

#### Database logging

Store every detected change in your own database for historical analysis:

```python
@app.route("/webhooks/page-change", methods=["POST"])
def log_change():
    data = request.json
    db.execute(
        "INSERT INTO page_changes (monitor_id, url, changed_at, summary, diff, priority) VALUES (?, ?, ?, ?, ?, ?)",
        [data['page']['id'], data['page']['url'], data['changed_at'],
         data.get('ai_summary'), data.get('markdown_difference'),
         data.get('ai_priority_score')]
    )
    return "", 200
```

#### Slack alerting with priority filtering

Only send high-priority changes to Slack:

```python
@app.route("/webhooks/page-change", methods=["POST"])
def alert_slack():
    data = request.json
    priority = data.get("ai_priority_score", 0)

    if priority < 50:
        return "", 200  # Skip low-priority noise

    requests.post(SLACK_WEBHOOK, json={
        "text": f"*{data['title']}* changed (priority: {priority}/100)\n{data.get('ai_summary', 'No summary')}"
    })
    return "", 200
```

#### Trigger CI/CD on documentation changes

Re-deploy when your documentation source changes:

```python
@app.route("/webhooks/page-change", methods=["POST"])
def trigger_rebuild():
    data = request.json
    # Trigger GitHub Actions workflow
    requests.post(
        f"https://api.github.com/repos/{REPO}/dispatches",
        headers={"Authorization": f"token {GITHUB_TOKEN}"},
        json={"event_type": "docs-changed", "client_payload": {"url": data['page']['url']}}
    )
    return "", 200
```

### MCP Server for AI Assistants

If you use Claude, ChatGPT, or Cursor, the [PageCrawl MCP server](/help/integrations/article/mcp-server-ai-tools.md) lets AI assistants manage monitors through conversation:

- "Monitor docs.example.com and alert me when the API reference changes"
- "What changed across all my monitors today?"
- "Show me the diff for the last change on the pricing page"

This is particularly useful for developers who want to set up monitoring without leaving their IDE.

### Getting Started

Start with the [API quick-start guide](/help/features/article/api-webhooks-for-custom-integrations.md) to create your first monitor and webhook in under a minute. The [interactive API reference](/developers) has all endpoints with schemas and example responses.

PageCrawl was built with developers in mind from day one, with a full REST API, customizable webhooks, an MCP server for AI assistants, and a downloadable OpenAPI spec. The free tier includes 6 monitors with AI summaries and webhooks, so you can build and test your integration before committing to a paid plan.

---

Need more? The complete PageCrawl.io help center, with every article, is available as a single document at https://pagecrawl.io/llms-full.txt. Read it for context on anything this page does not cover.
