How to Integrate PageCrawl.io with Home Assistant: Complete Guide to Web Change Detection Automation

How to Integrate PageCrawl.io with Home Assistant: Complete Guide to Web Change Detection Automation

Home Assistant users love automation. Whether it's turning on lights when you arrive home or adjusting the thermostat based on weather forecasts, the power of Home Assistant lies in connecting different services and triggering actions automatically.

But what about monitoring websites? Home Assistant has a built-in scrape sensor that works great for simple, static HTML pages. However, many sites you actually want to monitor—Raspberry Pi stock on rpilocator.com, Ubiquiti UniFi gear, sneaker drops, protected member portals—require something more powerful.

Why use PageCrawl.io instead of the built-in scrape sensor?

  • Real browser rendering — Modern sites load content via JavaScript. The scrape sensor only sees raw HTML, missing dynamic prices, stock status, and interactive elements. PageCrawl uses a real browser.
  • Anti-bot protection bypass — Sites like Ubiquiti, NVIDIA, and many retailers use Cloudflare, DataDome, or similar protection that blocks simple HTTP requests. PageCrawl handles this automatically.
  • Don't get your IP blocked — Scraping from your home IP can get you blocked or rate-limited. PageCrawl uses rotating proxies and residential IPs.
  • Don't overload servers — Running frequent checks from your HA instance puts load on both your server and the target site. PageCrawl manages check frequency and caching intelligently.
  • Login authentication — Monitor pages behind login walls using browser-based authentication that the scrape sensor can't handle.
  • AI-powered summaries — Get intelligent change summaries instead of raw diffs.
  • Visual screenshots — See exactly what changed with before/after screenshots.

PageCrawl.io is a web monitoring and change detection service built for the sites that are hard to monitor. In this guide, we'll show you how to integrate PageCrawl with Home Assistant using two methods:

  1. Webhooks (Push) — PageCrawl sends notifications directly to Home Assistant when changes are detected. Available on the free plan.
  2. REST API Polling (Pull) — Home Assistant periodically queries PageCrawl for status updates. Requires paid plan.

Both approaches have their use cases, and we'll cover everything from basic setup to advanced automation examples.

Prerequisites

Before you begin, you'll need:

  • A PageCrawl.io account with at least one monitored page
  • Home Assistant installed and accessible (local or Nabu Casa cloud)
  • Basic familiarity with Home Assistant YAML configuration
  • For webhooks: A way to expose Home Assistant to the internet (Nabu Casa, reverse proxy, or Cloudflare Tunnel)
  • For REST API polling: A paid PageCrawl plan (API access is not available on the free tier)

Webhooks are the recommended approach for real-time notifications. When PageCrawl detects a change on your monitored page, it immediately sends an HTTP POST request to your Home Assistant instance with all the change details.

Basic Webhook Setup

Step 1: Create a Webhook Automation in Home Assistant

First, create an automation that listens for incoming webhooks. Add this to your automations.yaml or create it through the UI:

automation:
  - id: pagecrawl_change_detected
    alias: "PageCrawl - Change Detected"
    description: "Triggered when PageCrawl detects a website change"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_change_notification
        allowed_methods:
          - POST
        local_only: false
    action:
      - service: notify.persistent_notification
        data:
          title: "Website Changed: {{ trigger.json.title }}"
          message: >
            {{ trigger.json.page.name }} has changed!

            URL: {{ trigger.json.page.url }}
            Changed at: {{ trigger.json.changed_at }}
            Difference: {{ trigger.json.human_difference }}
            {% if trigger.json.ai_summary %}

            AI Summary: {{ trigger.json.ai_summary }}
            {% endif %}

Step 2: Get Your Webhook URL

Your webhook URL will be in one of these formats:

With Nabu Casa:

https://YOUR_NABU_CASA_URL.ui.nabu.casa/api/webhook/pagecrawl_change_notification

With external access (reverse proxy, Cloudflare Tunnel, etc.):

https://your-homeassistant-domain.com/api/webhook/pagecrawl_change_notification

Important: Since PageCrawl is a cloud-hosted service, your Home Assistant instance must be accessible from the internet to receive webhooks. Local-only URLs (like homeassistant.local) will not work.

Step 3: Configure the Webhook in PageCrawl

  1. Log in to PageCrawl.io
  2. Go to Settings → Webhooks
  3. Click Add Webhook
  4. Enter your Home Assistant webhook URL
  5. Select the pages you want to trigger this webhook
  6. Choose which fields to include in the payload
  7. Save and test the webhook

Webhook Payload Structure

PageCrawl sends a comprehensive JSON payload with each webhook. Here's the complete structure:

{
  "id": 12345,
  "title": "Product Price Monitor",
  "status": "ok",
  "content_type": "text/html",
  "visual_diff": 15,
  "changed_at": "2024-01-15T10:30:00Z",

  "contents": "Current page content...",
  "original": "Previous page content...",
  "difference": 25.5,
  "human_difference": "Medium change detected",
  "elements": 5,

  "page_screenshot_image": "https://pagecrawl.io/api/webhook/...",
  "text_difference_image": "https://pagecrawl.io/api/webhook/...",
  "html_difference": "<div class='diff'>...</div>",
  "markdown_difference": "## Changes\n- Price changed from $99 to $79",

  "page": {
    "id": 123,
    "name": "Amazon Product Page",
    "url": "https://amazon.com/product/...",
    "url_tld": "amazon.com",
    "slug": "amazon-product-page",
    "link": "https://pagecrawl.io/app/pages/amazon-product-page",
    "folder": "Price Monitors"
  },

  "page_elements": [
    {
      "id": 456,
      "label": "Price",
      "type": "css",
      "contents": "$79.99",
      "original": "$99.99",
      "difference": 20.0,
      "human_difference": "Significant change",
      "changed": true
    }
  ],

  "ai_summary": "The product price dropped from $99.99 to $79.99, a 20% discount",
  "ai_priority_score": 85.5,

  "previous_check": {
    "id": 12344,
    "changed_at": "2024-01-14T10:30:00Z",
    "contents": "...",
    "difference": 5.2
  }
}

Key Fields for Automations

Field Description Use Case
title Page name Notification titles
status Page status (ok, error) Error alerting
difference Change percentage (0-100) Threshold-based triggers
human_difference Readable change description Notifications
ai_summary AI-generated change summary Smart notifications
ai_priority_score Importance score (0-100) Priority filtering
page.url Monitored URL Deep linking
page_elements[].contents Specific tracked element value Price/stock monitoring
changed_at ISO 8601 timestamp Time-based logic

Advanced Webhook Automations

Price Drop Alert with Mobile Notification

automation:
  - id: pagecrawl_price_drop_alert
    alias: "PageCrawl - Price Drop Alert"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_change_notification
        allowed_methods:
          - POST
        local_only: false
    condition:
      # Only trigger for significant changes with high AI priority
      - condition: template
        value_template: "{{ trigger.json.ai_priority_score | float > 70 }}"
      - condition: template
        value_template: "{{ trigger.json.difference | float > 10 }}"
    action:
      - service: notify.mobile_app_your_phone
        data:
          title: "Price Alert: {{ trigger.json.title }}"
          message: "{{ trigger.json.ai_summary | default('Change detected') }}"
          data:
            url: "{{ trigger.json.page.url }}"
            clickAction: "{{ trigger.json.page.url }}"
            image: "{{ trigger.json.page_screenshot_image }}"
            priority: high
            ttl: 0
            actions:
              - action: OPEN_URL
                title: "View Page"
                uri: "{{ trigger.json.page.url }}"
              - action: VIEW_PAGECRAWL
                title: "View in PageCrawl"
                uri: "{{ trigger.json.page.link }}"

Raspberry Pi Stock Alert (rpilocator.com)

Monitor rpilocator.com for Raspberry Pi 5 availability and get instant alerts when stock appears:

automation:
  - id: pagecrawl_rpi_stock_alert
    alias: "PageCrawl - Raspberry Pi Back in Stock"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_rpi_stock
        allowed_methods:
          - POST
        local_only: false
    condition:
      - condition: template
        value_template: >
          {% set content = trigger.json.contents | lower %}
          {{ 'in stock' in content or 'add to cart' in content or 'buy now' in content }}
    action:
      - service: notify.all_devices
        data:
          title: "Raspberry Pi In Stock!"
          message: "{{ trigger.json.title }} is available! {{ trigger.json.page.url }}"
      - service: light.turn_on
        target:
          entity_id: light.office_alert
        data:
          color_name: green
          brightness: 255
      - delay:
          seconds: 30
      - service: light.turn_off
        target:
          entity_id: light.office_alert

Multi-Page Router with Different Actions

Route different monitors to different actions based on the page folder or URL:

automation:
  - id: pagecrawl_router
    alias: "PageCrawl - Notification Router"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_change_notification
        allowed_methods:
          - POST
        local_only: false
    action:
      - choose:
          # Home Assistant / ESPHome / Zigbee2mqtt releases (GitHub)
          - conditions:
              - condition: template
                value_template: "{{ 'github.com' in trigger.json.page.url }}"
            sequence:
              - service: notify.mobile_app_phone
                data:
                  title: "New Release: {{ trigger.json.title }}"
                  message: "{{ trigger.json.ai_summary | default('New version available') }}"
                  data:
                    url: "{{ trigger.json.page.url }}"
                    priority: high

          # Hardware stock alerts (rpilocator, Ubiquiti store, ThePiHut)
          - conditions:
              - condition: template
                value_template: "{{ trigger.json.page.folder == 'Stock Alerts' }}"
            sequence:
              - service: notify.all_devices
                data:
                  title: "In Stock: {{ trigger.json.title }}"
                  message: "{{ trigger.json.page.url }}"
              - service: light.turn_on
                target:
                  entity_id: light.office_notification
                data:
                  color_name: green
                  brightness: 255

          # Firmware updates (Shelly, Tasmota)
          - conditions:
              - condition: template
                value_template: "{{ trigger.json.page.folder == 'Firmware' }}"
            sequence:
              - service: tts.speak
                target:
                  entity_id: tts.home_assistant_cloud
                data:
                  message: "Firmware update available for {{ trigger.json.title }}"
                  media_player_entity_id: media_player.office_speaker

        # Default action for any other page
        default:
          - service: notify.persistent_notification
            data:
              title: "PageCrawl: {{ trigger.json.title }}"
              message: "{{ trigger.json.human_difference }}"

Track Changes in Input Text Helper

automation:
  - id: pagecrawl_store_price
    alias: "PageCrawl - Store Latest Price"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_price_tracker
        allowed_methods:
          - POST
        local_only: false
    action:
      - service: input_text.set_value
        target:
          entity_id: input_text.tracked_product_price
        data:
          value: "{{ trigger.json.page_elements[0].contents }}"
      - service: input_datetime.set_datetime
        target:
          entity_id: input_datetime.price_last_updated
        data:
          datetime: "{{ now().strftime('%Y-%m-%d %H:%M:%S') }}"

Method 2: REST API Polling

REST API polling is useful when you need to:

  • Check status on a schedule regardless of changes
  • Monitor page health and error states
  • Build dashboard sensors
  • Integrate with systems that can't receive webhooks

Getting Your API Token

  1. Log in to PageCrawl.io
  2. Navigate to SettingsAPI
  3. Copy your API token (or generate a new one)

Your API token is a 60-character string that authenticates your requests.

Basic REST Sensor Setup

Add this to your configuration.yaml:

rest:
  - resource: "https://pagecrawl.io/api/changes/YOUR_PAGE_ID"
    scan_interval: 300  # Poll every 5 minutes
    headers:
      Authorization: "Bearer YOUR_API_TOKEN"
      Accept: "application/json"
    sensor:
      - name: "PageCrawl Example Page"
        unique_id: "pagecrawl_example_page_status"
        value_template: >
          {% if value_json.disabled %}
            disabled
          {% elif value_json.unseen > 0 %}
            changed
          {% elif value_json.failed > 0 %}
            error
          {% else %}
            ok
          {% endif %}
        json_attributes:
          - name
          - url
          - last_checked_at
          - unseen
          - failed
          - disabled
          - frequency

Advanced REST Sensors

Complete Page Status with All Attributes

rest:
  - resource: "https://pagecrawl.io/api/changes/123"
    scan_interval: 300
    headers:
      Authorization: "Bearer YOUR_API_TOKEN"
      Accept: "application/json"
    sensor:
      - name: "Product Monitor Status"
        unique_id: "pagecrawl_product_monitor"
        icon: mdi:web-sync
        value_template: >
          {% if value_json.disabled %}disabled
          {% elif value_json.pending %}checking
          {% elif value_json.unseen > 0 %}changed
          {% elif value_json.failed > 0 %}error
          {% else %}ok{% endif %}
        json_attributes:
          - id
          - name
          - url
          - status
          - last_checked_at
          - unseen
          - failed
          - disabled
          - pending
          - frequency
          - screenshots
          - latest

    binary_sensor:
      - name: "Product Monitor Has Changes"
        unique_id: "pagecrawl_product_monitor_changed"
        icon: mdi:alert-circle
        value_template: "{{ value_json.unseen > 0 }}"
        device_class: problem

      - name: "Product Monitor Has Errors"
        unique_id: "pagecrawl_product_monitor_error"
        icon: mdi:alert
        value_template: "{{ value_json.failed > 0 }}"
        device_class: problem

      - name: "Product Monitor Active"
        unique_id: "pagecrawl_product_monitor_active"
        icon: mdi:eye
        value_template: "{{ not value_json.disabled }}"

Latest Check Details with AI Summary

rest:
  - resource: "https://pagecrawl.io/api/changes/123/zapier/poll"
    scan_interval: 300
    headers:
      Authorization: "Bearer YOUR_API_TOKEN"
      Accept: "application/json"
    sensor:
      - name: "Product Monitor Latest Check"
        unique_id: "pagecrawl_product_latest"
        icon: mdi:clipboard-text-clock
        value_template: "{{ value_json[0].id }}"
        json_attributes_path: "$[0]"
        json_attributes:
          - title
          - status
          - changed_at
          - difference
          - human_difference
          - ai_summary
          - ai_priority_score
          - contents
          - page

      - name: "Product Monitor AI Priority"
        unique_id: "pagecrawl_product_ai_priority"
        icon: mdi:robot
        unit_of_measurement: "%"
        value_template: "{{ value_json[0].ai_priority_score | float(0) | round(1) }}"

      - name: "Product Monitor Difference"
        unique_id: "pagecrawl_product_difference"
        icon: mdi:compare
        unit_of_measurement: "%"
        value_template: "{{ value_json[0].difference | float(0) | round(1) }}"

Extract Specific Element Value (e.g., Price)

rest:
  - resource: "https://pagecrawl.io/api/changes/123/zapier/poll"
    scan_interval: 600
    headers:
      Authorization: "Bearer YOUR_API_TOKEN"
    sensor:
      - name: "Tracked Product Price"
        unique_id: "pagecrawl_product_price"
        icon: mdi:currency-usd
        unit_of_measurement: "$"
        value_template: >
          {% set price_element = value_json[0].page_elements | selectattr('label', 'equalto', 'Price') | first %}
          {% if price_element %}
            {{ price_element.contents | regex_replace('[^0-9.]', '') | float(0) }}
          {% else %}
            unknown
          {% endif %}
        json_attributes_path: "$[0]"
        json_attributes:
          - changed_at
          - ai_summary

Monitoring Multiple Pages

Template for Multiple Pages

Monitor multiple tinkerer-relevant pages with YAML anchors to reduce duplication:

rest:
  # Home Assistant Releases (GitHub)
  - resource: "https://pagecrawl.io/api/changes/101"
    scan_interval: 300
    headers:
      Authorization: !secret pagecrawl_api_token
    sensor:
      - name: "HA Releases Monitor"
        unique_id: "pagecrawl_ha_releases"
        icon: mdi:home-assistant
        value_template: &status_template >
          {% if value_json.disabled %}disabled
          {% elif value_json.unseen > 0 %}changed
          {% elif value_json.failed > 0 %}error
          {% else %}ok{% endif %}
        json_attributes: &common_attributes
          - name
          - url
          - last_checked_at
          - unseen
          - failed

  # Raspberry Pi Stock (rpilocator.com)
  - resource: "https://pagecrawl.io/api/changes/102"
    scan_interval: 300
    headers:
      Authorization: !secret pagecrawl_api_token
    sensor:
      - name: "RPi Stock Monitor"
        unique_id: "pagecrawl_rpi_stock"
        icon: mdi:raspberry-pi
        value_template: *status_template
        json_attributes: *common_attributes

  # Zigbee2mqtt Releases (GitHub)
  - resource: "https://pagecrawl.io/api/changes/103"
    scan_interval: 600
    headers:
      Authorization: !secret pagecrawl_api_token
    sensor:
      - name: "Z2M Releases Monitor"
        unique_id: "pagecrawl_z2m_releases"
        icon: mdi:zigbee
        value_template: *status_template
        json_attributes: *common_attributes

Store your API token in secrets.yaml:

pagecrawl_api_token: "Bearer your_60_character_api_token_here"

Dashboard Card for Multiple Monitors

Create a custom dashboard card showing all your monitors:

type: entities
title: Tinkerer Monitors
entities:
  - entity: sensor.ha_releases_monitor
    name: Home Assistant Releases
    secondary_info: last-changed
  - entity: sensor.rpi_stock_monitor
    name: Raspberry Pi Stock
    secondary_info: last-changed
  - entity: sensor.z2m_releases_monitor
    name: Zigbee2mqtt Releases
    secondary_info: last-changed

Or use a more visual approach with conditional formatting:

type: custom:auto-entities
card:
  type: glance
  title: PageCrawl Monitors
filter:
  include:
    - entity_id: sensor.pagecrawl_*
      options:
        tap_action:
          action: url
          url_path: "{{ state_attr(config.entity, 'url') }}"

Advanced Automation Examples

Price History Tracking with Statistics

sensor:
  - platform: statistics
    name: "Product Price Stats"
    entity_id: sensor.tracked_product_price
    state_characteristic: mean
    max_age:
      days: 30

  - platform: statistics
    name: "Product Price Min 30d"
    entity_id: sensor.tracked_product_price
    state_characteristic: value_min
    max_age:
      days: 30

automation:
  - id: pagecrawl_price_at_30day_low
    alias: "PageCrawl - Price at 30-Day Low"
    trigger:
      - platform: template
        value_template: >
          {{ states('sensor.tracked_product_price') | float ==
             states('sensor.product_price_min_30d') | float }}
    action:
      - service: notify.mobile_app_phone
        data:
          title: "30-Day Low Price!"
          message: >
            {{ state_attr('sensor.tracked_product_price', 'name') }}
            is at its lowest price in 30 days: ${{ states('sensor.tracked_product_price') }}

Combine Webhook and Polling for Reliability

# REST sensor for baseline status
rest:
  - resource: "https://pagecrawl.io/api/changes/123"
    scan_interval: 900  # 15 minutes as backup
    headers:
      Authorization: !secret pagecrawl_api_token
    sensor:
      - name: "Monitor Backup Status"
        unique_id: "pagecrawl_backup_sensor"
        value_template: "{{ value_json.unseen }}"

# Webhook for real-time notifications
automation:
  - id: pagecrawl_webhook_handler
    alias: "PageCrawl - Webhook Handler"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_notification
        allowed_methods:
          - POST
    action:
      - service: input_number.set_value
        target:
          entity_id: input_number.pagecrawl_unseen_count
        data:
          value: "{{ trigger.json.unseen | default(1) }}"
      - service: notify.mobile_app_phone
        data:
          title: "{{ trigger.json.title }}"
          message: "{{ trigger.json.ai_summary | default(trigger.json.human_difference) }}"

# Alert if no updates received for too long
  - id: pagecrawl_stale_check
    alias: "PageCrawl - Stale Data Alert"
    trigger:
      - platform: template
        value_template: >
          {{ (as_timestamp(now()) - as_timestamp(state_attr('sensor.monitor_backup_status', 'last_checked_at'))) > 7200 }}
    action:
      - service: notify.admin
        data:
          title: "PageCrawl Monitor Stale"
          message: "No check received in over 2 hours"

Trigger Manual Check via Home Assistant

rest_command:
  pagecrawl_trigger_check:
    url: "https://pagecrawl.io/api/changes/{{ page_id }}/check"
    method: PUT
    headers:
      Authorization: !secret pagecrawl_api_token
      Content-Type: "application/json"

script:
  check_all_monitors:
    alias: "Trigger All PageCrawl Checks"
    sequence:
      - service: rest_command.pagecrawl_trigger_check
        data:
          page_id: 101
      - delay:
          seconds: 2
      - service: rest_command.pagecrawl_trigger_check
        data:
          page_id: 102
      - delay:
          seconds: 2
      - service: rest_command.pagecrawl_trigger_check
        data:
          page_id: 103

Visual Dashboard with Change Screenshots

# Download and cache screenshot when change detected
automation:
  - id: pagecrawl_cache_screenshot
    alias: "PageCrawl - Cache Screenshot"
    trigger:
      - platform: webhook
        webhook_id: pagecrawl_notification
        allowed_methods:
          - POST
    condition:
      - condition: template
        value_template: "{{ trigger.json.page_screenshot_image is defined }}"
    action:
      - service: downloader.download_file
        data:
          url: "{{ trigger.json.page_screenshot_image }}"
          filename: "pagecrawl_latest_{{ trigger.json.page.id }}.png"
          overwrite: true

# Display in dashboard using local image
camera:
  - platform: local_file
    name: "PageCrawl Latest Screenshot"
    file_path: /config/downloads/pagecrawl_latest_123.png

Troubleshooting

Webhook Issues

Problem: Webhooks not arriving

  • Verify your Home Assistant is accessible from the internet
  • Check the webhook URL is correctly formatted
  • Test with curl:
    curl -X POST https://your-ha-url/api/webhook/pagecrawl_test \
      -H "Content-Type: application/json" \
      -d '{"test": "data"}'
  • Check Home Assistant logs for webhook errors

Problem: Webhook automation not triggering

  • Ensure local_only: false is set in your trigger
  • Verify the webhook_id matches exactly
  • Check that allowed_methods includes POST

REST API Issues

Problem: Authentication errors (401)

  • Verify your API token is correct
  • Ensure the Authorization header format is Bearer YOUR_TOKEN (with space)
  • Check the token hasn't been rotated

Problem: Empty or null values

  • Use default filters in templates: {{ value_json.field | default('N/A') }}
  • Check if the page ID/slug exists in your PageCrawl account

Problem: Rate limiting (429)

  • Increase scan_interval to reduce polling frequency
  • Recommended: 300 seconds (5 minutes) minimum

Template Errors

Problem: Template errors in value_template

  • Use the Template Editor in Developer Tools to test
  • Wrap in {% if %} checks for optional fields
  • Use | default() filter for nullable values

When to Use PageCrawl vs. Native HA Sensors

Use HA's built-in scrape or rest sensors when:

  • The site is simple, static HTML without JavaScript
  • No bot protection (Cloudflare, etc.)
  • Public JSON APIs with stable endpoints
  • GitHub Atom feeds (/releases.atom)
  • You don't mind your home IP making requests

Use PageCrawl when the site is hard to monitor:

Protected E-commerce & Stock Alerts

These sites actively block scrapers and need real browser rendering:

  • Ubiquiti Store — Heavy Cloudflare protection, dynamic inventory
  • NVIDIA GeForce Store — Bot detection, JavaScript-loaded stock status
  • rpilocator.com — Aggregates Raspberry Pi stock across retailers
  • Sneaker drops (Nike SNKRS, Shopify stores) — Aggressive bot protection
  • Limited releases — PS5/Xbox restocks, GPU drops

Behind-Login Content

PageCrawl can authenticate and monitor pages that require login:

  • Company career pages — New job postings behind applicant portals
  • Customer portals — Order status, account changes
  • School/University portals — Grades, schedules, announcements
  • ISP status dashboards — Outage info requiring login
  • Member-only forums — Specific threads or categories

Use PageCrawl's browser steps feature to log in before capturing content.

JavaScript-Heavy Sites

Modern sites that load content dynamically after page load:

  • Single-page applications (React, Vue, Angular sites)
  • Lazy-loaded pricing — Price appears after JavaScript executes
  • Interactive dashboards — Content loaded via API calls
  • Infinite scroll pages — Content not in initial HTML

Sites You Don't Want to Overload

Avoid hammering servers from your home IP or getting blocked:

  • Small business websites — Don't overload their hosting
  • Government portals — Rate limiting concerns
  • Competitor monitoring — Don't reveal your IP
  • Frequent checks — Let PageCrawl manage the rate limiting

Conclusion

Integrating PageCrawl with Home Assistant opens up powerful automation possibilities for tinkerers and home automation enthusiasts.

Webhooks provide instant notifications with rich payload data—perfect for time-sensitive alerts like Raspberry Pi stock drops or Ubiquiti UniFi availability. REST API polling gives you dashboard-ready sensors for monitoring Home Assistant releases, Zigbee2mqtt updates, or Shelly firmware changes.

Whether you're tracking hardware availability on rpilocator.com, monitoring GitHub releases for your favorite smart home projects, watching for deals on Zigbee devices, or keeping tabs on any web content that matters to your setup, this integration brings web change detection into your smart home ecosystem.

Keywords: Home Assistant webhook integration, web change detection, website monitoring automation, PageCrawl Home Assistant, REST API sensor, smart home website alerts, price drop notification Home Assistant, web scraping automation

Last updated: 3 February, 2026