PageCrawl is designed to make website change monitoring and management seamless. Page Discovery automatically finds new pages on a website and starts monitoring them for you, so your coverage stays up to date as the site grows. The fastest way to get started is to add a whole website from the Discovered Pages page, then fine-tune how it works from the template that gets created for you.
Add a Website to Discover
Start from the Discovered Pages page and click Add Website. This is the quickest way to begin: enter a website and PageCrawl handles the rest.
In the dialog, set:
- Website URL - the site you want to monitor (for example, a competitor's store or your own marketing site).
- What to track - choose to track every discovered page, only top-level pages (like
/pricingand/about), or review pages yourself before anything is monitored. - Check Frequency - how often discovered pages are checked for changes.
- Notify me via - the channels that should receive change alerts.
Click Add Website. PageCrawl scans the site, lists everything it finds on the Discovered Pages page, and creates a template with sensible defaults that controls how this website is monitored.
Adjust Advanced Configuration via the Template
Adding a website automatically creates a configured template for it. The template is where you fine-tune scanning, filters, and what gets tracked. Open it from Templates settings (or the "Templates settings" link in the Add Website dialog) and edit the template for your website.
Choose a Scanning Method
The template's Discover New Pages setting controls how PageCrawl looks for new pages. The default is Automatic (recommended), which combines methods to find pages using the best approach for each website:
- Automatic (recommended): Combines sitemap and link discovery to find pages using the best method for the website. This is the default and recommended setting.
- Homepage Links Only: Discover new links by following links on the homepage. Available as a daily or weekly check. Useful if you want to focus on pages directly linked from the main page.
- Sitemap Only: Discover pages listed in the website's sitemap. Most websites have a sitemap to help search engines find their pages, making this an efficient method for large sites.
- Follow Links 2 Levels Deep: Follows links on the homepage, then follows links on those pages too. Available as a weekly check. Note: Only available on Enterprise and Ultimate plans.
- Follow Links 3 Levels Deep: Follows links on the homepage, then follows links two more levels deep. Available as a weekly check. Note: Only available on Enterprise and Ultimate plans.
- Deep Scan: Conduct a comprehensive analysis by visiting every accessible page on your website. This ensures that no new links go unnoticed, even on deeply nested pages. Note: Only available on Enterprise and Ultimate plans.
Apply Include and Exclude Filters
Use the template's filters to control exactly which discovered pages are monitored, so you avoid tracking irrelevant pages.
- Include rules: Specify keywords or patterns that a page must match to be tracked. Useful for tracking specific types of content, such as
/product/pages only. - Exclude rules: Define keywords or patterns that should be skipped. Ideal for ignoring pages you do not care about, such as
/tag/or/archive/URLs. - Track All Pages: Track every discovered page without filtering.
- Tracked Page Limit: Cap how many pages are auto-tracked to keep usage under control.
Configure Tracked Elements
The template also defines what is tracked on each discovered page. You can monitor all pages, or only those with a specific structure (for example, only product pages).
- To monitor whole pages, set the tracked element to Full-page Text.
- To monitor pages with a specific layout, configure multiple tracked elements, such as product title, price, and description. If these elements do not exist on a discovered page, that page is simply skipped.
Save the template, and PageCrawl will keep discovering and monitoring matching pages automatically. If too many irrelevant pages are discovered, tighten the include/exclude filters and remove the pages you do not want.
