Web Archives That Actually Hold Up

When a regulator, court, or auditor asks you to produce a webpage exactly as it appeared on a specific date, a screenshot isn't enough. PageCrawl captures every detected change as a full, replayable web archive sealed with three independent cryptographic proofs that you can verify without trusting PageCrawl.

Included on Ultimate plans. Optional eIDAS qualified timestamps from a Qualified Trust Service Provider on the EU Trusted List available on Custom plans.

Why a screenshot isn't enough

A screenshot is a single flat image of one moment of the visible viewport. It can be cropped, edited, or generated wholesale by a modern AI model. When the timing or content of a webpage matters legally or commercially, what you actually need is the full page captured as it was, plus independent proof that it existed in that form at that moment in time.

ScreenshotPageCrawl WACZ archive
Captures the full pageVisible viewport onlyHTML, CSS, JavaScript, images, linked PDFs
Replayable as it wasNo, flat imageYes, navigate the page in ReplayWeb.page
Third-party attested timeNo, just file metadataYes, three independent providers
Detectable if modifiedEasily, undetectably editedAny byte changed invalidates the proofs
Distinguishable from AI fakesNo, indistinguishableYes, AI cannot forge the proofs
Open archival formatPNG / JPEG (cosmetic)WACZ, used by Internet Archive and Library of Congress

When you need a web archive, not a screenshot

Legal disputes and litigation

When the contents of a public webpage are at issue in a dispute, opposing counsel will challenge anything a party could have created themselves. An archive sealed by independent providers shifts the question from "do we believe you" to "verify it yourself".

Regulatory inspections

Compliance regimes (SEC 17a-4, FDA 21 CFR Part 11, HIPAA, DORA, GDPR) increasingly expect tamper-evident records of public-facing content with provable timestamps. Screenshots filed in a folder don't meet that bar; cryptographic proofs do.

Anti-fraud and brand protection

Capturing a fraudulent landing page, a counterfeit listing, or a brand impersonation site as a sealed archive at the moment of detection means the evidence remains usable even after the offending site is taken down or altered.

Long-term institutional memory

Public records, regulatory guidance, and policy pages are edited or removed without notice. A WACZ archive captured at the moment a decision was made preserves the actual context of that decision for as long as you need it.

What's in a PageCrawl Web Archive

Every archive is a self-contained WACZ file (Web Archive Collection Zipped), the open-format standard developed by Webrecorder and used by the Internet Archive, the Library of Congress, and major eDiscovery platforms.

Timestamped Captures

Every archive carries an ISO 8601 capture timestamp recorded inside the WARC record headers. The timestamp is part of the archive structure, not a sidecar file that can be edited.

Cryptographic Hashes Per Resource

The WACZ datapackage includes a SHA-256 hash of every captured resource: HTML, images, scripts, stylesheets, linked PDFs. Any modification to the archive contents invalidates the hashes and is detectable.

Self-Contained Replay

Open the WACZ in ReplayWeb.page, the open-source replay viewer, and navigate the archived page as it existed at capture time. No PageCrawl account required for the recipient. Works offline.

Linked Documents Captured

PDFs, Excel files, Word documents, and other linked artefacts referenced from the page are captured into the same archive. One file holds the full evidentiary picture, not just the host page.

Tamper-Evident Structure

The WACZ datapackage manifest cross-references every resource hash. Edit any byte of the archive and the manifest fails verification. Tamper-evidence is structural, not policy.

Open Format, No Lock-in

WACZ is an open specification. Archives can be ingested into any compliant archival or eDiscovery system. Your audit trail is portable, regulator-friendly, and not tied to PageCrawl.

Three Independent Integrity Layers

A self-stored screenshot is something the firm could have generated at any moment. To be useful as evidence, an archive needs independent third-party attestation that it existed in its current form at a specific point in time. Every WACZ archive on the Ultimate plan ships with three such attestations from unrelated providers. Any modification to the archive, even a single byte, invalidates all three.

Aligned with FRE 902(13)/(14) self-authenticating evidence in US courts. SOC 2 / HIPAA-ready. Optional eIDAS qualified RFC 3161 timestamps from a QTSP on the EU Trusted List are available on Custom plans.

Domain-identity signature (Let's Encrypt)

Each WACZ is signed with a Let's Encrypt certificate issued to PageCrawl's monitoring domain, embedded inside the archive per the WACZ Auth specification. ReplayWeb.page reads the signature and renders an integrity badge natively.

RFC 3161 timestamp from a commercial Trust Service Provider

Each WACZ also carries an RFC 3161 timestamp issued by a commercial Trust Service Provider. The TSP's signature binds the archive's SHA-256 hash to a specific moment in time, verifiable with `openssl ts` and any standard PKI tooling against the TSP's certificate chain.

Bitcoin blockchain anchor via OpenTimestamps

Each archive's hash is submitted to the OpenTimestamps calendar and anchored to the Bitcoin blockchain within hours. Verifiable offline against the public blockchain by anyone with an OpenTimestamps client. Independent of PageCrawl, of any commercial TSP, and of any single jurisdiction's PKI.

One-Click Integrity Verification

Verify the archive's manifest hash at any time. The verify endpoint re-extracts the WACZ datapackage, recomputes hashes, and confirms no byte has shifted since capture. Demonstrates tamper-evidence to auditors in seconds.

Access Audit Log

Every download, view, verify, and export is logged with user, IP, timestamp, and action. HIPAA, SOX, and SEC 17a-4 audit trails ask exactly this question: who accessed which record, when. The answer is a queryable log entry.

Review Boards for Signoff

Move detected changes through Kanban-style review boards. Compliance, legal, and policy teams mark items reviewed, attach internal notes, and produce a clean record for the next committee meeting or regulatory inspection.

Optional eIDAS qualified timestamp (Custom plan add-on)

Custom-plan archives can additionally carry an RFC 3161 timestamp issued by a Qualified Trust Service Provider listed on the European Union Trusted List, in accordance with eIDAS Regulation (EU) 910/2014. Qualified timestamps are recognized as evidence in all EU and EEA member states and carry the legal presumption of date/time accuracy and data integrity established under Article 41, with reversed burden of proof in disputes.

This sits as an optional Custom-plan add-on because the QTSP charges per-stamp and provisioning requires a customer-specific trust chain. We scope the QTSP, agreement, and rollout on a sales call.

Talk to Sales

Built for Regulated Industries

Compliance and legal teams across financial services, life sciences, healthcare, and law use PageCrawl archives as part of their evidentiary record.

Broker-Dealers (SEC 17a-4)

Capture fee schedules, customer agreements, disclosures, and Form CRS as they appear publicly. Continuous capture means you can produce the actual record on any date in the three-year retention window. SEC 17a-4 guide

Life Sciences (FDA 21 CFR Part 11)

Maintain audit-trail-bearing records of FDA guidance, supplier quality portals, and clinical trial registries. The archive structure satisfies the audit-trail and tamper-evidence controls in 11.10. Part 11 guide

Legal & eDiscovery

WACZ archives are accepted by major eDiscovery platforms and have been used in court as contemporaneous records. Replay the page exactly as it appeared on a date in dispute, with the integrity story baked into the format. Legal monitoring use case

Healthcare (HIPAA)

Track Notice of Privacy Practices changes, breach notification availability, and business-associate sub-processor lists with an audit trail OCR investigators recognize. HIPAA guide

EU Financial Entities (DORA)

Continuous capture of critical ICT third-party portals, ESA guidance, and DPAs satisfies DORA's continuous-update register obligation under Article 28. DORA guide

Privacy & GDPR (Article 30 ROPA)

Capture processor sub-processor lists and DPA versions on change. The archive demonstrates when each ROPA-affecting change was first detectable, an evidentiary asset during supervisory inspections. GDPR ROPA guide

How It Works

1

Continuous Capture

Every monitored page is checked at your chosen frequency. When PageCrawl detects a change, the page is captured into a fresh archive. Timestamped, hashed, and linked to the detected change record.

2

Hash & Manifest

The WACZ datapackage manifest records SHA-256 hashes for every captured resource: HTML, images, scripts, linked PDFs. The manifest itself is hashed, so any modification to the archive structure is detectable.

3

Replay & Inspect

Open any archive in ReplayWeb.page directly from PageCrawl, or download the WACZ file for offline review. Navigate the archived page as it appeared at capture time, including interactive elements and linked documents.

4

Export as Evidence

Download a single archive, the full change history for a monitor, or a tagged subset of monitors. Hand off to legal, compliance, eDiscovery, or a regulator. The archive is self-contained and verifiable without PageCrawl access.

What's Included By Plan

PlanWhat you getRetention
FreeScreenshot of the most recent 3 changesLast 3 only
StandardScreenshot per detected change1 year
EnterpriseScreenshot per detected changeUnlimited
UltimateScreenshot per detected change. WACZ archive capture with manifest hashes, replay viewer, and linked-resource capture is opt-in per page.Unlimited

Need a custom retention period or a specific WACZ specification version? Contact sales.


Frequently Asked Questions

Ready to Build Your Audit Trail?

Start capturing public-facing pages and disclosures with audit-grade archives today. Upgrade to Ultimate when WACZ becomes part of your compliance program.