Web Archives That Actually Hold Up
When a regulator, court, or auditor asks you to produce a webpage exactly as it appeared on a specific date, a screenshot isn't enough. PageCrawl captures every detected change as a full, replayable web archive sealed with three independent cryptographic proofs that you can verify without trusting PageCrawl.
Included on Ultimate plans. Optional eIDAS qualified timestamps from a Qualified Trust Service Provider on the EU Trusted List available on Custom plans.
Why a screenshot isn't enough
A screenshot is a single flat image of one moment of the visible viewport. It can be cropped, edited, or generated wholesale by a modern AI model. When the timing or content of a webpage matters legally or commercially, what you actually need is the full page captured as it was, plus independent proof that it existed in that form at that moment in time.
When you need a web archive, not a screenshot
Legal disputes and litigation
When the contents of a public webpage are at issue in a dispute, opposing counsel will challenge anything a party could have created themselves. An archive sealed by independent providers shifts the question from "do we believe you" to "verify it yourself".
Regulatory inspections
Compliance regimes (SEC 17a-4, FDA 21 CFR Part 11, HIPAA, DORA, GDPR) increasingly expect tamper-evident records of public-facing content with provable timestamps. Screenshots filed in a folder don't meet that bar; cryptographic proofs do.
Anti-fraud and brand protection
Capturing a fraudulent landing page, a counterfeit listing, or a brand impersonation site as a sealed archive at the moment of detection means the evidence remains usable even after the offending site is taken down or altered.
Long-term institutional memory
Public records, regulatory guidance, and policy pages are edited or removed without notice. A WACZ archive captured at the moment a decision was made preserves the actual context of that decision for as long as you need it.
What's in a PageCrawl Web Archive
Every archive is a self-contained WACZ file (Web Archive Collection Zipped), the open-format standard developed by Webrecorder and used by the Internet Archive, the Library of Congress, and major eDiscovery platforms.
Timestamped Captures
Every archive carries an ISO 8601 capture timestamp recorded inside the WARC record headers. The timestamp is part of the archive structure, not a sidecar file that can be edited.
Cryptographic Hashes Per Resource
The WACZ datapackage includes a SHA-256 hash of every captured resource: HTML, images, scripts, stylesheets, linked PDFs. Any modification to the archive contents invalidates the hashes and is detectable.
Self-Contained Replay
Open the WACZ in ReplayWeb.page, the open-source replay viewer, and navigate the archived page as it existed at capture time. No PageCrawl account required for the recipient. Works offline.
Linked Documents Captured
PDFs, Excel files, Word documents, and other linked artefacts referenced from the page are captured into the same archive. One file holds the full evidentiary picture, not just the host page.
Tamper-Evident Structure
The WACZ datapackage manifest cross-references every resource hash. Edit any byte of the archive and the manifest fails verification. Tamper-evidence is structural, not policy.
Open Format, No Lock-in
WACZ is an open specification. Archives can be ingested into any compliant archival or eDiscovery system. Your audit trail is portable, regulator-friendly, and not tied to PageCrawl.
Three Independent Integrity Layers
A self-stored screenshot is something the firm could have generated at any moment. To be useful as evidence, an archive needs independent third-party attestation that it existed in its current form at a specific point in time. Every WACZ archive on the Ultimate plan ships with three such attestations from unrelated providers. Any modification to the archive, even a single byte, invalidates all three.
Aligned with FRE 902(13)/(14) self-authenticating evidence in US courts. SOC 2 / HIPAA-ready. Optional eIDAS qualified RFC 3161 timestamps from a QTSP on the EU Trusted List are available on Custom plans.
Domain-identity signature (Let's Encrypt)
Each WACZ is signed with a Let's Encrypt certificate issued to PageCrawl's monitoring domain, embedded inside the archive per the WACZ Auth specification. ReplayWeb.page reads the signature and renders an integrity badge natively.
RFC 3161 timestamp from a commercial Trust Service Provider
Each WACZ also carries an RFC 3161 timestamp issued by a commercial Trust Service Provider. The TSP's signature binds the archive's SHA-256 hash to a specific moment in time, verifiable with `openssl ts` and any standard PKI tooling against the TSP's certificate chain.
Bitcoin blockchain anchor via OpenTimestamps
Each archive's hash is submitted to the OpenTimestamps calendar and anchored to the Bitcoin blockchain within hours. Verifiable offline against the public blockchain by anyone with an OpenTimestamps client. Independent of PageCrawl, of any commercial TSP, and of any single jurisdiction's PKI.
One-Click Integrity Verification
Verify the archive's manifest hash at any time. The verify endpoint re-extracts the WACZ datapackage, recomputes hashes, and confirms no byte has shifted since capture. Demonstrates tamper-evidence to auditors in seconds.
Access Audit Log
Every download, view, verify, and export is logged with user, IP, timestamp, and action. HIPAA, SOX, and SEC 17a-4 audit trails ask exactly this question: who accessed which record, when. The answer is a queryable log entry.
Review Boards for Signoff
Move detected changes through Kanban-style review boards. Compliance, legal, and policy teams mark items reviewed, attach internal notes, and produce a clean record for the next committee meeting or regulatory inspection.
Optional eIDAS qualified timestamp (Custom plan add-on)
Custom-plan archives can additionally carry an RFC 3161 timestamp issued by a Qualified Trust Service Provider listed on the European Union Trusted List, in accordance with eIDAS Regulation (EU) 910/2014. Qualified timestamps are recognized as evidence in all EU and EEA member states and carry the legal presumption of date/time accuracy and data integrity established under Article 41, with reversed burden of proof in disputes.
This sits as an optional Custom-plan add-on because the QTSP charges per-stamp and provisioning requires a customer-specific trust chain. We scope the QTSP, agreement, and rollout on a sales call.
Talk to SalesBuilt for Regulated Industries
Compliance and legal teams across financial services, life sciences, healthcare, and law use PageCrawl archives as part of their evidentiary record.
Broker-Dealers (SEC 17a-4)
Capture fee schedules, customer agreements, disclosures, and Form CRS as they appear publicly. Continuous capture means you can produce the actual record on any date in the three-year retention window. SEC 17a-4 guide
Life Sciences (FDA 21 CFR Part 11)
Maintain audit-trail-bearing records of FDA guidance, supplier quality portals, and clinical trial registries. The archive structure satisfies the audit-trail and tamper-evidence controls in 11.10. Part 11 guide
Legal & eDiscovery
WACZ archives are accepted by major eDiscovery platforms and have been used in court as contemporaneous records. Replay the page exactly as it appeared on a date in dispute, with the integrity story baked into the format. Legal monitoring use case
Healthcare (HIPAA)
Track Notice of Privacy Practices changes, breach notification availability, and business-associate sub-processor lists with an audit trail OCR investigators recognize. HIPAA guide
EU Financial Entities (DORA)
Continuous capture of critical ICT third-party portals, ESA guidance, and DPAs satisfies DORA's continuous-update register obligation under Article 28. DORA guide
Privacy & GDPR (Article 30 ROPA)
Capture processor sub-processor lists and DPA versions on change. The archive demonstrates when each ROPA-affecting change was first detectable, an evidentiary asset during supervisory inspections. GDPR ROPA guide
How It Works
Continuous Capture
Every monitored page is checked at your chosen frequency. When PageCrawl detects a change, the page is captured into a fresh archive. Timestamped, hashed, and linked to the detected change record.
Hash & Manifest
The WACZ datapackage manifest records SHA-256 hashes for every captured resource: HTML, images, scripts, linked PDFs. The manifest itself is hashed, so any modification to the archive structure is detectable.
Replay & Inspect
Open any archive in ReplayWeb.page directly from PageCrawl, or download the WACZ file for offline review. Navigate the archived page as it appeared at capture time, including interactive elements and linked documents.
Export as Evidence
Download a single archive, the full change history for a monitor, or a tagged subset of monitors. Hand off to legal, compliance, eDiscovery, or a regulator. The archive is self-contained and verifiable without PageCrawl access.
What's Included By Plan
Need a custom retention period or a specific WACZ specification version? Contact sales.
Frequently Asked Questions
Ready to Build Your Audit Trail?
Start capturing public-facing pages and disclosures with audit-grade archives today. Upgrade to Ultimate when WACZ becomes part of your compliance program.
