Most of the internet's monitoring tools were built for an era where a screenshot counted as evidence. That era is over. Regulators no longer accept it. Courts no longer accept it. AI agents that quote pages they cannot retrieve no longer have a defence. The web needs a layer underneath monitoring that is signed, timestamped, replayable, and verifiable without trusting the operator.
We built that layer. We call it the Web Evidence Layer. It is available on the Ultimate plan, enabled per monitor, for the teams that need a defensible record of what the web said and when.
This post is the definitive reference: what the Web Evidence Layer is, the components PageCrawl runs in production, and how each one is independently verifiable against a public standard.
Note: the Web Evidence Layer is an Ultimate-plan feature and is enabled on a per-monitor basis. It is not on by default and it is not applied to every monitored page. PageCrawl's broader feature set (MCP server, auth'd pages, three engines, multi-channel delivery, AI summaries, queryable history) ships on the rest of the plans without it.
Why a screenshot is no longer enough
Three forces collided in the last twenty-four months.
Regulators tightened the rules. The EU Digital Services Act, MiCA, product-recall regimes, and SEC disclosure expectations now demand verifiable captures backed by trusted timestamps. Marketing copy, terms of service, and disclosure language became regulated artefacts.
AI agents started acting on web content at machine speed. When an agent reprices a portfolio because a competitor changed a page, the firm needs a defensible record of what the competitor actually published. "The agent saw it" is not a control. "The model summarised it" is not a citation.
The volume of public-web disputes kept compounding. Brand impersonation, comparative advertising, dark-pattern enforcement, ESG-claim challenges, all of them turn on what a page said on a specific date. A screenshot proves nothing. A monitoring alert proves nothing. You need the page itself, captured deterministically, signed by a domain identity, timestamped by a trusted third party, anchored to a public ledger, and replayable in a browser.
That is the Web Evidence Layer. PageCrawl built it, and runs it for Ultimate-plan customers who enable it on the monitors that need it.
What web evidence actually means
Four properties separate web evidence from a saved PDF or a screenshot.
- Deterministic capture. When the Web Evidence Layer is enabled on a monitor, PageCrawl captures the full HTML, every sub-resource (CSS, JS, fonts, images, XHR responses), and the response headers. Replaying it later produces the same rendered page, byte for byte.
- Content addressing. Every file inside the archive is referenced by a cryptographic hash. Change one pixel, the hash changes. The archive's manifest is itself hashed, so the entire bundle has a single fingerprint.
- Signed timestamps. We attest the fingerprint existed at a specific moment. We issue several attestations in parallel because different attestations carry different legal weight.
- Replayable. Anyone with the archive opens it in a standard viewer and interacts with the page as it existed at capture time, with the original site offline.
The point is not "we have a copy". The point is we have a copy, and we can prove three independent things about it without anyone trusting PageCrawl.
The PageCrawl Web Evidence stack
WACZ archives, built on the WARC standard
WARC is the ISO standard (ISO 28500) that has stored every page the Internet Archive has ever crawled since 2009. It records the full HTTP exchange (request, response, headers, body) for every resource on the page.
WACZ wraps WARC files with an index, a datapackage manifest, and a signature block into a single replayable .wacz file. On Ultimate-plan monitors with the Web Evidence Layer enabled, PageCrawl outputs WACZ on each check that produces a change. The file is self-contained. You email it to a regulator, attach it to a court filing, hand it to opposing counsel. They open it locally and see exactly what you saw.
Domain-identity signature (Let's Encrypt)
The first attestation is the domain-identity signature. PageCrawl signs each archive using a certificate issued by Let's Encrypt, proving the archive was packaged by an entity controlling a specific domain at the time of signing. It is the same trust chain your browser already uses for HTTPS. It does not prove the content is authentic to the original publisher, it proves the archiver's identity is auditable.
RFC 3161 timestamp from a commercial TSP
RFC 3161 defines the Time-Stamp Protocol. PageCrawl submits the archive hash to a commercial Trust Service Provider, which returns a signed token attesting the hash existed at that moment.
RFC 3161 timestamps are routinely accepted in civil litigation, regulatory filings, and audit trails. They are the workhorse of the evidence stack. If you only get one timestamp, this is the one to get. PageCrawl issues one on each captured change for monitors that have the Web Evidence Layer enabled on the Ultimate plan.
Bitcoin blockchain anchor via OpenTimestamps
OpenTimestamps anchors the archive hash to the Bitcoin blockchain. After roughly an hour, anyone in the world can verify the timestamp by re-deriving the Merkle path against the Bitcoin block headers.
Why this is in the stack: a Bitcoin anchor survives the TSP. If a TSP disappeared in twenty years, the RFC 3161 token remains cryptographically valid but verification depends on surviving root certificates. A Bitcoin anchor remains verifiable as long as Bitcoin's block headers are recoverable. That is roughly the same trust assumption as "the internet exists".
PageCrawl anchors every captured change on monitors with the Web Evidence Layer enabled. The Bitcoin anchor is included with the Web Evidence Layer on the Ultimate plan, not a separate line item.
Qualified timestamp from a QTSP on the EU Trusted List
Under eIDAS, qualified Trust Service Providers (QTSPs) on the EU Trusted List issue qualified electronic timestamps. A qualified timestamp carries a legal presumption of accuracy across the EU. Courts and regulators in member states must treat it as authoritative unless rebutted with specific evidence.
PageCrawl offers a qualified timestamp from a QTSP on the EU Trusted List as the eIDAS Custom add-on. We reserve the word "qualified" strictly for that case. The other timestamps in our stack are strong evidence, but they are not qualified in the eIDAS sense.
Who buys the Web Evidence Layer
Compliance and regulated industries
Pharma, financial services, gambling, alcohol, and supplements all run pages that must match an approved master copy. With the Web Evidence Layer enabled on the Ultimate plan, PageCrawl keeps the signed, timestamped archive of every captured version of those pages. When a regulator asks what the page said on the 12th of March, the monitoring alert was the smoke and the WACZ is the receipt.
Legal and litigation support
Trademark infringement, comparative-advertising disputes, breach-of-contract claims tied to published terms, dark-pattern enforcement, brand-impersonation takedowns. All of them turn on what a public page said on a specific date. PageCrawl's WACZ files are designed to be filed as primary exhibits. They replay the page as the user would have seen it, with working links and interactive elements.
AI agent builders
If your agent quotes, summarises, or acts on web content, you need provenance for its outputs. PageCrawl's MCP server at /mcp-server is available on every plan and gives the agent live access to monitors, diffs, and history. For the changes that need a defensible citation, enable the Web Evidence Layer on the Ultimate plan and the agent can hand the reviewer a WACZ archive, an RFC 3161 timestamp, and an OpenTimestamps anchor alongside the answer.
Journalists and researchers
Investigations cite pages that publishers later edit or delete. PageCrawl's archives, anchored to Bitcoin via OpenTimestamps, let a reader, an editor, or a fact-checker verify the source years after publication, with the original site offline.
Auth'd pages, not just public URLs
The pages worth proving are usually behind a login. Supplier portals, gated competitor pricing, internal dashboards, member-only industry sites. PageCrawl monitors auth'd pages from Standard and above, and on Ultimate-plan monitors with the Web Evidence Layer enabled those captures flow into the same WACZ pipeline as public pages, with the domain-identity signature, the RFC 3161 timestamp, and the Bitcoin anchor. The evidence layer does not stop at the login wall.
Three engines, one evidence pipeline
PageCrawl runs three capture engines (Fast, Default, Stealth) so the right tool handles each page. Fast is curl-grade for static HTML and feeds. Default is full browser rendering for ordinary modern sites. Stealth handles the hardest, most defended pages. PDF, RSS/Atom, JSON feeds, and inbound email are first-class engines too.
When the Web Evidence Layer is enabled on an Ultimate-plan monitor, whichever engine fires, the output flows into the same evidence pipeline: WACZ archive, domain-identity signature, RFC 3161 timestamp, OpenTimestamps anchor, and on the eIDAS Custom add-on a QTSP qualified timestamp.
Glossary, for the record
WACZ (Web Archive Collection Zipped). PageCrawl's archive output format. Replayable in any compliant viewer.
WARC (Web ARChive, ISO 28500). The underlying record format inside WACZ.
RFC 3161 (Time-Stamp Protocol). The IETF standard PageCrawl uses for commercial-TSP timestamps.
TSP / TSA. Trust Service Provider / Time-Stamping Authority. PageCrawl uses accredited commercial TSPs.
OpenTimestamps (opentimestamps.org). The Bitcoin-anchored timestamp protocol PageCrawl runs on each captured change for Ultimate-plan monitors that have the Web Evidence Layer enabled.
eIDAS (Regulation (EU) No 910/2014). The EU regulation governing qualified electronic timestamps. PageCrawl's eIDAS Custom add-on issues them.
QTSP. Qualified Trust Service Provider on the EU Trusted List. The only entities that can issue qualified electronic timestamps. PageCrawl partners with one.
Content addressing. Referencing a file by the hash of its contents. Change the file, change the address.
Chain of custody. The unbroken record of who handled an artefact and when. PageCrawl produces it for monitors with the Web Evidence Layer enabled on the Ultimate plan.
Start with the Web Evidence Layer
A monitoring alert is the doorbell. The Web Evidence Layer is the building. PageCrawl ships the full stack on the Ultimate plan, enabled per monitor: deterministic capture, content addressing, four parallel attestations, replay. The rest of the product (auth'd pages on Standard and above, three engines, MCP for agents, multi-channel delivery, AI summaries, queryable history) is there whether you turn the evidence layer on or not.

