Last Week on My Mac: The mystery of Safari’s Web Archives
It’s both a joy and a curse that so many tell me of bugs they encounter. The joy is that it enables me to investigate and report them here, but the curse is when I can’t reproduce the problem. This week’s curse has been Safari’s webarchives, a topic that I had wisely avoided for several years. Search this blog using the
tool at the top right of any page and you’ll see just four articles here that mention webarchives, and this is now the second in the last ten days.
While I’m writing about searching this blog, I should point out that tool doesn’t take you out to Google or any general search engine, but confines its scope to articles published here. Although precious few seem to use it, I find it invaluable when preparing articles, and strongly recommend it.
Not only had I avoided tackling this topic, but I see from my own local search that I have seldom used webarchives myself, although not as a result of any unreliability.
In principle, Safari’s webarchives should rarely cause a problem. They’re written by converting what Safari already holds in memory for a webpage into an XML property list, a process termed serialisation, and used effectively by a great many apps in more challenging circumstances. There may be occasions when this doesn’t quite work right, and it does require Safari to retain backward compatibility to ensure it can load and display property lists written some years ago. But by and large it should prove robust.
In practice, there are quite a few who appear unable to get this to work with many versions of Safari, yet I can’t repeat that here. For one reader, the most recent version of Safari that can reliably open their webarchives is 18.6, which is the only version I have experienced problems with. Running in macOS Ventura 13.7.8 here, that version appears unable to open the webarchives it creates, or those from later versions of Safari. Meanwhile Safari 26.1 running in macOS 26.1 has no trouble opening any webarchive I’ve tried from 2009 onwards.
For the last three years, Safari and its supporting libraries including WebKit have been provided to macOS in a cryptex, where they can’t be modified. The only way the user can go beyond Safari’s settings to change its behaviour is using Safari Extensions, which are controlled by Apple. There doesn’t appear to be any way for the user to prevent WebKit and Safari from loading webarchives correctly, intentionally or inadvertently.
Cursed by my inability to reproduce the problems reported, I have immersed myself in a couple of lengthy log extracts. One documents Safari 18.6 failing to open a webarchive it created, the other shows Safari 26.1 successfully opening the same webarchive.
Safari 18.6 seems to have been making good progress opening the webarchive until it came to loading the main frame. It then needed PolicyForNavigationAction before it could go any further:01.154639 com.apple.WebKit Loading Safari WebKit 0x14c19b818 - [pageProxyID=21, webPageID=22, PID=596] WebPageProxy::decidePolicyForNavigationAction: listener called: frameID=24, isMainFrame=1, navigationID=26, policyAction=0, safeBrowsingWarning=0, isAppBoundDomain=0, wasNavigationIntercepted=0
01.154642 com.apple.WebKit Loading Safari WebKit 0x14c19b818 - [pageProxyID=21, webPageID=22, PID=596] WebPageProxy::receivedNavigationActionPolicyDecision: frameID=24, isMainFrame=1, navigationID=26, policyAction=0
01.154666 com.apple.WebKit Loading Safari WebKit 0x14c19b818 - [pageProxyID=21, webPageID=22, PID=596] WebPageProxy::isQuarantinedAndNotUserApproved: failed to initialize quarantine file with path.
01.154666 com.apple.WebKit Loading Safari WebKit 0x14c19b818 - [pageProxyID=21, webPageID=22, PID=596] WebPageProxy::receivedNavigationActionPolicyDecision: file cannot be opened because it is from an unidentified developer.
01.154799 Error Safari Safari Web view (pid: 596) did fail provisional navigation (Error Domain=NSURLErrorDomain Code=-999 "(null)")
So loading the main frame was halted with those chilling words “file cannot be opened because it is from an unidentified developer”, with which we’re only too familiar. The webarchive was in quarantine, it seems, and that put a stop to its loading. Only that isn’t quite accurate: there was no com.apple.quarantine xattr present, but one of those ubiquitous com.apple.macl xattrs instead. Safari had been stopped by its own security, didn’t even have the courtesy to inform us, and just sat there with an empty window going nowhere.
Safari 26.1 shows how it should have been done:00.740168 com.apple.WebKit 0xa4bda0718 - [pageProxyID=19, webPageID=20, PID=1035] WebPageProxy::decidePolicyForNavigationAction: listener called: frameID=4294967298, isMainFrame=1, navigationID=25, policyAction=Use, isAppBoundDomain=0, wasNavigationIntercepted=0
00.740172 com.apple.WebKit 0xa4bda0718 - [pageProxyID=19, webPageID=20, PID=1035] WebPageProxy::receivedNavigationActionPolicyDecision: frameID=4294967298, isMainFrame=1, navigationID=25, policyAction=Use
00.740233 com.apple.WebKit 0xa4bda0718 - [pageProxyID=19, webPageID=20, PID=1035] WebPageProxy::receivedNavigationActionPolicyDecision: Swapping in non-persistent websiteDataStore for web archive.
From then, WebKit moves apace and the archived webpage is soon displayed.
This doesn’t of course mean that Safari’s failures to open and display webarchives successfully are all the result of NavigationActionPolicyDecisions that the webarchive can’t be opened because of this security problem, but I suspect this isn’t the only time this has occurred. The vagaries of com.apple.macl xattrs are well known, and their propensity to cause other innocent actions to be blocked is only too familiar. Unfortunately, the only reliable workaround is to knock a hole through macOS security by disabling SIP. But for this to happen without any information being displayed to the user is unforgivable.
Other apps that access Safari’s webarchives don’t appear tainted by this behaviour. Michael Tsai of C-Command Software tells me that his EagleFiler app hasn’t had such problems since its introduction in 2006. If you’ve been struggling to open webarchives in Safari, you might like to consider whether that could address those problems. In the meantime, I can see what I’ll be doing over Christmas.
I’m very grateful to Michael Tsai of C-Command Software for information and discussion.