Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

From quarantine to provenance: how xattrs are copied

By: hoakley
13 September 2024 at 14:30

In the previous article, I outlined what extended attributes do, and how they work in macOS. I also started to explain how some are considered ephemeral, while others are persistent. This article continues from there, by documenting how macOS decides what to do with them when a file containing xattrs is copied.

Although Apple does now explain a little about this in the context of the FileProvider framework and syncing with cloud services, the only useful documentation is provided in man xattr_name_with_flags, and two source code files that are part of the open source copyfile component.

In 2013, as part of its enhancements for iCloud in particular, Apple added support for flags on xattrs to indicate how those xattrs should be handled when the file is copied in various ways. Rather than change the file system, Apple opted for what’s perhaps best seen as an elegant kludge: appending characters to the end of the xattr’s name.

If you work with xattrs, you’ve probably already seen this in those whose name ends with a hash # then one or more characters: that’s actually the flags, not part of the name, what Apple refers to as a ‘property list’. To avoid confusion I won’t use that term here, but refer to them as xattr flags. A common example of this is com.apple.lastuseddate#PS, which is seen quite widely. In recent years, Apple has added one flag, B, and there’s another to come with Sequoia.

Xattr flags

Flags can be upper or lower case letters C, N, P, S or B, and invariably follow the # separator, which is presumably otherwise forbidden from use in a xattr’s name. Upper case sets or enables that property, while lower case clears or disables that property. There are currently (macOS 14.6.1) five properties:

  • C: XATTR_FLAG_CONTENT_DEPENDENT, which ties the flag and the file contents, so the xattr is rewritten when the file data changes. This is normally used for checksums and hashes, text encoding, and position information. The xattr is preserved for copy and share, but not in a safe save.
  • P: XATTR_FLAG_NO_EXPORT, which doesn’t export or share the xattr, but normally preserves it during copying.
  • N: XATTR_FLAG_NEVER_PRESERVE, which ensures the xattr is never copied, even when copying the file.
  • S: XATTR_FLAG_SYNCABLE, which ensures the xattr is preserved during syncing with services such as iCloud Drive. Default behaviour is for xattrs to be stripped during syncing, to minimise the amount of data to be transferred, but this will override that.
  • B: XATTR_FLAG_ONLY_BACKUP, which keeps the xattr only in backups, including Time Machine (added recently).

These operate within another general restriction of xattrs: their name cannot exceed a maximum of 127 UTF-8 characters.

Defaults

macOS provides a standard ‘whitelist’ of default flag settings for different types of xattr. These aren’t contained in a configuration file, but are baked into the xattr flag code, where as of macOS 14.6.1 the following default flags are set for different types of xattr (* here represents the wild card):

  • com.apple.quarantinePCS
  • com.apple.TextEncodingCS
  • com.apple.metadata:kMDItemCollaborationIdentifierB
  • com.apple.metadata:kMDItemIsSharedB
  • com.apple.metadata:kMDItemSharedItemCurrentUserRoleB
  • com.apple.metadata:kMDItemOwnerNameB
  • com.apple.metadata:kMDItemFavoriteRankB
  • com.apple.metadata:* (except those above) – PS
  • com.apple.security.*S
  • com.apple.ResourceForkPCS
  • com.apple.FinderInfoPCS
  • com.apple.root.installedPC

Copy intents

Also contained in the source code is a table of intents, that explains how different types of copy are affected by different combinations of xattr flag. Currently, those are:

  • XATTR_OPERATION_INTENT_COPY – a simple copy, preserves xattrs that don’t have flag N or B
  • XATTR_OPERATION_INTENT_SAVE – save, where the content may be changing, preserves xattrs that don’t have flag C or N or B
  • XATTR_OPERATION_INTENT_SHARE – share or export, preserves xattrs that don’t have flag P or N or B
  • XATTR_OPERATION_INTENT_SYNC – sync to a service such as iCloud Drive, preserves xattrs if they have flag S, or have neither N nor B
  • XATTR_OPERATION_INTENT_BACKUP – back up, e.g. using Time Machine, preserves xattrs that don’t have flag N

Use

If you want a xattr preserved when it passes through iCloud, you therefore need to give it a name ending in the xattr flag S, such as co.eclecticlight.MyTest#S. Sure enough, when xattrs with that flag are passed through iCloud Drive, those xattrs are preserved even if the default rule would treat them differently. Similarly, to have a xattr that is stripped even when you just make a local copy of that file, append #N to its name.

There’s a further limit imposed on xattrs synced by FileProvider, including those for iCloud Drive, that strips all individual xattrs that are larger than a certain size. Apple gives that as “about 32KiB total for each item”, and my measurements performed in the recent past put that at about 32,650 bytes, slightly less than 32,767.

In itself, this information is valuable if you ever use any metadata stored in xattrs. It’s used in my intergrity-checking utilities Dintch, Fintch and cintch to ensure the xattr containing a file’s hash isn’t stripped by passage through iCloud Drive, for instance. On Tuesday morning next week, once Sequoia has been released, I’ll explain how Apple has extended this system to achieve something that many have been wishing for.

From quarantine to provenance: extended attributes

By: hoakley
12 September 2024 at 14:30

One of the innovative features in classic Mac OS was its use of resource forks, allowing structured metadata to be attached to any file. When Mac OS X merged that with the more traditional Unix approach adopted by NeXTSTEP, those were nearly lost. Classic Mac apps were restructured from storing most of their components, including their executable code, in their resource fork, when Mac OS X flattened those into an app bundle consisting of a hierarchy of separate files in folders, without any resources.

For the first four years of Mac OS X resource forks were reluctantly tolerated, until the solution came in 10.4 with the introduction of extended attributes, including one to contain what had previously been stored in the resource fork, which became an extended attribute or xattr with the name com.apple.ResourceFork.

All files in HFS+ and APFS (and other file systems) contain a fairly standard set of metadata known as attributes, information about a file such as its name, datestamps and permissions. Xattrs are extensions to those that contain almost any other type of metadata, the first notable xattr coming in Mac OS X 10.5, named com.apple.quarantine. That contains quarantine information for apps and other files downloaded from the internet, in a format so ancient that the quarantine flag is stored not in binary but as text.

The quarantine xattr provides a good demonstration of some of the valuable properties of xattrs: it can be attached to any file (or folder) without changing its data, and isn’t included when calculating CDHashes for code signatures. It can thus be added safely without any danger of altering the app or its code, although it does change the way in which macOS handles the code, by triggering security checks used to verify it isn’t malicious. Once those have been run, the flag inside the quarantine xattr can be changed to indicate it has been checked successfully.

Far from being a passing phase, or dying out as some had expected, xattrs have flourished since those early days. This has happened largely unseen by the user: few change anything revealed in the Finder’s Get Info dialog, although they’re used to store some forms of visible metadata such as Finder tags, and the URL used to download items from the internet. Editing xattrs is normally performed silently: you’re not made aware of changes in the quarantine xattr, and in most cases the only way to manage xattrs is to use the xattr command tool, or one of very few apps like xattred that can edit and manage them.

Examples

Among the well-known and important xattrs you can encounter are:

  • com.apple.quarantine the quarantine xattr, containing a quarantine flag
  • com.apple.rootless marks items individually protected by System Integrity Protection (SIP)
  • com.apple.provenance contains data about the origin of apps that have been quarantined
  • com.apple.metadata:kMDItemCopyright records copyright info
  • com.apple.metadata:kMDItemWhereFroms the origin of downloaded file as a URL
  • com.apple.metadata:_kMDItemUserTags Finder tags
  • com.apple.TextEncoding reveals text file encoding
  • com.apple.ResourceFork a classic Mac resource fork

Storage

In APFS and HFS+, xattrs aren’t stored with file data, nor with a file or folder’s normal attributes.

fileobjects

For smaller extended attributes up to 3,804 bytes, their data is stored with the xattr in the file system metadata. Larger extended attributes are stored as data streams, with separate records, but still separately from the file data. Apple doesn’t give a limit on the maximum size of xattrs, but they can certainly exceed 200 KB, and each file and folder can have an effectively unlimited number of them.

Persistence

Most file systems to which macOS can write either handle xattrs natively (HFS+, APFS), or macOS uses a scheme to preserve them, as in the hidden files written to FAT and ExFAT volumes. NFS is an important exception, and files copied to NFS will have all their xattrs stripped. Neither are extended attributes unique to Macs: most file systems used by Linux support them, and even Windows can at a push.

Because xattrs contain a wide range of metadata, some are treated as being ephemeral, others as persistent. Moving files with xattrs around within the same volume shouldn’t affect their xattrs, as that takes place within the same file system. Copying files to another volume, even if both use APFS, may leave some xattrs behind if they’re considered to be ephemeral.

iCloudDriveFileSummary4

The most complex situation is when a file with xattrs is moved to iCloud Drive. The Mac that originated that file is likely to retain most if not all of its xattrs, because the local copy remains within the same volume and file system. However, not all xattrs are copied up to iCloud storage, so other Macs accessing that file may only see a small selection of them. The rules for which xattrs are to be preserved during file copying, including in iCloud Drive, are baked into macOS, and the subject of the next article.

A brief history of Time Machine

By: hoakley
7 September 2024 at 15:00

In the days before Mac OS X, Apple didn’t provide a serious backup utility, and by the time we were starting to move up from Classic Mac OS the standard choice was normally Dantz Development’s Retrospect, first released in 1989 and still available today in version 19.

timelretrospect

idiskbackup2004

Time Machine wasn’t the first utility in Mac OS to back up local storage. In 2004, Apple’s first cloud subscription service .Mac included a Backup app that backed up local files to iDisk in the cloud, something that still isn’t supported today with iCloud.

In the following years, AirPort Wi-Fi systems flourished, and Apple decided to launch a consumer NAS incorporating an AirPort Extreme Base Station with a 500 GB or 1 TB hard disk. Software to support that was dubbed Time Machine, and was released in Mac OS X 10.5 Leopard on 26 October 2007, 17 years ago. The first Time Capsule was announced in January 2008, and shipped a month later.

timemachine1

timemachine2

Time Machine’s pane in System Preferences changed little until Ventura’s System Settings replaced it.

timemachine2

The application’s restore interface featured a single Finder-like window, much like today’s. Internally, Time Machine scheduled its backups using a system timer and launchd, making backups every hour regardless of what else the Mac might have been doing at the time.

The initial version of Time Machine was both praised and slated. Unlike Mike Bombich’s rival Carbon Copy Cloner, it couldn’t create bootable backups, and there were problems with FileVault encryption, which at that time could only encrypt Home folders, rather than whole volumes. Despite those, its introduction transformed the way that many used their Macs, and made it more usual for users to have backups.

TMbackup105

From its release, Time Machine was dependent on features of the HFS+ file system to create its Finder illusion. Every hour the backup service examined the record of changes made to the file system since the last backup was made, using its FSEvents database. It thus worked out what had changed and needed to be copied into the backup. During the backup phase itself, it only copied across those files that had been created or changed since the last backup was made.

TMbackuphardlinks

It did this by using hard links in the backup, and Apple added a new feature to its HFS+ file system to support this, directory hard links. Where an entire folder had remained unchanged since the last backup, Time Machine simply created a hard link to the existing folder in that backup. Where an existing file had been changed, though, the new file was written to the backup inside a changed folder, which in turn could contain hard links to its unchanged contents.

This preserved the illusion that each backup consisted of the complete contents of the source, while only requiring the copying of changed files, and creation of a great many hard links to files and folders. It was also completely dependent on the backup volume using the HFS+ file system, to support those directory hard links.

Without directory hard links, backups would quickly have become overwhelmed by hard links to files. If you had a million files and folders on the backup source volume, every hourly backup would have had to create a total of a million copied files or hard links. Directory hard links thus enabled the efficiency needed for this novel scheme to work.

timemachinefail

Apple later introduced what it termed Mobile Time Machine, intended for notebooks that could be away from their normal backup destination for some time. In around 10,000 lines of code, Mac OS X came to create something like a primitive snapshot, only on HFS+.

When macOS introduced its new DAS-CTS scheduling and dispatch system for background activities, in (about) Sierra, Time Machine’s backups were added to that. That proved unfortunate at the time because of a bug in that system, which failed on Macs left running continuously for several days, when backups could become infrequent and irregular.

When Apple released the first version of APFS on Mac OS X in High Sierra, its new snapshot feature was immediately incorporated into Time Machine to replace the earlier Mobile variant. Initially, APFS snapshots were also used instead of the FSEvents database to determine what should be backed up. Since then, making each backup of an APFS volume has involved creating a snapshot that’s stored locally on the APFS volume being backed up. In High Sierra and Mojave, the structure of backups themselves didn’t change, so they still required an HFS+ volume and relied on directory hard links.

TMbackup1015

Catalina introduced a more complicated scheme to replace snapshots as the usual means for determining what to back up. This was presumably because computing a snapshot delta had proved slow. As the backup destination remained in HFS+ format that could’t use snapshots, it continued to rely on directory hard links.

Big Sur and its successors with Signed System Volumes (SSVs) retained the option to continue backing up to HFS+ volumes, but added the ability to back up APFS volumes to APFS backup storage at last.

tmbackup14a

When backing up to APFS, Time Machine reverses the design used in High Sierra: instead of using snapshots to determine what needed to be backed up before creating a backup using traditional hard links, most of the time Time Machine determines what has changed using the original method with FSEvents, then creates each backup as a synthetic snapshot on the backup store. Unlike earlier versions, in Big Sur and later Time Machine can’t back up the System volume.

Once Time Machine has made a detailed assessment of the items to be backed up, it forecasts the total size to be copied. The local snapshot is copied to an .inprogress folder on the backup volume, and backup copying proceeds. Where possible, only changed blocks of files are copied, rather than having to copy the whole of every file’s data, an option termed delta-copying that can result in significant savings. Old backups are removed both according to age, and to maintain sufficient free space on the backup volume, in what Time Machine refers to as age-based and space-based thinning.

Data copied to assemble the backup on the backup volume is formed into a synthetic snapshot used to present the contents of that backup both in the Time Machine app and the Finder. Those snapshots are presented in /Volumes/.timemachine/ although they’re still stored on the backup volume.

Although modern Time Machine backups to APFS are both quicker and more space-efficient, the structure of backup storage poses problems. Copying backup stores on HFS+ was never easy, but there are currently no tools that can transfer those on APFS to another disk.

Behind the familiar interfaces of its app and settings, Time Machine has come a long way over the last few years, from building an illusion using huge numbers of hard links to creating synthetic snapshots.

What is Macintosh HD now?

By: hoakley
2 September 2024 at 14:30

Perhaps you just tried to save a document, only to be told you don’t have sufficient permissions to do so, or attempted to make another change to what’s on your Mac’s internal storage, with similar results. You then select the Macintosh HD disk in the Finder and Get Info. No wonder that didn’t work, as you only have read-only access to that disk. But if you unlock it and try to make any changes to permissions, you see

xpermserror

What’s going on?

Between macOS Mojave, with its single system volume, and Big Sur, the structure of the Mac system or boot volume has changed, with Catalina as an intermediate. Instead of Macintosh HD (or whatever you might have renamed it to) being one volume on your boot disk, it’s now two intertwined and joined together. What you see now as Macintosh HD isn’t even a regular APFS volume, but a read-only snapshot containing the current macOS. No wonder you can’t change it.

Root

Select the boot disk Macintosh HD in the Finder, and it appears to have four visible folders, Applications, Library, System and Users, just like it always did. Press Command-Shift-. to reveal hidden folders and all the usual suspects like bin, opt and usr are still where they should be. That’s the root of the combined System and Data volumes, and what’s shown there is a combination of folders on both volumes, with the top level or root on the Sealed System Volume (SSV).

The contents of those folders are also the result of both volumes being merged together using what Apple terms firmlinks:

  • Applications contains apps installed in your own Applications folder on the Data volume, and those bundled in macOS on the SSV. You can see just the latter in the path System/Applications, where they appear to be duplicated, but aren’t really.
  • Library comes only from the Data volume, and all its contents are on that volume. But inside it, in the path Library/Apple/System/Library are some components that should appear in the main System/Library.
  • System comes only from the SSV, although it has some contents merged into it using firmlinks, such as those folders in Library.
  • Users also comes only from the Data volume, and includes all Home folders for users.

So while the root of Macintosh HD might be in the SSV, much of its contents are on the Data volume, and can be written to, even though the root is a read-only snapshot, thanks to those firmlinks.

Data volume

There are two places that mounted volumes are listed in the Finder: the hidden top-level folder Volumes, where Macintosh HD is just a link back to the root complete with its merged volumes, and in System/Volumes, where what’s shown as Macintosh HD is in fact not the merged volumes, but only the Data volume. You can confirm that by looking at what’s in System/Volumes/Macintosh HD/System, where you only see the parts of the System folder that are stored on the Data volume, and not those stored on the SSV.

What is more confusing there is that System/Volumes/Macintosh HD/Applications is the same merged folder containing both user and bundled apps as in the top-level Applications folder. That’s an artefact resulting from the way that its firmlink works.

But if you open the Get Info dialog on System/Volumes/Macintosh HD, you’ll see the same as with the root Macintosh HD disk, information about the root and not the Data volume.

Mounted in System/Volumes are several other volumes like VM and Preboot, and (depending on whether this is an Intel or Apple silicon Mac) folders such as Recovery and xarts, that you really don’t want to mess with.

Permissions problems

Tackling problems that appear to be the result of incorrect permissions is best done at the lowest folder level. If you’re trying to save a document to the Documents folder inside your Home folder, select that and Get Info on it. Chances are that you are the owner and have both Read & Write permissions as you should. In that case, the problem most likely rests with privacy protection as in Privacy & Security settings. You then suffer Catch-22, as you can only effect changes to those by closing and opening the app, and as you can’t save your document before closing the app, you’re at risk of losing its contents. You may have better luck trying a different folder, creating a new one inside your Home folder, or using the Save As… command instead (which may be revealed by holding the Option key when opening the File menu).

Full layout

In case you’re wondering exactly which folders are merged into the hybrid Macintosh HD ‘volume’, those are shown below in increasing levels of detail, starting with the broad layout.

BootVolGpVentapfs

Then to a simplified version of the full layout.

BigSurIntSimple

Finally, in complete detail.

BigSurIntegrated

Happy navigating!

Links, aliases and the decline of paths

By: hoakley
26 August 2024 at 14:30

One of the fundamental requirements of any decent operating system and its primary file system is to provide objects that link to files and directories in storage. These allow code and users to access those files and folders indirectly, by referencing them from elsewhere. In most systems there are two methods of achieving this, by referencing the directory path and name of the file/folder, or using a reference to an identity number unique to that file system and part of that file’s attributes, normally the inode number. These form symbolic and hard links respectively.

Symlinks

linksetc3

A symlink is a separate file containing the path to the file or folder that it links to. It thus has its own File System Object, and its own tiny data containing the path to the original. This works well as long as that path is preserved in its entirety. The inode of the original file is unaware of any symlinks to it, so deleting the original file also breaks every symlink to that file.

Symlinks are easily created in Terminal using a command like
ln -s /Users/myname/Movies/myMovie.mov /Users/myname/Documents/Project1/myNewMovie.mov
which creates a tiny file myNewMovie.mov containing the path /Users/myname/Movies/myMovie.mov referring to the file it links to.

You can’t create symlinks using the Finder but can using third-party apps, and they should now work fully with QuickLook thumbnails and previews. They essentially take no space on disk. Paths used can be shorter and relative, allowing an enclosing folder to be moved within that volume, which would break a full absolute path. They can also specify files on other volumes, so long as the path to them remains correct.

Paths in symlinks work most reliably in the macOS SSV, as it’s both structured and static. Although the disciplined user can use them effectively in their Home folder, they are prone to unintended effects such as the renaming of an intermediate directory, moving any component in the path, and even changing the Unicode normalisation of a character in the path. As symlinks only contain a path, there’s no fallback to try resolving a broken path, making them inherently fragile, not robust.

Hard links

linksetc2

In APFS, a hard link is actually a single file, with one File System Object, that has two or more references in the form of Siblings that are associated with the file inode. That object has a single set of File Extents, so each of the siblings refers to exactly the same file and data, although siblings will normally have different paths. In the object, the reference count equals the number of siblings; when siblings are deleted, that count is decremented, and the object is only removed when the count reaches zero.

Hard links are easily created in Terminal using a command like
ln /Users/myname/Movies/myMovie.mov /Users/myname/Documents/Project1/myNewMovie.mov
to create a new hard link named myNewMovie.mov to the original myMovie.mov.

You can’t create hard links in the Finder, and because they use a single set of File Extents, they essentially take no space on disk.

Hard links look and work exactly like the original, and can be moved around freely within the same volume as they’re not dependent on paths. Copy one to another volume, though, and the copy will be a complete unlinked file. Hard links to files and to directories were one of the essential ingredients of Time Machine backups on HFS+, but as APFS doesn’t support directory hard links, Time Machine now has to use a different backup format for storage on APFS.

Aliases and bookmarks

Aliases originated in System 7, when Mac OS lacked the BSD/Unix features that were to come in Mac OS X. At that time, paths were seldom used, making symlinks implausible, so the original Finder alias was instead built on the equivalent of an inode number. They have evolved into a combination offering path information and inode number that should be able to resolve links that would break the path in symlinks, and can extend to other volumes, unlike hard links.

Despite their robustness and versatility, macOS provides no bundled command tools for the creation of aliases or their resolution into paths, making them of no use in shell scripts or the command line, although the free alisma addresses these. Neither have they been integrated into APFS as distinct objects in the file system, where they’re just another file.

Bookmarks are a variant of aliases intended to be used in files for similar purposes. For example, the list of files shown in the Open Recent menu command in apps is assembled from a file containing bookmarks to each of those documents. Most lists of apps and files built by Launch Services and other sub-systems are similarly reliant on bookmarks.

Demonstration

To demonstrate how these three different types of link behave, create a folder in your Home Documents folder (~/Documents), and copy a test document into that. Next to that file, create another folder named links, and within that create a symlink (using a full absolute path), a hard link, and an alias to the original file.

links

Copy the file containing the links to another volume, and note how the hard link becomes a separate local file instead of a link, but the symlink and alias continue to link to the original, whose path is unchanged.

Rename the folder enclosing the test document and links folder, and note how the symlink is then broken, although QuickLook may still show a cached thumbnail for it. Note how this also breaks the copy of the symlink on the other volume, as the path used by both of them no longer exists.

Finally, move the folder of links to be alongside that containing the original document, in the ~/Documents folder, and note how the symlink is broken there too.

Of the three types of link, the only one that proves robust to all these changes is the alias.

Dual-booting your Mac with multiple versions of APFS

By: hoakley
22 August 2024 at 14:30

Since its first public release seven years ago in High Sierra, APFS has changed greatly. Connect a bootable external disk with Sonoma installed to a Mac running macOS 10.13 and it won’t know what to make of it. That’s because many of the features used by APFS today didn’t exist until Catalina and Big Sur. This article explains how your Mac can cope with running such different versions of APFS, and how to avoid the problems that can arise when a Mac can start up in two or more different versions of macOS.

Fences

Disk structures, APFS and macOS fall into three main phases:

  • High Sierra and Mojave, which can run 32-bit code, boot from a single integrated volume, and don’t understand Signed System Volumes (SSVs); they run APFS versions 748.x.x to 945.x.x.
  • Catalina (macOS 10.15), which can’t run any 32-bit code, and boots from a group of volumes where the system is contained in its own read-only volume; it runs APFS version 1412.x.x.
  • Big Sur and later, which can’t run any 32-bit code, and boot from an SSV that’s a specially sealed snapshot on the System volume; they run APFS version 1677.x.x and later.

Fuller and finer details of the changes with different versions of macOS are given in the Appendix at the end.

If you need your Mac to run more than one version of macOS, it’s easiest and most compatible if the versions installed come from only one of those phases. Two phases are more of a challenge, and all three needs great care to ensure that one version doesn’t damage the file systems of another.

When you do need to run macOS from two or more phases, the cleanest and safest way is to only mount disks and volumes from one phase at a time. For example, when running Mojave, if you unmount disks and volumes for all later versions of macOS, then you won’t see any notifications warning you of Incompatible Disk. This disk uses features that are not supported on this version of macOS. Unfortunately, unless all your boot systems are on external disks, this isn’t easy to achieve.

Warnings

Generally speaking, APFS tools including those run by Disk Utility (including fsck_apfs, used by First Aid) are backward-compatible, but may not be as reliable when a tool from an older version of macOS is run on a newer version.

First Aid and fsck_apfs normally report versions of APFS tools used on the volumes and containers they’re checking. These draw attention to any potential for incompatibilities, such as:
The volume [volume name] was formatted by diskmanagementd (1412.141.1) and last modified by apfs_kext (945.250.134).
telling you that volume was created by macOS 10.15, and last changed by macOS 10.14.

Other messages you might see include warnings like: warning: container has been mounted by APFS version 2236.141.1, which is newer than 1412.141.3.7.2, which is less helpful.

If you’re going to run multiple versions of macOS on the same Mac, you’ll have to get used to those. For reference, APFS versions decode to macOS versions as:

  • 249.x.x is the beta release from macOS 10.12 Sierra
  • 748.x.x is 10.13 High Sierra
  • 945.x.x is 10.14 Mojave
  • 1412.x.x is 10.15 Catalina
  • 1677.x.x is 11 Big Sur
  • 1933.x.x is 12 Monterey up to 12.2
  • 1934.x.x is 12.3 Monterey and later
  • 2142.x.x is 13 Ventura
  • 2235.x.x is 14 Sonoma up to 14.3
  • 2236.x.x is 14.4 Sonoma and later
  • 2311.x.x and later is macOS 15 Sequoia beta.

Dangers

To minimise the risk of any problems arising, any manipulation of the file system should be performed by Disk Utility, the diskutil command tool or others, for the same or later version of macOS. Thus, if you have Mojave and Big Sur installed, you could use tools from either to maintain Mojave file systems, but should only use those for Big Sur when maintaining Big Sur’s file systems.

This becomes more difficult when file system maintenance needs to be performed in Recovery mode. Once the Mac has started up in Recovery, check the version of macOS before opening Disk Utility, using fsck_apfs, diskutil or anything similar.

This is one situation where versions of macOS with an SSV can become tricky, because of their different associations with Recovery systems. On Apple silicon Macs, Big Sur doesn’t use a paired Recovery volume, whereas Monterey and later do. To be confident that the Mac starts up in the correct version of Recovery, it should have been running that version normally before you start it up in Recovery. Once in Recovery, check the version of macOS before proceeding to work with its file systems.

Appendix: APFS and macOS version details

APFS major version numbers change with major version of macOS:

  • macOS 10.12 has APFS version 0.3 or 249.x.x, which shouldn’t be used at all.
  • 10.13 has 748.x.x, which doesn’t support Fusion Drives, but has basic support for volume roles.
  • 10.14 has 945.x.x, the first version to support Fusion Drives.
  • 10.15 has 1412.x.x, the first version to support the multi-volume boot group, and introduces extended support for volume roles, including Data, Backup and Prelogin.
  • 11 has 1677.x.x, the first version to support the SSV, and Apple silicon. On M1 Macs, it doesn’t support the paired Recovery volume, though.
  • 12 has 1933.x.x until 12.2, thereafter 1934.x.x, which support the paired Recovery volume on Apple silicon.
  • 13 has 2142.x.x, and is probably the first to support trimming of UDRW disk images and their storage as sparse files.
  • 14 has 2235.x.x, until 14.3, thereafter 2236.x.x.

Minor version numbers increment according to the minor version of macOS, and patch numbers wander without pattern.

What’s the time Mr. APFS? With a new version of Precize

By: hoakley
12 August 2024 at 14:30

It’s the middle of August, the time when, if you’re not an Apple engineer or third-party developer, you’re most likely to be on holiday. To help you enjoy that, today I’m going to engage you in a little game I call What’s the time, Mr. APFS?

To play along with me, all you need is a copy of the latest version of Precize, version 1.15, available from here: precize115
from Downloads above, from its Product Page, or via its auto-update mechanism.

Precize update

Thanks to some questions from applice, who should really be spending more time watching the Olympics than worrying about file systems, this new version of Precize comes with significant enhancements in its datestamp reporting. First, it now gives datestamps with decimal seconds, for those who want to be truly precise, and it adds two new time fields, for Attribute Modification, and Access, as shown in the screenshot below.

precize92

Once you’ve downloaded the Zip archive, unZip it where it is, and move the Precize app from its folder into one of your Applications folders (that ensures it doesn’t undergo app translocation). Keep the Zip archive it came in, as we’ll use that in the game, whose rules are simple: all you have to do is roughly guess the times in each of Precize’s datestamp fields, for different files and bundles.

What are the times?

APFS stores four datestamps for each file and directory in its volumes, and the four fields shown in this version of Precize correspond to those four datestamps. There is a fifth that the Finder can display, the date and time that an item was added to its current location. That’s not derived from the file’s attributes, but from directories, and isn’t shown in Precize.

The four datestamps are:

  • Created, taken from the create_time value, the time that this item was created, given as the clock time corresponding to an unsigned 64-bit integer giving the number of nanoseconds since January 1, 1970 at 00:00 UTC, disregarding leap seconds.
  • Modified, taken from the mod_time value, the time that this item was last modified, likewise.
  • Attr Mod, taken from the change_time value, the time that the item’s attributes were last modified, likewise. Those attributes include permissions, for example, and changes made to extended attributes should also update this value.
  • Accessed, taken from the access_time value, the time that this item was last accessed, likewise.

By default, APFS is relaxed over how it records the last of these. Some other filesystems are strict, and change the access_time value every time a file is read. Instead, APFS only updates this if its value is prior to (earlier than) the mod_time value. You can change this behaviour by setting an APFS volume’s APFS_FEATURE_STRICTATIME flag, for instance when mounting it by using a command like
mount -u -o strictatime

However, that’s not the default for APFS volumes. For example, inspect your Mac’s Data volume using the command
mount | grep '/System/Volumes/Data '
and you should see something like
/dev/disk6s1 on /System/Volumes/Data (apfs, local, journaled, nobrowse, root data)
without any mention of the strictatime flag. List all mounted volumes using
mount
and you’re unlikely to see any using strictatime.

File datestamps

Open the new version of Precise, so it’s poised waiting for you to drop a file onto its icon in the Dock. Then take a screenshot, and drop that file onto Precise’s icon. What do you expect the four datestamps to be?

This is an easy one for starters: the four times should be very close, and almost match the time given in the screenshot’s name. The first two should be Created and Modified, the latter perhaps a thousandth of a second behind, reflecting the time it takes from creation of the screenshot file to completing the save. Next comes Accessed, when macOS opened the file to perform its animation of the screenshot sliding away. Last of all should be Attr Mod, recording when all its attributes such as permissions were set in its destination folder.

Now copy that file to another volume. What do you expect those four datestamps to be there?

Created and Modified should remain the same, as those are preserved when the file is moved around in local file systems. But Attr Mod and Accessed should record the time that file arrived on the second volume, reflecting its new attributes and its access for the copy.

Now try a couple of harder questions. Drop the Zip archive that contained Precize onto the running app’s icon. What do you expect those four datestamps to be?

All four datestamps should record times close to when you downloaded that Zip file from here. Just as with the screenshot, Created leads, closely followed by Modified, with Accessed and Attr Mod trailing. Note that they’re all four very recent, subsequent to its download to your Mac.

Unzip that Precize Zip archive again and drop the Precize app within it onto the running app’s icon. What do you expect those four datestamps to be?

Now the Created and Modified values are taken from that app when it left my Mac here, but Attr Mod should be more recent, and Accessed is likely to be older unless you have accessed the app bundle since then.

What’s the time, Mr. APFS?

All the best games involve learning. I hope these have shown you how those four datestamps are set and changed. In summary:

  • Software like the Finder, when copying between local volumes, also copies most of a file’s attributes, including Created and Modified datestamps for consistency, even though it creates a new file in the destination. Attr Mod and Accessed are linked to the copy event, though.
  • Software like browsers, when copying from a remote server to a local volume, don’t preserve the file’s attributes, so all four datestamps correspond to the local arrival and creation of the download.
  • Archives such as Zip files preserve most of a file’s attributes, and reconstitute its Created and Modified values to those saved when the archive was created.

As a final exercise, using Precize, determine whether copying files by AirDrop between Macs behaves conservatively like copying between local volumes, or resets datestamps as downloading from the internet does. Why do you think that Apple designed it to work that way?

Have fun!

Copy speeds of large and sparse files

By: hoakley
7 August 2024 at 14:30

I have recently seen reports of very low speeds when copying large files such as virtual machines, in some cases extending to more than a day, even when they should have been sparse files, so requiring less time than would be expected for their full size. This article teases out some tests and checks that you can use to investigate such unexpectedly poor performance.

Expected performance

Time taken to copy or duplicate files varies greatly in APFS. Copies and duplicates made within the same volume should, when performed correctly, be cloned, so should happen in the twinkling of an eye, and without any penalties for size. This is regardless of whether the original is a sparse file, or a reasonably sized bundle or folder, whose contents should normally be cloned too. If cloning doesn’t occur, then the method used to copy or duplicate should be suspected. Apple explains how this is accomplished using the Foundation API of FileManager, using a copyItem() method. This is also expected behaviour for the Finder’s Duplicate command.

Copying a file to a different volume, whether it’s in the same container, or even on a different disk, should proceed as expected, according to the full size of the file, unless the original is a sparse file and both source and destination use the APFS file system. When an appropriate method is used to perform the copy between APFS volumes, sparse file format should be preserved. This results in distinctive behaviour in the Finder: at first, its progress dialog reflects the full (non-sparse) size of the file to be copied, and the bar proceeds at the speed expected for that size. When the bar reaches a point equivalent to the actual (sparse) size of the file being copied, it suddenly shoots to 100% completion.

Copying a sparse file to a file system other than APFS will always result in it expanding to its full, non-sparse size, and the whole of that size will then be transferred during copying. There is no option to explode to full size on the destination, nor to convert format on the fly.

External SSDs

When copying very large files, external disk performance can depart substantially from that measured using relatively small transfer sizes. While some SSDs will achieve close to their benchmark write speed, others will slow greatly. Factors that can determine that include:

  • a full SLC cache,
  • failure to Trim,
  • small write caches/buffers in the SSD,
  • thermal throttling.

Many SSDs are designed to use fast single-level cell (SLC) write caching to deliver impressive benchmarks and perform well in everyday use. When very large files are written to them, they can exceed the capacity of the SLC cache, and write speed then collapses to less than a quarter of that seen in their benchmark performance. The only solution is to use a different SSD with a larger SLC cache.

Trimming is also an insidious problem, as macOS by default will only Trim HFS+ and APFS volumes when they’re mounted if the disk they’re stored on has an NVMe interface, and won’t Trim volumes on SSDs with a SATA interface. The trimforce command may be able to force Trimming on SATA disks, although that isn’t clear, and its man page is forbidding.

Trimming ensures that storage blocks no longer required by the file system are reported to the SSD firmware so they can be marked as unused, erased and returned for use. If Trimming isn’t performed by APFS at the time of mounting, those storage blocks are normally reclaimed during housekeeping performed by the SSD firmware, but that may be delayed or unreliable. If those blocks aren’t released, write speed will fall noticeably, and in the worst case blocks will need to be erased during writing.

For best performance, SATA SSDs should be avoided, and NVMe used instead. NVMe is standard for USB 3.2 Gen 2 10 Gb/s, USB4 and Thunderbolt SSDs, which should all Trim correctly by default.

Disk images

Since Monterey, disk images with internal APFS (or HFS+) file systems have benefitted from an ingenious combination of Trimming and sparse file format when stored on APFS volumes. This can result in great savings in disk space used by disk images, provided that they’re handled as sparse files throughout.

When a standard read-write (UDIF) disk image is first created, it occupies its full size in storage. When that disk image is mounted, APFS performs its usual Trim, which in the case of a disk image gathers all free space into contiguous storage. The disk image is then written out to storage in sparse file format, which normally requires far less than its full non-sparse size.

This behaviour can save GB of disk space in virtual machines, but like other sparse files, is dependent on the file remaining on APFS file systems, otherwise it will explode to its full non-sparse size. Any app that attempts to copy the disk image will also need to use the correct calls to preserve its format and avoid explosion.

Tools

clonesparse3

Precize reports whether a file is a clone, is sparse, and provides other useful information including full sizes and inode numbers.

Sparsity can create test sparse files of any size, and can scan for them in folders.
Mints can inspect APFS log entries to verify Trimming on mounting, as detailed here.
Stibium is my own storage performance benchmark tool that is far more flexible than others. Performing its Gold Standard test is detailed here and in its Help book.

Snapshots aren’t backups

By: hoakley
10 July 2024 at 14:30

Most backup utilities now make snapshots of volumes they’re backing up, and Time Machine goes further by using snapshots in its backup process, and creating backups as snapshots. How then should we include snapshots in our backup plans? Could we rely on snapshots rather than conventional backups?

APFS snapshots

Each snapshot contains a complete set of the file system metadata for that volume at the time the snapshot was made, and all the extents required to reinstate that volume. Although tied to that volume, the snapshot is stored alongside the current volume metadata, in the same APFS container, in the same disk partition.

Extents list the storage blocks containing the data that composes every file that existed within that volume at the time the snapshot was made. Many of those will be the same as in the current volume, but the remainder will refer to data that has been deleted since the snapshot was made, but is being retained to enable the volume to be rolled back to its previous state. Those old extents can only be removed when that snapshot is deleted, which thus frees all those storage blocks at the same time. This is illustrated in the diagram.

snapobject

This shows the same file in a snapshot and the current volume. Extents for the data of the earlier version of that file contained in the snapshot are shown at the top, and consist of blocks EA, EB, EC and ED. After that snapshot was made, the file was edited and then consists of the blocks shown at the bottom, EA, FB, FC and ED.

Thus, the extents listed for that file in the snapshot consist of two blocks, EA and ED, that are currently in use and included in its extents now, and two blocks, EB and EC, that were deleted after the snapshot was made. However, as those are referenced in the snapshot’s extents, those storage blocks are retained to enable the snapshot to restore the volume’s previous state. When that snapshot is deleted, blocks EB and EC will then be returned to the pool of free blocks for erasure and re-use.

This demonstrates how snapshots depend on their own file system metadata, and a combination of currently used storage blocks and those awaiting return for re-use.

Rolling back to a snapshot

The most popular reason for wanting to roll back to a snapshot is to recover from problems following a macOS update. As the System volume isn’t included among backup snapshots, and the process of rolling back would be so complex that it’s probably not possible, there’s no point in considering the use of snapshots for that purpose.

In other circumstances, where part or all of a user volume is to be restored from a snapshot, once that snapshot has been mounted, the procedure is the same as for restoring from a backup. What is different is that restoring a whole volume from a snapshot is a one-way trip, and there is no undo. This is because snapshots subsequent to that used to restore from will be removed, and you won’t then be able to ‘roll forward’ to a later snapshot. That contrasts with a normal backup, where items remain available from any other backup that is retained in the backup store.

Snapshot robustness

Long before the introduction of APFS, Time Machine created its own form of snapshots when a Mac was away from its normal backup store. Those enabled anyone travelling without an external drive to have limited ability to restore lost or damaged files despite their lack of conventional backups. Experience showed this to be useful in the event of human error, such as accidental deletion of files, but of limited or no use when there were file system errors.

Because snapshots share the same container as the current volume, and share many file extents with them, they are prone to common errors. In particular, common file extents make it more likely that faults occurring in extents and data storage will affect them both. This is particularly important as one of the most common file system errors that corrupts data in files occurs when extents for two separate files overlap. A snapshot is thus more vulnerable than a backup on a different disk, or even one in a different container on the same physical store.

Snapshots are whole-volume

Snapshots do have one specific advantage over backups when it comes to their coverage. As they include the whole file system metadata for the volume, no items present in that volume are excluded from its snapshots. If you want to restore an item that has been excluded from backups made of any volume, you can therefore do that from its latest snapshot, if that item was present in the volume at the time that was made.

The only disadvantage to this is that snapshots can be disproportionately large compared to volume backups. If large and ever-changing files such as Virtual Machines are excluded from backing up, but are in a volume for which snapshots are made, those snapshots may appear excessively large. The solution is to move those files to a separate volume, for which snapshots aren’t made. This behaviour can also raise privacy concerns over data that aren’t encrypted on disk, as they too will be accessible in that volume’s snapshots even when excluded from its backups.

Conclusions

  • Because snapshots contain metadata and data common to the current version of a volume, and share the same container, they are prone to common errors.
  • Restoring a snapshot can’t be undone, and removes later snapshots.
  • Snapshots are a useful addition to conventional backups made to separate storage, but can’t replace them.
  • When conventional backups aren’t available, snapshots can be invaluable.
  • Snapshots can be used to restore items that were excluded from backups.
  • Snapshots can’t be used to roll back from a bad macOS update.

Reference

APFS snapshots in detail

❌
❌