Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

More updates for Tahoe: Aliases (Alifix), special files (Sparsity), file types (UTIutility) and language (Nalaprop)

By: hoakley
3 July 2025 at 14:30

This week I have another group of four little utilities whose windows have been overhauled, and have new app icons to meet the requirements of macOS Tahoe. Each of these new versions requires macOS Big Sur or later.

Finder aliases

If you have old Finder aliases that need to be checked and repaired, Alifix will do that job with you. Use it to scan a folder containing those aliases, and it will warn you which can’t be resolved any longer, and can rewrite those that need to be updated.

Alifix version 1.4 is now available from here: alifix14
and from its Product Page. As it seldom needs updating, it doesn’t use the auto-update mechanism.

APFS sparse and clone files

As you can tell by its name, Sparsity started off as a means of creating APFS sparse files for test purposes. In addition to that, it has a valuable scanning feature that will detect and report details of all sparse, clone and purgeable files in a selected volume or folder. Information reported includes both the nominal and actual size of each file, so you can see which sparse files are saving the most space on disk.

Sparsity version 1.4 is now available from here: sparsity14
and from its Product Page. It too doesn’t use auto-update.

UTI file types

Give UTIutility a filename extension and it will tell you its Uniform Type Indicator (UTI, also UTType), traditional Mac OSType, MIME type, Pasteboard type, and a list of UTIs it conforms to. You can also find the same information from those other properties. This too has a crawler that will search through a volume or folder and compile a list of all the UTIs it encounters there. Its Help book contains an extensive reference to UTIs to help you get the most out of them.

UTIutility version 1.4 is now available from here: utiutil14
and from its Product Page. It doesn’t use auto-update.

Natural language

For many years, macOS has had built-in features to handle and parse natural languages including French, Spanish and German. Nalaprop uses these features to analyse text files, or text pasted into the left view in its main window. That text can then be parsed by downloadable linguistics modules supplied by Apple, and each word displayed in colour according to that word’s part of speech or grammatical type. From that it can automatically construct dictionaries or concordances of words used in that text, arranged by part of speech, and giving word frequency for each.

Nalaprop comes with a multilingual demonstration file to show how well it copes with language transitions.

Here it has parsed and coloured the text in the middle according to part of speech, for two languages, English and French. To the right of those is the dictionary it has compiled, ending verbs and starting the list of nouns. At the far right is a colour key for parts of speech.

In this demonstration, Charles Dickens’ novel David Copperfield has been parsed, a total of nearly 360,000 words. Currently such large documents are analysed in the main thread, so you’re likely to see a spinning beachball during parsing, but can still switch freely to other apps when that’s taking place. Those with Apple silicon Macs will see that analysis is performed in a single thread running on one P core, so all the other P cores remain free to run other tasks. I was hoping to use different threads for this, but it proved too complicated to incorporate in this particular version.

Nalaprop version 1.4 is now available from here: nalaprop14
from its Product Page, and via its auto-update mechanism.

Enjoy!

Here are the 21 icons for those of my apps so far ported to be compatible with Tahoe.

You don’t have to collect all in the series, though.

How keys are used in FileVault and encryption

By: hoakley
25 June 2025 at 14:30

We rely on FileVault and APFS to protect our secrets by encrypting the volumes containing our documents and data. How they do that is a mystery to many, and raises important questions such as the role our passwords play, and how recovery keys work. This article attempts to demystify them.

Naïve encryption

A simple scheme to encrypt a disk or volume might be to take the user password, somehow turn it into a key suitable for the encryption method to be used, and employ that to encrypt and decrypt the data as it’s transferred between disk storage and memory.

There are lots of weaknesses and difficulties with that. Even using a ‘robust’ user password, it’s not going to be memorable, sufficiently long or hard to crack, and there’s no scope for recovery if that password is lost or forgotten.

FileVault base encryption

In Macs with T2 or Apple silicon chips when FileVault is disabled, everything in the Data volume stored on their internal SSD is still encrypted, but without any user password. This is performed in the Secure Enclave, which both handles the keys and performs the encryption/decryption. That ensures the keys used never leave the Secure Enclave, so are as well-protected as possible.

Generating the key used to encrypt the volume, the Volume Encryption Key or VEK, requires two huge numbers, a hardware key unique to that Mac, and the xART key generated by the Secure Enclave as a random number. The former ties the encryption to that Mac, and the latter ensures that an intruder can’t repeat generation of the same VEK even if it does know the hardware key. When you use Erase All Content and Settings (EACAS), the VEK is securely erased, rendering the encrypted data inaccessible, and there’s no means to either recover or recreate it.

This scheme lets the Mac automatically unlock decryption, but doesn’t put that in the control of the user, who therefore needs to enable FileVault to get full protection.

FileVault full encryption

Rather than trying to incorporate a user password or other key into the VEK, like many other encryption systems FileVault does this by encrypting the VEK using a Key Encryption Key or KEK, a process known as wrapping.

filevaultpasswords1

When you enter your FileVault password, that’s passed to the Secure Enclave, where it’s combined with the hardware key to generate the KEK, and that’s then used together with hardware and xART keys to decrypt or unwrap the VEK used for decryption/encryption.

This has several important benefits. As the KEK can be changed without producing a new VEK, the user password can be changed without the contents of the protected volume having to be fully decrypted and encrypted again. It’s also possible to generate multiple KEKs to support the use of recovery keys that can be used to unlock the VEK when the user’s password is lost or forgotten. Institutional keys can be created to unlock multiple KEKs and VEKs where an organisation might need access to protected storage in multiple Macs.

APFS encryption

True FileVault requires all keys to be stored in the Secure Enclave, and never released outside it. Intel Macs without T2 chips, and other protected volumes such as those on external storage can’t use that, and in the case of removable storage need an alternative that stays on the disk. For that, APFS uses the AES Key Wrap Specification in RFC 3394, using a secret such as a password to maintain confidentiality of every key.

APFS also uses separate VEKs and KEKs, so enabling the use of multiple KEKs for a single VEK, and the potential to change a KEK without having to decrypt and re-encrypt the whole volume, as in FileVault. In APFS, VEKs and KEKs are stored in and accessed from Keybags associated with both containers and volumes. The Container Keybag contains wrapped VEKs for each encrypted volume within that container, together with the location of each encrypted volume’s keybag. The Volume Keybag contains one or more wrapped KEKs for that volume, and an optional passphrase hint. These are shown in the diagram below.

apfsencryption1

Apple’s documentation refers to several secrets that can be used to wrap a KEK, including a user password, an individual recovery key, an institutional recovery key, and an unspecified mechanism implemented through iCloud. Currently, for normal software encryption in APFS, only two of those appear accessible: a user password is supported in both Disk Utility and diskutil‘s apfs verb, while diskutil also supports use of an institutional recovery key through its -recoverykeychain options. Individual and iCloud recovery keys only appear available when using FileVault, in this case implemented in software, either on Intel Macs without a T2 chip, or on all Macs when encrypting an external volume.

Because keybags are stored on the disk containing the encrypted volume, if the disk is connected to another Mac, when macOS tries to mount that volume, the user will be prompted to enter its password, and can then gain access to its contents. When FileVault is used to protect a Data volume on the internal SSD of a T2 or Apple silicon Mac, that volume can only be unlocked through the Secure Enclave of that Mac, and it isn’t possible to unlock it from another Mac (that’s also true when FileVault hasn’t been enabled on that volume).

Boot disk structure in macOS, iOS and iPadOS, and AI cryptexes

By: hoakley
20 June 2025 at 14:30

Volume structure of internal startup disks has grown increasingly complex during the transition from Intel to Apple silicon Macs. There also seems to be little information on iOS and iPadOS to compare against. This article briefly reviews structures of macOS 15 Sequoia on Apple silicon, iOS 18 and iPadOS 18.

Information for macOS is derived from the diskutil command tool, and from APFS entries in the log when booting a Mac mini M4 Pro in macOS 15.5. That for iOS is drawn from APFS entries in the log when booting an iPhone 15 Pro in iOS 18.5. That for iPadOS is drawn from APFS entries in the log when booting an iPad Pro 11-inch (4th generation)(Wi-Fi) in iPadOS 18.5. All three had Apple Intelligence enabled prior to booting. iOS and iPadOS logs were obtained from sysdiagnoses, and all logs were read using LogUI.

macOS 15 (Apple silicon)

The boot volume group consists of six volumes in a single container (partition). Two other containers are normally hidden from the user:

  • the first container of around 524 MB is reserved for preboot and secure boot support.
  • another container of about 5.4 GB is used for fallback recovery frOS, and in Big Sur was the primary recovery system, until the introduction of paired recovery volumes in macOS 12 Monterey.

The boot volume group contains:

  • System, left unmounted after booting from its Signed System Volume (SSV) snapshot;
  • Data, the only encrypted volume in the group, with numerous cryptexes grafted into it, and firmlinked to the SSV at multiple points;
  • paired, primary Recovery, containing a disk image of the Recovery system;
  • VM, the backing store for virtual memory;
  • Preboot, for early stages in the secure boot process, with cryptexes grafted into it;
  • Update, used as a working volume for macOS updates.

There are two groups of cryptexes grafted onto those volumes:

  • system cryptexes, including the large SystemCryptex or os.dmg of about 4.3 GB mainly containing dyld caches, and the smaller AppCryptex or app.dmg containing Safari and supporting components;
  • PFK volumes containing support components for Apple Intelligence features. These are numerous, and some are listed in the Appendix at the end.

If you’re wondering what a PFK volume might be, so am I. But this is what Google’s AI had to say: “Mac PFK” likely refers to a combination of MAC knives and Practical Fishkeeping (PFK) magazine. MAC knives are known for their high-quality, sharp blades and are popular among chefs and home cooks. Practical Fishkeeping is a magazine focused on fishkeeping, covering various aspects of the hobby.

These are summarised, without the help of Practical Fishkeeping, in this diagram.

iPadOS 18

In contrast to macOS since Big Sur, iPadOS and iOS only appear to have two containers (partitions) on their internal storage. The first is presumed to be similar in purpose to that in macOS, in supporting preboot and secure boot, although there is a xART volume in the boot volume group. In iPadOS, this container is smaller, at around 367 MB.

The boot volume group contains a slightly different range of volumes:

  • there is no Recovery volume;
  • there is no VM volume, as iPadOS doesn’t ordinarily support swapping/paging, although M-series models can in certain circumstances;
  • User, a second encrypted volume, appears unique to iPadOS;
  • xART and Hardware volumes are additional.

Cryptexes appear similar, with both system cryptexes and PFK volumes.

These are summarised below.

iOS 18

This is similar to iPadOS, with a first container/partition of around 351 MB, and the following differences in the boot volume group:

  • there is no User volume, and no Update volume;
  • Baseband Data is additional.

Cryptexes appear similar, with both system cryptexes and PFK volumes.

These are summarised below.

Conclusions

  • Volume structure of internal startup disks differs considerably between macOS, iPadOS and iOS.
  • As would be expected, iPadOS and iOS are most similar, but even they have substantial differences.
  • They each run their systems from a Signed System Volume firmlinked to an encrypted Data volume.
  • They each graft on two sets of cryptexes, one supplementing the system with dyld caches and Safari, the other providing components for AI.
  • There are now at least 24 cryptexes used to support AI.

I welcome corrections and explanations, please.

Appendix: Some PFK volumes from cryptexes

  • UC_FM_LANGUAGE_INSTRUCT_300M_BASE_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_300M_BAUC_FM_LANGUAGE_INSTRUCT_300M_BASE_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_DRAFTS_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_CONCISE_TONE_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MAIL_REPLY_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MAIL_REPLY_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_BASE_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_PROFESSIONAL_TONE_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_SUMMARIZATION_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_EVENT_EXTRACTION_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_PROOFREADING_REVIEW_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_HANDWRITING_SYNTHESIS_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_EVENT_EXTRACTION_MULTILINGUAL_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MESSAGES_REPLY_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_AUTONAMING_MESSAGES_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_URGENCY_CLASSIFICATION_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_FRIENDLY_TONE_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_HANDWRITING_SYNTHESIS_MULTILINGUAL_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_VISUAL_IMAGE_DIFFUSION_V1_BASE_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_PERSON_EXTRACTION_MULTILINGUAL_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • SECUREPKITRUSTSTOREASSETS_SECUREPKITRUSTSTORE_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_PERSON_EXTRACTION_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MAGIC_REWRITE_DRAFT_GENERIC_GENERIC_H14G_Cryptex.dmg
  • UC_IF_PLANNER_NLROUTER_BASE_EN_GENERIC_H14G_Cryptex.dmg

Source: iPadOS 18.5, configured with AI enabled for British English.

When and how should you run First Aid in Disk Utility?

By: hoakley
17 June 2025 at 14:30

If you used a Mac before 2017, you’ll be accustomed to running First Aid in Disk Utility as part of your routine housekeeping. Now all recent versions of macOS start up from a boot volume group in APFS, you might wonder whether that practice is still needed, and how it should be done.

Reliability of APFS

Like any file system, APFS can still acquire errors, but it’s designed to be as reliable as possible, particularly when used on SSDs.

Its file system uses Fletcher 64 checksums to ensure its integrity, although it doesn’t use checksums or hashes to check the integrity of file data. When each partition (container) is mounted, APFS finds the superblock with the most recent transaction identifier, and checks that it and its contents are valid. Each volume then has a quick check run by fsck_apfs (called by Disk Utility to perform First Aid) before it’s mounted. APFS also performs a Trim on each volume during the mount process.

APFS was designed to be the file system for Apple’s devices, including iPhones, where self-maintenance and reliability are essential, and users can’t perform routine maintenance on the file system.

Boot volume group

All Macs now start up from a boot volume group, consisting of the Signed System Volume (SSV), and a conjoined Data volume. Since Big Sur, the SSV is a read-only snapshot of the system that’s sealed using a tree of hashes to verify the integrity of its entire contents. As a snapshot, it’s read-only, and its hierarchy of hashes is checked progressively as its contents are accessed. Any error found in its hashes will normally result in the volume being reinstalled and a fresh snapshot made.

Although Disk Utility will attempt to check the Data volume while it’s still ‘live’, you should avoid that whenever possible. If you want to run First Aid or fsck_apfs on the Data volume, then it’s best to do so in Recovery mode, so the volume can be properly unmounted for checking. Safe mode is no substitute, though, neither does Safe mode perform any more thorough checks than normal user mode.

Time Machine backups

All Time Machine backups are made as read-only snapshots. Local snapshots are also made at the same time to each volume being backed up, and those too are read-only. Local snapshots are routinely deleted when they’re 24 hours old, although you can delete them manually before then. As Time Machine uses the most recent local snapshot when making the next of its backups, avoid deleting that if possible.

Indications for First Aid

The most reliable indicator of a problem with a file system is an alert reporting a file system error. Severe errors may be responsible for a kernel panic, although that’s now unusual unless a disk is in the process of failing altogether.

Running First Aid is no longer recommended as a routine practice in the way that it was when using HFS+. macOS updates are performed with the Data volume unmounted for much of the time. They update the files on the System volume (which is unmounted during normal running), make a snapshot of it, and build the hash tree to verify its contents. When that’s completed, the top-level hash is hashed again to produce the SSV’s signature, which is then compared with that set for the specific version of macOS. If there is any discrepancy, macOS has to be reinstalled and the process of creating the SSV repeated.

Warnings or errors?

Both First Aid and fsck_apfs may report warnings as well as errors, which are distinguished by the term given in the report. Warnings are almost invariably benign, and not a call to take any action. For example
warning: inode (id 1234567): Resource Fork xattr is missing or empty for compressed file
doesn’t normally require any repair, as it’s a warning of a condition that may be normal.

First Aid may refer to performing “deferred repairs”; those appear to be repairs to errors that APFS has already decided it needs to make, but did’t at the time. Sometimes they and other errors require “full space verification”, normally performed when you run First Aid on that container rather than the volume. In that case, when you check and repair the container, any error should be fixed.

Sequence

Apple recommends that check and repair using First Aid or fsck_apfs is performed first on each volume within a container (partition), then on the container itself, and finally on the disk. This sequence may differ from that used on HFS+ and other file systems.

Run First Aid on the Data volume

If you have evidence suggesting that there’s an error affecting your Mac’s Data volume, start it up in Recovery mode, select Options, click Continue underneath it then select Disk Utility in the window and click Continue.

Set its View tool to Show All Devices, then select the Data or Macintosh HD – Data volume in the boot group you want to check, and click First Aid.

diskutil05

diskutil06

Once that’s complete, select the container above that Data volume, and click First Aid.

diskutil07

diskutil08

Finally select the disk above that container, and click First Aid.

diskutil09

When you need to run First Aid on an external disk, you should be able to follow the same sequence when running in normal mode, rather than in Recovery. If First Aid complains of an error because one or more volumes are mounted, select the volume in the list at the left and click the Unmount tool. Once that volume has unmounted, select it again, and try First Aid. If that also fails for the same reason, select the disk itself, click on Unmount and then First Aid again. If you don’t have any joy, start your Mac up in Recovery and work from there instead.

Summary

  • Run First Aid in Disk Utility, or fsck_apfs, when there has been an error or incident that makes you suspect there may be an error in the file system.
  • Check and repair volumes in your current boot group in Recovery.
  • Set View to Show All Devices so you can see containers.
  • Run First Aid on volumes first, then containers, then disks last of all.
  • If any fail because they’re mounted, select the volume or disk, unmount it, and try again.
  • Warnings aren’t errors, and are likely to be normal.

How to tell the difference between copies, clone files, and hard links

By: hoakley
13 May 2025 at 14:30

APFS has several ways of creating copies or links to files that can be confused. These are:

  • conventional copy, to create a completely separate file
  • clone file, a separate file that has common data with the original
  • symbolic link or symlink, that’s just a path pointing to the original
  • hard link, that’s really exactly the same file in disguise
  • Finder alias, a more complex bookmark pointing to the original.

Symlinks and Finder aliases are easy to distinguish, as their icons have an arrow superimposed, and Get Info tells you they’re an Alias. While symlinks take almost no space at all, Finder aliases take a bit more. But at first sight, copies, clones and hard links all look identical. This article explores how you can tell them apart without resorting to Terminal’s command line.

First, a warning of a longstanding problem in the Finder: it can’t tell them apart, and can’t account correctly for the space they take on disk. To see what I mean, create a folder and a chunky file inside it, which I’ll call MyBigFile.tiff. In Terminal, create five hard links to it, numbered 2-5, using commands like
ln MyBigFile.tiff MyBigFile2.tiff
Then for good measure, clone MyBigFile.tiff twice by duplicating it in the Finder to create MyBigFileclone.tiff and MyBigFileclone2.tiff.

Select all seven files, and press Control-Command-I or Option-Command-I to Get Info on multiple items, and you’ll see the Finder thinks each of those seven files takes the same space on disk, in my case totalling 70 GB, even though we know that there’s only 10 GB of data stored between them. This has been a persistent shortcoming in the Finder since long before the introduction of APFS, and applies to both clone files and hard links.

Conventional copy

fileobject1

When we make a conventional copy of a file, a new inode is created for it, and each of the items that make up that file are copied, including the data stored in its file extents. This requires the same amount of additional disk space as used by the original file, as there’s nothing in common between the two files.

Clone file

fileobject3

Instead of duplicating everything, only the inode and its attributes (blue and pink) are duplicated to create a clone file, together with their file extent information. You can verify this by inspecting the numbers of those inodes, as they’re different, and information in the attributes such as the file’s name will also be different. There’s a flag in the file’s attributes to indicate that cloning has taken place.

Hard link

In hard links, exactly the same file is accessed through two different file paths. Although other file systems may handle this differently, according to Apple’s reference to APFS, this is how it handles hard links.

fileobject6

When you create a hard link to a file (blue), APFS creates two siblings (purple) with their own IDs and links, including different paths and names as appropriate. Those don’t replace the original inode, and there remains a single file object for the whole of that hardlinked file. Inode attributes keep a count of the number of links they have to siblings in their link (or reference) count. Normally, when a file has no hard links that’s one, and there are no sibling files. When a file is to be deleted, if its link count is only 1, the file and all its associated components can be removed, subject to the requirements of any clones and applicable snapshots. If the link count is greater than 1, then only the sibling being removed is deleted.

Using Precize

As the Finder can’t tell us which are hard links and which are clone files, we can resort to a utility like my free Precize. Drop the file onto its app icon, and these are what you should see.

This is the original file, which has now got four hard links and has two clones as well. If you drop any of those hard links onto Precize, you’ll see they’re the same file, with the same inode number given at the top in the volfs path and FileRefURL, in this case 8513451. Look at the bottom, and their Ref count is given as 5, because all five are hardlinked together to the same file. Because we’ve also cloned this file, the Clone checkbox at the bottom is ticked.

This is one of the two clone files made from that original. Because this is a different file that just happens to point to the same data, it has a different inode number in the volfs path and FileRefURL. Its Clone checkbox is ticked, as it is a clone, but it only has a single Ref count, as none of the hard links point to this clone file.

The same goes for the second clone, with its own inode number, ticked as a Clone, and single Ref count.

Are they identical?

The final question you might ask is whether files are identical. In the case of hard links, the answer is simple: as they’re the same file in disguise, yes, they are absolutely identical.

Clones require a bit more work, as they will continue to be shown as clones even though their contents may be quite different by then. The best answer is to compute the SHA-256 hash of the file’s data, and compare that between two clones. If you’re interested in any of their metadata contained in their extended attributes, then you’ll need to check those as well.

How disk images can become sparse files

By: hoakley
24 April 2025 at 14:30

It’s no miracle that a 10 GB disk image can be shrunk down to a few MB of sparse file. This article explains how APFS works that magic, first on a normal read-write disk image, then in the disk image inside a Virtual Machine.

What is a sparse file?

In essence a sparse file is any file whose allocated storage is smaller than the nominal size of that file. In APFS, this imposes two requirements:

  • the INODE_IS_SPARSE flag is set for that file’s inode,
  • the sparse byte count is given in its extended field.

As a result, the total size of storage allocated to that file’s data in its file extents is smaller than the total required to store the file at its nominal size. This is because the file contains empty data that isn’t stored on disk, saving disk space. This becomes clearer when we consider how this works with regular read-write disk images.

How a disk image becomes sparse

To demonstrate how this works, create a read-write disk image, which APFS will then turn into a sparse file. For the sake of simplicity, I’ll ignore all overheads such as the file system in that disk image.

APFS uses 4 KB storage blocks on SSDs. Creating a 4 GB disk image using DropDMG or Disk Utility therefore uses one million blocks. For this example I number those starting from 0000 0001 in hexadecimal, rising to 000F 4240 at the end of that disk image file, a million blocks later.

Once that has been created, copy a 4 MB file into the disk then unmount it, and mount it again. When it’s mounted that second time, APFS Trims it, and marks all its storage blocks apart from those 4 MB as being unused. That leaves my file occupying blocks 0000 0001 to 0000 03E8, and 0000 03E9 to 000F 4240 unallocated. APFS therefore sets the disk image file’s INODE_IS_SPARSE flag to TRUE and writes the sparse byte count to its extended field: the disk image is now a sparse file.

Creating a VM disk image

Unlike that read-write disk image, a disk image used for a Virtual Machine (on Apple silicon, at least) is created a sparse file in the first instance, using code like
let diskFd = open(diskImagePath, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR)
var result = ftruncate(diskFd, sizeDisk)
result = close(diskFd)

(error handling omitted) where sizeDisk is the size in bytes. Similar can be achieved in Terminal using the command
dd if=/dev/zero of=Disk.img bs=1m count=0 seek=10240
where the number given for seek is the size in blocks.

Maintaining the VM disk image

A read-write disk image is mounted and Trimmed by APFS on the host Mac. That used for a VM is different, as it’s the guest OS that has the task of Trimming the disk image from inside, and that works just the same as when macOS is booted on a Mac.

Read the entries made in your Mac’s log by APFS during startup. Those appear early with the start of APFS, when the version is given:
33.263 apfs_module_start:3403: load: com.apple.filesystems.apfs, v2332.101.1, apfs-2332.101.1, 2025/04/11

A little later, APFS Space Manager (Spaceman) Trims the first partition/container with log entries like:
34.012 spaceman_scan_free_blocks:4106: disk1 scan took 0.002064 s, trims took 0.000443 s
34.012 spaceman_scan_free_blocks:4110: disk1 101104 blocks free in 47 extents, avg 2151.14
34.012 spaceman_scan_free_blocks:4119: disk1 101104 blocks trimmed in 47 extents (9 us/trim, 106094 trims/s)
34.012 spaceman_scan_free_blocks:4122: disk1 trim distribution 1:14 2+:18 4+:8 16+:2 64+:1 256+:4

A couple of seconds later it trims a second partition:
36.391 spaceman_scan_free_blocks:4106: disk3 scan took 1.749635 s, trims took 1.147491 s
36.391 spaceman_scan_free_blocks:4110: disk3 351308484 blocks free in 319729 extents, avg 1098.76
36.391 spaceman_scan_free_blocks:4119: disk3 351308484 blocks trimmed in 319729 extents (3 us/trim, 278633 trims/s)
36.391 spaceman_scan_free_blocks:4122: disk3 trim distribution 1:118376 2+:48105 4+:82602 16+:42698 64+:24673 256+:3275
36.391 spaceman_scan_free_blocks:4130: disk3 trims dropped: 10469 blocks 10469 extents, avg 1.00

The following matching entries are taken from a macOS VM as it boots on an Apple silicon Mac:
02.557 apfs_module_start:3403: load: com.apple.filesystems.apfs, v2332.101.1, apfs-2332.101.1, 2025/04/11

03.278 spaceman_scan_free_blocks:4106: disk2 scan took 0.001036 s, trims took 0.000770 s
03.278 spaceman_scan_free_blocks:4110: disk2 126731 blocks free in 15 extents, avg 8448.73
03.278 spaceman_scan_free_blocks:4119: disk2 126731 blocks trimmed in 15 extents (51 us/trim, 19480 trims/s)
03.278 spaceman_scan_free_blocks:4122: disk2 trim distribution 1:4 2+:4 4+:5 16+:0 64+:0 256+:2

03.570 spaceman_scan_free_blocks:4106: disk4 scan took 0.295283 s, trims took 0.285527 s
03.570 spaceman_scan_free_blocks:4110: disk4 19188027 blocks free in 9939 extents, avg 1930.57
03.570 spaceman_scan_free_blocks:4119: disk4 19188027 blocks trimmed in 9939 extents (28 us/trim, 34809 trims/s)
03.570 spaceman_scan_free_blocks:4122: disk4 trim distribution 1:6010 2+:1072 4+:1775 16+:700 64+:244 256+:138
03.570 spaceman_scan_free_blocks:4130: disk4 trims dropped: 4252 blocks 4252 extents, avg 1.00

Just as the Trims performed on the host free up unused blocks of storage on the boot disk, so those in the VM do the same for the VM disk image. To demonstrate how those maintain the VM disk image in sparse format, I wrote two files inside the VM when it was running. One was a plain 10 GB file taking 10 GB on disk, the other a 10 GB sparse file taking a few MB. I then closed the VM and measured its size on disk, opened it again, deleted those two files and closed it again. During this I also took screenshots to verify changes recorded by Disk Utility in the free space inside the VM.

Before writing the two test files, the VM’s disk image size on the host was 107 GB, and it took 23.98 GB on disk as a sparse file. When it contained the two test files, its size remained the same, and it took 34.01 GB on disk. After deleting the files inside the VM, the disk image’s size remained the same, but it only took 24.01 GB on disk, and internally the VM reported that it had returned to 78.9 GB of storage available, the same as it had started with.

As expected, when the VM Trimmed it freed up storage space no longer used by the deleted files, as a result of which the VM disk image required less space on disk.

How the magic works

  • Read-write disk images are created as normal files. They’re Trimmed by APFS on each subsequent mount, and may then become sparse files when there’s sufficient unused space in them.
  • VM disk images are created as sparse files. They’re Trimmed by APFS in the VM during each boot and on demand, maintaining their sparse format when they have sufficient unused space.

❌
❌