Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Disk Images: Bands, Compaction and Space Efficiency

By: hoakley
21 October 2024 at 14:30

Of the two most dependable types of disk image, read-write (UDRW) disk images are the simpler. The only options you can set are its internal file system, whether the container is encrypted, and its maximum capacity. From there on, macOS will automatically Trim the disk image when it’s mounted, and adjust the size it takes on disk accordingly.

Sparse bundles can be more complex to configure in the first place, and may need routine maintenance to ensure their size on disk doesn’t grow steadily with use. This article considers how significant those complications are.

Band size

Most if not all storage divides data into blocks; in the case of SSDs, the default block size in APFS is 4,096 bytes, and that’s effectively the unit of storage in the single file of a read-write disk image.

Its equivalent in a sparse bundle is the maximum size that each of its band files can reach, normally set to 8.4 MB in macOS Sequoia and its predecessors, and equivalent to slightly more than 2,000 SSD blocks. Smaller band sizes are more efficient in the space required to store the contents of a sparse bundle, but require more band files for the same quantity of data stored. Experience with older versions of macOS suggests that it’s not a good idea for the total number of band files to exceed 100,000, but I’m not aware of any recent evidence to support that.

When creating very large sparse bundles, of the order of TB, macOS now may adjust the band size to limit their number. Some users report that those sparse bundles perform better as a result, for instance when being used to store backups on a network. There don’t appear to be any advantages to using band sizes less than 8.4 MB.

Neither DropDMG nor Disk Utility currently appear to allow any control over band size; Spundle and hdiutil not only let you specify custom band sizes when creating sparse bundles, but also allow you to change band size by copying an existing sparse bundle into a new one, which could be useful if you were changing the maximum size of a sparse bundle.

Compaction

Each time a writeable APFS or HFS+ file system is mounted in a disk image, including both sparse bundle and read-write types, storage blocks that are no longer in use should be returned for reuse by the process of Trimming. This is automatic, and in the case of single-file read-write disk images, is the only maintenance that occurs.

Because sparse bundles add another layer of band files over SSD storage, they too require periodic maintenance in the process of compaction. To compact a sparse bundle, macOS scans the band files and removes those that are no longer being used by the file system in the sparse bundle. This only applies to sparse bundles with APFS or HFS+ file systems, though, and isn’t guaranteed to free up any space. One small catch is that, by default, compaction won’t take place on a notebook running on battery power; hdiutil and Spundle provide an option to override that.

Space efficiency in use

I’ve been unable to find any objective comparison between space efficiency of modern read-write disk images and sparse bundles, so have performed my own on a USB4 external SSD connected to my Mac Studio M1 Max. I created two test cases of 125 GB images on a freshly-formatted APFS volume, one a sparse bundle with the default 8.4 MB band size, the other a read-write disk image (UDRW). In macOS Sequoia 15.0.1, the initial size they took on disk was 14.8 MB for the sparse bundle, and 336 MB for the disk image.

Using my utility Stibium, I then wrote and deleted a series of files in each, as follows:

  1. 160 files of 2 MB to 2 GB written totalling 53 GB
  2. 40 largest of those, sizes 600 MB to 2 GB, were then deleted, leaving 120
  3. another 160 files written totalling 53 GB
  4. 40 largest of those deleted, leaving a total of 240
  5. another 160 files written totalling 53 GB
  6. 40 largest of those deleted, leaving a total of 360
  7. another 160 files written totalling 53 GB
  8. 40 largest of those deleted, leaving a total of 480
  9. another 160 files written totalling 53 GB, leaving a total of 640 files.

At that point, the sparse bundle had 104.08 GB stored in 12,408 band files, and the read-write disk image had 91.63 GB stored in its single file. I then deleted all files and folders in each of the images, and unmounted and remounted each image several times to ensure Trimming had completed. The sparse bundle then had 1.12 GB in 135 band files, and offered 124.66 GB of free space when mounted; the read-write disk image was slightly smaller at 937.9 MB on disk and offered exactly the same amount of free space when mounted.

Compacting the empty sparse bundle was reported as reclaiming 52.3 MB, and reduced the disk space taken by the band files slightly to 1.06 GB while retaining the same number of bands, and still offering 124.66 GB of free space when mounted.

In this case, following writing 800 files totalling over 250 GB, and deleting a total of 160 of them, compaction made a remarkably small difference to the free space returned following removal of all files. There is also very little difference in the space efficiency of sparse bundles and modern read-write disk images.

One consistent difference observed throughout was in write speeds: they remained constant at 3.2 GB/s for the sparse bundle, and 1.0 GB/s for the read-write disk image, as reported here previously.

Conclusions

  • Sparse bundles are more complicated than read-write disk images (UDRW), with band size to be set, and compaction to be performed.
  • Default band size appears to work well, and manually setting band size should seldom be necessary.
  • Both types appear highly efficient in their use of disk space, with only small differences between them.
  • Although it might be important to compact sparse bundles in some cases, the amount of free space returned by compaction is unlikely to be significant in many circumstances.

Previous articles

Introduction
Tools
How read-write disk images have gone sparse
Performance

❌
❌