Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

How to search document versions

By: hoakley
25 May 2026 at 14:30

Document versioning built into macOS is an unfinished masterpiece that promises much but never seems to have been developed as fully as it deserved. This article looks at how macOS can’t search saved versions, and how you can work around that.

In essence versioning is simple: apps that support it, and a great many do now, save a series of versions to the volume’s hidden and sealed database in the .DocummentRevisions-V100 folder at its top level. To access those versions you’d normally use the Time-Machine-like browser provided by the Browse All Versions command in the Revert To item in the app’s File menu. Whenever the app saves a document, the open document becomes the current version, and its saved state becomes the previous version. This works for manual saves, and for any automatic timed saves the app might make.

Unlike all other versioning systems, this is all handled automatically by macOS, and neither you the user nor the app developer has to make any effort to create or manage those versions. It really does come for free.

Unfortunately, all those saved versions in the version database fall outside the scope of Spotlight indexing, and Spotlight search can’t look inside any of the old versions saved in a volume’s version database. Surprisingly, the version browser doesn’t offer any search facilities either, as that’s presumably another feature intended for a future that never came.

This is a serious omission, as I access old versions not infrequently, and being able to search for them saves me laborious browsing. It might be a few hours or days after I removed a section from a document, that I realise I need it back. By that time it may well have vanished from Time Machine’s hourly backups, or the section may have been too transient to be retained there. But the chances are that the missing content will be safe inside a saved version, if only I can find it.

Pulling tricks with the hidden .DocumentRevisions-V100 folder isn’t a good approach to solve this. It’s clear from its contents that previous versions aren’t saved as discrete files, but it uses a chunking system to store what has changed between versions, for economy in space. Access supported by macOS is strictly limited to looking up saved versions for any given file in that volume, and there’s no way to search their contents like that.

One way around this is to save each document version as a separate file, allow Spotlight to extract their contents and add those to its indexes for that volume, then to search those files. This is quick and simple using my free utility Versatility. To demonstrate this, I picked two documents with a substantial number of versions:

  • a Swift source code file edited in Xcode with 112 versions, with just one of them containing a function named loadAppexIndexer;
  • a large Pages document with a mere 49 versions, where I was looking for the first containing the placename Hulverstone.

In both cases I started by dropping the current document onto Veratility’s window, and saving individual archived versions to a new folder alongside that original document. I then opened that archive folder in a Finder window, and converted that to a Find window with that command in the Finder’s File menu. I entered my term, loadAppexIndexer or Hulverstone, into the search box, and changed the search scope from This Mac to the open archive folder.

In the Swift code, Spotlight immediately found the term in the file numbered 033 by Versatility, and all versions from the file numbered 023 in the case of the Pages document.

With that Finder window still open I was then able to locate those versions in the original documents:

  • Using my free Revisionist, the version numbers start from 1, whereas Versatility starts them at 000. So I added 1 to the number in the filename, and previewed that version in Revisionist. In both cases that’s sufficient to copy content that had gone missing from the current version of the document, for example.
  • Using the version browser in XCode or Pages, I looked back for that version’s datestamp, given in the Finder window as its Date Created, and brought that old version up in the browser.

Once happy I had done what I wanted with that old version, I then trashed the archive folder created by Versatility.

To summarise the sequence:

  1. drop the versioned document on Versatility’s window;
  2. save the archive folder alongside the original document;
  3. open a Finder Find window on that archive folder;
  4. using the search box and find bars, locate the version(s) in the archive folder;
  5. to open the versioned document in Revisionist, add 1 to its file version number;
  6. to open in an app version browser, select the date of that version as shown in the Finder window;
  7. on completion delete the archive folder.

Happy finding!

Last Week on My Mac: Intel Macs will be stuck with bugs

By: hoakley
24 May 2026 at 15:00

Just over six months ago a series of weird bugs came to light in Spotlight indexing. The first report was that plain text files beginning with the characters LG are never indexed, so their contents can never be found by Spotlight search. The mystery deepened when the same was discovered for text files beginning with the characters NPA or Draw. It was appropriately Drew who worked out the common factor behind this apparently bizarre connected behaviour: all three files are identified as not being text by the old Unix utility file(1), used to recognise file types by ‘sniffing’ their contents.

You can verify that by creating a plain text file with any of those three sets of characters at its start, then running the command file on that file. In the case of one beginning with Draw, file will identify it as RISC OS Draw file data, even though the file has an extension of txt or text and a UTI of public.plain-text. At that the RichText mdimporter, which analyses all text-based files for metadata to enter into Spotlight’s indexes, throws its hands up in horror and refuses to index the file’s contents. Change those opening characters in that file, perhaps by adding a leading space, and all of a sudden the mdimporter works as expected.

Following our collaborative effort here, particularly Drew’s insight, we realised this bug has been silently blocking the indexing of seemingly random text files for the last three years or more. What remained unanswered at the time was what that mdimporter was doing running file(1) on files whose UTI made it clear that they were in plain text, not some long-forgotten binary vector graphics format from 1989. I believe I now have an answer, thanks to my recent work on QuickLook’s qlgenerators.

QuickLook’s generators take advantage of the hierarchical structure of UTIs. Rather than accepting the most specific UTIs such as public.jpeg, Image.qlgenerator works with all files whose UTI conforms to the generic UTI of public.image, and then undertakes its own format detection. This enables it to generate correct thumbnails and previews of HEIC images that have been given the incorrect extension of jpg, for instance.

Similarly, a Swift source-code file with the extension of swift and the UTI of public.swift-source is handled by the Text.qlgenerator because public.swift-source conforms to public.plain-text, the UTI required for use of that generator.

What if Spotlight’s mdimporters were to work the same?

We know the built-in RichText.mdimporter is used to extract metadata for a wide range of files containing text, which all conform to the generic UTI of public.text. It then classifies them on the basis of their contents to work out what to index. What if that’s performed using file(1), so rejecting perfectly valid text files as ancient binary vector graphics files, and so on?

We can’t get the same direct evidence from the log that I obtained for QuickLook, as Spotlight is far less informative in its log entries. We can get clues from looking at output from mdimport and mdls, though. While a non-deviant text file contains a metadata attribute extracted by its importer as kMDItemTextContent containing the text in the file’s data, that’s missing from a text file starting with any of the three known triggers. In turn that’s associated with the attribute _kMDItemPrimaryTextEmbedding containing ‘vec_data’ listed by mdls, which is also missing for the deviant files.

There is hope that a third party might be able to undercut RichText.mdimporter by providing a bug-free importer for public.plain-text, but that relies on the built-in importer targeting public.text rather than public.plain-text. The best solution would be for Apple to fix the identification of text files instead of relying on file(1), which dates from 1973. Given that these deviant files work perfectly with QuickLook’s generator, it appears Apple has already solved this problem there. So I suspect this bug in RichText.mdimporter will never be fixed in Sequoia or Tahoe.

With the first beta-release of macOS 27 just a couple of weeks away, this leaves those using the last Intel Macs stuck with Spotlight indexing that will never work on some text files, assuming that at some point in the not too distant future this bug is finally fixed in an Arm-only macOS. This is all sadly familiar from the loss of 32-bit support in the transition from Mojave to Catalina, when little if any effort was devoted to making Mojave as free of bugs as possible before it was abandoned in the rush forward to 64-bit.

It would have been far better to be able to look back in fondness with macOS that worked better, than looking back in anger at what never got fixed.

One last thing to remember is that, when Apple does fix this bug, you’ll have to force Spotlight indexes to be rebuilt on each of your Mac’s volumes to ensure that the contents of these files are incorporated. We learned that last time there was a serious bug in the same importer, which failed to index the contents of RTF files.

Fun with UTIs, QuickLook and Spotlight

By: hoakley
23 May 2026 at 15:00

In case you haven’t got the message from the last few weeks looking at Spotlight and QuickLook, UTIs (Uniform Type Identifiers) are important, but not always as critical as they could be. To understand how macOS copes with misleading UTIs, I have a little demonstration you can try in the privacy of your own Mac.

All you need for this is an image with some Exif metadata. Those taken by an iPhone are particularly suitable, as they usually contain rich Exif information about which model took that image, focal length, aperture, exposure time and more. In my case, the image is in HEIC format. If you have my apps UTIutility and SpotTest, you can also explore UTIs more thoroughly, and inspect the metadata from images that gets indexed by Spotlight, but those are optional extras.

A file with the extension HEIC or heic is assumed to have the UTI public.heic, which conforms to public.heif-standard, and that in turn conforms to public.image, the parent of most image formats in macOS. The Help book for UTIutility shows these in a dense diagram.

Select that image in the Finder’s Column view to inspect its public metadata. While the image is selected, open Show Preview Options in the View menu and enable all the metadata listed there to be shown in previews.

You should then have a good preview pane with lots of metadata below it.

Next open a new Finder window and set it to Find. Using the search criterion popup menu, enable the Device model attribute, or another your image has metadata for, and search for that attribute, here iPhone XR, and you should see your image among the hits.

If you have SpotTest to hand, drop your image on its Drop Window. Being an image, it will crash mdimport, so the information you’ll see will be the metadata fields from Spotlight’s indexes, which should include the Device model as kMDItemAcquisitionModel.

So far, everything has worked as expected, but we’re now going to throw a spanner in the works by changing the extension on that image from HEIC to jpg, which changes the image’s UTI to public.jpeg, although that still conforms to public.image.

Its basic thumbnail icon now changes to a generic JPEG icon, so we’ve managed to confuse the basic thumbnailing scheme in QuickLook. But it’s still shown in the preview pane correctly, with all its metadata intact.

This is because that image has its larger thumbnails and previews generated by the qlgenerator for public.image, and that goes out of its way to parse the file data correctly, and recognise this is really a HEIC not a JPEG. If you’ve left the Finder Find window open, you’ll see that continues to find the image as if nothing had happened, as Spotlight also imports metadata using a common mdimporter for public.image, rather than relying on the more specific UTIs of public.jpeg or public.heic.

Finally, change the file’s extension to text, and you’ll see a preview of its text content, and it vanishes from the Find window too. That’s because text files are handled by their UTI of public.text, which includes public.rtf and others. Those don’t check the file data to ensure they’re not images, so the file is now being handled by the wrong qlgenerator and mdimporter, and won’t make any sense. As public.text formats don’t support Exif data, that isn’t extracted either, as you can see in SpotTest.

Change the extension back to heic, and you’ll see how quickly the right qlgenerator and mdimporter correct its thumbnails, previews, and search discovery, thanks to UTIs.

How to search Time Machine backups?

By: hoakley
22 May 2026 at 14:30

For the last 15 years or so, local Time Machine backup storage has been required to be included in the volumes that are indexed by Spotlight. We also know too well that they have been indexed, as it has been common for their indexing to take longer than the backup they have just made. Some time around the release of Sonoma, those indexing sessions became less noticeable, but unless you tried to search your backups, you probably didn’t notice any change. For, as far as I can tell, Spotlight doesn’t currently appear able to search Time Machine backups reliably, at least not in Sequoia or Tahoe, although this may not be universal.

For most purposes, the ability to search backups is essential. If you have a series of more than 100 backups over the last couple of years, finding a lost file by inspecting each backup individually is a frustrating waste of time, and requires you to know where to look in each. Even if full content and metadata searching aren’t feasible, the ability to search on file attributes such as name, extension and datestamps is surely fundamental.

As we’ve come to expect, Apple’s documentation isn’t in the least bit helpful. What is surprising is that the instructions given are almost identical for every version of macOS from Mojave to Tahoe.

That page opens with a bold promise: “If you use Time Machine to back up your Mac, you can use Spotlight to initiate a search of Time Machine to recover lost or deleted items.”

That’s just what I’m looking for, so how do I do that? “On your Mac, open a Finder window, then type a search word or phrase in the search field in the upper-right corner. Refine the results by specifying search criteria using the search bar.”

Everything’s good so far, but as the document I’m looking isn’t there, how do I search for it in my backups? “Click the Time Machine icon in the menu bar, then choose ‘Browse Time Machine backups’.”

That opens the Time Machine app, blows away my search criteria, and lists the volumes including Macintosh HD and my backup storage, as of now. So how do I search for the file that I accidentally deleted a couple of hours ago?

“Use the arrows and timeline to browse the Time Machine backups.” But that’s looking for the missing file, not searching those backups for it.

If I now step back through my backups to reach one that I know contains that file, I can restore it. But if I type anything into the search field, nothing is found. If I change the scope of the search to that backup, the window title changes but its contents remain blank, and there isn’t even a busy spinner to indicate a search is in progress.

With a little fiddling, I managed to get some results for other searches. Here’s an example where I was looking for files whose name contains logui with the extension swift.

Here I ended up with 102 hits, all of them old Fortran source files, none of which meets either of those criteria.

This time the two items found had appropriate names, but a completely different extension.

Undeterred, I left my Mac for over 24 hours, and tried again, only to discover the hourly backup containing my missing file had already been deleted. However, searching for files whose name contains logui with the extension swift proved just as futile. As I can’t disable Spotlight indexing on that volume without macOS telling me that it’s required to be indexed by Spotlight, neither can I force that volume to be reindexed.

There are third-party alternatives, including BackupLoupe and Find Any File (FAF). The former tellingly needs to create and maintain its own indexes, and FAF appears to work fairly reliably but takes an age to search each backup in turn.

In case this was a problem with one set of backups, I have now created a new backup set that suffers identical problems, and have reproduced this in both Sequoia and Tahoe, running on vastly different hardware. My conclusion is that using Spotlight to search Time Machine backups no longer works, and the instructions given by Apple are also broken. If you’ve managed to get reliable search working across your Time Machine backups without resorting to a third-party product, I’d be very grateful if you could explain how you did it.

Last Week on My Mac: Syncing metadata in iCloud Drive

By: hoakley
17 May 2026 at 15:00

How would you cope if you started your Mac up to find every datestamp on every file had been set to 1 January 1970 at 00:00? Or that every Finder Tag had vanished? What if all the Exif information for your images had been wiped out? Metadata like those aren’t optional extras we can afford to lose, although other metadata, like Finder Info and the URL from which a file was downloaded, are more ephemeral.

Xattr flags

Back in 2013, when iCloud was just two years old, Apple’s engineers implemented a scheme to determine how metadata stored as extended attributes (xattrs) should be copied, to take into account its permanence. This was important at that time to ensure only the right xattrs were preserved when files were stored in iCloud Drive. They did this by adding flags to the xattr’s name to indicate which actions should result in preservation of that type of xattr, and the metadata it contains.

Apple provides an extensive suite of xattr types to store metadata such as author names, keywords and comments, whose names all start with com.apple.metadata:. These are matched by corresponding metadata attribute keys, for example the xattr com.apple.metadata:kMDItemKeywords bears the metadata content accessed by the key kMDItemKeywords. Rather than appending flags to them all, this scheme laid down a table of default flags to be applied.

Being exemplary engineers, this was all detailed in the source code for copyfile, which is also open source. However, as this was just after Apple stopped creating much of its technical documentation, and little has ever been published about iCloud Drive, that’s as far as that information went.

For iCloud Drive, the relevant flag is S, XATTR_FLAG_SYNCABLE, defined as “This indicates that the extended attribute should be copied when the file is synced on services like iCloud Drive. Sync services may enforce additional restrictions on the acceptable size and number of extended attributes.”

Default flags set for xattr types com.apple.metadata:* (except for five of the less used types) are PS to preserve them during copy, save, sync and backup. You can read the macOS 26 version of those source files in Apple’s Open Source GitHub.

FileProvider

For the next decade or so, iCloud Drive mostly respected those behaviours, although without documentation it all appeared puzzling until you read the copyfile source. Then Apple decided to modernise it by implementing its FileProvider framework, a major architectural change for the better. Although the framework is intended to be common to all cloud storage services, there remains ample scope for iCloud Drive to offer enhancements.

Current FileProvider documentation for developers states:
“The extended file attributes are part of the item’s metadata. The system sets extended attributes on dataless files, and preserves them on files that it renders dataless. The system decides which attributes to sync. To sync an attribute, it calls the xattr_name_with_flags(_:_:) method and passes the XATTR_FLAG_SYNCABLE flag. Some older attributes are also synced.”

While FileProvider appears intended to conform to the xattr flag scheme, there’s no mention of the tables of default flags listed in the copyfile source code.

Implementation

Last week I set out to test and document how well iCloud Drive manages metadata, following the xattr flag system and copying up all those invaluable com.apple.metadata:* xattrs, only to discover it doesn’t any more. Of Apple’s many standard xattrs, the only ones I found that synced reliably over iCloud Drive in macOS 26.4.1 were:

  • com.apple.metadata:_kMDItemUserTags (Finder Tag)
  • com.apple.lastuseddate#PS
  • com.apple.quarantine
  • com.apple.TextEncoding

The good news is that iCloud Drive through FileProvider does respect the S flag, but only where it’s explicit, rather than being set in that table of defaults. For those of us who have our own xattr types, such as those used for integrity checking by Dintch, Fintch and cintch, can still rely on iCloud Drive to sync those, provided they bear that S flag.

The bad news is that none of the most popular metadata-bearing xattrs are synced, apart from Finder Tags. All those com.apple.metadata:kMDItem* xattrs are blocked, as FileProvider doesn’t recognise them as having their default PS flags.

Workarounds

I have since looked at two workarounds, explicitly adding the S flag to those types that should be accorded it by default, and changing the xattr names from com.apple.metadata:kMDItem* to com.apple.metadata:_kMDItem*.

Adding xattrs like com.apple.metadata:kMDItemKeywords#S with an explicit S flag does ensure FileProvider syncs those xattrs. However, as neither Spotlight indexing nor the Finder appear to understand that flag isn’t really part of the type name, those preserved xattrs are of no use: they aren’t recognised by Spotlight for indexing, nor are they displayed in the Finder’s Get Info dialog or Preview pane.

Inserting an underscore into the xattr’s name was even worse, as it didn’t lead to their preservation, and was ignored by Spotlight indexing and the Finder.

This raises another interesting issue, in that there doesn’t appear to be a way to extend Spotlight indexing to encompass additional xattr types. There are extensive third-party type libraries, such as org.openmetainfo:* from Tom Andersen of Ironic Software and OpenMeta. But those appear to require their own indexing, search and presentation support.

Error

When I first realised the impact on metadata in iCloud Drive, I assumed this arose because the engineers implementing FileProvider, or those porting iCloud Drive to use that framework, had’t been aware of the xattr flag system and its primary purpose. But if that had been true, they couldn’t have respected explicit use of the S flag. Thus, I’m left with two plausible explanations:

  • Apple’s FileProvider/iCloud Drive engineers are unaware of the system defaults table in copyfile, so assumed that com.apple.metadata:* xattrs weren’t intended to be preserved when syncing.
  • Apple decided to end default sync support for com.apple.metadata:* xattrs in iCloud Drive.

The second of those is even more erroneous than the first, as it reduces iCloud Drive support for metadata to a level common with third-party cloud providers. In this respect, users will see no difference from the behaviour of Dropbox or Microsoft OneDrive, which isn’t the marketing choice I’d have expected from Apple.

Conclusion

iCloud Drive no longer syncs much metadata stored in xattrs, in particular that stored in com.apple.metadata:* xattrs, apart from Finder Tags. There is no workaround, and unless Apple restores that feature, it limits the use of iCloud Drive with Macs.

Chinese whispers in PDF metadata

By: hoakley
15 May 2026 at 14:30

Chinese whispers is an old children’s game where everyone sits in a circle, and one child whispers into the ear of the next on their right a sentence like Send reinforcements, we’re going to advance. That child then whispers the message they heard to the child on their right, until it reaches the one who started it, who says out loud what they heard, classically Send three-and-fourpence, we’re going to a dance, as a demonstration of how messages can so easily become corrupted. What this has to do with China remains one of childhood’s mysteries. I should also explain that three-and-fourpence was idiomatic British English in the days before our currency was ‘decimalised’, and meant three shillings and four (old) pence, about 17 (new) pence, sufficient at one time to enjoy a good night out.

In this article I’m going to do much the same with metadata for a PDF document, tracing what gets indexed by Spotlight, so becoming discoverable by search, and what is displayed in the Finder. This relies on several of my utilities, most of which are available from this page.

Source PDF

I prepared a completely unrelated PDF using my favourite PDF editor, PDF Expert, by adding metadata to be saved in the file’s data. As you might expect, there are several ways that could be stored in the PDF format, including XMP metadata, but in this case for simplicity they were added in the document information dictionary.

I inspected that in a source view in Podofyllin, which found the following fields:
/Author (Author name in pdf)
/Creator (Pages)
/Keywords (keyword1 pdf)
/Subject (Subject in pdf)
/Title (0PDFtest1accessdefault)

When rendered in macOS, those are ‘flattened’ by its Quartz PDF engine, to
/Author (Author name in pdf)
/Creator (Pages)
/Keywords (keyword1 pdf)
/AAPL:Keywords [(keyword1 pdf)]
/Subject (Subject in pdf)
/Title (0PDFtest1accessdefault)

Note the copying of keywords into a new attribute AAPL:Keywords.

Extended attributes

I then added seven extended attributes using Metamer, with names such as com.apple.metadata:kMDItemAuthors, as shown below in xattred.

Spotlight import

I then inspected the file in SpotTest’s new Drop Window, which reported the following attributes found by mdimport:
":EA:kMDItemAuthors" = "author in xattr";
":EA:kMDItemComment" = "xattr comment";
":EA:kMDItemCreator" = "xattr creator";
":EA:kMDItemDescription" = "xattr description";
":EA:kMDItemKeywords" = "keyword1,xattr";
":EA:kMDItemSubject" = "xattr subject";
":EA:kMDItemTitle" = "xattr title";

all from the extended attributes, while those derived from the PDF data were
kMDItemAuthors = (Pages);
kMDItemCreator = Pages;
kMDItemDescription = "Subject in pdf";
kMDItemKeywords = ("keyword1 pdf");
kMDItemTitle = 0PDFtest1accessdefault;

Those attributes have already changed, with PDF Subject becoming kMDItemDescription, Creator being duplicated into kMDItemAuthors, and the loss of PDF Author.

Spotlight indexes

Attributes reported by mdls changed again to
kMDItemAuthors = (Pages)
kMDItemComment = "xattr comment"
kMDItemCreator = "Pages"
kMDItemDescription = "Subject in pdf"
kMDItemKeywords = ("keyword1,xattr")
kMDItemSubject = "xattr subject"
kMDItemTitle = "0PDFtest1accessdefault"

This has lost the xattr attributes kMDItemAuthors, kMDItemCreator, kMDItemDescription and kMDItemTitle, and the PDF kMDItemKeywords. That list of 7 attributes should then be searchable using Spotlight.

The Finder

The final step was to discover which of those could be displayed in the Finder, either in its Get Info dialog, or in the Preview panel of a Finder window.

Only 5 of those attributes survived in the Finder, and were given as
Authors: Pages
Content Creator: Pages
Description: Subject in pdf
Keywords: keyword1,xattr
Title: 0PDFtest1accessdefault

Of those, 4 are taken from the metadata in the PDF file, and only the Keywords were taken from its extended attribute. The attribute named as Authors contains a duplicate of what had originally been in the PDF Creator field, but neither of the PDF Author or xattr kMDItemAuthors fields. Those paths are traced in the diagram below.

Conclusions

Of the total of 12 distinct metadata attributes added in the PDF data and extended attributes, only 6 different items were indexed by Spotlight, and 4 were displayed in the Finder (allowing for the duplication of Authors and Content Creator).

Before relying on metadata for search and access in the Finder, it’s essential to verify that the attributes you intend using are successfully indexed and displayed. Choose the wrong attributes and you’ll never find anything.

SpotTest 1.2 can display Spotlight metadata directly

By: hoakley
14 May 2026 at 14:30

As promised, here is a new version of my free Spotlight test utility SpotTest, which will now display full information from two Spotlight command tools listing metadata for files.

To open its new Drop Window, either click on the new tool at the right end of the toolbar in its main window, or use the command in its Window menu. Then drag and drop files you want to inspect onto that window.

The app then runs two command tools on those files:
mdimport -t -d2 filename
to list all known metadata recognised by the mdimporter used, and
mdls filename
to list all indexed metadata.

That mdimport command currently crashes on most images, so won’t return any information about their metadata until Apple fixes the bug.

If you want to save the output in this Drop Window, select the file(s) output in its display, copy and paste it into a text editor or similar. You don’t have to keep the app main window open, and could use the Drop Window alone as a convenient way to inspect metadata.

As you’ll see, the length of mdimport output is significantly greater than that from mdls. Although there are matching entries, there’s no simple way to align those matches. Of all the possible layouts, I found this linked arrangement, where both outputs scroll alongside one another, the most effective for comparing their contents. It also allows you to view output for several files in the single window.

SpotTest version 1.2 is now available from here: spottest12
from Downloads above, and from its Product Page.

Tomorrow I’ll be using it to trace the paths of metadata from a source file to their display in the Finder.

What gets synced in iCloud Drive?

By: hoakley
12 May 2026 at 14:30

Following yesterday’s surprise about syncing of extended attributes via iCloud Drive, this article summarises what does and does not get synced in iCloud Drive in macOS Tahoe.

Attributes and data

File attributes are generally synced correctly and retained locally through eviction and download of each file. As those are saved in the inode data that is stored locally even when a file is dataless, they should remain reliable.

File data, complete with any embedded metadata such as EXIF fields and the optional Info section in RTF files, is retained and kept in sync with the copy in iCloud Drive while that file is kept downloaded. That is the only option when Optimise Mac Storage is turned off, and iCloud Drive is operating in replicated mode. When in nonreplicated mode, data is removed if the file is evicted from local storage, and synced down from iCloud Drive when it’s downloaded or materialised.

Extended attributes

As demonstrated in yesterday’s article, only the following xattrs are expected to sync down from iCloud Drive:

  • com.apple.metadata:_kMDItemUserTags (Finder Tag)
  • com.apple.lastuseddate#PS
  • com.apple.quarantine
  • com.apple.TextEncoding
  • other xattrs explicitly assigned the S flag, with #S appended to their name.

Other xattrs including those with names starting with com.apple.metadata:kMDItem aren’t likely to sync down from iCloud Drive without explicit use of the S flag.

Versions

Document versions saved in that volume’s version database in the hidden .DocumentRevisions-V100 folder will normally only be saved locally, when the local copy of that file is saved, and aren’t synced up to iCloud Drive at all. This has complicated effects best illustrated in the example below.

In this example, two Macs, A and B, are connected to the same iCloud Drive. At the start, Mac A creates and saves a new file, locally Version 1, in an iCloud Drive folder. This is synced up to iCloud Drive, from where it syncs down to Mac B, as the first version of that file.

Mac A then edits that file, saving it in two further versions. Each of those is synced up to iCloud Drive, and synced down to Mac B. However, as none of those versions is saved locally on Mac B, each is shown as the current and only version until the next one arrives.

Mac A is shut down after it has saved its version 3 of the file, and that has been synced up to iCloud Drive. Once that has synced down to Mac B, the document is edited there, and two versions are added to the first. Mac B’s Version 3 of that file is synced up to iCloud Drive, Mac B is shut down, and the file is opened and saved on Mac A, where that becomes its local Version 4.

At that stage, Mac A’s version database contains 4 versions, but no copy of Mac B’s Version 2. Mac B has 3 saved versions, including its Version 2 that isn’t saved in Mac A. iCloud Drive only has a single version, its Version 5, the same as Mac A’s Version 4 and Mac B’s Version 3.

The rules are simple:

  • Each Mac only adds a version to its own local version database when that file is saved locally.
  • At any moment, only the latest version is stored in iCloud Drive.

However, the behaviour can appear baffling, particularly when a document is observed in its QuickLook thumbnail as it’s being edited and saved on another Mac. You can then see its contents evolving with each save, but none of those are added to that Mac’s version database unless that version is saved there.

The end result is an incomplete version history on each Mac. The only way to unify those is in a third-party utility such as Versatility or Revisionist to consolidate them. As this doesn’t require any intervention by the File Provider framework, this is likely to be similar in third-party cloud services using that framework, such as Dropbox and Microsoft OneDrive.

Spotlight index content

No indexes from Spotlight are synced across iCloud Drive, but once a file has been synced down from iCloud Drive, its contents should be indexed locally, on the Data volume. Those indexed contents should remain accessible to search even when the file has been evicted from local storage.

QuickLook previews

QuickLook thumbnails and previews are also generated and stored locally. Those that have been cached to local storage remain accessible even when the file has been evicted. When there’s no cached version available, a generic icon for that file type will be displayed until an evicted file has been downloaded again. At one time, selecting an evicted file and calling for its thumbnail or preview was a simple way to force the file to be downloaded, but that has now been fixed.

How to check whether Spotlight is getting the right metadata

By: hoakley
8 May 2026 at 14:30

Spotlight can only search the metadata it has entered in its indexes. As I demonstrated a couple of days ago in two test cases, some metadata may be present in a file and available for indexing, but may not be added to those indexes. This normally occurs because of a problem or bug in the mdimporter responsible for extracting metadata and passing it for storage in the indexes. Fortunately, macOS provides a method of identifying that, using two command tools.

Commands

The first command
mdimport -t -d2 filename
lists all its known metadata recognised by the mdimporter used. Currently, that may crash persistently for some types of file such as images.

The second command
mdls filename
lists all indexed metadata for that file, and shouldn’t crash.

mdimport – aspirations

Output from mdimport is a long catalogue of all the metadata attached to and associated with that file. This starts with a statement of the file examined, tells you its type as a UTI, and reveals which mdimporter was used:
Imported '/Users/hoakley/Documents/0xattrtests/testtext1.text' of type 'public.plain-text' with plugIn /System/Library/Spotlight/RichText.mdimporter.

It then tells you how many metadata attributes it found
35 attributes returned

Those are listed, starting with those found in extended attributes, prefaced by :EA:
":EA:kMDItemLastUsedDate" = "2026-05-04 18:52:32 +0000";

Then come standard file metadata
":MD:kMDItemPath" = "/Users/hoakley/Documents/0xattrtests/testtext1.text";
"_kMDItemContentChangeDate" = "2026-05-04 11:34:50 +0000";

The main body lists all the rest with the prefix kMDItem common to metadata
kMDItemContentCreationDate = "2026-05-04 11:34:49 +0000";

Among those are the UTI of the file, and its more general types in the UTI tree. These can explain why a file appears to have been processed by the wrong mdimporter
kMDItemContentType = "public.plain-text";
kMDItemContentTypeTree = ("public.plain-text", "public.text", "public.data", "public.item", "public.content");

There’s a long series of entries giving the long form of the file type in multiple languages
kMDItemKind = {en = "Plain Text Document"; };

Text content that has been indexed isn’t given in this form of the command, but a summary is:
kMDItemTextContent = "<<< Text content of 4968 characters >>>";

Those are the metadata that should then be passed to Spotlight to be stored in its indexes, but not necessarily what does get stored. To discover that, we need the mdls output. Note that additional metadata obtained by mediaanalysisd and the CGPDF Service aren’t included in this, as they operate separately from mdimporters and normally after significant delay.

mdls – reality

This output is far shorter, and contains entries in Spotlight’s indexes for that file, except for indexed text content. The only way to assess that is by searching for text it should contain.

This should match metadata attributes seen in the mdimport output, such as
_kMDItemDisplayNameWithExtensions = "testtext1.text"
kMDItemContentCreationDate = 2026-05-04 11:34:49 +0000
kMDItemContentType = "public.plain-text"
kMDItemKind = "Plain Text Document"

Examples

Plain text file with extended attributes

mdimport:
“:EA:kMDItemAuthors” = “Andy Bill Charlie”;
“:EA:kMDItemComment” = “A. regular comment.”;
“:EA:kMDItemDescription” = “A description.”;
“:EA:kMDItemKeywords” = “keyword1,ketwird2,keyword3”;
“:EA:kMDItemSubject” = “The subject.”;

mdls:
kMDItemAuthors = (“Andy Bill Charlie”)
kMDItemComment = “A. regular comment.”
kMDItemDescription = “A description.”
kMDItemKeywords = (“keyword1,ketwird2,keyword3”)
kMDItemSubject = “The subject.”

Metadata attributes were faithfully added to Spotlight’s indexes.

RTF file with extended attributes

mdimport:
“:EA:kMDItemAuthors” = “Andy Bill Charlie”;
“:EA:kMDItemComment” = “A. regular comment.”;
“:EA:kMDItemDescription” = “A description.”;
“:EA:kMDItemKeywords” = “keyword1,ketwird2,keyword3”;
“:EA:kMDItemSubject” = “The subject.”;
kMDItemAuthors = “<null>”;
kMDItemComment = “<null>”;
kMDItemKeywords = “<null>”;
kMDItemSubject = “<null>”;

The last four are those obtained from the (absent) Info metadata embedded in the file data, and conflict with those from four of the extended attributes.

mdls:
kMDItemComment = “A. regular comment.”
kMDItemDescription = “A description.”
kMDItemKeywords = (“keyword1,ketwird2,keyword3”)
kMDItemSubject = “The subject.”

These reveal that Spotlight’s indexes captured four of the five extended attributes, and ignored the null values for the Info metadata. However, kMDItemAuthors is missing, presumably because of a bug in the mdimporter.

I’m considering whether it might be useful to add these to SpotTest, to help diagnose problems.

How macOS can ignore and hide metadata

By: hoakley
6 May 2026 at 14:30

When I was researching yesterday’s article about storing and managing metadata, I encountered some unexpected behaviour that merited deeper investigation. This article explains how that can lead to macOS ignoring and hiding metadata.

Spotlight is about more than just search, as it runs the metadata services for macOS. Those ingest information about files, contents of their extended attributes, metadata extracted from file data such as EXIF in images, and indexed text content. When the Finder needs to populate the fields of a Get Info dialog, or those shown in a Preview pane, it calls on Spotlight’s volume indexes to provide the data. The process of analysing and extracting metadata is thus important to a wider range of features beyond search, and it’s concerning when metadata known to be present is unexpectedly missing.

To determine what might be going wrong, I took six test files in macOS 26.4.1:

  • two plain text files, indexed by /System/Library/Spotlight/RichText.mdimporter
  • one RTF file, indexed by /System/Library/Spotlight/RichText.mdimporter
  • one PDF file, indexed by /System/Library/Spotlight/PDF.mdimporter
  • one JPEG and one PNG screenshot, indexed by /System/Library/Spotlight/Image.mdimporter.

To each I attached five xattrs containing distinctive test text:

  • com.apple.metadata:kMDItemAuthors, known in the Finder’s Find search terms as Authors;
  • com.apple.metadata:kMDItemComment, known as Comment, and distinct from Finder or Spotlight Comment;
  • com.apple.metadata:kMDItemDescription, known as Description;
  • com.apple.metadata:kMDItemKeywords, known as Keywords;
  • com.apple.metadata:kMDItemSubject, known as Subject of this item.

I then used the Finder’s Find window to search for part of the contents of each of the extended attributes for each of the test files. I also viewed each file in a Preview pane, with maximal display of information, as described here, and in a Get Info dialog.

For each file, I ran the following two commands:
mdimport -t -d2 filename
to list all its known metadata recognised by mdimporter. That crashed for both the image files, a bug that has been long present. I also ran
mdls filename
to list indexed metadata, which didn’t crash.

Search results

Search worked as expected and proved successful for all files with Description, Keywords and Subject. However, I was unable to find Authors metadata for the RTF file, and Comment metadata couldn’t be found for the JPEG or PNG images.

Finder preview pane

The two image formats displayed none of the metadata used. Otherwise Keywords was shown in each, Subject and Comment were omitted from PDF, and Authors and Description omitted from RTF.



These are summarised in the table below.

There are two serious failures shown here, failure to index Authors metadata correctly in the RTF file, and failure to index Comment metadata correctly in the JPG and PNG files. Otherwise all failures appear to be the result of design decisions over which types of metadata should be displayed in the Finder.

Failure to index Authors in RTF

From the mdimport listings, all five xattrs were recognised correctly and should have been indexed. However, there were additional [null] listings for four of them, including Authors, presumably derived from the absent RTF Info section, an optional feature in the format and often absent. Separate testing revealed that, for the Authors metadata alone, the content extracted from the Info section of the RTF overrode that obtained from the xattr, even when it was null. Adding the authors to the Author (note the singular) entry in Info restored its indexing and search, but using the text in RTF Info rather than the xattr.

There are thus two related bugs here:

  • The Author field from RTF Info metadata should be indexed into its own field, to avoid conflict with any com.apple.metadata:kMDItemAuthors xattr present.
  • Only com.apple.metadata:kMDItemAuthors xattr content should be indexed into Authors.

Failure to index Comment in JPG/PNG

No mdimport listing was available for these image files because of the tool crashing persistently. However, it’s clear that the contents of Comment was taken not from the xattr, but from the EXIF UserComment, which had been set by macOS to Screenshot when that screenshot had been made. That metadata should surely have been indexed into a separate category such as a new kMDItemUserComment to avoid such collisions.

There are again two related bugs:

  • The UserComment field from EXIF metadata shouldn’t be indexed into Comment, but into its own field.
  • Only com.apple.metadata:kMDItemComment xattr content should be indexed into Comment.

Conclusions

  • There appear to be indexing bugs in /System/Library/Spotlight/RichText.mdimporter and /System/Library/Spotlight/Image.mdimporter. This is the third bug I have investigated in RichText.mdimporter.
  • Those can lead to incorrect indexing in kMDItemAuthors (Authors) and kMDItemComment (Comment) respectively.
  • They limit significantly the usefulness of those xattrs as metadata in macOS.
  • Of the five metadata xattrs tested here, only one, kMDItemKeywords (Keywords), appears both reliable for search and displayed generally in the Finder.
  • kMDItemDescription or Description, and kMDItemSubject or Subject are reliable for search, but aren’t displayed well in the Finder.
  • Aside from these bugs, poor display coverage in the Finder limits the usefulness of these xattrs.
  • Investigating bugs in Spotlight mdimporters can be challenging, particularly when mdimport is so prone to crash.

Finder comments, steganography and malware

By: hoakley
28 April 2026 at 14:30

Parts of macOS go back a long way, and have scars to prove it. Among those are Finder comments: text you can easily add to a file’s metadata in the Finder’s Get Info dialog. Like all good metadata, it’s also searchable in Spotlight, so can be used to organise and categorise your documents. However, there are some catches that can make that unreliable.

Finder comment

To see what I mean, you’ll need an app like my free Metamer, a simple metadata editor, or xattred, to display extended attributes. Pick an old file you can sacrifice for this purpose. Select it in the Finder, and Get Info for it. In the Comments section, type in a comment.

comment1

Open that file in Metamer, and select the FinderComment item in its combo box menu.

comment2

Your freshly written Finder Comment has already been added as an extended attribute.

Open the file in xattred, and you’ll see that text as a String in a property list for that extended attribute.

comment3

Now change the name of that file, perhaps by adding a short suffix. Get Info and the Finder Comment is still there.

comment4

Look at the extended attribute using Metamer or xattred, and you’ll see that the String in the property list has been deleted, and the extended attribute no longer contains the text still displayed in the Get Info dialog.

comment5

This is because

  • the primary copy of the Finder Comment goes into the hidden .DS_Store file in the same folder as the document, and that’s the one used by the Finder;
  • a secondary copy is saved in a xattr of type com.apple.metadata:kMDItemFinderComment for the file, which the Finder knows nothing about.

You can experience other strange effects from this dissociation between those two copies, and their different behaviours. As a result I don’t recommend use of Finder comments, because of their unreliability.

Steganography

Finder comments and other metadata provide a scheme for steganography in macOS. This was first used by Wdef, one of the original viruses that affected Classic Mac OS in about 1990, and was resurrected 30 years later in 2020, when Phil Stokes of Sentinel Labs reported a Bundlore variant abusing the Resource fork, then as now an extended attribute of type com.apple.ResourceFork.

Just recently, William Charles Gibson and Ryan Conry of Cisco Talos have described how a similar technique could be used “to stage payloads in a way that evades static file analysis”. They propose using a payload script, encoded using Base64, and written to Finder comment metadata.

Writing the malicious Finder comment is simple to achieve using the AppleScript code
set comment of newFile to "$PAYLOAD"
and retrieval, decoding and execution by the command
mdls -name kMDItemFinderComment -raw ~/Desktop/remote_test.txt | base64 -D | bash
also run from within AppleScript.

There are three disadvantages in using a Finder comment for this purpose, its fragility as shown above, its visibility to the user in the Finder’s Get Info dialog, and reliance on Spotlight indexing and search. If you want to use file metadata for steganography, then you’re better off using an extended attribute directly, accessing it with the xattr command tool, which is robust, remains with the file rather than in a hidden .DS_Store file, and doesn’t need to be indexed by Spotlight or found using mdls.

That also gives you a wider choice of extended attribute. If traceability isn’t important, a custom xattr can be used with the PS flags to ensure its persistence in copies, saves, syncs and backups. There’s a long list of those supported in macOS, including kMDItemProjects, which is stored in the com.apple.metadata:kMDItemProjects xattr and treated as having PS flags, and most importantly isn’t exposed to the user in the Finder’s Get Info dialog.

At this stage you might want to check how thoroughly your malware protection checks extended attributes for malicious content. As a rough guess, I suspect the answer is not at all.

❌
❌