Normal view

There are new articles available, click to refresh the page.
Before yesterdayThe Eclectic Light Company

Chinese whispers in PDF metadata

By: hoakley
15 May 2026 at 14:30

Chinese whispers is an old children’s game where everyone sits in a circle, and one child whispers into the ear of the next on their right a sentence like Send reinforcements, we’re going to advance. That child then whispers the message they heard to the child on their right, until it reaches the one who started it, who says out loud what they heard, classically Send three-and-fourpence, we’re going to a dance, as a demonstration of how messages can so easily become corrupted. What this has to do with China remains one of childhood’s mysteries. I should also explain that three-and-fourpence was idiomatic British English in the days before our currency was ‘decimalised’, and meant three shillings and four (old) pence, about 17 (new) pence, sufficient at one time to enjoy a good night out.

In this article I’m going to do much the same with metadata for a PDF document, tracing what gets indexed by Spotlight, so becoming discoverable by search, and what is displayed in the Finder. This relies on several of my utilities, most of which are available from this page.

Source PDF

I prepared a completely unrelated PDF using my favourite PDF editor, PDF Expert, by adding metadata to be saved in the file’s data. As you might expect, there are several ways that could be stored in the PDF format, including XMP metadata, but in this case for simplicity they were added in the document information dictionary.

I inspected that in a source view in Podofyllin, which found the following fields:
/Author (Author name in pdf)
/Creator (Pages)
/Keywords (keyword1 pdf)
/Subject (Subject in pdf)
/Title (0PDFtest1accessdefault)

When rendered in macOS, those are ‘flattened’ by its Quartz PDF engine, to
/Author (Author name in pdf)
/Creator (Pages)
/Keywords (keyword1 pdf)
/AAPL:Keywords [(keyword1 pdf)]
/Subject (Subject in pdf)
/Title (0PDFtest1accessdefault)

Note the copying of keywords into a new attribute AAPL:Keywords.

Extended attributes

I then added seven extended attributes using Metamer, with names such as com.apple.metadata:kMDItemAuthors, as shown below in xattred.

Spotlight import

I then inspected the file in SpotTest’s new Drop Window, which reported the following attributes found by mdimport:
":EA:kMDItemAuthors" = "author in xattr";
":EA:kMDItemComment" = "xattr comment";
":EA:kMDItemCreator" = "xattr creator";
":EA:kMDItemDescription" = "xattr description";
":EA:kMDItemKeywords" = "keyword1,xattr";
":EA:kMDItemSubject" = "xattr subject";
":EA:kMDItemTitle" = "xattr title";

all from the extended attributes, while those derived from the PDF data were
kMDItemAuthors = (Pages);
kMDItemCreator = Pages;
kMDItemDescription = "Subject in pdf";
kMDItemKeywords = ("keyword1 pdf");
kMDItemTitle = 0PDFtest1accessdefault;

Those attributes have already changed, with PDF Subject becoming kMDItemDescription, Creator being duplicated into kMDItemAuthors, and the loss of PDF Author.

Spotlight indexes

Attributes reported by mdls changed again to
kMDItemAuthors = (Pages)
kMDItemComment = "xattr comment"
kMDItemCreator = "Pages"
kMDItemDescription = "Subject in pdf"
kMDItemKeywords = ("keyword1,xattr")
kMDItemSubject = "xattr subject"
kMDItemTitle = "0PDFtest1accessdefault"

This has lost the xattr attributes kMDItemAuthors, kMDItemCreator, kMDItemDescription and kMDItemTitle, and the PDF kMDItemKeywords. That list of 7 attributes should then be searchable using Spotlight.

The Finder

The final step was to discover which of those could be displayed in the Finder, either in its Get Info dialog, or in the Preview panel of a Finder window.

Only 5 of those attributes survived in the Finder, and were given as
Authors: Pages
Content Creator: Pages
Description: Subject in pdf
Keywords: keyword1,xattr
Title: 0PDFtest1accessdefault

Of those, 4 are taken from the metadata in the PDF file, and only the Keywords were taken from its extended attribute. The attribute named as Authors contains a duplicate of what had originally been in the PDF Creator field, but neither of the PDF Author or xattr kMDItemAuthors fields. Those paths are traced in the diagram below.

Conclusions

Of the total of 12 distinct metadata attributes added in the PDF data and extended attributes, only 6 different items were indexed by Spotlight, and 4 were displayed in the Finder (allowing for the duplication of Authors and Content Creator).

Before relying on metadata for search and access in the Finder, it’s essential to verify that the attributes you intend using are successfully indexed and displayed. Choose the wrong attributes and you’ll never find anything.

How to store and manage metadata in macOS

By: hoakley
5 May 2026 at 14:30

One of the design features of macOS is that a file’s metadata can be stored separately from that file’s data. This is normally achieved by extended attributes, xattrs, and to aid that there’s a rich and extendable range of them. This article explains what you can and can’t do with them, as of macOS 26.4.1.

More generally, most metadata are stored within a file’s data, to accommodate operating systems and file systems that don’t have such rich features. Most image formats, for example, incorporate standard collections of metadata such as EXIF information, which are saved with the image data. PDF and Word documents have similar features. These have the disadvantage that changing the metadata results in the file data being altered, and that makes it difficult to track and to guarantee the data’s integrity. This can result in damage or corruption to the data in a PDF file during editing of comment metadata, for example. When possible it’s far better to separate metadata.

Distinguishing between data and metadata can also be tricky at times. This is easiest with non-verbal data like images and audio, where text is clearly separate from data, but it can appear more arbitrary with written documents. Some content such as copyright information or an index is universally accepted as metadata, but abstracts and appendixes may vary. Some types of document, such as official standards, draw explicit distinctions, using qualifiers like normative and informative, to assist.

Finder Comments and Tags

These are the two standard types of metadata currently best supported in macOS, both being readily accessible in the Finder and applicable to any file or folder.

The main problem with Finder Comments is that they are primarily stored separately in the hidden .DS_Store file in the same directory as the item, although a secondary copy is written to a xattr attached to that file/folder. They’re easily accessed in the Finder’s Get Info dialog, and can be shown in List view windows, although that doesn’t work for multi-line comments. On balance, their strange storage makes them fragile and unsuitable for many uses.

Finder Tags are stored properly in a xattr, and are most widely distinguished by the coloured tag displayed. Although they can be repurposed to store text, their main value remains in categorisation. There’s a practical limit of 20-25 characters before their text label is cropped in most views, and labels for multiple items can only be shown in the Finder’s List view layout. For the majority, they are best used for allocating items to a limited number of categories, distinguished foremost by their tag colour, and aren’t suitable for more substantial text like even a brief summary.

Properties

To be generally suitable for storing text metadata for a file or folder in macOS, these should:

  • be attached to the file or folder as a xattr;
  • be capable of storing and displaying up to 3,804 bytes of UTF-8 text, the upper limit of data stored alongside the xattr;
  • use xattr flags to control their persistence;
  • be indexed by Spotlight so their contents can be searched;
  • be preserved in iCloud Drive, and when copied to a different volume;
  • be displayed and edited easily, ideally in the Finder, without the need for third-party software.

Which xattrs?

Of the dozens of xattrs available, I’m aware of just three that come closest to meeting all those requirements:

  • com.apple.metadata:kMDItemComment, known in Spotlight search menu as Comment, and different from Spotlight Comment, which is synonymous with Finder Comment;
  • com.apple.metadata:kMDItemKeywords, Spotlight Keywords;
  • com.apple.metadata:kMDItemSubject, Spotlight Subject.

These are sufficiently persistent as to be preserved in iCloud Drive, and when transferred between Macs using AirDrop. Although they can be displayed in the Finder, they each require third-party software for creation and management. As explained below, this doesn’t apply to image files, which are expected to store their metadata in EXIF information within the file, and not in xattrs.

Management

All three can be created and managed using my free Metamer, as well as with the more extensive features of xattred.

metamer131

Metamer is a lightweight drag-and-drop utility to create, edit and view many different types of xattr, including the three recommended. It uses a Combo box offering 16 of the most commonly used xattrs, and you can enter the full name of any other if you need. Although it can be used to edit multi-line text, it’s designed to work best with single lines.

xattred is a general xattr editor with more extensive capabilities, and ideal for checking all the xattrs attached to a file or folder. It doesn’t offer the conveniences of Metamer, though.

Display

Each of these three xattrs is displayed in the More Info section of the Get Info dialog, and depending on settings, they can also be shown in the Finder’s preview pane when enabled in Preview Options for that file type. The latter is explained in more detail here, and can appear counter-intuitive at times. However, they can only be added and displayed for supported file types. For example, JPEG and PNG image files can’t display these three xattrs in Get Info dialogs or in the preview pane, but PDF, RTF, text and many other file types can.

None of the three can be listed for multiple files in any of the Finder’s view layouts.

Spotlight

All three are available in the list of search terms available in the Other item at the foot of the search term menu in a Finder Find window, as listed above.

But

When researching this article, I discovered some odd behaviours that render some xattrs both invisible and undiscoverable by Spotlight search. I hope to describe those fully tomorrow if I can get my head around them.

❌
❌