Reading view

There are new articles available, click to refresh the page.

Last Week on My Mac: Making better use of security extended attributes

This week brought a timely revisit to remind myself just how common the three security extended attributes (xattr) have become, and to see whether we can make use of any of them for our own purposes.

How common?

Checking through one of my ~/Documents folders containing a modest 57,884 files, nearly 60% of them have at least one xattr. By far the most common is com.apple.quarantine on 48%, followed by com.apple.provenance at 14%. Some way behind those, but still one of the most frequent, is com.apple.macl on 2.8%.

Having explained what I think macOS does with all those xattrs, the next step is to ask whether we can use them for our direct benefit. Of the three, quarantine seems least useful to anything beyond Gatekeeper. MACL is like a boil on the bum, and the only time you’ll notice it is when it gets in your way. I can’t make sense from its contents either, but as it’s protected by SIP, there’s little a utility could do to alleviate our suffering, so we just have to learn to live with it. I’m surprised how uncommon it is in comparison with the nuisance it can cause.

Promising provenance

It’s the provenance xattr that looks most promising, and Koh M. Nakagawa has followed up his recent research into its function with an open-source command tool ShowProvenanceInfo that can look up provenance IDs found on files, in the ExecPolicy database’s Provenance Tracking table, although that requires root privileges for access.

Apps and executable code signed by third-parties rather than Apple are added to that table when they’ve successfully completed their first run. Each is given a unique provenance ID number that is attached to them in a com.apple.provenance xattr. When they then perform any of 11 types of file operation, such as creating a file or opening one in write mode, that app’s provenance ID is attached to the file in its own provenance xattr.

As apps and other executables that have been entered into that table have their own provenance xattrs, it shouldn’t be too much of a burden to build an independent database from those, together with other information about the executable with that provenance ID. That can then be used to examine provenance xattrs on arbitrary files, to identify which app last worked with that file, the primary task of a new GUI utility I’ve already dubbed Providable.

In addition to telling you which app with a provenance ID last changed a document, there are other functions that Providable could perform. Those property lists forming the basis for Background Items listed in Login Items & Extensions settings are normally created and changed by their owning app. When a third-party app with a provenance ID does that, the property list gains a provenance xattr that can be used to identify the app responsible. That can in turn provide information that’s often sadly lacking from the list of extensions, including the location of the app rather than that of the property list.

One obvious hole in this plan is the fact that apps that are signed by Apple, including those bundled in macOS and everything installed from the App Store, don’t get assigned provenance IDs. They therefore can’t be traced and identified from the files they create or change, as they operate outside the provenance tracking system.

Providable

My outline design for Providable is therefore to inspect provenance IDs saved as xattrs to the apps in the Data volume, and to display their details in a list you can refer to. Alongside that, a window lets you drop files on it for checking. Each will be examined for a provenance xattr, and those that have one will then be associated with the app in the list, providing its path and other details.

Provenance IDs can also be assigned to command tools and other executables, and a later version will allow you to check those in popular locations, and add them to the database so files can be matched to them as well. I don’t currently know how useful that might be, but we should get a better idea once the first version of Providable is in use.

I invite your ideas and comments, please, before I start coding.

Explainer: Data and metadata

Files, documents and everything else we store on our Macs consist of data. For an image, those are the pixels that have to be displayed for that image, for an illustrated book it’s the laid-out pages of text and pictures.

Associated with each of those is additional information about what’s in the data, such as the datestamp of its creation both as data and as that file, details of its creator, and about how it was created, such as the camera used. Those are data about its data, thus metadata.

Until 1984 and the first Mac, it was almost universal that most metadata was contained in the same file as the data, although some, such as a file’s datestamps, were stored separately in that file’s record in the file system, in its attributes. The Mac tried to change that by introducing a second fork to files, their resource fork, intended to contain metadata. Unfortunately, while that became standard on Macs, dominant operating systems like MS-DOS didn’t change, and continued to embed data and metadata together in flat files.

A lot has changed over the nearly 42 years since the first Mac, and now macOS has multiple sources of metadata for its files.

File attributes

APFS file records contain an extensive set of attributes, including

  • time of creation
  • time of modification of data
  • time of attribute modification
  • time of access
  • file name
  • owner, group and permissions.

These are largely common to other modern file systems.

Extended attributes

Mac OS X brought the extension of classic resource forks to other metadata objects, as extended attributes (xattr), named using a reverse-URL scheme, such as com.apple.FinderInfo containing metadata for the Finder. In this scheme, the traditional resource fork becomes a xattr of type com.apple.ResourceFork. Many of these are now used by macOS for security and privacy protection, but the user can add xattrs containing copyright information, names of creators, an arbitrary description, a text headline, and others. Anyone can define their own type of xattr, and some apps make good use of them for storing metadata.

Their main disadvantages are:

  • Xattrs rarely transfer to other platforms, making most Mac-only.
  • They’re commonly stripped when transferred even between Macs, or when shared in iCloud Drive. Apple has a system of tags to determine which xattrs should be stripped and which retained, but those aren’t as widely used as they deserve.
  • Most xattrs aren’t shown by the Finder, either in Preview panes or in Get Info dialogs.

For largely historical reasons, even Apple doesn’t take fullest advantage of xattrs. For example, Finder Comments, which are shown in Get Info, are primarily stored in a folder’s hidden .DS_Store file and only secondarily in a com.apple.metadata:kMDItemFinderComment xattr.

Embedded metadata

Because so few file systems use extended attributes or their equivalents, most file-specific metadata is now embedded in file data. For some file formats, such as those widely used by word processors and spreadsheets, this is relatively straightforward, particularly when using XML-based formats.

It becomes more complicated and less reliable when used with images, as in Exchangeable Image File Format, EXIF. Although usually treated as a metadata standard, in fact EXIF encompasses both data and metadata formats embedded in a single file for convenience.

EXIF metadata can include camera settings such as aperture and shutter speed, image metrics such as colourspace, date and time of creation, location and copyright information, and a thumbnail version of the image (which is arguably data rather than metadata). How the metadata is embedded with data is determined by the format of the image data. For JPEG data, EXIF metadata is stored in an Application Segment of the image, but for TIFF data there’s a sub-image file directory that can spread the metadata anywhere within the data.

The danger is that apps that edit data can inadvertently damage or remove the EXIF metadata, and that’s all too common among image editors that may need to rewrite the whole of the data when saving an edited image. Fortunately macOS relies on its own QuickLook thumbnails rather than embedded EXIF thumbnails, as some image editors don’t update the latter reliably.

There are more subtle disadvantages to embedding metadata with data. In Apple’s preferred model, there are separate datestamps in file attributes for saved changes to data and extended attributes, allowing them to be distinguished.

Summary

In macOS, metadata can be stored

  • in the file system as attributes,
  • as extended attributes,
  • embedded with the file data.

It’s hardly surprising how often it goes missing, or is overlooked.

❌