Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Using and troubleshooting Spotlight in Sequoia: summary

By: hoakley
29 November 2024 at 15:30

Over the last couple of weeks, I have looked at several aspects of Spotlight in macOS Sequoia. This article draws those together in a summary that I hope will prove a useful reference.

Spotlight and apps like HoudahSpot that work with it aren’t the only way to search for files and their contents. There are third-party alternatives, some of which can use Spotlight for some searches, while most can also search independently. These range from the free EasyFind, through FindAnyFile, to high-end FoxTrot Professional Search. If you want to know more about those, please consult their documentation.

How Spotlight works

At its heart, Spotlight consists of several components:

  • A file indexing system reliant on mdimporter modules supplied with macOS and third-party products, that enable mdworker processes to extract data from each file to be added to the index.
  • A hidden folder of indexes in .Spotlight-V100 at the top level of each volume that it indexes. These are compiled using the data extracted by mdworker processes during indexing, and maintained by processes including mds and mds_stores.
  • A search interface, including the Spotlight item in the menu bar, and the Finder’s Find command for its windows. Interfaces are also built into many apps.

There was a time when there was just one Spotlight, and searching in the Finder could reach almost any file on each indexed volume. Confusingly, different flavours of Spotlight have access to different sections of its indexes. One prime example is the search feature in Mail, which is the only way that you can search messages in that app, other than the Spotlight item in the menu bar. This is known as Core Spotlight, and prevents message contents from being found in the Finder, or in third-party apps like HoudahSpot. This operates independently of TCC privacy controls, and those can’t be used to give access to Core Spotlight indexes.

Recent versions of macOS can extract additional information from certain types of files such as images and PDFs. This may include text recognised within images using Live Text, and object recognition and classification, performed in the background by other services such as photoanalysisd. Those are slower and take significantly more processing, are normally run after traditional mdimporter extraction, and in some cases may take many days to complete. As some of that search content is only accessible to Core Spotlight, this is almost impossible to troubleshoot.

This is summarised, with pointers to common failures, in the diagram below.

spotlightsteps1

Importing to the indexes

To confirm that a file is being indexed correctly, there are three useful command tools:

  • mdimport -L lists known Spotlight mdimporter plugins.
  • mdimport -t -d3 [filepath] reports the mdimporter used for a file, and data extracted from that file.
  • mddiagnose -f [path] runs full Spotlight diagnostics.

My free Mints can also be used to investigate most Spotlight problems.

Exclusions from indexing

There have been three methods of excluding folders from indexing and search by naming, although only two of them still work reliably:

  • appending the extension .noindex to the folder name (this previously worked using .no_index instead);
  • making the folder invisible to the Finder by prefixing a dot ‘.’ to its name;
  • putting an empty file named .metadata_never_index inside the folder; that no longer works in recent macOS.

System Settings offers Spotlight Privacy settings in two sections. Items listed under Search results won’t normally prevent their indexing, but will block them from appearing in search results. Spotlight’s indexing exclusion list is accessed from the Search Privacy… button, where listed items won’t be indexed at all.

Re-indexing

In general use, the only way to force Spotlight to update or correct its indexes is to force them to be rebuilt. The only good indication for rebuilding Spotlight indexes on a volume is when they are known to be damaged or corrupted, in which case rebuilding offers the best chance of restoring normal search function to that volume.

To force a volume to be re-indexed, open Spotlight (or Siri & Spotlight) in System Settings and click the Search Privacy… or Spotlight Privacy… button at the bottom. Click the + button at the foot, select the volume and add it to the list, then click Done. Pause thirty seconds or so, click the Search Privacy… button again, select that volume in the list, and click the – button to remove it from the list. You don’t normally need to close System Settings or restart between adding and removing the volume.

If you prefer, you can instead use the mdutil command in Terminal. The command
mdutil -E /
erases the indexes on the Data volume and forces them to be rebuilt, and you can use the same option on other volumes given the correct path. Provided that you use the -E option, there’s no need to turn off indexing, or to turn it back on again, nor is it necessary to delete that volume’s hidden .Spotlight-V100 folder.

As Spotlight indexes are maintained and stored on each volume for its contents, you’ll need to include each volume on which you want to be able to search files by their contents. Unless you have been able to identify which volume’s indexes merit rebuilding, you may have to rebuild indexes on every mounted volume, which can take many hours or days to complete.

The simplest way to check that re-indexing is taking place is to open Activity Monitor, and in its CPU view check that processes with names starting with md are taking plenty of CPU. These should include mds_stores, mdworker (often multiple copies) and mds itself. On Apple silicon, those processes run almost exclusively on E cores, and are usually obvious in Activity Monitor’s CPU History window.

Although the Spotlight Search item opened from the menu bar may show indexing progress, that doesn’t cover all indexing, and isn’t a reliable indication of whether indexing or re-indexing is complete.

Indexing takes excessive time

This isn’t uncommon after mounting an external volume, or following a macOS upgrade. On Apple silicon Macs, it can result in prolonged and apparently intensive activity on the E cores. On all Macs, it may prevent a volume from being unmounted for many hours. Although some claim that forcing re-indexing can address this, that could instead take even longer to complete.

If an external disk needs to be disconnected, try the force-eject action offered by the Finder, then restart the Mac to ensure any further indexing activity is terminated. It may prove simpler to shut the Mac down, disconnect the external disk, and start it up again.

Failure to find

If a Spotlight search is unable subsequently to find known contents of a file, then one of the following has occurred:

  • The contents of that file weren’t correctly indexed, most commonly because that file is in a location excluded from indexing.
  • Although indexes for that volume contain data correct for the file, Spotlight is withholding information about it from search results.

When a file is in a folder that should be within the scope of a regular global Spotlight search, not just app-specific Core Spotlight, but can’t be found, try relocating the file to a path such as ~/Documents that should be in scope and repeat the search. This can be done using the test files provided by Mints, for example, although Mints’ Spotlight search only encompasses the active Data volume. To test other volumes including those on external disks, use the Finder’s Find.

When a file is moved within the same volume, Spotlight normally doesn’t re-index it, but records its new path. One useful technique is to place the files in a folder that is known to be indexed, then move them to another location. Spotlight in Sequoia (at least) withholds the results of searches in Library folders and their contents, except for those in /Library/Application Support.

iCloud Drive and network shares

The whole contents of iCloud Drive should be fully accessible to local Spotlight provided that files are downloaded locally at the time of the search. All files that have been evicted from local storage, and have been made dataless as a result, become inaccessible to Spotlight until they have been downloaded again. If you want to ensure that any file in iCloud Drive remains searchable, set it to be kept downloaded or pinned using the contextual menu’s Keep Downloaded command.

Shared volumes and folders using SMB are unlikely to be accessible to local Spotlight search. Although some have claimed to be able to see items on such shares, most can’t, and there seems to be no reliable way of enabling this in Sequoia.

Planning complex Time Machine backups for efficiency

By: hoakley
28 October 2024 at 15:30

Time Machine (TM) has evolved to be a good general-purpose backup utility that makes best use of APFS backup storage. However, it does have some quirks, and offers limited controls, that can make it tricky to use with more complex setups. Over the last few weeks I’ve had several questions from those trying to use TM in more demanding circumstances. This article explains how you can design volume layout and backup exclusions for the most efficient backups in such cases.

How TM backs up

To decide how to solve these problems, it’s essential to understand how TM makes an automatic backup. In other articles here I have provided full details, so here I’ll outline the major steps and how they link to efficiency.

At the start of each automatic backup, TM checks to see if it’s rotating backups across more than one backup store. This is an unusual but potentially invaluable feature that can be used when you make backups in multiple locations, or want added redundancy with two or more backup stores.

Having selected the backup destination, it removes any local snapshots from the volumes to be backed up that were made more than 24 hours ago. It then creates a fresh snapshot on each of those volumes. I’ll consider these later.

Current versions of TM normally don’t use those local snapshots to work out what needs to be backed up from each volume, but (after the initial full backup) should rely on that volume’s record of changes to its file system, FSEvents. These observe two lists of exclusions: those fixed by TM and macOS, including the hidden version database on each volume and recognised temporary files, and those set by the user in TM settings. Among the latter should be any very large bundles and folders containing huge numbers of small files, such as the Xcode app, as they will back up exceedingly slowly even to fast local backup storage, and can tie up a network backup for many hours. It’s faster to reinstall Xcode rather than restore it from a backup.

Current TM backups are highly efficient, as TM can copy just the blocks that have changed; older versions of TM backing up to HFS+ could only copy whole files. However, that can be impaired by apps that rewrite the whole of each large file when saving. Because the backup is being made to APFS, TM ensures that any sparse files are preserved, and handles clone files as efficiently as possible.

Once the backup has been written, TM then maintains old backups, to retain:

  • hourly backups for the last 24 hours, to accompany hourly local snapshots,
  • daily backups over the previous month,
  • weekly backups stretching back to the start of the current backup series.

These are summarised in the diagram below.

tmseqoutline1

Local snapshots

TM makes two types of snapshot: on each volume it’s set to back up, it makes a local snapshot immediately before each backup, then deletes that after 24 hours; on the backup storage, it turns each backup into a snapshot from which you can restore backed up files, and those are retained as stated above.

APFS snapshots, including TM local snapshots, include the whole of a volume, without any exceptions or exclusions, which can have surprising effects. For example, a TM exclusion list might block backing up of large virtual machine files resulting in typical backups only requiring 1-2 GB of backup storage, but because those VMs change a lot, each local snapshot could require 25 GB or more of space on the volume being backed up. One way to assess this is to check through each volume’s TM exclusion list and assess whether items being excluded are likely to change much. If they are, then they should be moved to a separate volume that isn’t backed up by TM, thus won’t have hourly snapshots.

Some workflows and apps generate very large working files that you may not want to clutter up either TM backups or local snapshots. Many apps designed to work with such large files provide options to relocate the folders used to store static libraries and working files. If necessary, create a new volume that’s excluded completely from TM backups to ensure those libraries and working files aren’t included in snapshots or backups.

TM can’t run multiple backup configurations with different sets of exclusions, though. If you need to do that, for instance to make a single nightly backup of working files, then do so using a third-party utility in addition to your hourly TM backups.

This can make a huge difference to free space on volumes being backed up, as the size of each snapshot can be multiplied by 24 as TM will try to retain each hourly snapshot for the last 24 hours.

Macs that aren’t able to make backups every hour can also accrue large snapshots, as they may retain older snapshots, that will only grow larger over time as that volume changes from the time that snapshot was made.

While snapshots are a useful feature of TM, the user has no control over them, and can’t shorten their period of retention or turn them off altogether. Third-party backup utilities like Carbon Copy Cloner can, and may be more suitable when local snapshots can’t be managed more efficiently.

iCloud Drive

Like all backup utilities, TM can only back up files that are in iCloud Drive when they’re downloaded to local storage. Although some third-party utilities can work through your iCloud Drive files downloading them automatically as needed, TM can’t do that, and will only back up files that are downloaded at the time that it makes a backup.

There are two ways to ensure files stored in iCloud Drive will be backed up: either turn Optimise Mac Storage off (in Sonoma and later), or download the files you want backed up and ‘pin’ them to ensure they can’t be removed from local storage (in Sequoia). You can pin individual files or whole folders and their entire contents by selecting the item, Control-click for the contextual menu, and selecting the Keep Downloaded menu command.

Key points

  • Rotate through 2 or more backup stores to handle different locations, or for redundancy.
  • Back up APFS volumes to APFS backup storage.
  • Exclude all non-essential files, and bundles containing large numbers of small files, such as Xcode.
  • Watch for apps that make whole-file changes, thus increasing snapshot and backup size.
  • Store large files on volumes not being backed up to minimise local snapshot size.
  • If you need multiple backup settings, use a third-party utility in addition to TM.
  • To ensure iCloud Drive files are backed up, either turn off Optimise Mac Storage (Sonoma and later), or pin essential files (Sequoia).

Further reading

Time Machine in Sonoma: strengths and weaknesses
Time Machine in Sonoma: how to work around its weaknesses
Understand and check Time Machine backups to APFS
Excluding folders and files from Time Machine, Spotlight, and iCloud Drive

Last Week on My Mac: the Finder is growing less consistent

By: hoakley
6 October 2024 at 15:00

Over the last forty years, one of the essential virtues of the Mac’s interface has been its consistency. From the outset, Apple has ensured the Finder behaves according to a small set of rules that are readily learned, often without conscious effort. These form a grammar grounded on those of languages like English, of subject-verb-object. Select a file, do something with it, and it changes state.

Drag a file from its current folder and drop it onto another, and it’s moved or copied from one to the other. The outcome depends on context, where performing that action within the same storage volume results in the file being moved, but between folders in different volumes it’s copied instead. Folders play two roles: as enclosing elements, and as objects in their own right. Drag a folder from one place to another, and you expect its entire contents to go with it, but apply a tag to a folder, and only that folder gains the tag.

iCloud Drive has never quite conformed to the same grammar. Change its mode by enabling the Optimise Mac Storage setting and it stops behaving so consistently. Files that appear to be present turn out to lack data content, and have to be marked as evicted exceptions. Folder commands to control download and eviction depart further from established norms. Select all, and if there are more than ten items selected, Remove Downloads is no longer available in the contextual menu. Can the Finder no longer cope with certain actions when they’re applied simultaneously to more items than we have fingers?

There is a workaround, though: select just the enclosing folder, and Remove Downloads on that. If you then want to download individual items within that evicted folder, you do so by selecting them and using Download Now, which oddly doesn’t seem constrained to the same limit of ten. You can thus have a hybrid folder, with some items downloaded, others evicted, and the folder itself continues to display the icon indicating its contents remain evicted.

Now consider Sequoia’s new pinning behaviour. Pin a folder and its contents are shown as being pinned, as you’d expect. Select one or two of those individual files, though, and there’s no option offered to unpin them. The only way to do that is to unpin the whole folder, then pin individual items within that folder. But like the Remove Downloads command, that too won’t work for more than ten items at once. Move a new item inside a pinned folder, and it automatically becomes pinned, and can’t be individually unpinned as long as its enclosing folder remains pinned.

If you have a folder of 100 files, and want to pin all of them bar one, this becomes laborious. If you pin the folder, you can’t unpin that one file, so you must leave the folder unpinned and instead pin the individual files inside it, no more than ten files at time.

Against the Finder’s consistent model, eviction and downloading are well-behaved, as you can

  1. select the folder, and apply the action to it,
  2. select the exception(s), then apply the inverse action to them,

requiring just two selections and two actions.

On the other hand, pinning behaves inconsistently, as you must:

  1. select a group of no more than ten files, apply the action to them,
  2. repeat that as many times as necessary, until only the files you want pinned have been pinned.

Consistency is often taken as a mark of reliability. iCloud Drive comes from a position of lower reliability, and inconsistencies in its human interface only serve to reduce the user’s trust.

Explaining this idiosyncratic behaviour is more important still. Instead of being designed for the human, its interface has been determined by its engineering, in a step back to computers that preceded the Mac. This reflects an API that is every bit as awkward as its human interface. In common with other properties of files in iCloud Drive, developers are prevented from discovering how iCloud Drive handles them. For example, Apple makes checking whether a file is evicted as difficult as possible, as there’s no property or attribute that records that.

Pinning is similarly convoluted, and currently undocumented. When pinned individually, files are given a distinctive extended attribute, but when pinned by virtue of any of their enclosing folders being pinned, that xattr is attached not to the files, but to the folder. To check whether any given file in iCloud Drive is pinned, an app therefore has first to check whether that file has the xattr, then to traverse its path upward and look for the xattr on every folder in that path.

Thus the human interface for pinning is determined not by the Finder’s consistent grammar, but by the implementation of this feature in iCloud Drive’s virtual file system. The sooner consistency is restored, the better for the Finder and its users.

Pinning iCloud Drive in Sequoia is bizarre, and an update to Cirrus

By: hoakley
2 October 2024 at 14:30

As I hinted on Monday, I’ve been updating my free iCloud Drive utility Cirrus to work with pinning files and folders in macOS Sequoia. It’s only when delving into the depths of iCloud Drive and pinning that you realise how bizarre it is, and how its interface can be intensely frustrating. This article starts with a brief practical demonstration of the oddities of this alien interface before trying to explain it, and ends with the new version of Cirrus.

Pinning frustration

To perform this brief demonstration, all you need is iCloud Drive with Optimise Mac Storage enabled, running in macOS Sequoia. Although it’s entirely non-destructive, you might like to try this using scrap files that you can trash in frustration at the end.

Create a new folder in iCloud Drive, give it a suitable name, and copy 12-18 files into it. Wait until they have all synced up to iCloud, then select just one of them and open the Finder’s contextual menu using Control-click, noting the new command Keep Downloaded that would pin that file to ensure it isn’t evicted from local storage.

Now press Command-A to select all those files in that folder, and open the contextual menu again. Notice that both the Keep Downloaded and Remove Download commands are now absent from the menu, and the only option offered is Download Now, even though although all those files are still downloaded.

Now select just a few of those files, ten or less, and the contextual menu offers Keep Downloaded and Remove Download commands again.

Rather than pinning these files piecemeal, pin their parent folder instead, using the contextual menu. A second or so after the folder shows the pinned icon, it’s shown against all the files inside that folder. So pinning a folder is simpler than trying to pin the files inside it, at least until you want to unpin any of them: select one of the files inside the pinned folder, open the contextual menu, and you’ll see there’s nothing there about Keep Downloaded, just another command to Show Downloaded Folder instead. Use that, and the Finder will then select the pinned folder for you (note, if you try this in Files in iOS, it doesn’t always work!).

The lesson is that, before you can unpin any of the files in that folder, you have to unpin the folder, which in turn unpins every file (and folder) within that folder, leaving you to manually pin the files you do want to pin, in batches of no more than ten at a time.

If you want further fun, with the parent folder still pinned, create a new folder inside that, and copy a file into that enclosed folder. You’ll see that, whatever you might want, everything that goes into that pinned folder is also pinned now, but can only be unpinned by unpinning everything inside the folder.

Pinning files and folders

The reason for this bizarre and annoying interface is the way that pinning is implemented.

When you pin files individually or in groups of up to ten, each file gains its own pinning extended attribute, of com.apple.fileprovider.pinned. But when you pin a folder, only that folder gains the extended attribute, none of the files or folders within it. The whole folder and the paths within it are designated as being pinned. And, as far as I can tell from the absence of any better information in Apple’s missing documentation, there’s no single method to determine whether a file in iCloud Drive is pinned.

Instead, you have to both

  • look for the extended attribute attached to the file, and
  • check all the folders in its path to determine if any of them has the extended attribute, which would then pin everything in their path.

Apple doesn’t document any file or URL attribute that can be used to determine whether a file or folder is pinned.

Cirrus 1.15

I had hoped that this new version of Cirrus, my free utility for exploring, working with and testing iCloud Drive, would be able to identify individual files and folders that are pinned in Sequoia. Because of the lack of any attribute to help this, I have temporarily abandoned my attempts.

What Cirrus now does, though, is performs deep scans of folders in iCloud Drive and reports which files and folders are pinned, giving individual file sizes, and a total file size and number of pinned files. This enables you to check where all your pinned files and folders are, and the pinning burden of iCloud Drive.

cirrus115

If you use pinning much, this is surely essential housekeeping.

Cirrus version 1.15 requires Big Sur 11.5, but its new feature requires macOS 15 Sequoia anyway, and is available from here: cirrus115
from Downloads above, from its Product Page, and through its auto-update mechanism. I hope it makes your pinning more painless.

❌
❌