Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Using and troubleshooting Spotlight in Sequoia: summary

By: hoakley
29 November 2024 at 15:30

Over the last couple of weeks, I have looked at several aspects of Spotlight in macOS Sequoia. This article draws those together in a summary that I hope will prove a useful reference.

Spotlight and apps like HoudahSpot that work with it aren’t the only way to search for files and their contents. There are third-party alternatives, some of which can use Spotlight for some searches, while most can also search independently. These range from the free EasyFind, through FindAnyFile, to high-end FoxTrot Professional Search. If you want to know more about those, please consult their documentation.

How Spotlight works

At its heart, Spotlight consists of several components:

  • A file indexing system reliant on mdimporter modules supplied with macOS and third-party products, that enable mdworker processes to extract data from each file to be added to the index.
  • A hidden folder of indexes in .Spotlight-V100 at the top level of each volume that it indexes. These are compiled using the data extracted by mdworker processes during indexing, and maintained by processes including mds and mds_stores.
  • A search interface, including the Spotlight item in the menu bar, and the Finder’s Find command for its windows. Interfaces are also built into many apps.

There was a time when there was just one Spotlight, and searching in the Finder could reach almost any file on each indexed volume. Confusingly, different flavours of Spotlight have access to different sections of its indexes. One prime example is the search feature in Mail, which is the only way that you can search messages in that app, other than the Spotlight item in the menu bar. This is known as Core Spotlight, and prevents message contents from being found in the Finder, or in third-party apps like HoudahSpot. This operates independently of TCC privacy controls, and those can’t be used to give access to Core Spotlight indexes.

Recent versions of macOS can extract additional information from certain types of files such as images and PDFs. This may include text recognised within images using Live Text, and object recognition and classification, performed in the background by other services such as photoanalysisd. Those are slower and take significantly more processing, are normally run after traditional mdimporter extraction, and in some cases may take many days to complete. As some of that search content is only accessible to Core Spotlight, this is almost impossible to troubleshoot.

This is summarised, with pointers to common failures, in the diagram below.

spotlightsteps1

Importing to the indexes

To confirm that a file is being indexed correctly, there are three useful command tools:

  • mdimport -L lists known Spotlight mdimporter plugins.
  • mdimport -t -d3 [filepath] reports the mdimporter used for a file, and data extracted from that file.
  • mddiagnose -f [path] runs full Spotlight diagnostics.

My free Mints can also be used to investigate most Spotlight problems.

Exclusions from indexing

There have been three methods of excluding folders from indexing and search by naming, although only two of them still work reliably:

  • appending the extension .noindex to the folder name (this previously worked using .no_index instead);
  • making the folder invisible to the Finder by prefixing a dot ‘.’ to its name;
  • putting an empty file named .metadata_never_index inside the folder; that no longer works in recent macOS.

System Settings offers Spotlight Privacy settings in two sections. Items listed under Search results won’t normally prevent their indexing, but will block them from appearing in search results. Spotlight’s indexing exclusion list is accessed from the Search Privacy… button, where listed items won’t be indexed at all.

Re-indexing

In general use, the only way to force Spotlight to update or correct its indexes is to force them to be rebuilt. The only good indication for rebuilding Spotlight indexes on a volume is when they are known to be damaged or corrupted, in which case rebuilding offers the best chance of restoring normal search function to that volume.

To force a volume to be re-indexed, open Spotlight (or Siri & Spotlight) in System Settings and click the Search Privacy… or Spotlight Privacy… button at the bottom. Click the + button at the foot, select the volume and add it to the list, then click Done. Pause thirty seconds or so, click the Search Privacy… button again, select that volume in the list, and click the – button to remove it from the list. You don’t normally need to close System Settings or restart between adding and removing the volume.

If you prefer, you can instead use the mdutil command in Terminal. The command
mdutil -E /
erases the indexes on the Data volume and forces them to be rebuilt, and you can use the same option on other volumes given the correct path. Provided that you use the -E option, there’s no need to turn off indexing, or to turn it back on again, nor is it necessary to delete that volume’s hidden .Spotlight-V100 folder.

As Spotlight indexes are maintained and stored on each volume for its contents, you’ll need to include each volume on which you want to be able to search files by their contents. Unless you have been able to identify which volume’s indexes merit rebuilding, you may have to rebuild indexes on every mounted volume, which can take many hours or days to complete.

The simplest way to check that re-indexing is taking place is to open Activity Monitor, and in its CPU view check that processes with names starting with md are taking plenty of CPU. These should include mds_stores, mdworker (often multiple copies) and mds itself. On Apple silicon, those processes run almost exclusively on E cores, and are usually obvious in Activity Monitor’s CPU History window.

Although the Spotlight Search item opened from the menu bar may show indexing progress, that doesn’t cover all indexing, and isn’t a reliable indication of whether indexing or re-indexing is complete.

Indexing takes excessive time

This isn’t uncommon after mounting an external volume, or following a macOS upgrade. On Apple silicon Macs, it can result in prolonged and apparently intensive activity on the E cores. On all Macs, it may prevent a volume from being unmounted for many hours. Although some claim that forcing re-indexing can address this, that could instead take even longer to complete.

If an external disk needs to be disconnected, try the force-eject action offered by the Finder, then restart the Mac to ensure any further indexing activity is terminated. It may prove simpler to shut the Mac down, disconnect the external disk, and start it up again.

Failure to find

If a Spotlight search is unable subsequently to find known contents of a file, then one of the following has occurred:

  • The contents of that file weren’t correctly indexed, most commonly because that file is in a location excluded from indexing.
  • Although indexes for that volume contain data correct for the file, Spotlight is withholding information about it from search results.

When a file is in a folder that should be within the scope of a regular global Spotlight search, not just app-specific Core Spotlight, but can’t be found, try relocating the file to a path such as ~/Documents that should be in scope and repeat the search. This can be done using the test files provided by Mints, for example, although Mints’ Spotlight search only encompasses the active Data volume. To test other volumes including those on external disks, use the Finder’s Find.

When a file is moved within the same volume, Spotlight normally doesn’t re-index it, but records its new path. One useful technique is to place the files in a folder that is known to be indexed, then move them to another location. Spotlight in Sequoia (at least) withholds the results of searches in Library folders and their contents, except for those in /Library/Application Support.

iCloud Drive and network shares

The whole contents of iCloud Drive should be fully accessible to local Spotlight provided that files are downloaded locally at the time of the search. All files that have been evicted from local storage, and have been made dataless as a result, become inaccessible to Spotlight until they have been downloaded again. If you want to ensure that any file in iCloud Drive remains searchable, set it to be kept downloaded or pinned using the contextual menu’s Keep Downloaded command.

Shared volumes and folders using SMB are unlikely to be accessible to local Spotlight search. Although some have claimed to be able to see items on such shares, most can’t, and there seems to be no reliable way of enabling this in Sequoia.

How to tell if Spotlight has indexed a file correctly

By: hoakley
22 November 2024 at 15:30

When you search but Spotlight doesn’t find what you’re looking for, you might wonder whether that’s because the file containing it hasn’t been indexed correctly. This article suggests some methods you can use to investigate this.

Spotlight relies on hidden indexes it maintains at the top level of each volume, in the .Spotlight-V100 folder. When an app or process creates a new file or changes the contents of an existing file, that’s recorded in the volume’s FSEvents database. If that file is in a location that isn’t excluded from Spotlight’s index, an mdworker process should then start adding its contents to the volume indexes. To do this, it first checks what type of file it is, by its UTI. Spotlight then looks up the correct mdimporter for that type, and uses that to generate data that Spotlight adds to that volume’s indexes as a set of attributes.

Recent versions of macOS can extract additional information from certain types of files such as images and PDFs. This includes text recognised within images using Live Text, and object recognition and classification, performed in the background by other services such as photoanalysisd. Those are slower and take significantly more processing, and are normally run after traditional mdimporter extraction.

mdimporter plugins

For many file types, these are provided as part of macOS and stored in the system, in /System/Library/Spotlight on the SSV. Importers for third-party apps may be in /Library/Spotlight or ~/Library/Spotlight, or in the /Library/Spotlight folder inside the app bundle. To discover all mdimporter plugins currently installed, use the command
mdimport -L
although that doesn’t inform you which file types they support.

To get more information, identify a file of the correct document type, normally with that type’s standard filename extension, and use mdimport to discover which mdimporter is used for that type, and what it can extract from that specific file, in a command of the form
mdimport -t -d3 [filepath]
where [filepath] is the path of that file. You can vary the digit used in the -d3 option from 1-3, with larger numbers delivering more detail. For -d3 you’ll first be given the file type and mdimporter used, following which all the data extracted is given according to its Spotlight attributes:
Imported '/Users/hoakley/Documents/SpotTestA.rtf' of type 'public.rtf' with plugIn /System/Library/Spotlight/RichText.mdimporter.
37 attributes returned

followed by a long list.

If you’re still in the dark, the nuclear option is to perform a diagnostic dump similar to sysdiagnose but concentrating on Spotlight, using the command
mddiagnose -f [path]
to write the diagnostic output to the specified path. Further details are given in man mddiagnose.

Mints

My free utility Mints has a whole section devoted to testing and observing Spotlight. This includes creation of a folder of nine test files in the user’s Home folder, carrying out a test search that should find those test files, custom log analysis, and cleaning up test files afterwards. This is all fully documented in the Spotlight check and test section of its Help book.

To start the test, click on the Create Test button at a known clock time, such as 12:30:00. Wait at least ten seconds, say until 12:30:10, then click the Check Search button. Give your Mac a good 20 seconds to complete that, then open the Log Window and view entries from 12:30:00 for a period of 30 seconds, to cover both the indexing of the test files and the search. Once you’re happy that all is well, you can delete the test files.

Test files include:

  • RTF rich text
  • PDF with embedded text
  • plain UTF-8 text
  • HTML
  • Microsoft Word DOCX
  • JPEG image with the search term in its EXIF metadata
  • Numbers spreadsheet
  • Pages document
  • plain UTF-8 text with the search term in a kMDItemKeywords extended attribute.

You can also customise the test files with other formats, provided that they contain the search text syzygy999 that Mints will look for.

Exclusions

Perhaps the most common reason for Spotlight not indexing a file’s contents is that it has been excluded from doing so. This is always worth checking before making the assumption the problem is in Spotlight.

There have been three general methods of excluding folders from indexing and search, although only two of them still work reliably:

  • appending the extension .noindex to the folder name (this previously worked using .no_index instead);
  • making the folder invisible to the Finder by prefixing a dot ‘.’ to its name;
  • putting an empty file named .metadata_never_index inside the folder; that no longer works in recent macOS.

System Settings offers Spotlight Privacy settings in two sections. Items listed under Search results won’t normally prevent indexing of those items, but will block them from appearing in search results. Spotlight’s indexing exclusion list is accessed from the Search Privacy… button, where listed items won’t be indexed at all.

Summary of useful commands

  • mdimport -L lists known Spotlight mdimporter plugins.
  • mdimport -t -d3 [filepath] reports the mdimporter used for a file, and data extracted from that file.
  • mddiagnose -f [path] runs full Spotlight diagnostics.

❌
❌