Normal view

There are new articles available, click to refresh the page.
Today — 22 December 2024Main stream

Dropped From Spending Bill, Cancer Research and D.C. Stadium Measures Revived by Senate

By: Minho Kim
22 December 2024 at 07:08
Two bills on pediatric cancer research and a football stadium site had been left out of the main spending package, but passed early Saturday as separate legislation.

© Eric Lee/The New York Times

Senator Chuck Schumer talked with reporters after announcing that the Senate reached an agreement that would allow lawmakers to pass a stopgap funding bill.
Before yesterdayMain stream

Using and troubleshooting Spotlight in Sequoia: summary

By: hoakley
29 November 2024 at 15:30

Over the last couple of weeks, I have looked at several aspects of Spotlight in macOS Sequoia. This article draws those together in a summary that I hope will prove a useful reference.

Spotlight and apps like HoudahSpot that work with it aren’t the only way to search for files and their contents. There are third-party alternatives, some of which can use Spotlight for some searches, while most can also search independently. These range from the free EasyFind, through FindAnyFile, to high-end FoxTrot Professional Search. If you want to know more about those, please consult their documentation.

How Spotlight works

At its heart, Spotlight consists of several components:

  • A file indexing system reliant on mdimporter modules supplied with macOS and third-party products, that enable mdworker processes to extract data from each file to be added to the index.
  • A hidden folder of indexes in .Spotlight-V100 at the top level of each volume that it indexes. These are compiled using the data extracted by mdworker processes during indexing, and maintained by processes including mds and mds_stores.
  • A search interface, including the Spotlight item in the menu bar, and the Finder’s Find command for its windows. Interfaces are also built into many apps.

There was a time when there was just one Spotlight, and searching in the Finder could reach almost any file on each indexed volume. Confusingly, different flavours of Spotlight have access to different sections of its indexes. One prime example is the search feature in Mail, which is the only way that you can search messages in that app, other than the Spotlight item in the menu bar. This is known as Core Spotlight, and prevents message contents from being found in the Finder, or in third-party apps like HoudahSpot. This operates independently of TCC privacy controls, and those can’t be used to give access to Core Spotlight indexes.

Recent versions of macOS can extract additional information from certain types of files such as images and PDFs. This may include text recognised within images using Live Text, and object recognition and classification, performed in the background by other services such as photoanalysisd. Those are slower and take significantly more processing, are normally run after traditional mdimporter extraction, and in some cases may take many days to complete. As some of that search content is only accessible to Core Spotlight, this is almost impossible to troubleshoot.

This is summarised, with pointers to common failures, in the diagram below.

spotlightsteps1

Importing to the indexes

To confirm that a file is being indexed correctly, there are three useful command tools:

  • mdimport -L lists known Spotlight mdimporter plugins.
  • mdimport -t -d3 [filepath] reports the mdimporter used for a file, and data extracted from that file.
  • mddiagnose -f [path] runs full Spotlight diagnostics.

My free Mints can also be used to investigate most Spotlight problems.

Exclusions from indexing

There have been three methods of excluding folders from indexing and search by naming, although only two of them still work reliably:

  • appending the extension .noindex to the folder name (this previously worked using .no_index instead);
  • making the folder invisible to the Finder by prefixing a dot ‘.’ to its name;
  • putting an empty file named .metadata_never_index inside the folder; that no longer works in recent macOS.

System Settings offers Spotlight Privacy settings in two sections. Items listed under Search results won’t normally prevent their indexing, but will block them from appearing in search results. Spotlight’s indexing exclusion list is accessed from the Search Privacy… button, where listed items won’t be indexed at all.

Re-indexing

In general use, the only way to force Spotlight to update or correct its indexes is to force them to be rebuilt. The only good indication for rebuilding Spotlight indexes on a volume is when they are known to be damaged or corrupted, in which case rebuilding offers the best chance of restoring normal search function to that volume.

To force a volume to be re-indexed, open Spotlight (or Siri & Spotlight) in System Settings and click the Search Privacy… or Spotlight Privacy… button at the bottom. Click the + button at the foot, select the volume and add it to the list, then click Done. Pause thirty seconds or so, click the Search Privacy… button again, select that volume in the list, and click the – button to remove it from the list. You don’t normally need to close System Settings or restart between adding and removing the volume.

If you prefer, you can instead use the mdutil command in Terminal. The command
mdutil -E /
erases the indexes on the Data volume and forces them to be rebuilt, and you can use the same option on other volumes given the correct path. Provided that you use the -E option, there’s no need to turn off indexing, or to turn it back on again, nor is it necessary to delete that volume’s hidden .Spotlight-V100 folder.

As Spotlight indexes are maintained and stored on each volume for its contents, you’ll need to include each volume on which you want to be able to search files by their contents. Unless you have been able to identify which volume’s indexes merit rebuilding, you may have to rebuild indexes on every mounted volume, which can take many hours or days to complete.

The simplest way to check that re-indexing is taking place is to open Activity Monitor, and in its CPU view check that processes with names starting with md are taking plenty of CPU. These should include mds_stores, mdworker (often multiple copies) and mds itself. On Apple silicon, those processes run almost exclusively on E cores, and are usually obvious in Activity Monitor’s CPU History window.

Although the Spotlight Search item opened from the menu bar may show indexing progress, that doesn’t cover all indexing, and isn’t a reliable indication of whether indexing or re-indexing is complete.

Indexing takes excessive time

This isn’t uncommon after mounting an external volume, or following a macOS upgrade. On Apple silicon Macs, it can result in prolonged and apparently intensive activity on the E cores. On all Macs, it may prevent a volume from being unmounted for many hours. Although some claim that forcing re-indexing can address this, that could instead take even longer to complete.

If an external disk needs to be disconnected, try the force-eject action offered by the Finder, then restart the Mac to ensure any further indexing activity is terminated. It may prove simpler to shut the Mac down, disconnect the external disk, and start it up again.

Failure to find

If a Spotlight search is unable subsequently to find known contents of a file, then one of the following has occurred:

  • The contents of that file weren’t correctly indexed, most commonly because that file is in a location excluded from indexing.
  • Although indexes for that volume contain data correct for the file, Spotlight is withholding information about it from search results.

When a file is in a folder that should be within the scope of a regular global Spotlight search, not just app-specific Core Spotlight, but can’t be found, try relocating the file to a path such as ~/Documents that should be in scope and repeat the search. This can be done using the test files provided by Mints, for example, although Mints’ Spotlight search only encompasses the active Data volume. To test other volumes including those on external disks, use the Finder’s Find.

When a file is moved within the same volume, Spotlight normally doesn’t re-index it, but records its new path. One useful technique is to place the files in a folder that is known to be indexed, then move them to another location. Spotlight in Sequoia (at least) withholds the results of searches in Library folders and their contents, except for those in /Library/Application Support.

iCloud Drive and network shares

The whole contents of iCloud Drive should be fully accessible to local Spotlight provided that files are downloaded locally at the time of the search. All files that have been evicted from local storage, and have been made dataless as a result, become inaccessible to Spotlight until they have been downloaded again. If you want to ensure that any file in iCloud Drive remains searchable, set it to be kept downloaded or pinned using the contextual menu’s Keep Downloaded command.

Shared volumes and folders using SMB are unlikely to be accessible to local Spotlight search. Although some have claimed to be able to see items on such shares, most can’t, and there seems to be no reliable way of enabling this in Sequoia.

Why can’t Spotlight find files in Library folders?

By: hoakley
26 November 2024 at 15:30

You probably don’t use Spotlight to search files likely to be in one of your Mac’s Library folders, but if you do you may have noticed that it hardly ever finds anything there. I’m grateful to Scott for pointing out that some folders inside those Library folders, like Preferences, can’t be searched at all. This article attempts to discover what’s going on.

How Spotlight finds

When files are created or changed, provided they’re in a location that it indexes, Spotlight’s mdworker processes index data about those files, including metadata about each file itself, such as its name, and about its contents, enabling you to search for terms found in the file’s data. If a Spotlight search is unable subsequently to find known contents of a file, then one of the following has occurred:

  • The contents of that file weren’t correctly indexed, most commonly because that file is in a location excluded from indexing, as set in Search Privacy… in Spotlight settings.
  • Although indexes for that volume contain data correct for the file, Spotlight is withholding information about it from search results, most commonly because that category of file (or folder) shouldn’t provide search results, according to Search results settings in Spotlight settings.

I’ll refer to these two causes as unindexed and withheld respectively.

Does Spotlight reindex files that have been moved?

Before trying to establish which folders Spotlight won’t return search results for, I needed to be able to distinguish between those two causes. One way to achieve that might be to ensure that test files were first indexed at a location that is reliably indexed, for which no search results are withheld, then move the test files to a different location. That relies on Spotlight retaining indexed data when files are moved, rather than reindexing them.

I therefore assembled a folder of test files, based on those created and used by Mints, with the addition of a property list containing the search term syzygy999. These were first placed at the top of the user’s Home folder, time allowed for Spotlight to add them to the volume’s indexes, and that was confirmed by performing Mints’ test search, which found them all. That folder was then moved to ~/Documents, with Mints given Full Disk Access to ensure search wouldn’t be blocked by TCC.

Inspection of the log for 30 seconds following the change of location demonstrated that no reindexing occurred, although Spotlight was informed of the changed paths for the files. Thus, any search failures are expected to result from results being withheld, rather than the files being unindexed.

Which folders don’t return expected search results?

I then moved the test folder from ~/Documents to other folders and repeated Mints’ test search for each. Results were:

  • ~/Library and /Library – search only found the term in a plain text file, where it was attached as an extended attribute and not in the text data.
  • ~/Library/Trial – only found in plain text file with the search term in xattr.
  • ~/Library/Preferences – no searches returned successfully.
  • ~/Library/Application Support – all test files were found successfully, including the property list.

Is this specific to the active boot volume group?

All those test were performed on the active Data volume. To test whether this extends to other (inactive) boot volume groups, I connected an external bootable SSD with Sonoma installed and repeated the tests. As Mints only searches the current Data volume, I used the Find feature in the Finder. Results were exactly the same for that Data volume’s single user.

Where is this behaviour controlled?

Within each volume’s Spotlight index folder is VolumeConfiguration.plist, whose name suggests it might contain Spotlight configuration for that volume. Comparison of that file from a regular APFS volume with that from the Data volume of a boot volume group failed to suggest any value that might be responsible for this behaviour, though.

That indicates it’s determined in another property list, or coded into Spotlight.

If you have purchased the Spotlight master tool HoudahSpot, which I strongly recommend, then this may seem familiar, as it’s mentioned in its Help file, suggesting that it’s not new to Sequoia.

Conclusions

  • Spotlight in Sequoia (at least) withholds the results of searches in Library folders and their contents, except for those in /Library/Application Support.
  • There’s no way apparent to change that behaviour.

Postscript on dealing with prolonged indexing

At the end of my testing, I tried to unmount the external SSD containing the bootable Sonoma installation, only to be told that it was still in use. This was the inevitable reindexing that has caused similar problems to many others. Rather than forcing ejection in the Finder, I left this to run, in the hope that it would complete reasonably swiftly on a Mac mini M4 Pro. I gave up after over three hours, forced the external Data volume to be unmounted, and disconnected the SSD.

Intensive reindexing activity continued to occupy all the E cores, as if the volume was still present. This was occurring in the /var directory rather than in any volume indexes, and was only terminated by restarting the Mac. At no time did Spotlight indicate that indexing was in progress, nor provide any indication of progress to completion. If this isn’t a bug in Sequoia, then it’s a serious flaw in Spotlight that needs to be addressed. I will see if I can repeat this, and then file a bug report if I can.

When and how to rebuild Spotlight indexes

By: hoakley
19 November 2024 at 15:30

Forcing Spotlight’s indexes to be rebuilt has become a panacea, popularly used when anything appears amiss with Spotlight. In many cases, it’s exactly the wrong response to what’s likely to be normal indexing activity. This article explains when it can sometimes be useful, and how to make more effective use of it.

Spotlight works by searching the indexes it maintains on each volume, stored in the hidden .Spotlight-V100 folder at the top level of each searchable volume. Within that folder is a property list containing the volume configuration details, and the Store-V2 folder containing a folder named using a UUID, within which are all the files composing the indexes. As those are opaque to the user they shouldn’t be tampered with.

Indexing

The process of indexing is simplest to understand when considered for a single newly created or changed file. That change is recorded in the volume’s FSEvents database, which in turn triggers an XPC call to process that file if it’s in a location within Spotlight’s scope.

Provided the changed file isn’t in an excluded location, an mdworker process should then start adding its contents to the volume indexes. To do this, it first checks what type of file it is, in terms of its UTI. If that’s incorrect, then the remainder of the steps won’t work properly. In most cases now, that means the file must have the correct extension for its type. If it doesn’t, then mdworker won’t be able to index it correctly.

Spotlight then looks up the correct mdimporter for that type. For many file types, those are provided as part of macOS and stored in the system, in /System/Library/Spotlight on the SSV. Importers for third-party apps may be in /Library/Spotlight or ~/Library/Spotlight, or in the /Library/Spotlight folder inside the app itself. To check all mdimporter plugins currently installed, use the command
mdimport -L

Spotlight importers and mdworker itself can crash when there’s a bug, or the mdimporter encounters a malformed file. If that happens, the log normally records repeated crashes and restarts of that mdworker process. If you can identify and remove the file that’s causing those, that should allow indexing to return to normal.

Once the mdworker has extracted the data from the file, that’s added to the volume’s indexes, typically reflected in a log entry from mds_stores containing the message
compressing 5686 bytes to <private>
or similar, for each file that has had content extracted and added to the indexes. Other services involved include mdsync and mdwrite.

Recent versions of macOS can extract additional information from certain types of files such as images and PDFs. This includes text recognised within images using Live Text, and object recognition and classification, performed in the background by other services such as photoanalysisd. Those are slower and take significantly more processing, and are normally run after mdimporter extraction.

When might it be useful to rebuild indexes?

The only indication for rebuilding Spotlight indexes on a volume is when they are known to be damaged or corrupted, in which case rebuilding offers the best chance of restoring normal search function to that volume.

One way to check the functional integrity of indexes is to perform searches for known targets, a feature available in my free Mints. I added this when the system mdimporter for Rich Text had a bug that effectively made searching the contents of any RTF file impossible, thankfully a rare situation, although third-party mdimporters may not be as reliable.

In the past rebuilding indexes was often used when mdworker processes repeatedly crashed when trying to extract index data from a file. However, that relied on the assumption that those crashes wouldn’t recur. If they did, then rebuilding wouldn’t solve the problem. One way to investigate this further is to discover from the log which file is causing mdworker workers to crash, and removing the cause. This isn’t as straightforward now, as log entries no longer identify the file(s) causing the problem unless privacy protection is removed from the log.

Recently, it has become popular to force indexes to be rebuilt whenever Spotlight’s indexing processes appear to be taking a long time maintaining current indexes, on the assumption that starting that from scratch is going to be quicker than leaving them to complete. This isn’t likely to help, as it’s likely to force rebuilding of indexes that are already fully up to date, and indexing provides no information as to its progress. So there’s no way of telling whether allowing current indexing activity to complete would take another few seconds or days. However, it’s most unlikely that forcing full reindexing would ever be faster than allowing the completion of indexing that’s already in progress.

Rebuilding the index

To force a volume to be re-indexed, open Spotlight (or Siri & Spotlight) in System Settings and click the Search Privacy… or Spotlight Privacy… button at the bottom. Click the + button at the foot, select the volume and add it to the list, then click Done. Pause thirty seconds or so, click the Search Privacy… button again, select that volume in the list, and click the – button to remove it from the list. You don’t normally need to close System Settings or restart between adding and removing the volume.

If you prefer, you can instead use the mdutil command in Terminal. The command
mdutil -E /
erases the indexes on the Data volume and forces them to be rebuilt, and you can use the same option on other volumes.

As Spotlight indexes are maintained and stored on each volume for its contents, you’ll need to include each volume on which you want to be able to search files by their contents. Unless you have been able to identify which volume’s indexes merit rebuilding, you may have to rebuild indexes on every mounted volume, which can take many hours or days to complete.

The simplest way to check that re-indexing is taking place is to open Activity Monitor, and in its CPU view check that processes with names starting with md are taking plenty of CPU. These should include mds_stores, mdworker (often multiple copies) and mds itself. On Apple silicon, those processes run almost exclusively on E cores, and are usually obvious in Activity Monitor’s CPU History window.

Key points

  • Each volume may have its own Spotlight indexes, used when searching that volume’s contents.
  • Modern macOS indexes more extensive data, including text recognised within images, and types of object found within them. More advanced metadata take longer to analyse and index.
  • Rebuilding indexes is indicated if they are known to be damaged or corrupted.
  • If you don’t know which volume’s indexes require to be rebuilt, rebuilding them on all volumes can take many hours or days.
  • If the problem is in a file or mdimporter then it’s likely to recur during rebuilding indexes unless the file is identified and removed from indexing.
  • As there’s no way to determine progress in building indexes, forcing a rebuild is likely to take longer than allowing current indexing to complete.
  • Rebuilding is best triggered by adding the volume to Spotlight exclusions, then removing it again.
  • Alternatively, use mdutil -E [volume].
  • Check rebuilding is taking place using Activity Monitor.

Postscript

Several have commented that the Spotlight Search item opened from the menu bar can show indexing progress. That’s correct, but it doesn’t actually show all Spotlight indexing by any means. For example, on my M4 Mac mini, that shows only the first minute or so, then pretends that indexing is complete despite a further 10 minutes of intensive activity filling the E cores in CPU History, almost all of it the result of continuing Spotlight indexing activity. So what that progress bar shows is the period during which Spotlight search is unavailable, not the period of indexing.

Regarding additional mdutil commands, a quick read through the man page is informative. That makes it clear that the -E option “will cause each local store for the volumes indicated to be erased. The stores will be rebuilt if appropriate.” There is no need to halt indexing before doing that.

However, if you do use -i off before performing other mdutil commands, then you will need to turn indexing back on again using -i on before Spotlight will either recreate a deleted index directory or rebuild the indexes within that. There is absolutely no need to remove the index directory using -X if all you want to do is force the indexes to be rebuilt: -E is perfectly sufficient to do that.

❌
❌