Normal view

There are new articles available, click to refresh the page.
Yesterday — 8 September 2025Main stream

Ruth Paine, Who Gave Lodging to Marina Oswald, Dies at 92

7 September 2025 at 22:52
Her knowledge of Lee Harvey Oswald and his wife made her a noteworthy witness during the Warren Commission’s investigation into the assassination of President John F. Kennedy.

© Eric Risberg/Associated Press

Ruth Paine in 2013. She let Lee Harvey Oswald and his wife, Marina, stay at her home in 1963 and, according to the author Thomas Mallon, knew more about the Oswalds’ movements and moods in the months prior to the assassination of President John F. Kennedy than anyone else did.
Before yesterdayMain stream

A deeper dive into Spotlight indexing and local search

By: hoakley
4 August 2025 at 14:30

With additional information about Spotlight’s volume indexes, this article looks in more detail at how those indexes are updated when new files are created, and how local search is reflected in log entries. To do this I use the Spotlight test feature in my free utility Mints, which is designed to assess its performance.

Mints can create a test folder and populate it with 9 test files, copied from inside the app bundle using File Manager’s copyItem function. Provided the app and that newly created folder in ~/MintsSpotlightTest4syzFiles are in the same volume, this will almost certainly occur by creating APFS clone files. The files are:

  • SpotTestA.rtf, RTF, relying on /System/Library/Spotlight/RichText.mdimporter
  • SpotTestB.pdf, PDF, relying on /System/Library/Spotlight/PDF.mdimporter
  • SpotTestC.txt, plain text, relying on /System/Library/Spotlight/RichText.mdimporter
  • SpotTestD.html, HTML, relying on /System/Library/Spotlight/RichText.mdimporter
  • SpotTestE.docx, Microsoft Word docx, replying on /System/Library/Spotlight/Office.mdimporter
  • SpotTestEXIF.jpg, JPEG image not containing the target string in the image, but in the Make field of its EXIF IDF0 metadata, and relying on /System/Library/Spotlight/Image.mdimporter
  • SpotTestF.numbers, Numbers spreadsheet containing the target string in one of its cells, and relying on /System/Library/Spotlight/iWork.mdimporter
  • SpotTestG.pages, Pages document, relying on /System/Library/Spotlight/iWork.mdimporter
  • SpotTestMD.txt, plain text not containing the target string in its contents, but in an extended attribute of type com.apple.metadata:kMDItemKeywords attached when the file is copied.

The search target they each contain is a string starting with the characters syzygy999 so most likely to be unique to the test files and Mints’ bundle.

This example test was performed on a Mac mini M4 Pro running macOS 15.6 from its internal SSD, with the Home folder located in the boot Data volume in that storage.

Indexing

Folder creation and file copying should take place soon after clicking on the Create Test button, in this example test in just over 0.3 seconds. The first mdworker process responding to this is spawned about 0.9 seconds after file copying starts, and further mdworker processes are added to it, making a total of 11. These are uncorked by launchd in about 0.03-0.05 seconds and are active for a period of about 0.2 seconds before they appear to have finished creating new posting lists to be added to the local indexes.

Shortly after those are complete, spotlightknowledge is active briefly, then 0.2 seconds after file copying mediaanalysisd receives a text processing request and initialises CGPDFService. Just over 0.1 second later CoreSceneUnderstanding is invoked from mediaanalysisd. For the following 5 seconds, a series of MADTextEmbedding (mediaanalysisd text embedding) tasks are performed. Finally, content obtained by mdworker and MADTextEmbedding is compressed by mds_stores, finishing about 6.7 seconds after the new files were created.

These steps are summarised in the diagram above, where those in blue update metadata indexes, and those in yellow and green update content indexes. It’s most likely that each file to be added to the indexes has its own individual mdworker process that works with a separate mdimporter, although in this case at least 11 mdworker processes were spawned for 9 files to be indexed.

Text extraction from files by mediaanalysisd involves a series of steps, and will be covered in more detail separately. In this test, local Spotlight indexes should have been updated within 7-8 seconds of file creation, so waiting at least 10 seconds after clicking the Create Test button should be sufficient to enable the test files to be found by search.

Searching

When the Check Search button in Mints is clicked, an NSMetadataQuery is performed using a search predicate over the search scope of NSMetadataQueryUserHomeScope, the Home folder. This uses the search predicate (kMDItemTextContent CONTAINS[cd] "syzygy999") OR (kMDItemKeywords CONTAINS[cd] "syzygy999") OR (kMDItemAcquisitionMake CONTAINS[cd] "syzygy999") to find all items containing any of:

  • kMDItemTextContent CONTAINS[cd] "syzygy999" looks for the target string in a text representation of the content of the document, as in most of the test files used.
  • kMDItemKeywords CONTAINS[cd] "syzygy999" looks for keywords in an extended attribute of type com.apple.metadata:kMDItemKeywords as attached to the SpotTestMD.txt file.
  • kMDItemAcquisitionMake CONTAINS[cd] "syzygy999" looks for the Make field of an EXIF IDF0 as embedded in the SpotTestEXIF.jpg file.

Each of those is both case- and diacritic-insensitive, although the test cases all use lower case without any diacritics.

Remarkably little is recorded in the log for these test searches. com.apple.metadata starts the query within 0.001 second of clicking the Check Search button, and is interrupted by TCC approving the search to proceed. About 0.03 seconds after the start of the query, a series of entries from mds_stores reports query node counts of 4, and results are fetched and sent as replies between 0.09 and 19.5 seconds after the start of the query. However, Mints typically reports a considerably shorter search time of around 4 seconds.

Better testing

The test currently included in Mints appears useful, in that it assesses the whole of Spotlight local search from file creation to completion of a test search. However, in its current form it doesn’t test LiveText or other in-image text extraction, and only applies to the volume containing the Home folder. Log extracts obtained in its Log Window are basic and don’t capture much of the detail used for this article. There seems scope for a dedicated Spotlight test app that addresses these and more.

Summary

  • Spotlight local indexing from the creation of nine new files reliant on macOS bundled mdimporters is initiated promptly, and should be completed within 10 seconds of their creation. It appears to follow the scheme shown in the diagram above.
  • Text extraction from images by mediaanalysisd (‘MAD’) is significantly slower than extraction of metadata and other text content.
  • NSMetadataQuery search should also be relatively quick when the number of hits is small, but leaves little information in the log.

❌
❌