Wildfire Smoke Will Kill Thousands More by 2050, Study Finds
© The New York Times
© The New York Times
© Jason Andrew for The New York Times
© Getty Images
© Vasily Bogoyavlensky/Agence France-Presse — Getty Images
© Shira Inbar
© Getty Images
Over the last few weeks, as I’ve been digging deeper into Spotlight local search, what seemed at first to be fairly straightforward has become far more complex. This article draws together some lessons that I have learned.
Having tracked down a batch of Apple’s patents relating to search technologies that are likely to have been used in Spotlight, I was surprised to see those have been abandoned. For example, US Patent 2011/0113052 A1, Query result iteration for multiple queries, filed by John Hörnkvist on 14 January 2011, was abandoned five years later. I therefore have no insights to offer based on Apple’s extant patents.
Spotlight indexes metadata separately from contents, and both types of index point to files, apparently through their paths and names, rather than their inodes. You can demonstrate this using test file H in SpotTest. Once Spotlight has indexed objects in images discovered by mediaanalysisd
, moving that file to a different folder breaks that association immediately, and the same applies to file I whose text is recognised by Live Text.
Extracted text, that recovered using optical character recognition (Live Text), and object labels obtained using image classification (Visual Look Up) are all treated as content rather than metadata. Thus you can search for content, but you can’t obtain a list of objects that have been indexed from images, any more than you can obtain Spotlight’s lexicon of words extracted as text.
Spotlight’s indexes are multilingual, as demonstrated by one of Apple’s earliest patents for search technology. Extracted text can thus contain words in several languages, but isn’t translated. Object labels are likely to be in the primary language set at the time, for example using the German word weide instead of the English cattle, if German was set when mediaanalysisd
extracted object types from that image. You can verify this in SpotTest using test file H and custom search terms.
If you change your Mac’s primary language frequently, this could make it very hard to search for objects recognised in images.
The Finder’s Find feature can be effective, but has a limited syntax lacking OR
and NOT
unless you resort to using Raw Query predicates (available from the first popup menu).* This means it can’t be used to construct a search for text containing the word cattle OR
cow. This has a beneficial side-effect, in that each term used should reduce the number of hits, but it’s a significant constraint.
The Finder does support some search types not available in other methods such as mdfind
. Of the image-related types, John reports that kMDItemPhotosSceneClassificationLabels
can be used in the Finder’s Find and will return files with matching objects that have been identified, but that doesn’t work in mdfind
, either in Terminal or when called by an app. Other promising candidates that have proved unsuccessful include:
One huge advantage of mdfind
is that it can perform a general search for content using wildcards, in the form(** == '[searchTerm]*'cdw)
Using NSMetadataQuery from compiled code is probably not worth the effort. Not only does it use predicates of different form from mdfind
, but it’s unable to make use of wildcards in the same way that mdfind
can, a lesson again demonstrated in SpotTest. mdfind
can also be significantly quicker.
For example, you might use the formmdfind "(kMDItemKeywords == '[searchTerm]*'cdw)"
in Terminal, or from within a compiled app. The equivalent predicate for NSMetadataQuery would read(kMDItemKeywords CONTAINS[cdw] \"cattle\")
Another caution when using NSMetadataQuery is that apps appear to have their own single NSMetadataQuery instance on their main thread. That can lead to new queries leaking into the results from previous queries.
OR
or NOT
, although using Raw Queries can work around that.mdfind
is most powerful, including wildcard search for content.I’m very grateful to Jürgen for drawing my attention to the effects of language, and to John for reporting his discovery of kMDItemPhotosSceneClassificationLabels.
As promised, this new version of my Spotlight indexing and search utility SpotTest extends its reach beyond the user’s Home folder, and can now test and search any regular volume that’s connected to your Mac and mounted in /Volumes.
By default, its searches remain restricted to the user’s Home folder, where SpotTest’s folder of crafted test files is installed. That applies whether you opt to use the search using its NSMetadataQuery tool, or the much faster option of the mdfind
tool instead. If you want to search another mounted volume, click on the button for the app to check which volumes are available, then select one from its new Scope menu items. Volumes listed there exclude Time Machine backups and any hidden volumes whose names start with a dot, which will in any case be excluded from Spotlight indexing as they’re hidden.
This new version also fixes a weird bug that you’re unlikely to encounter in the previous version, but in rare circumstances could be infuriating. When searching using the NSMetadataQuery tool, if you had two windows open both with results from that tool, both would be updated with the same search results, and the time taken in them could rise to the absurd. This occurred because both windows were being updated with the data returned from the most recent search, as the NSMetadataQuery is shared in the app’s MainActor. After some fraught debugging, windows in this version ignore any search result updates initiated by other windows. I hope!
Volumes set in the Scope menu only affect search scope. Test folders are created in and removed from the user’s Home folder, and mdimporters are checked there as well. If you want to investigate indexing and search performance on other volumes, then you should manually create your own test folders as necessary. One quick and simple approach is to create a standard test folder in the Home folder, and copy that onto the volume(s) you want to test. A little later this week I’ll illustrate this in an article explaining how to get the best out of SpotTest and how it can help diagnose Spotlight problems.
I have taken the opportunity to improve SpotTest’s reporting of errors, such as trying to remove a test folder that doesn’t exist. I have also thoroughly revised the Help book, and added a page about search scopes.
SpotTest version 1.1 for macOS 14.6 and later, including Tahoe, is now available from here: spottest11
from Downloads above, and from its Product Page.
Enjoy!
There are some topics that invariably generate comments from those who have either abandoned a major feature in macOS, or are struggling with it. Some of the most persistent are problems with Spotlight, particularly with its local search of files on your Mac. To help grapple with those, four years ago I added some Spotlight tests to Mints that can be used to work out where those problems are occurring. I’m delighted now to offer an extension to those in a whole new app, perhaps predictably named SpotTest.
Spotlight is so substantial, almost silent in the log, and impenetrable that the best approach to diagnosing its problems is to test it out in a controlled way. Mints has been doing that by creating a folder of files containing an unusual word, then searching for that. Although that’s still useful for a quick test, we need something more focused and flexible, and that’s what SpotTest aims to deliver.
Following deep dives into how Spotlight indexes and searches metadata and contents of files, and how it can search text extracted from images and the results of image analysis, I’ve realised that different test files are required, together with alternative means of search. For example, the standard approach used in compiled apps, with NSMetadataQuery, is incapable of finding content tags obtained using Visual Look Up, which only appear when using the mdfind
command. SpotTest takes these into account.
There are now 15 carefully crafted test files, of which one cannot currently be found, no matter what method of search you try.
A perfect 13/15 result from NSMetadataQuery is only possible after waiting a day or more for background mediaanalysisd
processing to recognise and extract the text in file I, a PNG image. The other 12 here should all be found when running this test a few seconds after the test files have been created. They rely on a range of mdimporter modules bundled in macOS, apart from file L, an XML property list.
Another of SpotTest’s tools will list the mdimporters used for each of the test files.
Run the search using the mdfind
command within SpotTest and, once mediaanalysisd
has done its image recognition, you should get a perfect 14/15.
The only current limitation of SpotTest version 1.0 is that it can only run tests on the Data volume that your Mac started up from, using a folder at the top level of your Home folder. A future version will let you test other volumes as well. Its Help book runs to nine pages: please read them, as its test might seem deceptively simple but provide a lot of useful information about how Spotlight local search is functioning. Coupled with log extracts using LogUI it should shine light in the darkness.
SpotTest 1.0, which requires macOS 14.6 or later, is now available from here: spottest10
and from its new place in its Product Page.
I wish you successful searching.
With additional information about Spotlight’s volume indexes, this article looks in more detail at how those indexes are updated when new files are created, and how local search is reflected in log entries. To do this I use the Spotlight test feature in my free utility Mints, which is designed to assess its performance.
Mints can create a test folder and populate it with 9 test files, copied from inside the app bundle using File Manager’s copyItem
function. Provided the app and that newly created folder in ~/MintsSpotlightTest4syzFiles are in the same volume, this will almost certainly occur by creating APFS clone files. The files are:
The search target they each contain is a string starting with the characters syzygy999
so most likely to be unique to the test files and Mints’ bundle.
This example test was performed on a Mac mini M4 Pro running macOS 15.6 from its internal SSD, with the Home folder located in the boot Data volume in that storage.
Folder creation and file copying should take place soon after clicking on the Create Test button, in this example test in just over 0.3 seconds. The first mdworker
process responding to this is spawned about 0.9 seconds after file copying starts, and further mdworker
processes are added to it, making a total of 11. These are uncorked by launchd
in about 0.03-0.05 seconds and are active for a period of about 0.2 seconds before they appear to have finished creating new posting lists to be added to the local indexes.
Shortly after those are complete, spotlightknowledge
is active briefly, then 0.2 seconds after file copying mediaanalysisd
receives a text processing request and initialises CGPDFService. Just over 0.1 second later CoreSceneUnderstanding is invoked from mediaanalysisd
. For the following 5 seconds, a series of MADTextEmbedding (mediaanalysisd
text embedding) tasks are performed. Finally, content obtained by mdworker
and MADTextEmbedding is compressed by mds_stores
, finishing about 6.7 seconds after the new files were created.
These steps are summarised in the diagram above, where those in blue update metadata indexes, and those in yellow and green update content indexes. It’s most likely that each file to be added to the indexes has its own individual mdworker
process that works with a separate mdimporter, although in this case at least 11 mdworker
processes were spawned for 9 files to be indexed.
Text extraction from files by mediaanalysisd
involves a series of steps, and will be covered in more detail separately. In this test, local Spotlight indexes should have been updated within 7-8 seconds of file creation, so waiting at least 10 seconds after clicking the Create Test button should be sufficient to enable the test files to be found by search.
When the Check Search button in Mints is clicked, an NSMetadataQuery is performed using a search predicate over the search scope of NSMetadataQueryUserHomeScope
, the Home folder. This uses the search predicate (kMDItemTextContent CONTAINS[cd] "syzygy999") OR (kMDItemKeywords CONTAINS[cd] "syzygy999") OR (kMDItemAcquisitionMake CONTAINS[cd] "syzygy999")
to find all items containing any of:
kMDItemTextContent CONTAINS[cd] "syzygy999"
looks for the target string in a text representation of the content of the document, as in most of the test files used.kMDItemKeywords CONTAINS[cd] "syzygy999"
looks for keywords in an extended attribute of type com.apple.metadata:kMDItemKeywords as attached to the SpotTestMD.txt file.kMDItemAcquisitionMake CONTAINS[cd] "syzygy999"
looks for the Make field of an EXIF IDF0 as embedded in the SpotTestEXIF.jpg file.Each of those is both case- and diacritic-insensitive, although the test cases all use lower case without any diacritics.
Remarkably little is recorded in the log for these test searches. com.apple.metadata starts the query within 0.001 second of clicking the Check Search button, and is interrupted by TCC approving the search to proceed. About 0.03 seconds after the start of the query, a series of entries from mds_stores
reports query node counts of 4, and results are fetched and sent as replies between 0.09 and 19.5 seconds after the start of the query. However, Mints typically reports a considerably shorter search time of around 4 seconds.
The test currently included in Mints appears useful, in that it assesses the whole of Spotlight local search from file creation to completion of a test search. However, in its current form it doesn’t test LiveText or other in-image text extraction, and only applies to the volume containing the Home folder. Log extracts obtained in its Log Window are basic and don’t capture much of the detail used for this article. There seems scope for a dedicated Spotlight test app that addresses these and more.
mediaanalysisd
(‘MAD’) is significantly slower than extraction of metadata and other text content.One thing we humans are good at is searching. It’s a task we engage in from a few moments after birth until the time we slip away in death, we search everything around us. Locating and identifying that bird of prey wheeling high, finding the house keys, and that book we mislaid some time last week, meeting the perfect partner, discovering the right job, choosing the best education, looking through a Where’s Wally? or Where’s Waldo? book, and so on. Searching has transformed some into explorers like Christopher Columbus, and was the purpose of the chivalric quest. It’s what researchers in every field do, and thanks to Douglas Adams can be answered by the number 42.
Last week my searching took two new turns.
The first was more of a meta-search, in trying to discover more about the internals of Spotlight. Following the example of Maynard Handley, who has used them so successfully in understanding how M-series CPUs work, I looked through patents that have been awarded to Apple for the work of its search engineers. Yesterday’s slightly fuller history of Spotlight search is one result, and there are more to come in the future as I digest those patents concerned with performing search.
There’s a tinge of irony here, as many of my searches have been conducted using Google Patents, alongside Google Scholar one of the remaining search engines that doesn’t yet use AI and attempt to provide its own answers.
The other marks a new phase in my quest to get more information from the Unified log. Looking back to my first comment here, I realise how wildly over-optimistic I was when I wrote that it “should make my life a lot easier”, and that “a new version of Console will provide improved features to help us wade through logs.” Nine years later, I look wistfully at what remains of Console and realise how wrong I was on both counts.
When RunningBoard arrived in macOS Catalina, I soon noticed how “its log entries are profuse, detailed, and largely uncensored for privacy.” Since then it has proved garrulous to the point where its apparently ceaseless log chatter is a distraction, and can overwhelm attempts to read other log entries. I suspect it has contributed significantly to those advanced Mac users who now refuse to even try to make sense of the log.
One answer might be to tweak log preferences to shut out this noise, but given the purpose of RunningBoard in monitoring the life cycle of apps, why not try to use the information it provides? To do that, it’s first necessary to understand RunningBoard’s idiosyncratic language of assertions, and its protocols under which they’re acquired. The only way to do that without documentation is by observation: catalogue over 30 of those assertions for an interesting example like Apple’s Developer app, and see what they reveal.
By far the most informative entries from RunningBoard are those announcing that it’s acquiring an assertion, such asAcquiring assertion targeting [app<application.developer.apple.wwdc-Release.9312198.9312203(501)>:2946] from originator [osservice<com.apple.uikitsystemapp(501)>:748] with description <RBSAssertionDescriptor| "com.apple.frontboard.after-life.subordinate" ID:424-748-2228 target:2946 attributes:[
<RBSDomainAttribute| domain:"com.apple.frontboard" name:"AfterLife-Subordinate" sourceEnvironment:"(null)">
]>
In a log often censored to the point of being unintelligible, this contains frank and explicit detail. The app is identified clearly, with the user ID of 501 and process ID of 2946. The originator is similarly identified as com.apple.uikitsystemapp with its PID of 748, which is confirmed in the middle digits in the Assertion ID. This is explicitly related to FrontBoard and an attribute named AfterLife-Subordinate. There’s not a single <private> to blight this entry, although further knowledge is needed to decode it fully.
Normally to get such information from a running process would require its source code to be instrumented with calls to write log entries, many of which would be lost to <private>, yet RunningBoard seems happy, for the moment, to provide that information freely. You can see what I mean by applying the predicatesubsystem == "com.apple.runningboard" AND message CONTAINS "Acquiring assertion t"
in LogUI, to obtain a running commentary on active apps and processes. Once you’ve identified a relevant assertion, you can focus attention on other log entries immediately prior to that. I will be following this up in the coming week, with fuller instructions and some demonstrations.
Although neither patents nor assertions have the significance of the number 42, in their own ways they show how the art and science of search aren’t dead yet, nor have they succumbed to AI.
Since writing A brief history of local search, I have come across numerous patents awarded to Apple and its engineers for the innovations that have led to Spotlight. This more detailed account of the origins and history of Spotlight uses those primary sources to reconstruct as much as I can at present.
1990
ON Technology, Inc. released On Location, the first local search utility for Macs, a Desk Accessory anticipating many of the features to come in Spotlight 15 years later. This indexed text found in the data fork of files, using format-specific importer modules to access those written by Microsoft Word, WordPerfect, MacWrite and other apps of the day. Those files and their indexed contents were then fully searchable. This required System Software 6.0 or later, and a Mac with a hard disk and at least 1 MB of RAM. It was developed by Roy Groth, Rob Tsuk, Nancy Benovich, Paul Moody and Bill Woods.
1991
Version 2 of On Location was released. ON Technology was later acquired by Network Corporation, then by Symantec in 2003.
AppleSearch was released, and bundled in Workgroup Servers. This was based on a client-server system running over AppleShare networks. September’s release of System Software 7.5 introduced a local app Find File, written by Bill Monk.
Sherlock was released in Mac OS 8.5. This adopted a similar architecture to AppleSearch, using a local service that maintained indexes of file metadata and content, and a client app that passed queries to it. This included remote search of the web through plug-ins working with web search engines, as they became available.
Early patent applications were filed by Apple’s leading engineers who were working on Sherlock, including US Patent 6,466,901 B1 filed 30 November 1998 by Wayne Loofbourrow and David Cásseres, for a Multi-language document search and retrieval system.
Sherlock 2 was released in Mac OS 9.0. This apparently inspired developers at Karelia Software to produce Watson, ‘envisioned as Sherlock’s “companion” application, focusing on Web “services” rather than being a “search” tool like Sherlock.’
On 5 January, Yan Arrouye and Keith Mortensen filed what became Apple’s US Patent 6,847,959 B1 for a Universal Interface for Retrieval of Information in a Computer System. This describes the use of multiple plug-in modules for different kinds of search, in the way that was already being used in Sherlock. Drawings show that it was intended to be opened using an item on the right of the menu bar, there titled [GO-TO] rather than using the magnifying glass icon of Sherlock or Spotlight. This opened a search dialog resembling a prototype for Spotlight, and appears to have included ‘live’ search conducted as letters were typed in.
Karelia Software released Watson.
Mac OS X Jaguar brought Sherlock 3, which many considered had an uncanny resemblance to Watson. That resulted in acrimonious debate.
In preparation for the first Intel Macs, Mac OS X 10.4 Tiger, released in April 2005, introduced Spotlight as a replacement for Sherlock, which never ran on Intel Macs.
Initially, the Spotlight menu command dropped down a search panel as shown here, rather than opening a window as it does now.
On 4 August, John M Hörnkvist and others filed what became US Patent 7,783,589 B2 for Inverted Index Processing, for Apple. This was one of a series of related patents concerning Spotlight indexing. Just a week later, on 11 August, Matthew G Sachs and Jonathan A Sagotsky filed what became US Patent 7,698,328 B2 for User-Directed search refinement.
A Finder search window, precursor to the modern Find window, is shown in the lower left of this screenshot taken from Tiger in 2006.
Spotlight was improved in Mac OS 10.5 Leopard, in October. This extended its query language, and brought support for networked Macs that were using file sharing.
This shows a rather grander Finder search window from Mac OS X 10.5 Leopard in 2009.
Search attributes available for use in the search window are shown here in OS X 10.9 Mavericks, in 2014.
In OS X 10.10 Yosemite, released in October, web and local search were merged into ‘global’ Spotlight, the search window that opens using the Spotlight icon at the right end of the menu bar, accompanied by Spotlight Suggestions.
John M Hörnkvist and Gaurav Kapoor filed what was to become US Patent 10,885,039 B2 for Machine learning based search improvement, which appears to have been the foundation for Spotlight Suggestions, in turn becoming Siri Suggestions in macOS Sierra. Those were accompanied by remote data collection designed to preserve the relative anonymity of the user.
This shows a search in Global Spotlight in macOS 10.12 Sierra, in 2017.
Apple acquired Laserlike, Inc, whose technology (and further patents) has most probably been used to enhance Siri Suggestions. Laserlike had already filed for patents on query pattern matching in 2018.
I’m sure there’s a great deal more detail to add to this outline, and welcome any additional information, please.
4 August 2025: I’m very grateful to Joel for providing me with info and links for On Location, which I have incorporated above.