The Hidden Trauma of Jury Duty
© Neeta Satam for The New York Times
© Neeta Satam for The New York Times
© Lyndon French for The New York Times
With additional information about Spotlight’s volume indexes, this article looks in more detail at how those indexes are updated when new files are created, and how local search is reflected in log entries. To do this I use the Spotlight test feature in my free utility Mints, which is designed to assess its performance.
Mints can create a test folder and populate it with 9 test files, copied from inside the app bundle using File Manager’s copyItem
function. Provided the app and that newly created folder in ~/MintsSpotlightTest4syzFiles are in the same volume, this will almost certainly occur by creating APFS clone files. The files are:
The search target they each contain is a string starting with the characters syzygy999
so most likely to be unique to the test files and Mints’ bundle.
This example test was performed on a Mac mini M4 Pro running macOS 15.6 from its internal SSD, with the Home folder located in the boot Data volume in that storage.
Folder creation and file copying should take place soon after clicking on the Create Test button, in this example test in just over 0.3 seconds. The first mdworker
process responding to this is spawned about 0.9 seconds after file copying starts, and further mdworker
processes are added to it, making a total of 11. These are uncorked by launchd
in about 0.03-0.05 seconds and are active for a period of about 0.2 seconds before they appear to have finished creating new posting lists to be added to the local indexes.
Shortly after those are complete, spotlightknowledge
is active briefly, then 0.2 seconds after file copying mediaanalysisd
receives a text processing request and initialises CGPDFService. Just over 0.1 second later CoreSceneUnderstanding is invoked from mediaanalysisd
. For the following 5 seconds, a series of MADTextEmbedding (mediaanalysisd
text embedding) tasks are performed. Finally, content obtained by mdworker
and MADTextEmbedding is compressed by mds_stores
, finishing about 6.7 seconds after the new files were created.
These steps are summarised in the diagram above, where those in blue update metadata indexes, and those in yellow and green update content indexes. It’s most likely that each file to be added to the indexes has its own individual mdworker
process that works with a separate mdimporter, although in this case at least 11 mdworker
processes were spawned for 9 files to be indexed.
Text extraction from files by mediaanalysisd
involves a series of steps, and will be covered in more detail separately. In this test, local Spotlight indexes should have been updated within 7-8 seconds of file creation, so waiting at least 10 seconds after clicking the Create Test button should be sufficient to enable the test files to be found by search.
When the Check Search button in Mints is clicked, an NSMetadataQuery is performed using a search predicate over the search scope of NSMetadataQueryUserHomeScope
, the Home folder. This uses the search predicate (kMDItemTextContent CONTAINS[cd] "syzygy999") OR (kMDItemKeywords CONTAINS[cd] "syzygy999") OR (kMDItemAcquisitionMake CONTAINS[cd] "syzygy999")
to find all items containing any of:
kMDItemTextContent CONTAINS[cd] "syzygy999"
looks for the target string in a text representation of the content of the document, as in most of the test files used.kMDItemKeywords CONTAINS[cd] "syzygy999"
looks for keywords in an extended attribute of type com.apple.metadata:kMDItemKeywords as attached to the SpotTestMD.txt file.kMDItemAcquisitionMake CONTAINS[cd] "syzygy999"
looks for the Make field of an EXIF IDF0 as embedded in the SpotTestEXIF.jpg file.Each of those is both case- and diacritic-insensitive, although the test cases all use lower case without any diacritics.
Remarkably little is recorded in the log for these test searches. com.apple.metadata starts the query within 0.001 second of clicking the Check Search button, and is interrupted by TCC approving the search to proceed. About 0.03 seconds after the start of the query, a series of entries from mds_stores
reports query node counts of 4, and results are fetched and sent as replies between 0.09 and 19.5 seconds after the start of the query. However, Mints typically reports a considerably shorter search time of around 4 seconds.
The test currently included in Mints appears useful, in that it assesses the whole of Spotlight local search from file creation to completion of a test search. However, in its current form it doesn’t test LiveText or other in-image text extraction, and only applies to the volume containing the Home folder. Log extracts obtained in its Log Window are basic and don’t capture much of the detail used for this article. There seems scope for a dedicated Spotlight test app that addresses these and more.
mediaanalysisd
(‘MAD’) is significantly slower than extraction of metadata and other text content.One thing we humans are good at is searching. It’s a task we engage in from a few moments after birth until the time we slip away in death, we search everything around us. Locating and identifying that bird of prey wheeling high, finding the house keys, and that book we mislaid some time last week, meeting the perfect partner, discovering the right job, choosing the best education, looking through a Where’s Wally? or Where’s Waldo? book, and so on. Searching has transformed some into explorers like Christopher Columbus, and was the purpose of the chivalric quest. It’s what researchers in every field do, and thanks to Douglas Adams can be answered by the number 42.
Last week my searching took two new turns.
The first was more of a meta-search, in trying to discover more about the internals of Spotlight. Following the example of Maynard Handley, who has used them so successfully in understanding how M-series CPUs work, I looked through patents that have been awarded to Apple for the work of its search engineers. Yesterday’s slightly fuller history of Spotlight search is one result, and there are more to come in the future as I digest those patents concerned with performing search.
There’s a tinge of irony here, as many of my searches have been conducted using Google Patents, alongside Google Scholar one of the remaining search engines that doesn’t yet use AI and attempt to provide its own answers.
The other marks a new phase in my quest to get more information from the Unified log. Looking back to my first comment here, I realise how wildly over-optimistic I was when I wrote that it “should make my life a lot easier”, and that “a new version of Console will provide improved features to help us wade through logs.” Nine years later, I look wistfully at what remains of Console and realise how wrong I was on both counts.
When RunningBoard arrived in macOS Catalina, I soon noticed how “its log entries are profuse, detailed, and largely uncensored for privacy.” Since then it has proved garrulous to the point where its apparently ceaseless log chatter is a distraction, and can overwhelm attempts to read other log entries. I suspect it has contributed significantly to those advanced Mac users who now refuse to even try to make sense of the log.
One answer might be to tweak log preferences to shut out this noise, but given the purpose of RunningBoard in monitoring the life cycle of apps, why not try to use the information it provides? To do that, it’s first necessary to understand RunningBoard’s idiosyncratic language of assertions, and its protocols under which they’re acquired. The only way to do that without documentation is by observation: catalogue over 30 of those assertions for an interesting example like Apple’s Developer app, and see what they reveal.
By far the most informative entries from RunningBoard are those announcing that it’s acquiring an assertion, such asAcquiring assertion targeting [app<application.developer.apple.wwdc-Release.9312198.9312203(501)>:2946] from originator [osservice<com.apple.uikitsystemapp(501)>:748] with description <RBSAssertionDescriptor| "com.apple.frontboard.after-life.subordinate" ID:424-748-2228 target:2946 attributes:[
<RBSDomainAttribute| domain:"com.apple.frontboard" name:"AfterLife-Subordinate" sourceEnvironment:"(null)">
]>
In a log often censored to the point of being unintelligible, this contains frank and explicit detail. The app is identified clearly, with the user ID of 501 and process ID of 2946. The originator is similarly identified as com.apple.uikitsystemapp with its PID of 748, which is confirmed in the middle digits in the Assertion ID. This is explicitly related to FrontBoard and an attribute named AfterLife-Subordinate. There’s not a single <private> to blight this entry, although further knowledge is needed to decode it fully.
Normally to get such information from a running process would require its source code to be instrumented with calls to write log entries, many of which would be lost to <private>, yet RunningBoard seems happy, for the moment, to provide that information freely. You can see what I mean by applying the predicatesubsystem == "com.apple.runningboard" AND message CONTAINS "Acquiring assertion t"
in LogUI, to obtain a running commentary on active apps and processes. Once you’ve identified a relevant assertion, you can focus attention on other log entries immediately prior to that. I will be following this up in the coming week, with fuller instructions and some demonstrations.
Although neither patents nor assertions have the significance of the number 42, in their own ways they show how the art and science of search aren’t dead yet, nor have they succumbed to AI.
Since writing A brief history of local search, I have come across numerous patents awarded to Apple and its engineers for the innovations that have led to Spotlight. This more detailed account of the origins and history of Spotlight uses those primary sources to reconstruct as much as I can at present.
1990
ON Technology, Inc. released On Location, the first local search utility for Macs, a Desk Accessory anticipating many of the features to come in Spotlight 15 years later. This indexed text found in the data fork of files, using format-specific importer modules to access those written by Microsoft Word, WordPerfect, MacWrite and other apps of the day. Those files and their indexed contents were then fully searchable. This required System Software 6.0 or later, and a Mac with a hard disk and at least 1 MB of RAM. It was developed by Roy Groth, Rob Tsuk, Nancy Benovich, Paul Moody and Bill Woods.
1991
Version 2 of On Location was released. ON Technology was later acquired by Network Corporation, then by Symantec in 2003.
AppleSearch was released, and bundled in Workgroup Servers. This was based on a client-server system running over AppleShare networks. September’s release of System Software 7.5 introduced a local app Find File, written by Bill Monk.
Sherlock was released in Mac OS 8.5. This adopted a similar architecture to AppleSearch, using a local service that maintained indexes of file metadata and content, and a client app that passed queries to it. This included remote search of the web through plug-ins working with web search engines, as they became available.
Early patent applications were filed by Apple’s leading engineers who were working on Sherlock, including US Patent 6,466,901 B1 filed 30 November 1998 by Wayne Loofbourrow and David Cásseres, for a Multi-language document search and retrieval system.
Sherlock 2 was released in Mac OS 9.0. This apparently inspired developers at Karelia Software to produce Watson, ‘envisioned as Sherlock’s “companion” application, focusing on Web “services” rather than being a “search” tool like Sherlock.’
On 5 January, Yan Arrouye and Keith Mortensen filed what became Apple’s US Patent 6,847,959 B1 for a Universal Interface for Retrieval of Information in a Computer System. This describes the use of multiple plug-in modules for different kinds of search, in the way that was already being used in Sherlock. Drawings show that it was intended to be opened using an item on the right of the menu bar, there titled [GO-TO] rather than using the magnifying glass icon of Sherlock or Spotlight. This opened a search dialog resembling a prototype for Spotlight, and appears to have included ‘live’ search conducted as letters were typed in.
Karelia Software released Watson.
Mac OS X Jaguar brought Sherlock 3, which many considered had an uncanny resemblance to Watson. That resulted in acrimonious debate.
In preparation for the first Intel Macs, Mac OS X 10.4 Tiger, released in April 2005, introduced Spotlight as a replacement for Sherlock, which never ran on Intel Macs.
Initially, the Spotlight menu command dropped down a search panel as shown here, rather than opening a window as it does now.
On 4 August, John M Hörnkvist and others filed what became US Patent 7,783,589 B2 for Inverted Index Processing, for Apple. This was one of a series of related patents concerning Spotlight indexing. Just a week later, on 11 August, Matthew G Sachs and Jonathan A Sagotsky filed what became US Patent 7,698,328 B2 for User-Directed search refinement.
A Finder search window, precursor to the modern Find window, is shown in the lower left of this screenshot taken from Tiger in 2006.
Spotlight was improved in Mac OS 10.5 Leopard, in October. This extended its query language, and brought support for networked Macs that were using file sharing.
This shows a rather grander Finder search window from Mac OS X 10.5 Leopard in 2009.
Search attributes available for use in the search window are shown here in OS X 10.9 Mavericks, in 2014.
In OS X 10.10 Yosemite, released in October, web and local search were merged into ‘global’ Spotlight, the search window that opens using the Spotlight icon at the right end of the menu bar, accompanied by Spotlight Suggestions.
John M Hörnkvist and Gaurav Kapoor filed what was to become US Patent 10,885,039 B2 for Machine learning based search improvement, which appears to have been the foundation for Spotlight Suggestions, in turn becoming Siri Suggestions in macOS Sierra. Those were accompanied by remote data collection designed to preserve the relative anonymity of the user.
This shows a search in Global Spotlight in macOS 10.12 Sierra, in 2017.
Apple acquired Laserlike, Inc, whose technology (and further patents) has most probably been used to enhance Siri Suggestions. Laserlike had already filed for patents on query pattern matching in 2018.
I’m sure there’s a great deal more detail to add to this outline, and welcome any additional information, please.
4 August 2025: I’m very grateful to Joel for providing me with info and links for On Location, which I have incorporated above.
There are several well-known methods for excluding items from Spotlight search. This article details one that, as far as I can tell, has remained undocumented for the last 18 years since it was added in Mac OS X 10.5 Leopard, when Spotlight was only two years old, and can still catch you out by hiding files from local Spotlight search. This was discovered by Matt Godden, who has given his account here.
There have been three general methods of excluding folders from Spotlight’s indexing and search, although only two of them still work reliably:
.noindex
to the folder name (this earlier worked with .no_index
instead);.metadata_never_index
inside the folder; that no longer works in recent macOS.Additionally, System Settings offers Spotlight Privacy settings in two sections. Search results won’t normally prevent indexing of those items, but does block them from appearing in search results. Spotlight’s indexing exclusion list is accessed from the Spotlight Privacy… button, where you can add items that you don’t want indexed at all.
Matt Godden investigated repeated failure of Spotlight search to find some images in his large media library, and discovered that the extended attribute (xattr) named com.apple.metadata:kMDItemSupportFileType
was responsible. Images that weren’t returned in a search of that library all had that xattr attached, and when that was removed, those images were found reliably.
According to Apple’s documentation, that xattr was available in Mac OS X 10.5 and has since been deprecated. No further information is given about its function or effect, nor does it appear in an older list of Spotlight metadata attribute keys.
Search of previous mentions of this xattr reveal that it has been found with either of two values, iPhotoPreservedOriginal
as described for Matt’s images, and MDSystemFile
used with several apps that have proved equally inaccessible to Spotlight search. Images that have this xattr attached appear to have originated in old iPhotos libraries, which may have been migrated to Photos libraries. Searches for files with this xattr suggest that even old collections of images seldom have the xattr present, in my case on only 9 files out of over 800,000 checked, and the MDSystemFile
variant wasn’t found in over 100,000 application files.
The mere presence of this xattr is sufficient to exclude a file from Spotlight search: setting its value to the arbitrary word any
, for example, was as effective as setting it to either iPhotoPreservedOriginal
or MDSystemFile
.
Strangely, the method used to search is important: files with the com.apple.metadata:kMDItemSupportFileType
xattr can’t be found when using Local Spotlight search in a Find window, but can be found by Mints using a standard search predicate with NSMetadataQuery.
The simplest way to detect whether your Mac has files with the com.apple.metadata:kMDItemSupportFileType
xattr is to use the Crawler tool in my free xattred, with Full Disk Access. Open its window using the Open Crawler… command in the Window menu, paste the xattr name into the Xattr type box. Click on the Scan button and select the volume or folder to check. xattred then crawls the full directory tree within that and reports all files with that xattr.
The xattr can then be removed by dragging the file onto one of xattred’s main windows, selecting the xattr, and clicking on the Cut button. That change will be effective immediately, and the file made available through Spotlight search within a few seconds.
If you have more than a handful of files with the xattr, use xattred’s Stripper to remove them all. Paste the xattr name into the Xattr type box. Click on the Strip button and select the volume or folder to process.
com.apple.metadata:kMDItemSupportFileType
xattr attached, search for and remove all such xattrs to ensure those files aren’t excluded from search.I’m extremely grateful to Matt Godden for his painstaking research and keeping me informed.
The Swiss Army knife has fallen victim to unintended consequences. Once the dream of every schoolboy and pocketed by anyone who went out into the countryside, my small collection of Swiss Army knives and multi-tools now remains indoors and unused. This is the result of strict laws on the carriage of knives in the UK; although not deemed illegal, since 1988 carrying them in a public place has put you at risk of being stopped and searched, and one friend was subjected to that for carrying a mere paint-scraper.
Swiss Army knives have another more sinister danger, that they’re used in preference to dedicated tools. Over the last week or two as I’ve been digging deeper into Spotlight, I can’t help but think how it has turned into the Swiss Army knife of search tools, by compromising its powers for the sake of versatility.
At present, I know of four different Spotlights:
Of those, it’s Global Spotlight that I find most concerning, as it’s the frontline search tool for many if not most who use Macs, and the most flawed of the four. It’s not even the fault of Spotlight, whose 20th birthday we should have celebrated just over a month ago. No, this flaw goes right back to Sherlock, first released in Mac OS 8.5 in 1998.
At that time, few Macs had more than 5 GB of hard disk storage, and local search typically dealt with tens of thousands of files. That was also the first year that Google published its index, estimating that there were about 25 million web pages in all. Apple didn’t have its own web browser to offer, but made Microsoft’s Internet Explorer the default until Safari was released five years later. Merging local and web search into a single app seemed a good idea, and that’s the dangerous precedent set by Sherlock 27 years ago.
The result today only conflates and confuses.
In the days of Sherlock, web search was more a journey of discovery, where most search engines ranked pages naïvely according to the number of times the search term appeared on that page. That only changed with the arrival of Google’s patented PageRank algorithm at the end of the twentieth century, and placement of ads didn’t start in earnest until the start of the new millennium, by which time Safari was established as the standard browser in Mac OS X.
Local search was and remains a completely different discipline, with no concept of ranking. As local storage increased relentlessly in capacity, file metadata and contents became increasingly important to its success. Internally local searches have been specified by a logical language of predicates that are directly accessible to remarkably few users, and most of us have come to expect Spotlight’s indexing to handle metadata for us.
The end result challenges the user with negotiating web search engines and dodging their ads using one language, confounded by the behaviour of Siri Suggestions, and hazarding a wild guess as to what might come up in the metadata and content of files. More often than not, we end up with a potpourri that fails on all counts.
As an example, I entered the terms manet painting civil war into Spotlight’s Global Search box and was rewarded with a link to Manet’s painting of The Battle of the Kearsarge and the Alabama from 1864, as I’d hope. But entered into the search box of a Find window, those found anything but, from Plutarch’s Lives to a medical review on Type 2 diabetes. In MarsEdit’s Core Spotlight, though, they found every article I have written for this blog that featured the painting.
To get anything useful from local Spotlight, I had to know one of the ships was the USS Kearsarge, and that unusual word immediately found an image of the painting, but no useful content referring to it. Had I opted to search for the word Alabama instead, I would have been offered 94 hits, ranging from linguistics to the Mueller report into Russian interference in the 2016 US Presidential election. Adding the requirement that the file was an image narrowed the results down to the single image.
Conversely, entering Kearsarge into Global Spotlight offered a neighbourhood in North Conway, New Hampshire, in Maps, information about three different US warships from Siri Knowledge, Wikipedia’s comprehensive disambiguation page, a list of five US warships of that name, and three copies of the image of Manet’s painting without any further information about them.
Spotlight is also set to change with the inevitable addition of AI. Already suggestions are tailored using machine learning, but as far as I’m aware local Spotlight doesn’t yet use any form of AI-enhanced search. Words entered into search boxes and bars aren’t subject to autocorrection, and although Global Spotlight may suggest alternative searches using similar words, if you enter acotyle Spotlight doesn’t dismiss it as a mistake for acolyte. It remains to be seen whether and when local Spotlight switches from Boolean binaries to fuzziness and probability, but at least that will be more akin to the ranking of web pages, and we’ll no longer need to be bilingual.
For the time being, we’re left with a Swiss Army knife, ideal for finding where Apple has hidden Keychain Access, but disappointing when you don’t know exactly what you’re looking for.
Spotlight, the current search feature in macOS, does far more than find locally stored files, but in this brief history I focus on that function, and how it has evolved as Macs have come to keep increasingly large numbers of files.
Until early Macs had enough storage to make this worthwhile, there seemed little need. Although in 1994 there were precious few Macs with hard disks as large as 1 GB, networks could provide considerably more. That year Apple offered its first product in AppleSearch, based on a client-server system running over AppleShare networks, and in its Workgroup Servers in particular. This was a pioneering product that was soon accompanied by a local app, Find File, written by Bill Monk and introduced in System 7.5 that September.
The next step was to implement a similar architecture to AppleSearch on each Mac, with a service that maintained indexes of file metadata and contents, and a client that passed queries to it. This became Sherlock, first released in Mac OS 8.5 in 1998. As access to the web grew, this came to encompass remote search through plug-ins that worked with web search engines.
Those were expanded in Sherlock 2, part of Mac OS 9.0 from 1999 and shown above, and version 3 that came in Mac OS X 10.2 Jaguar in 2002. The latter brought one of the more unseemly conflicts in Apple’s history, when developers at Karelia claimed Sherlock 3 had plagiarised its own product, Watson, which in turn had been modelled on Sherlock. Apple denied that, but the phrase being Sherlocked has passed into the language as a result.
Sherlock remained popular with the introduction of Mac OS X, but was never ported to run native on Intel processors. Instead, Apple replaced it with Spotlight in Mac OS X 10.4 Tiger, in April 2005.
Initially, the Spotlight menu command dropped down a search panel as shown here, rather than opening a window as it does now.
A Finder search window, precursor to the modern Find window, is shown in the lower left of this screenshot taken from Tiger in 2006.
Spotlight was improved again in Mac OS 10.5 Leopard, in 2007. This extended its query language, and brought support for networked Macs that were using file sharing.
This shows a rather grander Finder search window from Mac OS X 10.5 Leopard in 2009.
Search attributes available for use in the search window are shown here in OS X 10.9 Mavericks, in 2014.
Spotlight’s last major redesign came in OS X 10.10 Yosemite, in 2014, when web and local search were merged into Global Spotlight, the search window that opens using the Spotlight icon at the right end of the menu bar. With Global Spotlight came Spotlight (then Siri from macOS Sierra) Suggestions, and they have been accompanied by remote data collection designed to preserve the relative anonymity of the user.
This Finder window in OS X 10.10 Yosemite, in 2015, shows a more complex search in progress.
This shows a search in Global Spotlight in macOS 10.12 Sierra, in 2017.
Local Search in the Finder’s Find window can now use a wide variety of attributes, some of which are shown here, in macOS 10.13 High Sierra, in 2018. Below are search bars for several different classes of metadata.
Over the years, Spotlight’s features have become more divided, in part to safeguard privacy, and to deliver similar features from databases. Core Spotlight now provides search features within apps such as Mail and Notes, where local searches lack access.
Spotlight’s indexes are located at the root level of each indexed volume, in the hidden .Spotlight-V100 folder. Those are maintained by mdworker
processes relying on mdimporter
plugins to provide tailored access for different file types. If an mdimporter fails to provide content data on some or all of the file types it supports, those are missing from that volume’s indexes, and Spotlight search will be unsuccessful. This happened most probably in macOS Catalina 10.15.6, breaking the indexing of content from Rich Text files. That wasn’t fixed until macOS Big Sur 11.3 in April 2021.
Over the last few years, macOS has gained the ability to perform optical character recognition using Live Text, and to analyse and classify images. Text and metadata retrieved by the various services responsible are now included in Spotlight’s indexes. From macOS 13 Ventura in 2022, those services can take prolonged periods working through images and file types like PDF that include images they can process to generate additional content and metadata for indexing.
Those with large collections of eligible files have noticed sustained workloads as a result. Fortunately for those with Apple silicon Macs, those services, like Spotlight’s indexing, run almost exclusively on their Mac’s E cores, so have little or no effect on its ability to run apps. For those with Intel processors, though, this may continue to be troubling.
In less than 30 years, searching Macs has progressed from the basic Find File to Spotlight finding search terms in text recognised in photos, in almost complete silence. Even Spotlight’s 20th birthday passed just over a month ago, on 29 April, without so much as an acknowledgment of its impact.
Although the great majority use GUI search tools provided in the Global Spotlight menu, Finder Find windows, and in-app Core Spotlight, macOS also provides access using query languages. This article takes a brief tour of those available in macOS. As with previous coverage, this doesn’t include third-party utilities such as HoudahSpot that provide their own interface to Spotlight searching, nor alternative search methods.
Search boxes provided by both Global Spotlight and Local Spotlight can accept a simple form of query language, for examplename:"target"*cdw
which also works with filename:
, and performs the equivalent of the Matches operator in a Find window. These use English terms for the attributes to be used, like name
and filename
, including some of those listed here for Core Spotlight. However, limited information is available and this doesn’t appear to be extensive enough to use at scale. Operators available are also limited within those listed for Core Spotlight.
Modifiers available in current macOS include
The asterisk * can be used as a wildcard to match substrings, and the backslash \ acts as an escape character, for example \"
meaning a ” literal. In theory, simple predicates can be combined using &&
as AND, and ||
as OR.
In practice, getting these to work is tricky, and rarely worth the effort of trying.
One of the least-used attributes available in search bars in the Find window enables the use of what are termed raw queries. Confusingly, these use different names for attributes, such as kMDItemDisplayName
instead of name
. Otherwise these are more reliable than those used in search boxes. For example, when searching for the string target
,kMDItemDisplayName = "*target*"
is the equivalent of Contains, andkMDItemDisplayName = "target*"w
appears functionally identical to Matches.
These appear to be an option of last resort, and need documentation.
mdfind
This command tool provides the most complete access to the central Spotlight Query Language, which defies abbreviation to SQL. In addition, it also supports a direct form that searches for matching file names only, usingmdfind -name "target"
to find the word target
in filenames.
Unfortunately, although Spotlight Query Strings are predicates, they aren’t the same as NSPredicates used elsewhere within macOS. One of the most obvious differences is that Spotlight’s modifiers are appended to the value, not the operator, as they are when using search predicates in the log
command, for example.
Query strings take the general formattribute operator value[modifiers]
where
attribute
is a kMD… name defined as a metadata attribute key,operator
can take a formal version such as ==
, or may be abbreviated to just =
, which appear to be identical in effect,value
can be a string containing wildcards or escapes, or those detailed for special cases such as numbers and dates,modifiers
include those given above.Simple examples aremdfind "kMDItemDisplayName = '*target*'"
ormdfind "kMDItemDisplayName == '*target*'"
Apple’s current list of common metadata attribute keys is given here. Otherwise, documentation is old if not ancient, and there are obvious differences from current Spotlight and mdfind
, such as the expanded list of modifiers.
File metadata queries are explained here, with an apparently duplicated version here. File metadata attributes are documented here, and a general account of predicates is here.
On the face of it, one way to become familiar with and to develop query strings for use in mdfind
might be to set them up in a Find window, save that and use the query string within it for your own. Although this can be helpful, queries in saved search files often use attributes not accessible to mdfind
. For example, entering the string target in the search box setting a search bar to Kind is Image All is represented by the query(((** = "target*"cdw)) && (_kMDItemGroupId = 13))
where the second attribute is an internal form of kMDItemKind
.
However, this quickly runs into difficulties, as values of _kMDItemGroupId
don’t appear to be documented, and substituting that with an alternative such askMDItemKind = "public.image"
fails silently.
The ability to save Spotlight searches is perhaps its most underused feature. This article explains how Saved Search and Smart Folders work, and how you can use them to your advantage.
Open a new Finder window and turn it into a Find window using the Find command at the foot of the File menu. Leave its search box blank, and set up one or more search bars with criteria that find some files of interest.
Then click on the Save button at the upper right, just below the search box. In the save dialog, set the saved location to an accessible folder such as ~/Documents, leave the Add To Sidebar checkbox empty, and click Save.
Close that Find window, and select your Saved Search ‘folder’ in the Finder. That will display the same files you just found in that search, as if it was a normal folder. Select any of those files, though, and you’ll see that they’re not really there, and their paths are for the original file, as if they were symbolic links, perhaps.
You can now move that Saved Search around, even copy it to another Mac, and wherever it goes it performs the same search and shows the results. Open Get Info on the Saved Search item and it’s described as a Saved Search Query, and the predicate used internally by Spotlight for that search is shown as its Query.
Now double-click on the Saved Search item to open it in its own window, and click on the Action tool (the circle containing an ellipsis …) and select the Show Search Criteria item in the menu. This restores your original Find window, complete with all its original settings, search bars, and their contents.
You can now change that search, and it will update to show the new results. To save your modified search, click on the Save button at the upper right, and the Saved Search will be duly updated.
You can do exactly the same starting from the Finder’s New Smart Folder command in its File menu, as that creates a new Find window with identical features. The end results are the same, a file of type com.apple.finder.smart-folder
. Only, as you’ll have gathered by now, this isn’t a real folder at all, just a property list that the Finder handles in an unconventional way.
Open the .savedSearch file using a text editor (you can make that easier by changing its extension to .plist if you wish), and you’ll see that it doesn’t even list the files ‘contained’ by this ‘folder’, all it gives is the search predicate used by Spotlight to find the items that are shown as being inside it.
That predicate, also shown as the Query in the Get Info dialog, is saved in the property list against its RawQuery
key. Much of the rest of the property list is devoted to details enabling the Finder to reconstruct the Find window should you should decide to Show Search Criteria. But nowhere does it list any of the items found in the search.
This ensures that Saved Searches and Smart Folders take almost no space, just a few KB for that property list, and unless they’re open and displaying the search results, they don’t take any memory either.
They also don’t behave like regular folders. You can’t add items to them except by changing the search criteria used. You can only copy items from them, and can’t remove anything either. If you duplicate a Saved Search then you simply get another copy of the property list, not the items found by it. Their contents are also dynamic: create another item that meets their search criteria, and that will automatically be added to their contents, hence the alternative name of Smart Folder.
They’re also versatile and can have several roles:
They’re also the only place that I’m aware of that provides a bridge between searches made using the GUI in the Finder, and the terms and format of predicates used internally and by mdfind
. This is important, as Saved Searches are themselves of no use in the command line, but their Query string can be used directly, as I’ll demonstrate in a later article.
The most common problem reported in Spotlight search is failure to find target file(s). In this series of articles, I’ll examine what can go wrong, and how you can make local searches more successful using features in macOS. I’m well aware that there are other utilities for searching, some relying on Spotlight, others working independently, but here I’ll confine myself as much as possible to what’s provided in macOS.
Successful searches using Spotlight have four essential requirements:
The first of those can be checked using the mdimport
command tool in Terminal, using the commandmdimport -t -d3 [filepath]
where [filepath] is the path of the file. You can vary the digit used in the -d3
option from 1-3, with larger numbers delivering more detail. For -d3
you’ll first be given the file type and mdimporter used, following which all the data extracted is given according to its Spotlight attributes:Imported '/Users/hoakley/Documents/SpotTestA.rtf' of type 'public.rtf' with plugIn /System/Library/Spotlight/RichText.mdimporter.
37 attributes returned
followed by a long list.
If the file hasn’t been indexed, this article works through the steps you can take to rectify that. Note that recent experience is that using mdutil -E /
to erase and force rebuild of indexes on the Data volume may not work as expected, and you should either perform this in System Settings, or using the command mdutil -E /System/Volumes/Data
Global Spotlight is accessed through the magnifying glass icon in the right of the menu bar, or using the default key combination of Command-space. This includes website content, and isn’t ideal when you’re searching for local files. If you want to use this as an easy gateway to local search, enter the text you want to search for and scroll down to use the command Search in Finder, which opens a Finder Find window for the results of that search query. Alternatively, you can click on Show More in the Documents section of the search results.
Local Spotlight can also be opened by the Find command at the foot of the Finder’s File menu, and takes you straight to its search box, at the right of the toolbar in that Finder window.
This window offers a choice of two search scopes at the upper left. The first covers all the accessible contents of that Mac, and the second is the folder open in that window when it was converted to a Find window. To set the scope for a local Spotlight search, start from a normal Finder window with the target folder open, then use the Find command on that.
Typing text into the search box at the right of the toolbar then performs live or incremental search for both filenames and content at the same time, or you can select one of them in the menu.
Text entered into the search box can simply be the start of a word in the target, or can be a basic search query such as name:"target"*cdw
. I will explain those in a later article about search queries.
Instead of, or in addition to, entering text in the search box, you can set further search criteria in search bars below that.
In this case, the file Name is required to contain the string entered in the text box. Add more of these search bars for additional criteria to narrow your search.
In search bars, the first popup sets the metadata attribute to use in the query. For example, both Name and Filename refer to the name of the file, although Name is given in its localised form. Many more options are available by selecting the Other… item at the foot of the attribute menu. You can either set those as a one-off, or add them to the menu of attributes if you’re likely to use them again. These roughly correspond to the metadata attributes as in formal Spotlight search queries used elsewhere, although their names are different.
The second popup sets the operator to be used. While they may appear self-explanatory, two merit fuller explanation as they may work differently from how you might expect:
These are crucial in search queries run from the search box in a Find window, and the matches operator used in a search bar below. Although the search box claims to use the contains operator, it actually behaves as the matches operator does in a search bar.
In many languages word roots and meaning appear at the start of words, with declensions and conjugations at the end. If you want to find words related to harvest, like harvester, harvesting and harvested, then you’re going to enter a search query using harvest rather than vest. Like other search engines designed for live or incremental search, Spotlight is fastest when searching for the start of words. It therefore divides compound words often used for filenames into component words. It does so using rules for word boundaries laid down in the International Components for Unicode.
In practice, word boundaries include a space, the underscore _, hyphen – and changes of case used in CamelCase. Spotlight treats each of the following examples as three words:one target two
one_target_two
one-target-two
OneTargetTwo
Languages other than English may allow other word boundaries, but those are the most common.
The rules recognise that hyphens are difficult, and Spotlight makes them even trickier as it can ignore them altogether when searching for an arbitrary string without word boundaries, and will then happily find netargett in one-target-two! Spotlight also struggles with multiple hyphens mixed with underscores. For example, it may not be able to find danger in the file name a_a-b-c-e-danger_z.txt when using matches, but should work as expected when using contains instead.
In-app search or Core Spotlight relies on search features provided by the app, for example that in Mail. Although these use Spotlight’s indexes and its query language, their function is different from Global or Local Spotlight, and implemented through a distinct Core Spotlight API.
The primary command tool for performing Spotlight search is mdfind
, which uses formal query strings and predicates. I’ll tackle those in a future article.
I’m very grateful to Aldous and to Thomas Tempelmann for their unstinting help in understanding word boundaries and their importance, and for solving the mystery of Cleta.
How everything grows over time. Twenty years ago a hard disk of 100 GB was often ample, now twenty times that can be insufficient, and some have even larger media libraries. Finding files from among tens of thousands used to be straightforward, but now we’re working with millions we’re often struggling. Last week’s discussions of Spotlight search and its alternatives highlighted how important search strategies have become.
Perhaps the most common strategy we use to search quickly and effectively is to apply a series of properties or attributes narrowing from the general to the specific: a dog, a small dog, a small grey-and-white dog, a small grey-and-white Havanese dog. In just a few adjectives we have narrowed the field to a description applying to a small number of domestic pets.
This strategy has two essential requirements: the target of your search must be included in the list of items being searched, and each of the attributes or criteria you apply in succession must include the search target. The first is obvious and critical to Spotlight’s success, and the second is the basis of how attributes are chosen. If the dog’s colour had been specified as red, then that search would have failed.
One of many skills in successful searching is judging how exclusive each criterion should be, and being more inclusive to ensure none of the criteria might inadvertently exclude the target.
Although you can combine attributes in this way when searching using the general Spotlight search window accessed through the menu bar, that’s a global search including websites and everything searchable from Wikipedia to Photos albums and Messages. When looking for a file, searching in the Finder immediately narrows the scope, and saves you wading through many irrelevant results. You can then add a search bar for each criterion, perhaps specifying that you’re looking for an image in your ~/Documents folder, each time reducing the number of hits until your choice becomes sufficiently limited.
Spotlight offers another technique that has become popular in search engines as their performance has improved, in what’s known as live or incremental search. As you type letters into one of its search boxes, it shows results as it gets them. This isn’t much use when entering common combinations of letters, but as they become more specific this can save time and accommodate any uncertainty you might have over spelling or the rest of the word. I use this frequently in MarsEdit when looking for old articles I have written: for example, typing wrestl
will find wrestler, wrestlers, wrestling, wrestled, etc.
This works well with most languages including English, where roots and meanings are concentrated in the first parts of words, and declension and conjugation are usually found in their endings. Not all languages work like that, though, and this may not perform as well in Georgian or even German due to their morphology.
For those who prefer to use the command line, mdfind
can use predicates to express combinations of attributes, but those aren’t readily used in the same incremental way to narrow results down interactively. Another situation where predicates often come into play is when searching log entries and using the log show
command, and that brings me on to LogUI, my other concern last week.
Let’s say you want to discover all the information RunningBoard gathers about an app, something you know is written in a log entry by the com.apple.runningboard
subsystem shortly after that app starts its launch sequence. While you could search for all entries for that subsystem in the minute or so around the time you launched the app, there are likely to be thousands of hits.
To narrow down that search you have several options, including:
subsystem == "com.apple.launchservices" OR subsystem == "com.apple.runningboard"
;com.apple.launchservices
to identify the time that LaunchServices announces the app will be launched through RunningBoard;constructed job description
, RunningBoard’s log entry giving the details you’re looking for.Those are ordered in increasing specificity, reducing numbers of hits, and increasing requirement for prior knowledge. That’s a general association, in that the more prior knowledge you have, the more specific you can make your criteria, and the fewer irrelevant hits you will see. As with Spotlight search, the more of these criteria you apply, the greater your chance of success, provided they all match the entry you’re looking for.
To make LogUI more amenable to incremental search strategies, two additional features are needed. Instead of only exporting whole log extracts to Rich Text, the app needs to save and read formatted extracts. It also needs the ability to eliminate entries that don’t meet search criteria. Together those will enable use of a predicate to save an extract of reduced size, then application of search criteria, maybe saving an even smaller extract.
One way to combine multiple searches is to use multiple search bars, in a similar way to the Finder’s Find window. However, that tends to become overcomplicated, and I suspect is relatively little-used. If you do need a series of search criteria, then you also need different ways of combining them, including OR as well as AND, and that becomes a GUI predicate editor. I have yet to see any successful GUI predicate editor.
Next week, in the days prior to WWDC, I’m going to be focussing on search strategies for Spotlight, before turning to LogUI to implement these changes. This is an ideal time to let me know what you’d like to see, and how LogUI can support more successful search.
Searching for a file with a distinctive word in its name should be straightforward, but here I show some weird problems that could catch you out. I’m very grateful to Sam for drawing my attention to this, and welcome all and any rational explanations of what’s going on.
In some accounts of ancient Greek mythology, Cleta (Κλήτα) was one of the two Charites or Graces, alongside Phaenna. Her name apparently means renowned, and is still occasionally used as a first name today. It’s not the sort of word that should give Spotlight any cause for concern, and should prove easy to find.
To see the problems it can cause, create a folder somewhere accessible, in ~/Documents perhaps, and create half a dozen files with the names shown below.
Now open a new Finder window, and set it to Find mode using that command at the foot of the File menu. Then type into its search box the letters cleta
Only four of the files in that folder are found, excluding the first two, despite the fact that all their names clearly contain the search term.
Now clear the search box, and in the search criterion below, set it to find Name contains cleta
, which you might have thought would be the same as the previous search.
Now all six files are found successfully.
You can try other variations of the file name to see which can be found using the search box, and which remain hidden. For example,1995z_spectacletable_01.txt
also appears susceptible to this problem, suggesting that other examples might have the form
[digits]_[chars]cleta[chars]_[digits].[extension]
There are some other oddities at work as well, that you can see in the four file names that haven’t yet played hide and seek. So far I’ve been using Spotlight to find file names that simply contain the characters cleta
. Now extend that to cletapainting
While you would expect the second of those to appear, Spotlight has elided the hyphen embedded in the first, as if it wasn’t there. Although Spotlight doesn’t provide a simple way to search for discrete words in file names, that’s a feature readily accessible in several third-party search utilities, including Find Any File and HoudahSpot. If you use Spotlight much, both of those are essentials, and you may wish to add Alfred as well.
As expected, Find Any File has no problems in finding all six test files when looking for names containing cleta
Set it to find names containing the word cleta
, though, and it recognises spaces, hyphens and underscore _ characters as word separators, but doesn’t oblige with CamelCase, whether or not you capitalise its initial character.
cleta
in file names, as they can confuse Spotlight.My thanks again to Sam for providing me with the example of cleta that made this possible if apparently highly improbable.
Postscript
For those who think this all works as they expect, try the following file name:
1995z_star-post-office-cleta-hunt-portrait_01.txt