Normal view

There are new articles available, click to refresh the page.
Today — 19 September 2025Main stream
Yesterday — 18 September 2025Main stream

Trump Administration Stopping Efforts to Collect Scientific Data

18 September 2025 at 17:04
A pattern of getting rid of statistics has emerged that echoes the president’s first term, when he suggested if the nation stopped testing for Covid, it would have few cases.

© Jason Andrew for The New York Times

Before yesterdayMain stream

New Research Helps Explain Gas Craters in Siberia

16 September 2025 at 17:02
Spontaneous gas explosions appear to be increasing in northern Russia because of climate change and some specific local conditions.

© Vasily Bogoyavlensky/Agence France-Presse — Getty Images

A gas crater on the Yamal Peninsula in northern Russia in August 2014.

What Exactly Are A.I. Companies Trying to Build? Here’s a Guide.

16 September 2025 at 17:00
Amazon, Microsoft, Google, Meta and OpenAI plan to spend at least $325 billion by the end of the year in pursuit of A.I. We explain why they’re doing it.

© Shira Inbar

Quirks of Spotlight local search

By: hoakley
4 September 2025 at 14:30

Over the last few weeks, as I’ve been digging deeper into Spotlight local search, what seemed at first to be fairly straightforward has become far more complex. This article draws together some lessons that I have learned.

Apple’s search patents have been abandoned

Having tracked down a batch of Apple’s patents relating to search technologies that are likely to have been used in Spotlight, I was surprised to see those have been abandoned. For example, US Patent 2011/0113052 A1, Query result iteration for multiple queries, filed by John Hörnkvist on 14 January 2011, was abandoned five years later. I therefore have no insights to offer based on Apple’s extant patents.

Search is determined by index structure

Spotlight indexes metadata separately from contents, and both types of index point to files, apparently through their paths and names, rather than their inodes. You can demonstrate this using test file H in SpotTest. Once Spotlight has indexed objects in images discovered by mediaanalysisd, moving that file to a different folder breaks that association immediately, and the same applies to file I whose text is recognised by Live Text.

Extracted text, that recovered using optical character recognition (Live Text), and object labels obtained using image classification (Visual Look Up) are all treated as content rather than metadata. Thus you can search for content, but you can’t obtain a list of objects that have been indexed from images, any more than you can obtain Spotlight’s lexicon of words extracted as text.

Language

Spotlight’s indexes are multilingual, as demonstrated by one of Apple’s earliest patents for search technology. Extracted text can thus contain words in several languages, but isn’t translated. Object labels are likely to be in the primary language set at the time, for example using the German word weide instead of the English cattle, if German was set when mediaanalysisd extracted object types from that image. You can verify this in SpotTest using test file H and custom search terms.

If you change your Mac’s primary language frequently, this could make it very hard to search for objects recognised in images.

Search method makes a difference

The Finder’s Find feature can be effective, but has a limited syntax lacking OR and NOT unless you resort to using Raw Query predicates (available from the first popup menu).* This means it can’t be used to construct a search for text containing the word cattle OR cow. This has a beneficial side-effect, in that each term used should reduce the number of hits, but it’s a significant constraint.

The Finder does support some search types not available in other methods such as mdfind. Of the image-related types, John reports that kMDItemPhotosSceneClassificationLabels can be used in the Finder’s Find and will return files with matching objects that have been identified, but that doesn’t work in mdfind, either in Terminal or when called by an app. Other promising candidates that have proved unsuccessful include:

  • kMDItemPhotosSceneClassificationIdentifiers
  • kMDItemPhotosSceneClassificationMediaTypes
  • kMDItemPhotosSceneClassificationSynonyms
  • kMDItemPhotosSceneClassificationTypes.

One huge advantage of mdfind is that it can perform a general search for content using wildcards, in the form
(** == '[searchTerm]*'cdw)

Using NSMetadataQuery from compiled code is probably not worth the effort. Not only does it use predicates of different form from mdfind, but it’s unable to make use of wildcards in the same way that mdfind can, a lesson again demonstrated in SpotTest. mdfind can also be significantly quicker.

For example, you might use the form
mdfind "(kMDItemKeywords == '[searchTerm]*'cdw)"
in Terminal, or from within a compiled app. The equivalent predicate for NSMetadataQuery would read
(kMDItemKeywords CONTAINS[cdw] \"cattle\")

Another caution when using NSMetadataQuery is that apps appear to have their own single NSMetadataQuery instance on their main thread. That can lead to new queries leaking into the results from previous queries.

Key points

  • Spotlight indexes metadata separately from contents.
  • Text recovered from images, and objects recognised in images, appear to be indexed as contents. As a result you can’t obtain a lexicon of object types.
  • Indexed data appear to be associated with the file’s path, and will be lost if a file is moved within the same volume.
  • Text contents aren’t translated for indexing, so need to be searched for in their original language.
  • Object types obtained from images appear to be indexed using terms from the primary language at the time they are indexed. If the primary language is changed, that will make it harder to search for images by contents.
  • The Finder’s Find is constrained in its logic, and doesn’t support OR or NOT, although using Raw Queries can work around that.
  • mdfind is most powerful, including wildcard search for content.
  • NSMetadataQuery called from code uses a different predicate format and has limitations.

I’m very grateful to Jürgen for drawing my attention to the effects of language, and to John for reporting his discovery of kMDItemPhotosSceneClassificationLabels.

  • I’m grateful to Jozef (see his comment below) for reminding me that there is a way to construct OR and NOT operations in the Finder’s Find, although it’s not obvious, and even when you know about it you may well find it difficult to use. I will explain more about this in another article next week.

SpotTest 1.1 has search scopes for volumes

By: hoakley
25 August 2025 at 14:30

As promised, this new version of my Spotlight indexing and search utility SpotTest extends its reach beyond the user’s Home folder, and can now test and search any regular volume that’s connected to your Mac and mounted in /Volumes.

By default, its searches remain restricted to the user’s Home folder, where SpotTest’s folder of crafted test files is installed. That applies whether you opt to use the search using its NSMetadataQuery tool, or the much faster option of the mdfind tool instead. If you want to search another mounted volume, click on the 🔄 button for the app to check which volumes are available, then select one from its new Scope menu items. Volumes listed there exclude Time Machine backups and any hidden volumes whose names start with a dot, which will in any case be excluded from Spotlight indexing as they’re hidden.

This new version also fixes a weird bug that you’re unlikely to encounter in the previous version, but in rare circumstances could be infuriating. When searching using the NSMetadataQuery tool, if you had two windows open both with results from that tool, both would be updated with the same search results, and the time taken in them could rise to the absurd. This occurred because both windows were being updated with the data returned from the most recent search, as the NSMetadataQuery is shared in the app’s MainActor. After some fraught debugging, windows in this version ignore any search result updates initiated by other windows. I hope!

Volumes set in the Scope menu only affect search scope. Test folders are created in and removed from the user’s Home folder, and mdimporters are checked there as well. If you want to investigate indexing and search performance on other volumes, then you should manually create your own test folders as necessary. One quick and simple approach is to create a standard test folder in the Home folder, and copy that onto the volume(s) you want to test. A little later this week I’ll illustrate this in an article explaining how to get the best out of SpotTest and how it can help diagnose Spotlight problems.

I have taken the opportunity to improve SpotTest’s reporting of errors, such as trying to remove a test folder that doesn’t exist. I have also thoroughly revised the Help book, and added a page about search scopes.

SpotTest version 1.1 for macOS 14.6 and later, including Tahoe, is now available from here: spottest11
from Downloads above, and from its Product Page.

Enjoy!

SpotTest 1.0 will help you diagnose Spotlight problems

By: hoakley
18 August 2025 at 14:30

There are some topics that invariably generate comments from those who have either abandoned a major feature in macOS, or are struggling with it. Some of the most persistent are problems with Spotlight, particularly with its local search of files on your Mac. To help grapple with those, four years ago I added some Spotlight tests to Mints that can be used to work out where those problems are occurring. I’m delighted now to offer an extension to those in a whole new app, perhaps predictably named SpotTest.

Spotlight is so substantial, almost silent in the log, and impenetrable that the best approach to diagnosing its problems is to test it out in a controlled way. Mints has been doing that by creating a folder of files containing an unusual word, then searching for that. Although that’s still useful for a quick test, we need something more focused and flexible, and that’s what SpotTest aims to deliver.

Following deep dives into how Spotlight indexes and searches metadata and contents of files, and how it can search text extracted from images and the results of image analysis, I’ve realised that different test files are required, together with alternative means of search. For example, the standard approach used in compiled apps, with NSMetadataQuery, is incapable of finding content tags obtained using Visual Look Up, which only appear when using the mdfind command. SpotTest takes these into account.

There are now 15 carefully crafted test files, of which one cannot currently be found, no matter what method of search you try.

A perfect 13/15 result from NSMetadataQuery is only possible after waiting a day or more for background mediaanalysisd processing to recognise and extract the text in file I, a PNG image. The other 12 here should all be found when running this test a few seconds after the test files have been created. They rely on a range of mdimporter modules bundled in macOS, apart from file L, an XML property list.

Another of SpotTest’s tools will list the mdimporters used for each of the test files.

Run the search using the mdfind command within SpotTest and, once mediaanalysisd has done its image recognition, you should get a perfect 14/15.

The only current limitation of SpotTest version 1.0 is that it can only run tests on the Data volume that your Mac started up from, using a folder at the top level of your Home folder. A future version will let you test other volumes as well. Its Help book runs to nine pages: please read them, as its test might seem deceptively simple but provide a lot of useful information about how Spotlight local search is functioning. Coupled with log extracts using LogUI it should shine light in the darkness.

SpotTest 1.0, which requires macOS 14.6 or later, is now available from here: spottest10
and from its new place in its Product Page.

I wish you successful searching.

A deeper dive into Spotlight indexing and local search

By: hoakley
4 August 2025 at 14:30

With additional information about Spotlight’s volume indexes, this article looks in more detail at how those indexes are updated when new files are created, and how local search is reflected in log entries. To do this I use the Spotlight test feature in my free utility Mints, which is designed to assess its performance.

Mints can create a test folder and populate it with 9 test files, copied from inside the app bundle using File Manager’s copyItem function. Provided the app and that newly created folder in ~/MintsSpotlightTest4syzFiles are in the same volume, this will almost certainly occur by creating APFS clone files. The files are:

  • SpotTestA.rtf, RTF, relying on /System/Library/Spotlight/RichText.mdimporter
  • SpotTestB.pdf, PDF, relying on /System/Library/Spotlight/PDF.mdimporter
  • SpotTestC.txt, plain text, relying on /System/Library/Spotlight/RichText.mdimporter
  • SpotTestD.html, HTML, relying on /System/Library/Spotlight/RichText.mdimporter
  • SpotTestE.docx, Microsoft Word docx, replying on /System/Library/Spotlight/Office.mdimporter
  • SpotTestEXIF.jpg, JPEG image not containing the target string in the image, but in the Make field of its EXIF IDF0 metadata, and relying on /System/Library/Spotlight/Image.mdimporter
  • SpotTestF.numbers, Numbers spreadsheet containing the target string in one of its cells, and relying on /System/Library/Spotlight/iWork.mdimporter
  • SpotTestG.pages, Pages document, relying on /System/Library/Spotlight/iWork.mdimporter
  • SpotTestMD.txt, plain text not containing the target string in its contents, but in an extended attribute of type com.apple.metadata:kMDItemKeywords attached when the file is copied.

The search target they each contain is a string starting with the characters syzygy999 so most likely to be unique to the test files and Mints’ bundle.

This example test was performed on a Mac mini M4 Pro running macOS 15.6 from its internal SSD, with the Home folder located in the boot Data volume in that storage.

Indexing

Folder creation and file copying should take place soon after clicking on the Create Test button, in this example test in just over 0.3 seconds. The first mdworker process responding to this is spawned about 0.9 seconds after file copying starts, and further mdworker processes are added to it, making a total of 11. These are uncorked by launchd in about 0.03-0.05 seconds and are active for a period of about 0.2 seconds before they appear to have finished creating new posting lists to be added to the local indexes.

Shortly after those are complete, spotlightknowledge is active briefly, then 0.2 seconds after file copying mediaanalysisd receives a text processing request and initialises CGPDFService. Just over 0.1 second later CoreSceneUnderstanding is invoked from mediaanalysisd. For the following 5 seconds, a series of MADTextEmbedding (mediaanalysisd text embedding) tasks are performed. Finally, content obtained by mdworker and MADTextEmbedding is compressed by mds_stores, finishing about 6.7 seconds after the new files were created.

These steps are summarised in the diagram above, where those in blue update metadata indexes, and those in yellow and green update content indexes. It’s most likely that each file to be added to the indexes has its own individual mdworker process that works with a separate mdimporter, although in this case at least 11 mdworker processes were spawned for 9 files to be indexed.

Text extraction from files by mediaanalysisd involves a series of steps, and will be covered in more detail separately. In this test, local Spotlight indexes should have been updated within 7-8 seconds of file creation, so waiting at least 10 seconds after clicking the Create Test button should be sufficient to enable the test files to be found by search.

Searching

When the Check Search button in Mints is clicked, an NSMetadataQuery is performed using a search predicate over the search scope of NSMetadataQueryUserHomeScope, the Home folder. This uses the search predicate (kMDItemTextContent CONTAINS[cd] "syzygy999") OR (kMDItemKeywords CONTAINS[cd] "syzygy999") OR (kMDItemAcquisitionMake CONTAINS[cd] "syzygy999") to find all items containing any of:

  • kMDItemTextContent CONTAINS[cd] "syzygy999" looks for the target string in a text representation of the content of the document, as in most of the test files used.
  • kMDItemKeywords CONTAINS[cd] "syzygy999" looks for keywords in an extended attribute of type com.apple.metadata:kMDItemKeywords as attached to the SpotTestMD.txt file.
  • kMDItemAcquisitionMake CONTAINS[cd] "syzygy999" looks for the Make field of an EXIF IDF0 as embedded in the SpotTestEXIF.jpg file.

Each of those is both case- and diacritic-insensitive, although the test cases all use lower case without any diacritics.

Remarkably little is recorded in the log for these test searches. com.apple.metadata starts the query within 0.001 second of clicking the Check Search button, and is interrupted by TCC approving the search to proceed. About 0.03 seconds after the start of the query, a series of entries from mds_stores reports query node counts of 4, and results are fetched and sent as replies between 0.09 and 19.5 seconds after the start of the query. However, Mints typically reports a considerably shorter search time of around 4 seconds.

Better testing

The test currently included in Mints appears useful, in that it assesses the whole of Spotlight local search from file creation to completion of a test search. However, in its current form it doesn’t test LiveText or other in-image text extraction, and only applies to the volume containing the Home folder. Log extracts obtained in its Log Window are basic and don’t capture much of the detail used for this article. There seems scope for a dedicated Spotlight test app that addresses these and more.

Summary

  • Spotlight local indexing from the creation of nine new files reliant on macOS bundled mdimporters is initiated promptly, and should be completed within 10 seconds of their creation. It appears to follow the scheme shown in the diagram above.
  • Text extraction from images by mediaanalysisd (‘MAD’) is significantly slower than extraction of metadata and other text content.
  • NSMetadataQuery search should also be relatively quick when the number of hits is small, but leaves little information in the log.

Last Week on My Mac: Search and you’ll find

By: hoakley
3 August 2025 at 15:00

One thing we humans are good at is searching. It’s a task we engage in from a few moments after birth until the time we slip away in death, we search everything around us. Locating and identifying that bird of prey wheeling high, finding the house keys, and that book we mislaid some time last week, meeting the perfect partner, discovering the right job, choosing the best education, looking through a Where’s Wally? or Where’s Waldo? book, and so on. Searching has transformed some into explorers like Christopher Columbus, and was the purpose of the chivalric quest. It’s what researchers in every field do, and thanks to Douglas Adams can be answered by the number 42.

Last week my searching took two new turns.

Spotlight

The first was more of a meta-search, in trying to discover more about the internals of Spotlight. Following the example of Maynard Handley, who has used them so successfully in understanding how M-series CPUs work, I looked through patents that have been awarded to Apple for the work of its search engineers. Yesterday’s slightly fuller history of Spotlight search is one result, and there are more to come in the future as I digest those patents concerned with performing search.

There’s a tinge of irony here, as many of my searches have been conducted using Google Patents, alongside Google Scholar one of the remaining search engines that doesn’t yet use AI and attempt to provide its own answers.

Logs

The other marks a new phase in my quest to get more information from the Unified log. Looking back to my first comment here, I realise how wildly over-optimistic I was when I wrote that it “should make my life a lot easier”, and that “a new version of Console will provide improved features to help us wade through logs.” Nine years later, I look wistfully at what remains of Console and realise how wrong I was on both counts.

When RunningBoard arrived in macOS Catalina, I soon noticed how “its log entries are profuse, detailed, and largely uncensored for privacy.” Since then it has proved garrulous to the point where its apparently ceaseless log chatter is a distraction, and can overwhelm attempts to read other log entries. I suspect it has contributed significantly to those advanced Mac users who now refuse to even try to make sense of the log.

One answer might be to tweak log preferences to shut out this noise, but given the purpose of RunningBoard in monitoring the life cycle of apps, why not try to use the information it provides? To do that, it’s first necessary to understand RunningBoard’s idiosyncratic language of assertions, and its protocols under which they’re acquired. The only way to do that without documentation is by observation: catalogue over 30 of those assertions for an interesting example like Apple’s Developer app, and see what they reveal.

By far the most informative entries from RunningBoard are those announcing that it’s acquiring an assertion, such as
Acquiring assertion targeting [app<application.developer.apple.wwdc-Release.9312198.9312203(501)>:2946] from originator [osservice<com.apple.uikitsystemapp(501)>:748] with description <RBSAssertionDescriptor| "com.apple.frontboard.after-life.subordinate" ID:424-748-2228 target:2946 attributes:[
<RBSDomainAttribute| domain:"com.apple.frontboard" name:"AfterLife-Subordinate" sourceEnvironment:"(null)">
]>

In a log often censored to the point of being unintelligible, this contains frank and explicit detail. The app is identified clearly, with the user ID of 501 and process ID of 2946. The originator is similarly identified as com.apple.uikitsystemapp with its PID of 748, which is confirmed in the middle digits in the Assertion ID. This is explicitly related to FrontBoard and an attribute named AfterLife-Subordinate. There’s not a single <private> to blight this entry, although further knowledge is needed to decode it fully.

Normally to get such information from a running process would require its source code to be instrumented with calls to write log entries, many of which would be lost to <private>, yet RunningBoard seems happy, for the moment, to provide that information freely. You can see what I mean by applying the predicate
subsystem == "com.apple.runningboard" AND message CONTAINS "Acquiring assertion t"
in LogUI, to obtain a running commentary on active apps and processes. Once you’ve identified a relevant assertion, you can focus attention on other log entries immediately prior to that. I will be following this up in the coming week, with fuller instructions and some demonstrations.

Although neither patents nor assertions have the significance of the number 42, in their own ways they show how the art and science of search aren’t dead yet, nor have they succumbed to AI.

A more detailed history of Spotlight

By: hoakley
2 August 2025 at 15:00

Since writing A brief history of local search, I have come across numerous patents awarded to Apple and its engineers for the innovations that have led to Spotlight. This more detailed account of the origins and history of Spotlight uses those primary sources to reconstruct as much as I can at present.

1990

ON Technology, Inc. released On Location, the first local search utility for Macs, a Desk Accessory anticipating many of the features to come in Spotlight 15 years later. This indexed text found in the data fork of files, using format-specific importer modules to access those written by Microsoft Word, WordPerfect, MacWrite and other apps of the day. Those files and their indexed contents were then fully searchable. This required System Software 6.0 or later, and a Mac with a hard disk and at least 1 MB of RAM. It was developed by Roy Groth, Rob Tsuk, Nancy Benovich, Paul Moody and Bill Woods.

1991

Version 2 of On Location was released. ON Technology was later acquired by Network Corporation, then by Symantec in 2003.

1994

AppleSearch was released, and bundled in Workgroup Servers. This was based on a client-server system running over AppleShare networks. September’s release of System Software 7.5 introduced a local app Find File, written by Bill Monk.

1998

Sherlock was released in Mac OS 8.5. This adopted a similar architecture to AppleSearch, using a local service that maintained indexes of file metadata and content, and a client app that passed queries to it. This included remote search of the web through plug-ins working with web search engines, as they became available.

Early patent applications were filed by Apple’s leading engineers who were working on Sherlock, including US Patent 6,466,901 B1 filed 30 November 1998 by Wayne Loofbourrow and David Cásseres, for a Multi-language document search and retrieval system.

1999

Sherlock 2 was released in Mac OS 9.0. This apparently inspired developers at Karelia Software to produce Watson, ‘envisioned as Sherlock’s “companion” application, focusing on Web “services” rather than being a “search” tool like Sherlock.’

2000

On 5 January, Yan Arrouye and Keith Mortensen filed what became Apple’s US Patent 6,847,959 B1 for a Universal Interface for Retrieval of Information in a Computer System. This describes the use of multiple plug-in modules for different kinds of search, in the way that was already being used in Sherlock. Drawings show that it was intended to be opened using an item on the right of the menu bar, there titled [GO-TO] rather than using the magnifying glass icon of Sherlock or Spotlight. This opened a search dialog resembling a prototype for Spotlight, and appears to have included ‘live’ search conducted as letters were typed in.

2001

Karelia Software released Watson.

2002

Mac OS X Jaguar brought Sherlock 3, which many considered had an uncanny resemblance to Watson. That resulted in acrimonious debate.

2005

In preparation for the first Intel Macs, Mac OS X 10.4 Tiger, released in April 2005, introduced Spotlight as a replacement for Sherlock, which never ran on Intel Macs.

Initially, the Spotlight menu command dropped down a search panel as shown here, rather than opening a window as it does now.

2006

On 4 August, John M Hörnkvist and others filed what became US Patent 7,783,589 B2 for Inverted Index Processing, for Apple. This was one of a series of related patents concerning Spotlight indexing. Just a week later, on 11 August, Matthew G Sachs and Jonathan A Sagotsky filed what became US Patent 7,698,328 B2 for User-Directed search refinement.

A Finder search window, precursor to the modern Find window, is shown in the lower left of this screenshot taken from Tiger in 2006.

2007

Spotlight was improved in Mac OS 10.5 Leopard, in October. This extended its query language, and brought support for networked Macs that were using file sharing.

This shows a rather grander Finder search window from Mac OS X 10.5 Leopard in 2009.

2014

Search attributes available for use in the search window are shown here in OS X 10.9 Mavericks, in 2014.

In OS X 10.10 Yosemite, released in October, web and local search were merged into ‘global’ Spotlight, the search window that opens using the Spotlight icon at the right end of the menu bar, accompanied by Spotlight Suggestions.

2015

John M Hörnkvist and Gaurav Kapoor filed what was to become US Patent 10,885,039 B2 for Machine learning based search improvement, which appears to have been the foundation for Spotlight Suggestions, in turn becoming Siri Suggestions in macOS Sierra. Those were accompanied by remote data collection designed to preserve the relative anonymity of the user.

spotlighticloud

This shows a search in Global Spotlight in macOS 10.12 Sierra, in 2017.

c 2019

Apple acquired Laserlike, Inc, whose technology (and further patents) has most probably been used to enhance Siri Suggestions. Laserlike had already filed for patents on query pattern matching in 2018.

I’m sure there’s a great deal more detail to add to this outline, and welcome any additional information, please.

4 August 2025: I’m very grateful to Joel for providing me with info and links for On Location, which I have incorporated above.

❌
❌