Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

How to search Spotlight for Live Text and objects in images

By: hoakley
13 August 2025 at 14:30

Spotlight has been able to find text extracted from images using Live Text, and the names of objects recognised using Visual Look Up, for some years now. This article considers how you can and cannot search for those. Although this might seem obvious, it’s more confusing than it appears and could mislead you into thinking that Spotlight indexing or search isn’t working.

As detailed in yesterday’s account of Live Text, text recognition in images uses a lexicon to match words rather than proceeding in single characters. Terms assigned to recognised objects are also words. Thus, when searching for either type you should use words as much as possible to increase the chances of success.

Global Spotlight 🔍

Type a word like cattle into Spotlight’s search window and you can expect to see a full selection of documents and images containing the term or cattle objects. Those include images containing the word, and images containing objects identified as cattle, but don’t include images in PDF files, as they’re not analysed by mediaanalysisd, so don’t undergo character or object recognition in the same way that regular images like JPEGs do.

The Finder’s Find

Open a new window in the Finder and turn it into a Spotlight search using the Finder’s File menu Find command. In its search box at the top right, type in a word like cattle and in the popup menu select the lower option, Content contains. Press Return and the search box will now display ANY cattle. Then set the Kind to Image in the toolbar above search results, to narrow results down to image files. You should then see a full listing of image files that either contain the word cattle, or contain objects identified by image analysis as being cattle. Note how many of those appear.

Reconfigure the search so the search box is empty, and there are two rows of search settings: the first can remain the same as Kind is Image, but set the second to Contents contains cattle. Those images containing objects identified as cattle will now vanish, leaving images containing the word cattle still listed.

To understand the difference between these, you can save those two Find windows and read the underlying terms used for each search. The search that returned text obtained by both Live Text and Visual Look Up used
(((** = "cattle*"cdw)) && (_kMDItemGroupId = 13))
while the one excluding Visual Look Up used
(((kMDItemTextContent = "cattle*"cdw)) && (_kMDItemGroupId = 13))
instead. We can transfer those to the mdfind command tool to explore further.

mdfind

To use those as search predicates with mdfind we’ll translate them into more general form,
mdfind "(** == 'cattle*'cdw) && (kMDItemContentTypeTree == 'public.image'cd)"
should return both Live Text and Visual Look Up, while
mdfind "(kMDItemTextContent == "cattle*"cdw) && (kMDItemContentTypeTree == 'public.image'cd)"
only returns Live Text results.

The term (** == 'cattle*'cdw) has a special meaning because of its wild card **, and will return any match found in the metadata and contents of files. kMDItemTextContent is similar, but confined to text content, which doesn’t include the names of objects recognised in an image, and Apple doesn’t reveal whether there’s an equivalent that does.

Code search

Although apps can call mdfind to perform searches, they normally use NSMetadataQuery with an NSPredicate instead. That isn’t allowed to use a predicate like
** ==[cdw] "cattle"
so can’t search for objects identified using Visual Look Up. When it uses the substitute of
kMDItemTextContent ==[cdw] "cattle"
it also fails to find text obtained using Live Text. So the only way to search for recovered text in a compiled app is to call mdfind.

Timing

Searching for text obtained using Live Text, or object labels obtained using Visual Look Up, depends entirely on those being added to the Spotlight indexes on the volume. Observations of the log demonstrate just how quickly normal indexing by mdworker processes takes place. Here’s an example for a screenshot:
06.148292 com.apple.screencapture Write screenshot to temporary location
06.151242 [0x6000009bc4b0] activating connection: mach=true listener=false peer=false name=com.apple.metadata.mds
06.162302 com.apple.screencapture Moving screenshot to final location
06.169565 user/501/com.apple.mdworker.shared.1E000000-0600-0000-0000-000000000000 internal event: WILL_SPAWN, code = 0
06.198868 com.apple.DiskArbitration.diskarbitrationd mdworker_shared [7266]:118071 -> diskarbitrationd [380]
06.226997 kernel Sandbox apply: mdworker_shared[7266]

In less than 0.1 second an mdworker process has been launched and granted the sandbox it uses to generate metadata and content for Spotlight’s indexes. Unfortunately, that doesn’t include any Live Text or Visual Look Up content, which are generated separately by mediaanalysisd later. It’s hard to estimate how much later, although you shouldn’t expect to find such recovered text for several hours or days, depending on the opportunities for mediaanalysisd to perform background image analysis for this purpose.

Summary

  • Words recovered from images (not those in PDF files, though) and objects recognised in them can be found by Spotlight search. For best results, choose words rather than letter fragments for search terms.
  • Global Spotlight gives full access to both types.
  • The Finder’s Find gives full access to both only when the search term is entered in the search box as Content contains, otherwise it may exclude objects recognised.
  • mdfind can give full access to both types only when using a wildcard term such as (** == 'cattle*'cdw).
  • NSMetadataQuery currently appears unable to access either type. Call mdfind instead.
  • The delay before either type is added to the volume Spotlight indexes can be hours or days.

How to search successfully in Spotlight: Query languages

By: hoakley
6 June 2025 at 14:30

Although the great majority use GUI search tools provided in the Global Spotlight menu, Finder Find windows, and in-app Core Spotlight, macOS also provides access using query languages. This article takes a brief tour of those available in macOS. As with previous coverage, this doesn’t include third-party utilities such as HoudahSpot that provide their own interface to Spotlight searching, nor alternative search methods.

Search boxes

Search boxes provided by both Global Spotlight and Local Spotlight can accept a simple form of query language, for example
name:"target"*cdw
which also works with filename:, and performs the equivalent of the Matches operator in a Find window. These use English terms for the attributes to be used, like name and filename, including some of those listed here for Core Spotlight. However, limited information is available and this doesn’t appear to be extensive enough to use at scale. Operators available are also limited within those listed for Core Spotlight.

Modifiers available in current macOS include

  • c for case-insensitivity,
  • d to ignore diacritics such as accents,
  • w to match on word boundaries, as marked by space, underscore _, hyphen – and changes of case used in CamelCase.

The asterisk * can be used as a wildcard to match substrings, and the backslash \ acts as an escape character, for example \" meaning a ” literal. In theory, simple predicates can be combined using && as AND, and || as OR.

In practice, getting these to work is tricky, and rarely worth the effort of trying.

Raw queries

One of the least-used attributes available in search bars in the Find window enables the use of what are termed raw queries. Confusingly, these use different names for attributes, such as kMDItemDisplayName instead of name. Otherwise these are more reliable than those used in search boxes. For example, when searching for the string target,
kMDItemDisplayName = "*target*"
is the equivalent of Contains, and
kMDItemDisplayName = "target*"w
appears functionally identical to Matches.

These appear to be an option of last resort, and need documentation.

mdfind

This command tool provides the most complete access to the central Spotlight Query Language, which defies abbreviation to SQL. In addition, it also supports a direct form that searches for matching file names only, using
mdfind -name "target"
to find the word target in filenames.

Unfortunately, although Spotlight Query Strings are predicates, they aren’t the same as NSPredicates used elsewhere within macOS. One of the most obvious differences is that Spotlight’s modifiers are appended to the value, not the operator, as they are when using search predicates in the log command, for example.

Query strings take the general form
attribute operator value[modifiers]
where

  • attribute is a kMD… name defined as a metadata attribute key,
  • operator can take a formal version such as ==, or may be abbreviated to just =, which appear to be identical in effect,
  • value can be a string containing wildcards or escapes, or those detailed for special cases such as numbers and dates,
  • modifiers include those given above.

Simple examples are
mdfind "kMDItemDisplayName = '*target*'"
or
mdfind "kMDItemDisplayName == '*target*'"

Apple’s current list of common metadata attribute keys is given here. Otherwise, documentation is old if not ancient, and there are obvious differences from current Spotlight and mdfind, such as the expanded list of modifiers.

File metadata queries are explained here, with an apparently duplicated version here. File metadata attributes are documented here, and a general account of predicates is here.

Saved Search Queries

On the face of it, one way to become familiar with and to develop query strings for use in mdfind might be to set them up in a Find window, save that and use the query string within it for your own. Although this can be helpful, queries in saved search files often use attributes not accessible to mdfind. For example, entering the string target in the search box setting a search bar to Kind is Image All is represented by the query
(((** = "target*"cdw)) && (_kMDItemGroupId = 13))
where the second attribute is an internal form of kMDItemKind.

However, this quickly runs into difficulties, as values of _kMDItemGroupId don’t appear to be documented, and substituting that with an alternative such as
kMDItemKind = "public.image"
fails silently.

Conclusions

  • Spotlight query strings take several forms, none of them well-documented.
  • Queries provided in Saved Search are of limited use, and are only likely to confuse.
  • For occasional use, they are usually frustrating.
  • For frequent use, third-party alternatives are more consistent and much better documented.

❌
❌