Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Amazon’s Big Spending Reignites an A.I. Stock Rally

Investors cheered the tech giant’s latest results showing that its huge investments in artificial intelligence are beginning to show returns.

© Brendan Mcdermid/Reuters

Andy Jassy, the C.E.O. of Amazon, sees big growth from the company’s cloud business.

Last Week on My Mac: Why Spotlight can’t find some files

By: hoakley
26 October 2025 at 16:00

For the last seven years or so there have been many folk complaining that Spotlight local search hasn’t been finding the files they know are there. Many have resorted to repeatedly rebuilding its indexes, usually without success. Last week, thanks to Jürgen, Drew, aldous and others who have contributed, we have discovered one cause. A bug that appears to have been introduced in macOS Mojave, and is still present in Tahoe 26.0.1, that prevents Spotlight from indexing any of the contents of plain text files that start with certain characters.

Jürgen stumbled across the first example, with files starting with the two capital letters LG. At the time, that seemed extremely unusual and unlikely to affect many files. Then Drew added HPA and Draw to the list of forbidden characters. What looked like a rare event was becoming increasingly commonplace, and that list can only grow. How many indexing failures it could account for is impossible to guess.

Piecing together the evidence, it looks like this bug is inside the standard macOS RichText.mdimporter, now embedded in the Signed System Volume in /System/Library/Spotlight and at version 6.9 (350), as it has been since Sonoma (Ventura 13.7.8 has build 345.60.106, although that also suffers this bug). What happens is that saving a text file starting with forbidden characters correctly triggers Spotlight’s indexing service. That identifies the file as having the UTI public.plain-text and hands it over for its contents to be indexed. But the indexer inspects those first few characters, decides it’s a different type of file altogether, and promptly returns an error 4864 for an NSCoderReadCorruptError without going any further.

Apart from the text content not being added to Spotlight’s indexes, and a few lines buried in the Unified log, there are no indications of anything going wrong. If you test the importer using
mdimport -t -d3 filename
the file appears to import correctly, but that command doesn’t give any insight into the import of its contents, only standard attributes such as the filename that are indexed separately.

It was Drew who first suggested a plausible reason for this failure, confirmed by aldous: prior to attempting to index the text contents, Spotlight’s service was using a completely different method to check the type of the contents, the ‘magic’ database used by the file(1) command.

file(1) is an old Unix utility dating back to 1973 or earlier, operating independently of UTIs that were adopted in Mac OS X 10.4 Tiger 20 years ago. Rather than relying on a type assigned to a file, it ‘sniffs’ the contents, particularly the first few bytes of data, and uses a sprawling set of ad hoc rules to guess the file type. It turns out that files starting with the characters Draw were characteristic of a binary vector graphics format used by the !Draw app for RISC OS 2 in 1989. Rather than believing the file’s UTI for one of the most common types of files in macOS, Spotlight’s indexer therefore decided that it was trying to import file data that must now be as rare as hens’ teeth, and wouldn’t go any further.

If you’re sceptical about this coincidence, open the acorn magic data in /usr/share/file/magic in a text editor, and you’ll see the file opening string of Draw identified as RISC OS Draw file data. There are 332 other magic data files containing similar rules for identifying file types. I leave it as an exercise to the Unix wizard to build a list of all those that could cause similar problems with Spotlight indexing.

When this bug hunt started and it affected just LG and HPA, it was fairly esoteric and faintly amusing, at least as long as you didn’t write about your LG TV, high pressure air or Horizontal Pod Autoscaling. When Draw was added, and all those 333 magic files piled in, I realised how extensive this could be, and how little testing can be performed on Spotlight indexing and search.

Given that about eight years ago an Apple engineer wrote code for the RichText.mdimporter in macOS that introduced testing against some or all of the magic database, wouldn’t you have thought they’d test and debug that against test cases, such as text files starting with characters (mis)recognised by magic rules? And maybe occasionally over subsequent years and new versions of macOS, wouldn’t revised versions of the importer be tested again?

Apple likes Spotlight to be opaque to the user, for it to ‘just work’. There’s almost no documentation even for developers, and tools provided are strictly limited in what they can do, as demonstrated here in the case of mdimport. That’s all very well until Spotlight doesn’t work and no one outside Apple can do anything about it. Third-parties can’t even write custom mdimporters to do the job properly, as those bundled in macOS take priority.

If this was the first time that Spotlight indexing had let us down, I might feel more charitable. But between macOS Catalina 10.15.6 in July 2020 and Big Sur 11.3 in April 2021 macOS was incapable of indexing the content of any Rich Text files. There are still many documents that haven’t been indexed as a result. Those whose contents haven’t been indexed as a result of this bug will similarly be excluded from search until they too are reindexed by a fixed mdimporter. For Intel Macs that won’t be supported by macOS 27, that could well be forever.

A Spotlight bug affecting all recent macOS: the LG error

By: hoakley
23 October 2025 at 14:30

There’s a bug in Spotlight that can prevent it from indexing any of the contents of susceptible text files. This has been present since macOS 13 Ventura if not before, and is still present in Tahoe 26.0.1. I didn’t discover this myself, though: it was reported to me by Jürgen, to whom full credit is due. It’s also one of the strangest bugs I’ve come across, and all depends on two letters.

Demonstration

To demonstrate this bug, all you need is a single UTF-8 plain text file, created by TextEdit or any other app capable of saving plain text. Start the text with the two characters L and G, both in capitals. Then add one or more distinctive words, such as
LG syzygy

Save that file to a folder that you know is indexed and searched by Spotlight, then a few seconds later try searching for the word syzygy in its contents. Extend this as much as you want, maybe appending the whole of one of Charles Dickens’ novels, but no matter how you search for its contents, that file will never be found. If you want to get more serious, use that text file in my Spotlight test app SpotTest, and it will also be unable to find that file.

This only works with plain text files, not Rich Text, PDF or HTML. It’s also sensitive to those two letters. Set one of them in lowercase, preface them with a space, or substitute a different letter, and the contents of that file will then be indexed correctly and searchable as normal.

Affected macOS

I have tested this in virtual machines going back as far as macOS 13 Ventura, and it’s present in them all. If you have access to an earlier version of macOS, I’d be interested to know whether it affects that as well.

Cause

The two UTF-8 characters concerned, 4c 47, don’t appear to be anything special that could be misinterpreted.

Although it’s not easy to distinguish failure to index from search errors, saving a test file does result in repeated reports of an error that could cause Spotlight to fail when trying to index the file, for example the log entries
30.946740 mdwrite Decoding error: Error Domain=NSCocoaErrorDomain Code=4864 UserInfo={NSDebugDescription=[private]} for [private]
30.951004 mds Decoding error: Error Domain=NSCocoaErrorDomain Code=4864 UserInfo={NSDebugDescription=[private]} for [private]

Error code 4864 is NSCoderReadCorruptError, implying that the presence of those two characters at the start of a text file may be triggering a bug in RichText.mdimporter, the importer module shipped in macOS that’s responsible for indexing plain text files.

My current hypothesis is therefore that text files starting with the characters LG are failing to have their contents indexed correctly because of a bug in RichText.mdimporter.

History

This isn’t the first bug in the RichText.mdimporter. In macOS Catalina 10.15.6, the same mdimporter (then build 319.60.100) introduced a bug that broke indexing of Rich Text (RTF) files. That was perpetuated through early releases of Big Sur until it was finally fixed in RichText.mdimporter build 326.11 in Big Sur 11.3.

Because text files starting with the characters LG are exceedingly unusual, this bug appears to have been left in RichText.mdimporter for a great deal longer.

I will be reporting this to Apple in Feedback later this month. Please feel free to file your own Feedback if you can spare the time.

Summary

  • In macOS 13 to 26, plain text files starting with the characters LG cannot be searched for their contents.
  • This appears to be the result of a longstanding bug in RichText.mdimporter in macOS.
  • If those characters are altered, or prefixed by a space, indexing and search behave normally.

I’m very grateful to Jürgen for drawing this to my attention.

❌
❌