Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Spotlight search can be blocked by extended attributes

By: hoakley
16 June 2025 at 14:30

There are several well-known methods for excluding items from Spotlight search. This article details one that, as far as I can tell, has remained undocumented for the last 18 years since it was added in Mac OS X 10.5 Leopard, when Spotlight was only two years old, and can still catch you out by hiding files from local Spotlight search. This was discovered by Matt Godden, who has given his account here.

Well-known exclusions

There have been three general methods of excluding folders from Spotlight’s indexing and search, although only two of them still work reliably:

  • making the folder invisible to the Finder by prefixing a dot ‘.’ to its name;
  • appending the extension .noindex to the folder name (this earlier worked with .no_index instead);
  • putting an empty file named .metadata_never_index inside the folder; that no longer works in recent macOS.

Additionally, System Settings offers Spotlight Privacy settings in two sections. Search results won’t normally prevent indexing of those items, but does block them from appearing in search results. Spotlight’s indexing exclusion list is accessed from the Spotlight Privacy… button, where you can add items that you don’t want indexed at all.

xclusions2

Extended attribute

Matt Godden investigated repeated failure of Spotlight search to find some images in his large media library, and discovered that the extended attribute (xattr) named com.apple.metadata:kMDItemSupportFileType was responsible. Images that weren’t returned in a search of that library all had that xattr attached, and when that was removed, those images were found reliably.

According to Apple’s documentation, that xattr was available in Mac OS X 10.5 and has since been deprecated. No further information is given about its function or effect, nor does it appear in an older list of Spotlight metadata attribute keys.

Search of previous mentions of this xattr reveal that it has been found with either of two values, iPhotoPreservedOriginal as described for Matt’s images, and MDSystemFile used with several apps that have proved equally inaccessible to Spotlight search. Images that have this xattr attached appear to have originated in old iPhotos libraries, which may have been migrated to Photos libraries. Searches for files with this xattr suggest that even old collections of images seldom have the xattr present, in my case on only 9 files out of over 800,000 checked, and the MDSystemFile variant wasn’t found in over 100,000 application files.

The mere presence of this xattr is sufficient to exclude a file from Spotlight search: setting its value to the arbitrary word any, for example, was as effective as setting it to either iPhotoPreservedOriginal or MDSystemFile.

Strangely, the method used to search is important: files with the com.apple.metadata:kMDItemSupportFileType xattr can’t be found when using Local Spotlight search in a Find window, but can be found by Mints using a standard search predicate with NSMetadataQuery.

Detection and removal

The simplest way to detect whether your Mac has files with the com.apple.metadata:kMDItemSupportFileType xattr is to use the Crawler tool in my free xattred, with Full Disk Access. Open its window using the Open Crawler… command in the Window menu, paste the xattr name into the Xattr type box. Click on the Scan button and select the volume or folder to check. xattred then crawls the full directory tree within that and reports all files with that xattr.

The xattr can then be removed by dragging the file onto one of xattred’s main windows, selecting the xattr, and clicking on the Cut button. That change will be effective immediately, and the file made available through Spotlight search within a few seconds.

If you have more than a handful of files with the xattr, use xattred’s Stripper to remove them all. Paste the xattr name into the Xattr type box. Click on the Strip button and select the volume or folder to process.

Recommendations

  • If your Mac is likely to search old images that might have the com.apple.metadata:kMDItemSupportFileType xattr attached, search for and remove all such xattrs to ensure those files aren’t excluded from search.
  • Whether this behaviour is intentional or not, it’s clearly an undesirable legacy, has been deprecated for many years, and should be removed from Spotlight search.

I’m extremely grateful to Matt Godden for his painstaking research and keeping me informed.

Emscripten Fetch 接口的一个潜在内存泄漏问题

7 May 2025 at 18:04

近日发现了一个非常刁钻的可能引起基于 Emscripten 编译的 WASM 程序内存泄漏的问题。Emscripten 工具链提供了 Fetch 功能模块,这个模块允许我们调用浏览器的 fetch 接口来进行网络访问。

一个使用 fetch 接口的简单例子是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <stdio.h>
#include <string.h>
#include <emscripten/fetch.h>

void downloadSucceeded(emscripten_fetch_t *fetch) {
printf("Finished downloading %llu bytes from URL %s.\n", fetch->numBytes, fetch->url);
// The data is now available at fetch->data[0] through fetch->data[fetch->numBytes-1];
emscripten_fetch_close(fetch); // Free data associated with the fetch.
}

void downloadFailed(emscripten_fetch_t *fetch) {
printf("Downloading %s failed, HTTP failure status code: %d.\n", fetch->url, fetch->status);
emscripten_fetch_close(fetch); // Also free data on failure.
}

int main() {
emscripten_fetch_attr_t attr;
emscripten_fetch_attr_init(&attr);
strcpy(attr.requestMethod, "GET");
attr.attributes = EMSCRIPTEN_FETCH_LOAD_TO_MEMORY;
attr.onsuccess = downloadSucceeded;
attr.onerror = downloadFailed;
emscripten_fetch(&attr, "myfile.dat");
}

Fetch API 提供了一些比较高阶的功能,一种一个比较重要的功能是,他可以将下载的内容缓存到 IndexDB 中,这个缓存机制能够突破浏览器自身的缓存大小的限制(一般超过 50MB 的文件浏览器的自动缓存机制会拒绝缓存)。但是这个缓存机制会导致内存泄漏。

1 泄漏产生的过程

在开头的例子中,我们需要再 onerror 和 onsuccess 回调中调用 emscripten_fetch_close 接口来关闭 fetch 指针代表的请求。在关闭过程中,fetch 使用的数据缓存区将会被回收。这个过程如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
EMSCRIPTEN_RESULT emscripten_fetch_close(emscripten_fetch_t* fetch) {
if (!fetch) {
return EMSCRIPTEN_RESULT_SUCCESS; // Closing null pointer is ok, same as with free().
}

// This function frees the fetch pointer so that it is invalid to access it anymore.
// Use a few key fields as an integrity check that we are being passed a good pointer to a valid
// fetch structure, which has not been yet closed. (double close is an error)
if (fetch->id == 0 || fetch->readyState > STATE_MAX) {
return EMSCRIPTEN_RESULT_INVALID_PARAM;
}

// This fetch is aborted. Call the error handler if the fetch was still in progress and was
// canceled in flight.
if (fetch->readyState != STATE_DONE && fetch->__attributes.onerror) {
fetch->status = (unsigned short)-1;
strcpy(fetch->statusText, "aborted with emscripten_fetch_close()");
fetch->__attributes.onerror(fetch);
}

fetch_free(fetch);
return EMSCRIPTEN_RESULT_SUCCESS;
}

可以看到,回收并非总会发生, emscripten_fetch_close 函数会对 fetch 的部分状态进行检查,如果检查失败,则会返回一个 EMSCRIPTEN_RESULT_INVALID_PARAM 的错误码,并且不会执行后续的清理过程(`fetch_free)。被检查的两属性中,fetch->id 是我们需要关注的对象。fetch->id 这个属性作为 fetch 的唯一标识符,是用来建立起 C++ 端的请求对象和 JS 端的请求对象的映射的。id 的值在 JS 端分配。查看源码中的 Fetch.js 文件,

1
2
3
4
5
6
7
8
9
10
11
12
function fetchXHR(fetch, onsuccess, onerror, onprogress, onreadystatechange) {
// ...

var id = Fetch.xhrs.allocate(xhr);
#if FETCH_DEBUG
dbg(`fetch: id=${id}`);
#endif
{{{ makeSetValue('fetch', C_STRUCTS.emscripten_fetch_t.id, 'id', 'u32') }}};

// ...

}

这是唯一的一处 id 复制。这段代码位于 fetchXHR 函数中,这意味着只有发起了 XHR 请求时,id 才会被分配。那么,如果缓存存在呢?这时不会调用 fetchXHR 函数(而是调用 fetchLoadCachedData 函数)。这意味着回调函数中我们试图调用 emscripten_fetch_close 函数来关闭请求并回收资源时,这个回收过程无法进行,这导致了内存泄漏。

2 怎么解决这个问题

要解决这个问题我们只需要强行让 fetch->id == 0 的检查无法通过即可,我们可以在 emscripten_fetch_close 调用前,强行设置 fetch->id 为一个非零值。那么什么值合适呢?如果我们取值和已有的请求的 id 相同,那么 emscripten_fetch_close 可能将那个请求关闭。研究 id 分配的过程(即 Fetch.xhrs.allocate 的实现)

1
2
3
4
5
6
// libcore.js
allocate(handle) {
var id = this.freelist.pop() || this.allocated.length;
this.allocated[id] = handle;
return id;
}

可以看到,id 是顺序分配的,且使用过的 id 会被回收使用(freelist)。因此我们可以设置一个较大的值,只要同一时间最大的并发请求数量不超过这个值,那就是安全的。我一般选择设置为 0xffff。 那么,正确的关闭请求的方式是:

1
2
3
4
if (fetch->id == 0) {
fetch->id = 0xffff;
}
emscripten_fetch_close(fetch);

The perils of virtualisation on M4 Macs

By: hoakley
21 April 2025 at 14:30

Until last November, lightweight virtualisation of macOS on Apple silicon Macs had behaved uniformly across M-series families. Although I have heard of one report of problems moving VMs between Macs, those were built with custom kernels. In ordinary experience, VMs running on M1, M2 and M3 chips seemed not to care about the host’s hardware, and most of the time just worked, and updated correctly. There was one unfortunate glitch with shared folders that were lost in macOS 14.2 and 14.2.1, but otherwise VMs largely worked as expected.

Then last November disaster struck those of us who had just started using our new M4 Macs: they couldn’t virtualise any version of macOS before Ventura 13.4. Running a macOS VM for any version before that on an M4 Mac resulted in a black screen, and the VM failed to boot. That was fixed swiftly in macOS 15.2, and we no longer had to keep an older Apple silicon Mac around to be able to run those older versions of macOS in VMs.

Like many who virtualise macOS on Apple silicon, I keep a library of VMs with different versions so I can readily run tests on my apps and other issues. This is one of the great advantages of virtualisation, provided that you don’t rely on being able to run most apps from the App Store. When Apple releases new versions of macOS, once I’ve updated my Mac hosts, I turn to updating VMs. I’m normally cautious when doing this, to avoid trashing the original version. I duplicate the most recent, open it and run Software Update. When I’m happy that has worked correctly, I trash the original and rename the updated VM with its new version number.

That worked fine with Ventura 13.7.4 updating to 13.7.5, and Sonoma going to 14.7.5, but Sequoia 15.3.2 failed with a kernel panic, as I’ve detailed. When several of you kindly pointed out that M1, M2 and M3 Macs had no such problem, I confirmed on my M3 Pro that this is confined to hosts with an M4 family chip.

I have since tried updating my 15.3.2 VM to 15.4.1 on the M4 Pro, a surprisingly large update of over 6 GB, and that continues to result in a kernel panic and failure. I have also tried updating from 15.1 to 15.4.1 with an extraordinarily large download of more than 15 GB, only to see a repeat of the same kernel panic, with an almost identical panic log.

The macOS 15.4 update was particularly large, and some Apple silicon Macs were unable to install it successfully, most commonly on external bootable disks. From your reports, the 15.4.1 update seems to have fixed those problems with real rather than virtualised macOS. However, it hasn’t done anything to solve problems with VMs.

If you have an existing VM running any version of Sequoia prior to 15.4, then you’re unlikely to be successful updating that to 15.4 or later using an M4 host.

In contrast, upgrading a VM currently running Sonoma 14.7.5 completed briskly and without error. To my great surprise, that only requires a download of 8.7 GB, a little over half the size of the update from 15.1 to 15.4.1, which seems to be the wrong way round. The snag with upgrading from a previous major version of macOS to 15.x is that VM will never be able to use one of the most attractive features of Sequoia, iCloud Drive. If you want support for that, you’ll have to build a fresh VM using a Sequoia IPSW image file.

So for the time being, M4 hosts have a barrier between 15.3.2 and 15.4 that they can’t cross with an update. If you want a VM running 15.4 or later, then you’ll have to build a new one, or update 15.4 or later.

I don’t know and probably wouldn’t understand what changed in the 15.4 update, but it has certainly upset a lot of apple carts and VMs. And if you’d like a little homework, can you please explain:

  • Update 15.1 to 15.4.1, download 15 GB, failure.
  • Upgrade 14.7.5 to 15.4.1, download 8.7 GB, success.

❌
❌