Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Last Week on My Mac: A strategy for data integrity

By: hoakley
20 July 2025 at 15:00

File data integrity is one of those topics that never goes away. Maybe that’s because we’ve all suffered in the past, and can’t face a repeat of that awful feeling when an important document can’t be opened because it’s damaged, or crucial data have gone missing. Before considering what we could do prevent that from happening, we must be clear about how it could occur.

We have an important file, and immediately after it was last changed and saved, a SHA256 digest was made of it and saved to that file as an extended attribute, in the way that you can using Dintch, Fintch or cintch. A few days or weeks later we open the file and discover its contents have changed.

Reasons

What could account for that?

One obvious reason is that the file was intentionally changed and saved without updating its digest. Provided there are good backups, we should be able to step back through them to identify when that change occurred, and decide whether it’s plausible that it was performed intentionally. Although the file’s Modified datestamp should coincide with the change seen in its backups, there’s no way of confirming that change was intentional, or even which app was used to write the changed file (with some exceptions, such as PDF).

Exactly the same observations would also be consistent with the file being unintentionally changed, perhaps as a result of a bug in another app or process that resulted in it writing to the wrong file or storage location. The changed digest can only detect the change in file content, and can’t indicate what was responsible. This is a problem common to file systems that automatically update their own records of file digests, as they are unable to tell whether the change is intentional, simply that there has been a change. This also applies to changes resulting from malicious activity.

The one circumstance in which change in contents, hence in digest, wouldn’t necessarily follow a change in the file’s Modified datestamp is when an error occurs in the storage medium. However, this is also the least likely to be encountered in modern storage media without that error being reported.

Errors occurring during transfer to and from storage are detected by CRC or similar checks made as part of the transfer protocol. This is one of the reasons why a transfer bandwidth of 40 Gb/s cannot realise a data transfer rate of 5 GB/s, because part of that bandwidth is used by the error-checking overhead. Once written to a hard disk or SSD, error-correcting codes are used to verify integrity of the data, and are used to detect bad storage blocks.

Out of interest, I’ve been conducting a long-term experiment with 97 image files totalling 60.8 MB stored in my iCloud Drive since 11 April 2020, over five years ago. At least once a year I download them all and check them using Dintch, and so far I haven’t had a single error.

Datestamps

There are dangers inherent in putting trust in file datestamps as markers of change.

In APFS, each file has four different datestamps stored in its attributes:

  • create_time, time of creation of that file,
  • mod_time, time that file was last modified,
  • change_time, time that the file’s attributes including extended attributes were last modified,
  • access_time, time that file was last read.

For example, a file with the following datestamps

  • create_time 2025-04-18 19:58:48.707+0100
  • mod_time 2025-04-18 20:00:56.134+0100
  • change_time 2025-07-19 06:59:10.542+0100
  • access_time 2025-07-19 06:52:17.504+0100

was created on 18 April this year, last modified a couple of minutes later, last had its attributes changed on 19 July, but was last read 7 minutes before that modification to its attributes.

These can be read using Precize, or in Terminal, but there’s a catch with access_time. APFS has an optional feature, set by volume, determining whether access_time is changed strictly. If that option is set, then every time a file is accessed, whether it’s modified or not, its access_time is updated. However, this defaults to only updating access_time if its current value is earlier than mod_time. I’m not aware of any current method to determine whether the strict access_time is enabled for any APFS volume, and it isn’t shown in Disk Utility.

mod_time can be changed when there has been no change in the file’s data, for example using the Terminal command touch. Any of the times can be altered directly, although that should be very unusual even in malware.

Although attaching a digest to a file as an extended attribute will update its change_time, there are many other reasons for that being changed, including macOS adding or changing quarantine xattrs, the file’s ‘last used date’, and others.

Proposed strategy

  1. Tag folders and files whose data integrity you wish to manage.
  2. Back them up using a method that preserves those tags, such as Time Machine, or copied to iCloud Drive.
  3. Periodically Check their tags to verify their integrity.
  4. As soon as possible after any have been intentionally modified and saved, Retag them to ensure their tags are maintained.
  5. In the event that any are found to have changed, and no longer match their tag, trace that change back in their backups.

Unlike automatic integrity-checking built into a file system, this will detect all unexpected changes, regardless of whether they are made through the file system, are intentional or unintentional, are malicious, or result from errors in storage media or transmission. Because only intentionally changed files are retagged, this also minimises the size of backups.

Save your M-series Mac’s energy and battery

By: hoakley
18 July 2025 at 14:30

For the last nine years, Apple silicon chips in iPhones and iPads, and most recently Macs since 2020, have had Efficiency cores, designed to eke out their use of power to extend battery time and stay cool. Although you have no control over what runs on the Efficiency cores in a device, there are options for Macs. This article explains how you can reduce energy use in an Apple silicon Mac, and why that’s a good idea.

M series chips can have between 2 and 8 E cores, each with slightly less than half the processing power of their Performance core equivalent. If you’re unsure how many CPU cores of each type your Mac has, and how they’re used, open the CPU History window in Activity Monitor and watch.

Particularly in the first few minutes after starting a Mac up, the E cores will be busy catching up with their housework, routine tasks such as updating Spotlight’s indexes, checking for various updates, and making an initial backup. Once those are out of the way, you’ll see other bursts of activity, such as XProtect Remediator scans, and a steady trickle of background tasks. Because most user processes are run on the P cores, even hectic E core activity has no effect on what you’re doing.

When threads run on the E cores, they run more slowly, take longer to complete, and for all that take substantially less energy and power. That’s because those cores are designed to be more energy-efficient and are run at lower frequencies, or clock speeds. For almost any given task, running its threads on E cores will thus use less total energy, and

  • run a laptop Mac’s battery down less,
  • generate less heat, so keep the Mac cooler,
  • leave the P cores free for more pressing work.

In-app controls

Although the code run by apps can’t be directly allocated to P or E cores, macOS can get a strong hint by a setting known as the Quality of Service (QoS). By default, the whole of a user app will normally be run at high QoS, and macOS will try to do that on P cores, so long as they’re available.

Some apps are starting to give the user control over this in a ‘speed control’. Among those is the compression utility Keka.

polycore4

In Keka’s settings, you can give its tasks a maximum number of threads, and even run them at custom Quality of Service (QoS) if you want them to be run in the background on E cores, and not interrupt your work on P cores. Although you’re unlikely to do this when compressing or decompressing a few small files, when its tasks are likely to take several minutes, and you can afford to wait a little longer, run them on the E cores.

Two of my apps, Dintch and Cormorant, have even simpler controls.

Dintch has a three-position slider, offering

  • a red racing car 🏎 for top priority on P cores when possible;
  • a blue truck 🚙 for medium priority on P cores when possible;
  • a green tortoise 🐢 for the E cores.

The first two of those took 6.2 seconds to check a 16.8 GB file, and the third took almost 25 seconds. The difference between the first two is in their priority, if there are several threads competing for the same CPU cores. If you’re waiting for files to be checked in Dintch, set the speed control at the racing car, but if you can leave the Mac to get on with checking in the background and want the efficiency of E cores, set the control to the tortoise instead.

Cormorant, a much simpler compression utility aimed at AirDrop transfers, has a fourth slider position with a pointing hand 👉 that allows you to set a custom number of threads and QoS level, as I also use that to test and compare P and E cores.

Activity Monitor

You can observe the effects of these controls in Activity Monitor’s CPU History window.

Here’s the window after Dintch has checked that 16.8 GB file with the setting for the racing car. All the P cores show four bursts of activity as they run the code to compute the SHA256 digest.

When set to the tortoise, there’s almost no activity on the P cores, but the four E cores are busy computing for nearly 25 seconds.

One word of caution about over-interpreting what you see in the CPU History window. Both P and E cores can run at a wide range of frequencies, but Activity Monitor takes no account of that. Taken at face value, you might think these E cores were working far harder than the P cores, but in fact they were only running at low frequency, little more than idle. In contrast, those small bursts of P core activity were at higher frequency, so were using far greater energy and power, albeit more briefly.

Other apps

I’m sure that other apps are offering the user control over whether their longer tasks are run on the E cores, and will be pleased to learn of them. There are also two ways that you can control them yourself, using St. Clair Software’s excellent utility App Tamer, or the command tool taskpolicy.

App Tamer works best with apps, and makes it simple to demote their threads so they’re run on E cores in the background. If you want to demote the threads of a running process from Terminal, use a command like
taskpolicy -b -p 567
to confine all threads of the process with PID 567 to the E cluster. You can then reverse that using the -B option for threads with higher QoS, but can’t alter those set to a low QoS by the process. The ground rule here is that high QoS threads can be demoted to the E cores, but low QoS threads can’t be promoted to run on the P cores.

What to avoid

Running a virtual machine on an Apple silicon Mac always uses P cores first, even though they will also be used to run background threads inside the VM. So the worst thing you can do in terms of energy efficiency and core use is to run code inside a VM.

Summary

  • Run task threads on E cores for better battery endurance, lower heat production, and to keep other apps more responsive.
  • Where an app provides a control, use it to run long tasks in the background, on the E cores.
  • For apps that don’t, use App Tamer. For processes use the taskpolicy command tool.
  • Avoid running a VM if you want high efficiency.

Updates for file integrity (Dintch/Fintch), compression (Cormorant) and LogUI build 70

By: hoakley
14 July 2025 at 14:30

This is the last batch of ‘simple’ updates to my free apps to bring them up to the expectations of macOS 26 Tahoe. With them comes a minor update to my log browser LogUI, which is recommended for all using Tahoe, as it fixes an annoying if fundamentally cosmetic bug.

Preparing these updates for release was a little troublesome, as I attempted this using developer beta 3 of Tahoe and Xcode 26 beta 3. Little did I realise when I got all four rebuilt, tested and notarized, that this combination had stripped their shiny new Tahoe-compliant app icons. That made these new versions unusable in Sequoia and earlier, as they each displayed there with the generic app icon, despite working fine in Tahoe.

Eventually I discovered that I could build fully functional versions using Xcode 26 beta 2 in Sequoia 15.5, so that’s how they have been produced.

File integrity

Five years ago I build a suite of two apps and a command tool to enable checking the integrity of file data wherever it might be stored. This uses SHA256 hashes stored with each checked file as an extended attribute. At that time, the only alternative saved hashes to a file in the enclosing folder, which I considered to be suboptimal, as it required additional maintenance whenever files were moved or copied to another location. It made more sense to ensure that the hash travels with the file whose data integrity it verifies.

The three are Fintch, intended for use with single files and small collections, Dintch, for larger directories or whole volumes, and cintch, a command tool ideal for calling from your own scripts. As the latter has no interface beyond its options, it continues to work fine in macOS 26.

Since then other products have recognised the benefits of saving hashes as extended attributes, although some may now use SHA512 rather than SHA256 hashes. What may not be apparent is the disadvantage of that choice.

Checking the integrity of thousands of files and hundreds of GB of data is computationally intensive and takes a lot of time, even on fast M4 chips. It’s therefore essential to make that as efficient as possible. Although checksums would be much quicker than SHA256 hashes, they aren’t reliable enough to detect some changes in data. SHA algorithms have the valuable property of amplifying even small differences in data: changing a single bit in a 10 GB file results in a huge change in its SHA256 hash.

At the same time, the chances of ‘collisions’, in which two different files share the same hash, are extremely low. For SHA256, the probability that two arbitrary byte sequences share the same hash is one in 2^256, roughly one in 1.2 x 10^77. Using SHA512 changes that to one in 2^512, which is even more remote.

However, there ain’t no such thing as a free lunch, as going from SHA256 to SHA512 brings a substantial increase in the computational burden. When run on a Mac mini M4 Pro, using its internal SSD, SHA256 hashes are computed from files on disk at a speed of around 3 GB/s, but that falls to 1.8 GB/s when using SHA512 hashes instead.

dintchcheck14

Dintch provides two controls to optimise its performance: you can tune the size of its buffer to cope best with the combination of CPU and storage, and you can set it to run at one of three different QoS values. At its highest QoS, it will run preferentially on Apple silicon P cores for a best speed of 3 GB/s, while run at its lowest QoS it will be confined to the E cores for best energy economy, and a speed of around 0.6 GB/s for running long jobs unobtrusively in the background.

The two apps and cintch are mutually compatible, and with their earlier versions going back to macOS El Capitan. In more recent versions of macOS they use Apple’s CryptoKit for optimum performance.

Dintch version 1.8 is now available from here: dintch18
Fintch version 1.4 is now available from here: fintch14
and from their Product Page, from where you can also download cintch. Although they do use the auto-update mechanism, I fear that changes in WordPress locations may not allow this to work with earlier versions.

Compression/decompression

Although I recommend Keka as a general compression and decompression utility, I also have a simple little app that I use with folders and files I transfer using FileVault. This only uses AppleArchive LZFSE, and strips any quarantine extended attributes when decompressing. It’s one of my testbeds for examining core allocation in Apple silicon Macs, so has extensive controls over QoS and performance, and offers manual settings as well as three presets.

Cormorant version 1.6 is now available from here: cormorant16
and from its Product Page. Although it does use the auto-update mechanism, I fear that changes in WordPress locations may not allow this to work with version 1.5 and earlier.

LogUI

Those using this new lightweight log browser in Tahoe will have discovered that, despite SwiftUI automatically laying out its controls, their changed sizes in Tahoe makes a mess of the seconds setting for times. This new version corrects that, and should be easier to use.

LogUI version 1 build 70 is now available from here: logui170

There will now be a pause in updates for macOS Tahoe until Apple has restored backward compatibility of app icons, hopefully in the next beta-releases.

❌
❌