What Will China’s Green-Tech Ambitions Cost the World?
© Lauren DeCicca for The New York Times
© Lauren DeCicca for The New York Times
© Rachel Woolf for The New York Times
© Loren Elliott for The New York Times
If you’ve read any of my articles here about the inner workings of CPU cores in Apple silicon chips, you’ll know I’m no stranger to using the command tool powermetrics
to discover what they’re up to. Last week I attempted something more adventurous when trying to estimate how much power and energy are used in a single Visual Look Up (VLU).
My previous tests have been far simpler: start powermetrics
collecting sample periods using Terminal, then run a set number of core-intensive threads in my app AsmAttic, knowing those would complete before that sampling stopped. Analysing dozens of sets of measurements of core active residency, frequency and power use is pedestrian, but there’s no doubt as to when the tests were running, nor which cores they were using.
VLU was more intricate, in that once powermetrics
had started sampling, I had to double-click an image to open it in Preview, wait until its Info tool showed stars to indicate that stage was complete, open the Info window, spot the buttons that appeared on recognised objects, select one and click on it to open the Look Up window. All steps had to be completed within the 10 seconds of sampling collections, leaving me with the task of matching nearly 11,000 log entries for that interval against sampling periods in powermetrics'
hundred samples.
The first problem is syncing time between the log, which gives each entry down to the microsecond, and the sampling periods. Although the latter are supposed to be 100 ms duration, in practice powermetrics
is slightly slower, and most ranged between about 116 and 129 ms. As the start time of each period is only given to the nearest second, it’s impossible to know exactly when each sample was obtained.
Correlating log entries with events apparent in the time-course of power use is also tricky. Some are obvious, and the start of sampling was perhaps the easiest giveaway as powermetrics
has to be run using sudo
to obtain elevated privileges, which leaves unmistakeable evidence in the log. Clicks made on Preview’s tools are readily missed, though, even when you have a good estimate of the time they occurred.
Thus, the sequence of events is known with confidence, and it’s not hard to establish when VLU was occurring. As a result, estimating overall power and energy use for the whole VLU also has good confidence, although establishing finer detail is more challenging.
The final caution applies to all power measurements made using powermetrics
, that those are approximate and uncalibrated. What may be reported as 40 mW could be more like 10 or 100 mW.
In the midst of this abundance of caution, one fact stands clear: VLU hardly stresses any part of an Apple silicon chip. Power used during the peak of CPU core, GPU and neural engine (ANE) activity was a small fraction of the values measured during my previous core-intensive testing. At no time did the ten P cores in my M4 Pro come close to the power used when running more than one thread of intensive floating-point arithmetic, and the GPU and ANE spent much of time twiddling their thumbs.
Yet when Apple released VLU in macOS Monterey, it hadn’t been expecting to be able to implement it at all in Intel chips because of its computational demand. What still looks like magic can now be accomplished with ease even in a base M1 model. And when we care to leave our Macs running, mediaanalysisd
will plod steadily through recently saved images performing object recognition and classification to add them to Spotlight’s indexes, enabling us to search images by labels describing their contents. Further digging in Apple’s documentation reveals that VLU and indexing of discovered object types is currently limited by language to English, French, German, Italian, Spanish and Japanese.
Some time in the next week or three, when Apple releases macOS Tahoe, we’ll start seeing Apple silicon Macs stretch their wings with the first apps to use its Foundation Models. These are based on the same Large Language Models (LLMs) already used in Writing Tools, and run entirely on-device, unlike ChatGPT. This has unfortunately been eclipsed by Tahoe’s controversial redesign, but as more developers get to grips with these new AI capabilities, you should start to see increasingly novel features appearing.
What developers will do with them is currently less certain. These LLMs are capable of working with text including dialogue, thus are likely to appear early in games, and should provide specialist variants of more generic Writing Tools. They can also return numbers rather than text, and suggest and execute commands and actions that could be used in predictive automation. Unlike previous support for AI techniques such as neural networks, Foundation Models present a simple, high-level interface that can require just a few lines of code.
If you’ve got an Apple silicon Mac, there’s a lot of potential coming in Tahoe, once you’ve jiggled its settings to accommodate its new style.
Look in the log, and Visual Look Up (VLU) on an Apple silicon Mac apparently involves a great deal of work, in both CPU cores and the neural engine (ANE). This article reports a first attempt to estimate power and energy use for a single VLU on an image.
To estimate this, I measured CPU, GPU and ANE power in sampling periods of 100 ms using powermetrics
, and correlated events seen there with those recorded in a log extract over the same period, obtained using LogUI. The test was performed on a Mac mini M4 Pro running macOS 15.6.1, using Preview to perform the VLU on a single image showing a small group of cattle in an upland field. Power measurements were collected from a moment immediately before opening the image, and ceased several seconds after VLU was complete.
When used like this, powermetrics
imposes remarkably little overhead on the CPU cores, but its sampling periods are neither exact nor identical. This makes it difficult to correlate log entries and their precise timestamps with sampling periods. While powermetrics
gives power use in mW, those measurements aren’t calibrated and making assumptions about their accuracy is hazardous. Nevertheless, they remain the best estimates available.
The first step in log analysis was to identify the starting time of powermetrics
sampling periods. Although execution of that command left no trace in its entries, as it has to be run with elevated privileges using sudo
, its approval was obvious in entries concluding30.677182 com.apple.opendirectoryd ODRecordVerifyPassword completed
A subsequent entry at 30.688828 seconds was thus chosen as the start time for sampling periods, and all times given below as given in seconds after that time zero.
The following relevant events were identified in the log extract at elapsed times given in seconds:
com.apple.VisionKit
Signpost Begin: “VisionKit MAD Parse Request”com.apple.mediaanalysis
Running task VCPMADServiceImageProcessingTaskThus, the ANE was run almost continuously from 1.4-2.2 seconds after the start of sampling, otherwise was used little over the total period of about 9 seconds. Over that period of activity, an initial model used to detect objects was succeeded by a later model to identify objects in a ‘nature world’.
From the log record, it was deduced that the VLU was started in powermetrics
sample 10 (1.0 seconds elapsed), and essentially complete by sample 75 (7.5 seconds elapsed), a period of approximately 6.5 seconds, following which power use was low until the end of the sampling periods. All subsequent calculations refer to that series of samples and period of time.
Sums, averages and maxima of power measurements for that period of 6.5 seconds are:
Thus for the whole VLU, 93% of power was used by the CPU, 4.6% by the GPU, and only 2.2% by the ANE.
For comparison, in the M4 Pro chip running maximal in-core loads, each P core can use 1.3 W running floating point code, and 3 W running NEON code. The chip’s 20-core GPU was previously measured as using a steady maximum power of 20 W, with peaks at 25 W.
As each power sample covers 0.1 seconds, energy used during each sampling period is power/0.1, thus the total energy used over the 6.5 second period of VLU is:
Those are small compared to the test threads used previously, which cost 3-8 J for each P core used.
Power used in each 100 ms sampling period varied considerably over the whole 10 seconds. The chart below shows total power for the CPU.
Highest power was recorded between samples 10-25, corresponding to 1.0-2.5 seconds elapsed since the start of measurements, and most events identified in the log. Later bursts of power use occurred at about 4.2 seconds, and between 6.6-7.1 seconds, which most probably corresponded to opening the info window and performing the selected look-up.
Almost all power use by the neural engine occurred between 1.5-2.1 seconds, correlating well with the period in which substantial models were being run.
Peak GPU power use occurred around 1.0-1.5 seconds when the image was first displayed, at 3.1-3.2 seconds, and between 6.5-7.4 seconds. It’s not known whether any of those were the result of image processing for VLU as GPU-related log entries are unusual.
Composite total power use demonstrates how small and infrequent ANE and GPU use was in comparison to that of the CPU.
Given the limitations of this single set of measurements, I suggest that, on Apple silicon Macs