Which cores does Visual Look Up use?
A couple of weeks ago I estimated how much power and energy were used when performing Visual Look Up (VLU) on an Apple silicon Mac, and was surprised to discover how little that was, concluding that “it’s not actually that demanding on the capability of the hardware”. This article returns to those measurements and looks in more detail at what the CPU cores and GPU were doing.
That previous article gives full details of what I did. In brief, this was performed on a Mac mini M4 Pro running macOS Sequoia 15.6.1, using an image of cattle in a field, opened in Preview. powermetrics
collected samples in periods of 100 ms throughout, and a full log extract was obtained to relate time to logged events.
Power use by CPU cores, GPU and neural engine (ANE) are shown in this chart from that article. This tallies against log records for the main work in VLU being performed in samples 10-24, representing a time interval of approximately 1.0-2.4 seconds after the start. There were also briefer periods of activity around 3.2 seconds on the GPU, 4.2 seconds on the CPU, and 6.6-7.1 seconds on the CPU. The latter correlated with online access to Apple’s SMOOT service to populate and display the VLU window.
To gain further detail, powermetrics
measurements of CPU core cluster frequencies, active residencies of each core, and GPU frequency and active residency, were analysed for the first 80 collection periods.
Frequency and active residency
Cluster frequencies in MHz are shown in the chart above for the one E and two P clusters, and the GPU. These show:
- The E cores (black) ran at a baseline of 1200-1300 MHz for much of the time, reaching their maximum frequency of 2592 MHz during the main VLU period at 1.0-2.4 seconds.
- The first P cluster (blue), P0, was active in short bursts over the first 1.5 seconds, and again between 6.3-7.0 seconds. For the remainder of the period the cluster was shut down.
- The second P cluster (red), P1, was most active during the three periods of high power use, although it didn’t quite reach its maximum frequency of 4512 MHz. When there was little core activity, it was left to idle at 1260 MHz but wasn’t shut down.
- The GPU (yellow) ran at 338 MHz or was shut down for almost all the time, with one brief peak at 927 MHz.
This chart shows the total active residencies for each of the three CPU clusters, obtained by adding their % measurements. Thus the maximum for the E cluster is 400%, and 500% for each of the two P clusters, and 1,400% in all. These are broadly equivalent to the CPU % shown in Activity Monitor, and take no account of frequency. These show:
- The E cores (pale blue) had the highest active residency throughout, ranging from as little as 30% when almost idle around 5 seconds, to just over 300% during the main VLU phase at 1.4 seconds.
- The first P cluster (purple) remained almost inactive throughout.
- The second P cluster (red) was only active during the periods of highest work, particularly between 1.0-2.4 seconds and again at 6.4-7.1 seconds. For much of the rest of the test it had close to zero active residency.
Taken together, these show that a substantial proportion of the processing undertaken in VLU was performed by the E cores, with shorter peaks of activity in some of the cores in the second P cluster. For much of the time, though, all ten P cores were either idle or shut down.
Load
Combining frequency and active residency into a single value is difficult for the two types of CPU core. To provide a rough metric, I have calculated ‘cluster load’ astotal cluster active residency x (cluster frequency / maximum core frequency)
where the maximum frequency of these E cores is taken as 2592 MHz, and the P cores as 4512 MHz. For example, in the sample period at 2.2 seconds, the P1 cluster frequency was 4449 MHz, and the total active residency for the five cores was 122%. Thus the P1 cluster load was 122 x (4449/4512) = 120.3%. Maximum load for that cluster would have been 500%.
The chart above shows load values for:
- The E cluster (black) riseing to 150-260% during the peak of VLU activity, from a baseline of 20-30%.
- The P0 cluster (blue) which never reached 10% after the initial sample at 0 seconds.
- The P1 cluster (red) spiking at 90-150% during the three most active phases, otherwise remaining below 10%.
Caution is required when comparing E with P cores on this measurement, as not only is E core maximum frequency only 57% that of P cores, but it’s generally assumed that their maximum processing capacity is roughly half that of P cores. Even with that reservation, it’s clear that a substantial proportion of the processing performed in this VLU was on the E cores, with just one cluster of P cores active in short spikes.
Finally, it’s possible to examine the correlation between total P cluster load and total CPU power.
This chart shows calculated total P load and reported total CPU power use. The linear regression shown isCPU power = 4.1 + (42.2 x total load)
giving a power use of 4,200 mW for a load of 100%, equating to a single P core running at maximum frequency.
Conclusions
- Cluster frequencies and active residencies measured in CPU cores followed the same phases as seen in CPU power, with most of the processing load of VLU in the the early stage, between 1.0-2.4 seconds, a shorter peak at 6.6-7.1 seconds correlating with online lookup, and a small peak at about 4.2 seconds.
- A substantial proportion of the processing performed for VLU was run on E rather than P cores, with P cores only being used for brief periods.
- Visual Look Up used remarkably little of the capability of an M4 Pro chip.