Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Comparing in-core performance of Intel, M3 and M4 CPU cores

By: hoakley
16 May 2025 at 14:30

It has been a long time since I last compared performance between CPU cores in Intel and Apple silicon Macs. This article compares six in-core measures of CPU performance across four different models, two with Intel processors, an M3 Pro, and an M4 Pro.

If you’re interested in comparing performance across mixed code modelling that in common apps, then look no further than Geekbench. The purpose of my tests isn’t to replicate those, but to gain insight into the CPU cores themselves, when running tight number-crunching loops largely using their registers and accessing memory as little as possible. This set of tests lays emphasis on those run at low Quality of Service (QoS), thus on the E cores of Apple silicon chips. Although those run relatively little user code, they are responsible for much of the background processing performed by macOS, and can run threads at high QoS when there are no free P cores available, although they do that at higher frequencies to deliver better performance.

Methods

Testing was performed on four Macs:

  • iMac Pro 2017, 3.2 GHz 8-core Intel Xeon W, 32 GB memory, Sequoia 15.3.2;
  • MacBook Pro 16-inch 2019, 2.3 GHz 8-core Intel Core i9, 16 GB memory, Sequoia 15.5;
  • MacBook Pro 16-inch 2023, M3 Pro, 36 GB memory, Sequoia 15.5;
  • Mac mini 2024, M4 Pro, 48 GB memory, Sequoia 15.5.

Six test subroutines were used in a GUI harness, as described in many of my previous articles. Normally, those include tests I have coded in Arm Assembly language, but for cross-platform comparisons I rely on the following coded in Swift:

  • float mmul, direct calculation of 16 x 16 matrix multiplication using nested for loops on Floats.
  • integer dot product, direct calculation of vector dot product on vectors of 4 Ints.
  • simd_float4 calculation of the dot-product using simd_dot in the Accelerate library.
  • vDSP_mmul, a function from the vDSP sub-library in Accelerate, multiplies two 16 x 16 32-bit floating point matrices, which in M1 and M3 chips appears to use the AMX co-processor;
  • SparseMultiply, a function from Accelerate’s Sparse Solvers, multiplies a sparse and a dense matrix, and may use the AMX co-processor in M1 and M3 chips.
  • BNNSMatMul matrix multiplication of 32-bit floating-point numbers, here in the Accelerate library, and since deprecated.

Source code for the last four is given in the appendix to this article.

Each test was run first in a single thread, then in four threads simultaneously. Loop throughput per second was calculated from the average time taken for each of the four threads to complete, and compared against the single thread to ensure it was representative. Results are expressed as percentages compared to test throughput at high QoS on the iMac Pro set at 100%. Thus a test result reported here as 200% indicates the cores being tested completed calculations in loops at twice the rate of those in the cores of the iMac Pro, so are ‘twice the speed’.

High QoS

User threads are normally run at high QoS, so getting the best performance available from the CPU cores. In Apple silicon chips, those threads are run preferentially on P cores at high frequency, although that may not be at the core’s maximum. Results are charted below.

Each cluster of bars here shows loop throughput for one test relative to the iMac Pro’s 3.2 GHz 8-core Xeon processor at 100%. Pale blue and red bars are for the two Intel Macs, the M3 Pro is dark blue, and the M4 Pro green. The first three tests demonstrate what was expected, with an increase in performance in the M3 Pro, and even more in the M4 Pro to reach about 200%.

Results from vDSP matrix multiplication are different, with less of an increase in the M3 Pro, and a reduction in the M4 Pro. This may reflect issues in the code used in the Accelerate library. That contrasts with the huge increases in performance seen in the last two tests, rising to a peak of over 400% in BNNS matrix multiplication.

With that single exception, P cores in recent Apple silicon chips are out-performing Intel CPU cores by wider margins than can be accounted for in terms of frequency alone.

Low QoS

When expressed relative to loop throughput at high QoS, no clear trend emerges in Apple silicon chips. This reflects the differences in handling of threads run at low QoS: as the Intel CPUs used in Macs only have a single core type, they can only run low QoS threads at lower priority on the same cores. In Apple silicon chips, low QoS threads are run exclusively on E cores running at frequencies substantially lower than their maximum, for energy efficiency. This is reflected in the chart below.

In the Intel Xeon W of the iMac Pro, low QoS threads are run at a fairly uniform throughput of about 45% that of high QoS threads, and in the Intel Core i9 that percentage is even lower, at around 35%. Throughput in Apple silicon E cores is more variable, and in the case of the last test, the E cores in the M4 Pro reach 66% of the throughput of the Intel Xeon at high QoS. Thus, Apple appears to have chosen the frequencies used to run low QoS threads in the E cores to deliver the required economy rather than a set level of performance.

Conclusions

  • CPU P core performance in M3 and M4 chips is generally far superior to CPUs in late Intel Macs.
  • Performance in M3 P cores is typically 160% that of a Xeon or i9 core, rising to 330%.
  • Performance in M4 P cores is typically 190% that of a Xeon or i9 core, rising to 400%.
  • Performance in E cores when running low QoS threads is more variable, and typically around 30% that of a Xeon or i9 core at high QoS, to achieve superior economy in energy use.
  • On Intel processors running macOS Sequoia, low QoS threads are run significantly slower than high QoS threads, at about 45% (Xeon) or 30-35% (i9).

My apologies for omitting legends from the first version of the two charts, and thanks to @holabotaz for drawing my attention to that error, now corrected.

Choosing an Apple silicon Mac

By: hoakley
7 May 2025 at 14:30

This coming autumn it’ll be five years since Apple started shipping its first Apple silicon Macs, and it’s already four years since the first M1 iMac. As prices of used Intel Macs are tumbling, more Apple silicon models are coming onto the used market. With a total of 15 basic M-series chips now available, this article tries to help you decide which new or used Apple silicon model to buy.

CPU cores

With such a wide choice, this is perhaps the most complex feature to understand, and it’s likely to make the biggest difference to what your Mac will do. M-series chips have anything from 2-8 Efficiency (E) cores, and 4-24 Performance (P) cores across four different families.

Although folk are usually more concerned with the number of P cores, E cores are responsible for doing much of the routine work, and shouldn’t be ignored. They run most of the background tasks in macOS, from Time Machine backups to indexing all your images and documents for Spotlight. P cores are largely responsible for running the code in your apps, so determine how fast it feels in use.

Most M-series chips have at least 4 E cores, but two, the M1 Pro and M1 Max, have only 2. They compensate for that by running those E cores at higher frequencies when working on heavy background tasks, but subsequent designs have set the comfortable minimum at 4, and the latest base M4 comes with 6. Of the two core types, E cores are the more versatile, as they can run all types of task, background or user, and when running at their maximum frequency can deliver a high proportion of the processing power of a P core. As an E core’s energy use is much lower than that of a P core, they’re a better option when running a laptop Mac on its battery.

The four E cores in this M4 Pro are kept fully occupied in the minutes after starting up, leaving the P cores free for running apps smoothly.

P and E core performance has increased with each new family. This is illustrated in different types of computation, when running one thread on a single core.

M4M3multiTests

The Y axis here gives loop throughput per second for my four basic in-core performance tests, a tight assembly code integer math loop, another tight assembly code loop of floating point math, NEON vector processor assembly code, and a tight loop calling an Accelerate routine run in the NEON unit. Pale blue bars are results for the M1, purple for the M3, and red for the M4, the higher the bar the faster.

Maximum core frequencies have increased from 3.2 GHz in the M1’s P cores to 4.5 GHz in the M4. One crude comparative measurement of overall computing capacity is to total the maximum frequencies for each of the CPU cores in each chip. Those are shown as Σfn in the table below, and the chart that follows it.

These are also complicated by sub-variants and binned versions, where one or two cores have been disabled by Apple, to produce a cheaper chip.

If you’re looking for CPU performance, the M3 Max, and M4 Pro and Max stand out and approach the performance of Ultra chips. But those assume that the software running is able to make full use of all the cores available. There’s no point in paying for the 32 cores in an M3 Ultra if the app you run most can’t use many of them.

Another detail that’s easily overlooked is the instruction set (ISA) supported, notably that of the M4, which includes new features for accelerated matrix and other computation. In this respect, the M2 family has been underrated, as I’ve explained here.

GPU

For most, the choice of CPU cores determines the GPU provided, and for general use they’re well matched. Exceptions to this are when high GPU performance is essential, and to support external displays. In either case you’ll need to check carefully with Apple’s specifications or Mactracker to ensure support. That’s particularly important when driving multiple high-resolution displays.

Memory

Memory options are determined by the chip, with some starting at only 8 GB, which is insufficient even for the lightest use. There was a myth that the use of Unified memory would result in substantial economy in memory use, but in practice that doesn’t work out, and demand for memory has increased with the introduction of new features such as AI.

The danger with this is that using substantial amounts of swap storage is deceptively fast because of the high speed of the internal SSD. As models with 8 GB memory often have small SSDs as well, this is likely to lead to rapid ‘wear’ of the SSD, and some early adopters saw worryingly rapid changes in wear indicators. Fortunately, Apple has recognised this problem, and all M4 models now come with a minimum of 16 GB.

If you’re interested in buying an older model with only 8 GB, at least check its SSD size and wear indicators before parting with your money. Further information about memory requirements is here.

SSD

While it’s possible to enjoy using an Apple silicon Mac with only 256 GB internal SSD, unless you’re frugal in its use you’ll find yourself buying a more substantial external SSD to supplement that. You can start up an Apple silicon Mac from macOS on an external SSD (or even a hard drive, if you must), but that’s more fussy than with an Intel Mac. If you want to consider relying on external storage, this article explains how best to do that.

For most users, a minimum internal SSD requires 512 GB for comfort and a long life.

Buy to upgrade

Until recently, all Apple silicon Macs have been stuck with the CPU cores, GPU, memory and internal SSD that they came with. That may be changing with some now offering SSD upgrades for the M4 Mac mini. However, those are likely to invalidate your warranty, and aren’t likely to be available for other popular models, apart from the Mac Pro.

Recommendations

  • Prefer a later Pro or Max chip over an M1 Pro or Max, to get at least 4 E cores.
  • E cores are more versatile than P cores, and an advantage when a laptop is powered by its battery.
  • If you need to use external displays, check the model’s support for their number and resolution.
  • Look for a minimum of 16 GB of memory.
  • When buying a model with 8 GB of memory, check the wear on its SSD.
  • Prefer a minimum of 512 GB SSD to avoid relying on external storage.
  • Don’t rely on upgrading any Apple silicon Mac’s internal hardware.

Enjoy your new Mac!

The perils of virtualisation on M4 Macs

By: hoakley
21 April 2025 at 14:30

Until last November, lightweight virtualisation of macOS on Apple silicon Macs had behaved uniformly across M-series families. Although I have heard of one report of problems moving VMs between Macs, those were built with custom kernels. In ordinary experience, VMs running on M1, M2 and M3 chips seemed not to care about the host’s hardware, and most of the time just worked, and updated correctly. There was one unfortunate glitch with shared folders that were lost in macOS 14.2 and 14.2.1, but otherwise VMs largely worked as expected.

Then last November disaster struck those of us who had just started using our new M4 Macs: they couldn’t virtualise any version of macOS before Ventura 13.4. Running a macOS VM for any version before that on an M4 Mac resulted in a black screen, and the VM failed to boot. That was fixed swiftly in macOS 15.2, and we no longer had to keep an older Apple silicon Mac around to be able to run those older versions of macOS in VMs.

Like many who virtualise macOS on Apple silicon, I keep a library of VMs with different versions so I can readily run tests on my apps and other issues. This is one of the great advantages of virtualisation, provided that you don’t rely on being able to run most apps from the App Store. When Apple releases new versions of macOS, once I’ve updated my Mac hosts, I turn to updating VMs. I’m normally cautious when doing this, to avoid trashing the original version. I duplicate the most recent, open it and run Software Update. When I’m happy that has worked correctly, I trash the original and rename the updated VM with its new version number.

That worked fine with Ventura 13.7.4 updating to 13.7.5, and Sonoma going to 14.7.5, but Sequoia 15.3.2 failed with a kernel panic, as I’ve detailed. When several of you kindly pointed out that M1, M2 and M3 Macs had no such problem, I confirmed on my M3 Pro that this is confined to hosts with an M4 family chip.

I have since tried updating my 15.3.2 VM to 15.4.1 on the M4 Pro, a surprisingly large update of over 6 GB, and that continues to result in a kernel panic and failure. I have also tried updating from 15.1 to 15.4.1 with an extraordinarily large download of more than 15 GB, only to see a repeat of the same kernel panic, with an almost identical panic log.

The macOS 15.4 update was particularly large, and some Apple silicon Macs were unable to install it successfully, most commonly on external bootable disks. From your reports, the 15.4.1 update seems to have fixed those problems with real rather than virtualised macOS. However, it hasn’t done anything to solve problems with VMs.

If you have an existing VM running any version of Sequoia prior to 15.4, then you’re unlikely to be successful updating that to 15.4 or later using an M4 host.

In contrast, upgrading a VM currently running Sonoma 14.7.5 completed briskly and without error. To my great surprise, that only requires a download of 8.7 GB, a little over half the size of the update from 15.1 to 15.4.1, which seems to be the wrong way round. The snag with upgrading from a previous major version of macOS to 15.x is that VM will never be able to use one of the most attractive features of Sequoia, iCloud Drive. If you want support for that, you’ll have to build a fresh VM using a Sequoia IPSW image file.

So for the time being, M4 hosts have a barrier between 15.3.2 and 15.4 that they can’t cross with an update. If you want a VM running 15.4 or later, then you’ll have to build a new one, or update 15.4 or later.

I don’t know and probably wouldn’t understand what changed in the 15.4 update, but it has certainly upset a lot of apple carts and VMs. And if you’d like a little homework, can you please explain:

  • Update 15.1 to 15.4.1, download 15 GB, failure.
  • Upgrade 14.7.5 to 15.4.1, download 8.7 GB, success.

❌
❌