A brief history of Mac numeric processing
It might seem extraordinary today, but the first Macs with Motorola 68000 processors couldn’t add two floating point numbers directly in their CPU. As with other early processors, they handled integers, not floating point, and that’s one reason why many features such as display coordinates started off using integers rather than floating point as they do today.
Of course the Mac and other computers could perform floating point calculations, but that required the use of maths routines in software libraries. Others had been implementing floating point support in hardware: for example, Intel started developing its own maths coprocessor the 8087 in 1977, and that became available to accompany its 8086 processor, but at that time Motorola didn’t have any equivalent.
68K Macs
Apple hired a young mathematician to define and implement what became known as the Standard Apple Numerics Environment or SANE, for Apple II, III, Lisa and Mac product lines, and it was SANE that formed the basis for Motorola’s 68881 maths coprocessor for its 68020 CPU in 1984. At the same time, the IEEE was standardising floating point maths for computing, and in 1985 published its first version of IEEE 754.
SANE was built into the first Mac 64K ROM, and when Macs started to come with 68020 CPUs and 68881 coprocessors, in the Macintosh II of 1987, they ran their floating point routines on the 68881 or its successor the 68882. In 1991, the first Quadra came with a Motorola 68040 and its integrated floating point unit, although as late as 1995 Apple was still releasing new Macs that lacked any hardware support for floating point maths. The most complete description of SANE is in a printed account published by Addison-Wesley, the second edition dating from 1988.
Power Macs
The change to PowerPC processors in 1994 brought an end to SANE, replacing it with PowerPC Numerics, which differed in many of its details. Floating point support in PowerPC CPUs included additional instructions to support Apple’s new standard. The period of transition was covered by providing SANE for backward compatibility with apps that had been built for Motorola 68K processors, and encouraging developers to rebuild their apps to use the PowerPC’s new Numerics. Tom Pittman and John Neil produced and marketed PowerFPU, a control panel for Power Macs, that ran 68K floating point code in emulation.
From 1996, the AIM Alliance of Apple, IBM and Motorola developed extensions to the PowerPC instruction set to support vector processing. Known variously as AltiVec, VMX for Vector Multimedia Extension, and Apple’s Velocity Engine, it was used to accelerate QuickTime and Quartz, when it was introduced in Mac OS X. Those extensions handle both integer and floating point in registers that are 128-bit wide to pack in multiple values for its operations. Velocity Engine was supported by Power Macs with G4 and G5 processors, from 1999 onwards.
Intel Macs
When Macs changed architecture to use Intel CPUs, those had integral floating point support, including x87 maths co-processor emulation and Intel’s Streaming SIMD Extensions, SSE, providing a replacement for features in the PowerPC. From 2003, in Mac OS 10.3 Panther, Apple had collected its more advanced numerical and vector support into the Accelerate framework, covering signal processing, image processing, linear algebra with BLAS/LAPACK, vector maths and more.
The transition to Intel wasn’t as seamless as might have appeared though, because of differences that might at first look subtle. For instance, PowerPC floating point support included a single, fused operation to multiply and add, but Intel CPUs performed the operations separately, which could accumulate additional rounding error. Apple warned “that in cases involving catastrophic cancellation, this may give results that are vastly different after the addition or subtraction has completed.”
As Intel Macs developed, they acquired increasingly capable GPUs that offered an alternative for some floating point calculations. OpenCL was introduced in 2009 to facilitate this, and in 2015, with OS X 10.11 El Capitan, Apple added support for its own GPU programming using the Metal API. That has evolved since, with Metal 2 introduced in macOS 10.13 High Sierra, and subsequent enhancements.
Apple silicon
From the first M1 chip, Apple silicon has put floating point performance to the fore. All the old variables that had originally been coded as integers have now become floating point, requiring fast and accurate scalar, vector and matrix support. CPU cores, even Efficiency cores, include extensive scalar instructions, with Arm’s NEON vector processing. GPUs support Metal 3, and matrix operations are catered for in a dedicated neural engine and an undocumented matrix coprocessor, the AMX. The latest M4 chip adds support for Arm’s SME matrix extensions in its ARMv9.2-A instruction set, although those are thought to be executed by the AMX.
While most of those are supported directly, access to the neural engine and (prior to the M4) the AMX coprocessor have been limited. It’s believed that appropriate functions in the Accelerate and related frameworks use whatever hardware is most appropriate.
Floating point calculations, often using very large matrices, are a key part of modern neural networks and both Machine Learning and Apple/Artificial Intelligence. Apple added support for a new floating point format, bfloat16, to Metal in 2023, and in its CPU core instruction set with the M2.
In the 40 years since the 128K Mac, crunching numbers has come a long way, thanks to Apple’s dedicated teams of mathematician-engineers.
Floating point formats
One of the eternal problems when working with floating point numbers in hex is their encoding. Converting IEEE 754 hex format into decimal expressed in engineering notation is fairly arcane. My free Mints includes a floating point explorer, to convert between 32- and 64-bit floating point and decimal engineering/scientific formats.
References
PowerFPU, a brief account by Tom Pittman
IEEE 754 at Wikipedia
Inside Macintosh: PowerPC Numerics on the Internet Archive
Velocity Engine on the Internet Archive
SSE Performance Programming and the early Accelerate framework, on the Internet Archive
Accelerate framework (current)
Metal calculations on a GPU (current)