What is Quality of Service, and how does it matter?
In computing, the term Quality of Service is widely used to refer to communication and network performance, but for Macs it has another more significant meaning, as the property that determines the performance of each thread run on your Mac, most importantly in Apple silicon chips.
Processes and threads
Each process running on your Mac consists of at least one thread. Threads are single flows of code execution run on one CPU core at a time, sharing virtual memory allocated to that process, but with their own stack. In addition to the process’s main thread, it can create additional threads as it requires, which can then be scheduled to run in parallel on different cores. As all recent Macs have more than one core, processes with more than one thread can make good use of more than one core, and so run faster.
Take the example of a file compressor. If it’s coded so that it can perform its compression in four threads that can be run simultaneously, then it will compress files in roughly a quarter of the time when it runs on four CPU cores, compared with running on a single core (ignoring input and output to disk).
That only works when those four cores are all free. If your Mac is also trying to build its Spotlight indexes at the same time, the threads doing that will compete with those of your compression app. That’s where the thread’s Quality of Service (QoS) settings come in, as they assign priority. On Apple silicon Macs, a thread’s QoS will also help determine whether it’s run on its Performance or Efficiency cores.
Standard QoS settings
QoS is set by the process, and is normally chosen from the standard list:
- QoS 9 (binary 001001), named background and intended for threads performing maintenance, which don’t need to be run with any higher priority.
- QoS 17 (binary 010001), utility, for tasks the user doesn’t track actively.
- QoS 25 (binary 011001), userInitiated, for tasks that the user needs to complete to be able to use the app.
- QoS 33 (binary 100001), userInteractive, for user-interactive tasks, such as handling events and the app’s interface.
There’s also a ‘default’ value of QoS between 17 and 25, an unspecified value, and in some circumstances you might come across others used by macOS.
These are the QoS values exposed to the programmer. Internally, macOS uses a more complex scheme with different values.
CPU core type
When running apps on Intel Macs, because all their CPU cores are identical, QoS has more limited effect, and is largely used to determine priority when there are threads queued for execution on a limited number of cores.
Apple silicon Macs are completely different, as they have two types of CPU core, Efficiency (E) cores designed to use less energy and normally run at lower frequencies, and Performance (P) cores that can run at higher frequencies and deliver maximum performance, but using more energy.
QoS is therefore used to determine which type of core a thread should be run on. Threads with a QoS of 9 (background) are run on E cores, and can’t be promoted to run on P cores, even when there are inactive P cores and the E cores are heavily loaded. Threads with a QoS of 17 and above will be preferentially run on P cores when they’re available, but when they’re all fully occupied, macOS will run them on E cores instead. In that case, the E cores will be run at higher frequencies for better performance with less economy.
If your Apple silicon Mac has a base variant chip with 4 E and 4 P cores, this results in the following:
- apps with a total of up to 4 threads at high QoS will be scheduled and run at full speed on the P cores;
- when those P cores are all busy with high QoS threads, running another thread will then result in that being run on the E cores, and slightly slower than it would on a P core;
- a total of 8 high QoS threads can thus be run on P and E cores together;
- when running low QoS background threads on E cores, a maximum of 4 can be run at any time when the E cores are available, but those threads can’t spill over and run on the P cores, even if those are idle.
Controls
As QoS is normally either set by the process for its threads, or for services in their LaunchDaemon or LaunchAgent property list, the user has little direct control. A few apps now provide settings to adjust the QoS of their worker threads. Among those in the compression utility Keka, together with a couple of my own utilities such as the Dintch integrity checker.
In Keka’s settings, you can give its tasks a maximum number of threads, and even run them at custom Quality of Service (QoS) if you want them to be run in the background on E cores, and not interrupt your work on P cores.
Dintch has a simple slider, with the green tortoise to run it on E cores alone, and the red racing car at full speed on the P cores.
App Tamer and taskpolicy
The great majority of threads run at low QoS on the E cores are those of macOS and its services like Spotlight indexing. When a thread has already been assigned a low QoS, there’s currently no utility or tool that can promote it so it’s run at a higher QoS. In practice this means that you can’t accelerate those tasks.
What you can do, though, is demote threads with higher QoS to run at low QoS, more slowly and in the background. The best way to do this is using St. Clair Software’s excellent utility App Tamer. If you prefer, you can use the taskpolicy
command tool instead. For instance, the commandtaskpolicy -b -p 567
will confine all threads of the process with PID 567 to the E cluster, and can be reversed using the -B option for threads with higher QoS (but not those set to low QoS by the process).
That can be seen in this CPU History window from Activity Monitor. An app has run four threads, two at low QoS and two at high QoS. In the left side of each core trace they are run on their respective cores, as set by their QoS. The app’s process was then changed using taskpolicy -b
and the threads run again, as seen in the right. The two threads with high QoS are then run together with the two with low QoS in the four E cores alone.
Virtualisation
Although Game Mode does alter the effects of QoS and core allocation, its impact is limited. The one significant exception to the way that QoS works is in virtualisation.
macOS Virtual Machines running on Apple silicon chips are automatically assigned a high QoS, and run preferentially on P cores. Thus, even when running threads at low QoS, those are run within threads on the host’s P cores. This remains the only known method of electively running low QoS threads on P cores.
Key points
- Threads are single flows of code execution run on one CPU core at a time, sharing virtual memory allocated to that process, but with their own stack.
- Apps and processes set the Quality of Service (QoS) for each of the threads they run.
- On Apple silicon chips, low QoS of background results in that thread being run on E cores alone.
- Higher QoS threads are preferentially allocated to P cores, but when they aren’t available, that thread will be run on E cores at high frequency.
- Some apps now provide controls over the QoS of their worker threads.
- App Tamer and
taskpolicy
let you demote high QoS threads to be run with low QoS on the E cores, but can’t promote low QoS threads to run faster on P cores. - Virtual machines run all threads at high QoS as far as the host Mac is concerned.
Further reading
Apple’s Energy Efficiency Guide for Mac Apps, last revised 13 September 2016, so without any mention of Apple silicon.
Apple silicon: 1 Cores, clusters and performance
Apple silicon: 2 Power and thermal glory
Apple silicon: 3 But does it save energy?