Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Explainer: Yara rules

By: hoakley
20 August 2025 at 14:30

Security utilities that detect known malicious software do so using sets of detection rules. Since their introduction by Victor Alvarez of VirusTotal 12 years ago, the most common method of expressing these rules is in a text file with the extension .yara. Apparently YARA stands for either YARA: Another Recursive Acronym, or Yet Another Ridiculous Acronym.

Yara rules are used extensively in macOS by both the original XProtect and its sibling XProtect Remediator. Those of XProtect are found in the XProtect.yara file in XProtect.bundle, in /Library/Apple/System/Library/CoreServices and its additional location in /private/var/protected/xprotect in Sequoia and later. Further Yara rules are also encrypted and embedded in XProtect Remediator’s scanning modules, as detailed by Koh M. Nakagawa in the FFRI Security GitHub.

As used by Apple, each rule can consist of up to three sections:

  • meta, containing the rule’s metadata including a description and in more recent cases a UUID for the rule.
  • strings, specifying some of the content of the file, typically in the form of hexadecimal strings such as { A0 6B }.
  • condition, a logical expression that, when satisfied, meets the requirements of that rule, so identifying it as malicious.

I’ll exemplify these using Yara rules used in XProtect version 5310.

Private rules

At the start of the Yara file are any private rules. These are a bit like macros, in that they define properties that are then used in multiple rules later. Laid out in compact form, an example reads:

private rule Shebang
{ meta:
description = "private rule to match shell scripts by shebang (!#)"
condition:
uint16(0) == 0x2123
}

This starts with description metadata for this private rule, then states the condition for satisfying this rule, that the first 16-bit unsigned integer in the file contains the hex 0x2123, the UTF-8 characters 0x21 or ! and 0x23 or #. In the reverse order of #! they’re known as the shebang, and the opening characters of many shell scripts, but in this rule they are given the other way around because of the byte order used in the 16-bit integer.

This defines what’s required of a Shebang, and can be included in the conditions of other Yara rules. Instead of having to redefine the same feature in every rule, they can simply include that Shebang rule in their condition.

Regular rules

After 3-5 private rules, the XProtect Yara file goes on to enumerate 372 normal rules, including this one:

rule XProtect_MACOS_SOMA_JLEN
{meta:
description = "MACOS.SOMA.JLEN"
uuid = "4215C9D4-57D5-4D30-82E1-96477493E8D5"
strings:
$a0 = { 4c 8d 3d ?? 74 0b 00 4c 8d 25 ?? ?? 0b 00 4c 8d 2d ?? ?? 0b 00 48 8d 5d c0 }
$a1 = { 73 62 06 91 d4 03 00 f0 94 42 21 91 }
condition:
Macho and ( ( $a0 ) or ( $a1 ) ) and filesize < 5MB
}

This rule has its internal code name as its description, and has been assigned the UUID shown. It then defines two binary strings, a0 and a1, the former containing ‘wild’ values expressed using the question mark ?. The condition for satisfying the rule is that the file must:

  • be a Mach-O binary, and
  • contain either a0 or a1, and
  • its size must be less than 5 MB.

Further details about Yara rules are given here.

Implementation

There are standard implementations that can check files against custom sets of Yara rules. Rules are normally compiled into binary form from their text originals before use. Full details are given on VirusTotal.

Apple’s use of Yara files is mysterious, as for some years all the descriptions used arbitrary code names as obfuscation. When the source of all the rules is given in plain text, it’s hard to see what purpose that served, and it meant that users were told that MACOS.0e32a32 had been detected in an XProtect scan, for instance. Thankfully, Apple has more recently replaced most of those with more meaningful names.

I’m grateful to Duncan for asking me to explain this, and hope I have been successful. I’m also grateful to isometry and an anonymous commenter for straightening out the confusion over the Shebang.

How XProtect’s detection rules have changed 2019-25

By: hoakley
15 August 2025 at 14:30

XProtect is the front-line tool in macOS for detecting known malware. When a downloaded app is run for the first time and put through Gatekeeper checks, those rely on detection rules defined in the XProtect.yara file inside the XProtect bundle in /System/Library/CoreServices. Those are updated periodically to extend their coverage as new malware is detected and analysed by Apple’s security engineers. This article looks at how they have changed over the last six years.

My starting point is XProtect version 2103 released on 2 May 2019, in the heyday of macOS 10.14.4 Mojave. That contains a total of 92 rules in a text file of 42,903 bytes, for an average rule size of 456 bytes. Among those are many old chestnuts such as Bundlore.

My end point is version 5310 released this week, on 12 August 2025, for macOS 15.6 Sequoia and earlier. That contains a total of 372 rules in a text file of 969,662 bytes, giving an average rule size of 2,572 bytes. Still among those are the same old chestnuts including Bundlore.

Thus the number of rules is now 4 times what it was six years ago, and they take over 22 times as much space.

For the period up to the end of 2023, I have analysed XProtect’s Yara file in updates every 6 months, in May and November, or the closest update available. From the start of 2024 updates became more frequent, and I have therefore analysed the last update in each month. In late 2024, XProtect in macOS Sequoia started using iCloud to deliver its XProtect data updates. For this analysis I have excluded version 5273, which was only released via iCloud and wasn’t provided through the regular softwareupdate route used by all previous versions.

The number of Yara rules increased steadily until updates became more frequent in 2024, following which there was a very steep rise early that year. Since then they have continued to rise more steeply than before 2024, but now appear more linear, as seen in the red line of regression. Over this period, hardly any Yara rules have been removed.

Total size of the Yara file has followed a similar pattern, with little change until the start of 2024. It then peaked briefly before reducing slightly, pausing a little, then undergoing a step increase from 288 KB to 877 KB. Growth has been steadier for the last year, although it appears to be on track to reach 1 MB in 2026.

Average size of Yara rules changed little between 2021-2023, but increased greatly with the addition of some very large rules in June-July 2024. It has since declined slowly, as more recent rules have been far smaller.

This prodigious growth in the number of Yara rules and their size has inevitably had its effect on the time taken to complete Gatekeeper checks that include XProtect scans. macOS Tahoe has been promised to limit that, by not scanning notarized apps with XProtect, so improving app launch times.

Given that remarkably few old Yara rules have been removed over the last six years, this growth has been inevitable. However, unless old malware is incapable of being run on Macs still supported by XProtect updates, it’s hard to see how it could be safe to remove old rules. When support for running x86 code (except that for “older unmaintained gaming titles”) is dropped from macOS 28, many older Yara rules could be dropped from XProtect updates without putting Apple silicon Macs at risk, but even that isn’t an easy decision. In the meantime, at least our faster Macs should be able to complete XProtect scans more quickly.

❌
❌