Lesson 3 of 6·12 min read·Includes quiz

Conditions & Logic

Boolean operators, file size

What You'll Learn

  • Construct YARA conditions using Boolean operators (and, or, not) to combine string matches
  • Use string counting operators (#, any of, all of, N of) for flexible detection logic
  • Apply file property checks (filesize, uint16, uint32) to restrict matches to specific file types
  • Use positional operators ($string at N, $string in range) for precise byte-offset matching
  • Compare weak rules with strong rules and identify techniques that reduce false positives
  • Connect condition-building skills to the webshell detection challenge in Lab 7.3

The Condition Section: Where Precision Lives

You have built a toolkit for writing YARA strings — text with modifiers, hex with wildcards and jumps, and regex for variable patterns. But strings alone do not make a good rule. The condition section determines when your rule fires, and the difference between a useful rule and a noisy one is almost always in the condition.

A rule with great strings and a weak condition (any of them) will match thousands of legitimate files. A rule with the same strings and a precise condition will match only the target. The condition is where you control the signal-to-noise ratio.

YARA condition logic — Boolean operators, counting, file properties, string positions, and a complete example

Boolean Operators

YARA conditions use three Boolean operators: and, or, and not.

and — Both Must Be True

condition:
    $download and $webclient

The rule fires only if both $download and $webclient are found in the file. Adding more and clauses makes the rule more restrictive (fewer matches, fewer false positives).

or — Either Can Be True

condition:
    $eval or $system or $exec

The rule fires if any one of the three strings is found. Using or makes the rule more permissive (more matches, potentially more false positives). Use or when different strings indicate the same behavior — a web shell might use eval, system, or exec to execute commands, but they all mean "command execution."

not — Must NOT Be Present

condition:
    $suspicious_string and not $known_good_string

The not operator excludes files that contain a specific string. This is powerful for eliminating known false positives. For example, if your web shell rule keeps matching a legitimate PHP framework file that happens to contain eval(, you can add a not clause for a string unique to that framework:

strings:
    $eval = "eval(" nocase
    $laravel = "Illuminate\\Foundation\\Application"

condition:
    $eval and not $laravel

Operator Precedence and Grouping

YARA follows standard operator precedence: not binds tightest, then and, then or. Use parentheses to make complex conditions readable:

condition:
    ($eval and $b64) or ($system and $post) or ($exec and $get)

Without parentheses, $eval and $b64 or $system would be parsed as ($eval and $b64) or $system — which fires if $system alone is present. Always use parentheses when mixing and and or.

String Counting Operators

Counting operators are the most powerful tools for flexible detection. Instead of specifying exact Boolean combinations of named strings, you can count how many strings matched.

any of them / all of them

condition:
    any of them       // At least 1 string matches
    all of them       // Every defined string matches

any of them is the loosest possible condition (highest recall, lowest precision). all of them is the tightest (lowest recall, highest precision — but fails if the target is missing even one string).

N of them

condition:
    3 of them         // At least 3 of all defined strings

This is the sweet spot for most rules. If you define 6 strings that characterize a malware family, requiring 3 matches means the rule catches variants where some strings have been changed while keeping precision.

N of ($pattern*)

strings:
    $web_eval = "eval(" nocase
    $web_system = "system(" nocase
    $web_exec = "exec(" nocase
    $web_passthru = "passthru(" nocase
    $input_post = "$_POST"
    $input_get = "$_GET"
    $input_request = "$_REQUEST"

condition:
    2 of ($web_*) and 1 of ($input_*)

The $web_* wildcard matches all string names starting with $web_. This condition requires at least 2 execution functions and at least 1 user input source — a pattern that strongly indicates a web shell.

String Occurrence Count (#)

The # operator counts how many times a string appears in the file:

condition:
    #eval > 3          // "eval" appears more than 3 times

Multiple occurrences of suspicious function calls are more indicative of malicious intent than a single occurrence. A legitimate PHP file might use eval once; a web shell often uses it repeatedly.

Avoid any of them in production rules. It is useful for rapid triage and testing, but a production rule with any of them will almost certainly generate false positives. Every string you define could appear in legitimate software. The power of YARA comes from requiring combinations of indicators — 3 of them, 2 of ($exec_*) and $post, or explicit Boolean logic. Single-string matching is antivirus; multi-indicator matching is threat hunting.

File Property Checks

File properties let you restrict your rule to specific file types and sizes without relying solely on string matching.

filesize

condition:
    filesize < 50KB                    // Less than 50 kilobytes
    filesize > 100 and filesize < 2MB  // Between 100 bytes and 2 megabytes
    filesize < 10MB                    // Less than 10 megabytes

The filesize check is arguably the single most effective false-positive reducer in YARA. Common ranges by target type:

TargetTypical Sizefilesize Check
Web shell50 bytes - 50KBfilesize < 50KB
Malware dropper/stager10KB - 500KBfilesize < 500KB
RAT / backdoor50KB - 5MBfilesize < 5MB
Ransomware100KB - 2MBfilesize < 2MB
Legitimate enterprise app10MB - 500MB(excluded by above ranges)

uint16 and uint32 — Magic Byte Checks

The uint16(offset) and uint32(offset) functions read 2 or 4 bytes at a specific file offset and return them as an integer. This is how you check file format magic bytes:

condition:
    uint16(0) == 0x5A4D          // PE executable (MZ header)
    uint32(0) == 0x464C457F      // ELF binary (\x7FELF)
    uint32(0) == 0x04034B50      // ZIP archive (PK header)
    uint16(0) == 0x8B1F          // GZIP compressed data

The uint16(0) == 0x5A4D check is the standard way to ensure you only match PE (Windows executable) files. Combined with filesize and string checks, this creates highly precise rules:

condition:
    uint16(0) == 0x5A4D and
    filesize < 1MB and
    3 of them

This means: the file must be a PE executable, under 1MB, with at least 3 matching strings.

Note the byte order. YARA reads uint16 and uint32 in little-endian format (least significant byte first), which matches how x86 processors store integers. The MZ header bytes are 4D 5A in the file, but as a uint16 value they are 0x5A4D (bytes reversed). The ELF header bytes are 7F 45 4C 46 in the file, but as a uint32 they are 0x464C457F. This catches many beginners off guard.

Positional Operators

Sometimes you need a string to appear at a specific location in the file, not just anywhere.

at — Exact Offset

condition:
    $mz_header at 0              // MZ must be at the very start
    $pe_sig at 128               // PE signature at offset 128

in — Offset Range

condition:
    $mz_header at 0 and
    $pe_sig in (60..1024)        // PE signature within first 1KB

The in (start..end) operator restricts the string to a specific byte range. This is useful for file structure validation — you know the PE signature must be within a certain range of the MZ header.

entrypoint — PE/ELF Entry Point

condition:
    $shellcode at entrypoint     // Shellcode starts at the entry point

The entrypoint variable holds the file offset of the PE or ELF entry point. If your shellcode pattern appears exactly at the entry point, the file is almost certainly malicious — legitimate programs do not start with raw shellcode.

Weak Rules vs. Strong Rules

Reducing false positives — comparing a weak rule (high FP risk) with a strong rule (precise detection)

The difference between a noisy rule and a production-quality rule is almost always in the condition. Here is a concrete comparison:

Weak Rule

rule Weak_WebShell
{
    strings:
        $a = "eval("

    condition:
        $a
}

Problems: Matches ANY file containing eval( — including legitimate PHP frameworks (Laravel, WordPress, Drupal), JavaScript build tools, Python scripts, and configuration generators. This rule would fire thousands of times on a typical web server with zero malicious files.

Strong Rule

rule Strong_WebShell
{
    strings:
        $eval = "eval(" nocase
        $b64 = "base64_decode" nocase
        $system = "system(" nocase
        $exec = "exec(" nocase
        $passthru = "passthru(" nocase
        $post = "$_POST" nocase
        $get = "$_GET" nocase
        $request = "$_REQUEST" nocase
        $php = "<?php"

        $safe_wordpress = "WordPress"
        $safe_laravel = "Illuminate"
        $safe_drupal = "Drupal"

    condition:
        filesize < 50KB and
        $php and
        ($eval or $system or $exec or $passthru) and
        ($post or $get or $request) and
        not ($safe_wordpress or $safe_laravel or $safe_drupal)
}

Why this works:

  1. filesize < 50KB — web shells are small; legitimate frameworks are large
  2. $php — only match PHP files (not JavaScript or Python)
  3. Execution function required — the file must contain at least one command execution function
  4. User input required — the file must read from user-supplied HTTP parameters
  5. Framework exclusions — explicitly exclude files belonging to known legitimate projects

This rule requires the confluence of PHP code, execution capability, user input handling, and small size — plus the absence of known framework markers. The chance of a legitimate file matching all these criteria is near zero.

The FP Prevention Checklist

Before deploying any rule to production:

  1. File format check — add uint16(0) or $php_tag to restrict to the target file type
  2. File size limit — add filesize < N based on the expected size range
  3. Multiple indicators — require 2+ strings with different indicator types
  4. Flexible counting — use N of them instead of requiring all or any
  5. Exclusion strings — add not clauses for known false-positive sources
  6. Clean corpus test — run against a set of known-good files before deployment
💡

Build conditions incrementally. Start with a loose condition (any of them) to verify your strings match the target. Then add filesize. Then add a format check. Then increase the required count. After each change, re-test against both your malware corpus and your clean corpus. Stop when you have 100% detection of targets and 0% false positives on clean files.

Combining Everything: A Complete Detection Rule

Here is a production-quality rule that demonstrates every condition technique:

rule Ransomware_LockBit3_Indicator
{
    meta:
        author = "CyberBlue Academy"
        description = "Detects LockBit 3.0 ransomware indicators"
        date = "2026-02-17"
        severity = "critical"
        mitre_att_ck = "T1486"
        tlp = "TLP:GREEN"

    strings:
        // Ransom note strings
        $note1 = "your data are stolen and encrypted" nocase wide ascii
        $note2 = ".onion" nocase
        $note3 = "restore-my-files" nocase wide ascii

        // Technical indicators
        $mutex = "Global\\lockbit" nocase wide ascii
        $ext = ".lockbit" nocase
        $shadow = "vssadmin delete shadows" nocase
        $bcdedit = "bcdedit /set {default} recoveryenabled no" nocase
        $wmic = "wmic shadowcopy delete" nocase

        // Hex patterns
        $lockbit_header = { 4C 6F 63 6B 42 69 74 20 33 2E 30 }
        $pe_header = { 4D 5A [20-200] 50 45 00 00 }

    condition:
        uint16(0) == 0x5A4D and
        filesize < 2MB and
        (
            (2 of ($note*)) or
            ($mutex and 1 of ($shadow, $bcdedit, $wmic)) or
            ($lockbit_header and $ext) or
            (3 of ($note*, $mutex, $ext, $shadow, $bcdedit, $wmic))
        )
}

This rule uses:

  • uint16(0) == 0x5A4D — only PE executables
  • filesize < 2MB — ransomware is compact
  • Multiple detection paths connected by or — catches different variants where some strings may be missing
  • Wildcard counting (2 of ($note*)) — flexible matching within string groups
  • Named string combinations — specific pairs that together are highly indicative

Key Takeaways

  • Boolean operators (and, or, not) combine string matches — use parentheses when mixing operators to ensure correct evaluation
  • String counting (N of them, N of ($pattern*), #string > N) provides flexible detection that catches malware variants
  • filesize checks are the single most effective false-positive reducer — always include one based on the expected target size range
  • uint16(0) and uint32(0) magic byte checks restrict rules to specific file formats (PE, ELF, ZIP, etc.)
  • Positional operators (at, in, entrypoint) match strings at specific file offsets for structural validation
  • Strong rules combine file format checks, size limits, multiple string types, flexible counting, and exclusion strings
  • Weak rules use single strings with any of them — they match thousands of legitimate files
  • In Lab 7.3, you will write 3 YARA rules to detect 5 webshells hidden among 500 files with zero false positives — condition precision is the key

What's Next

You now have the full YARA rule-writing toolkit: strings with modifiers (Lesson 7.2) and precise conditions (this lesson). In Lesson 7.4, you will learn how to deploy your rules at scale — scanning files, directories, disk images, and memory dumps efficiently. You will also learn performance optimization techniques that matter when scanning thousands of files with hundreds of rules.

Knowledge Check: Conditions & Logic

10 questions · 70% to pass

1

What is the difference between 'any of them' and '3 of them' in a YARA condition?

2

Why is 'uint16(0) == 0x5A4D' one of the most commonly used YARA condition checks?

3

Which condition correctly requires at least 2 execution functions AND at least 1 user input source using wildcard counting?

4

What does the 'not' operator do in a YARA condition like '$eval and not $laravel'?

5

What is the single most effective technique for reducing false positives in YARA rules?

6

What does the # operator do in a YARA condition like '#eval > 3'?

7

In the condition '$shellcode at entrypoint', what does the 'entrypoint' variable represent?

8

In Lab 7.3, you must find 5 webshells hidden among 500 files. A rule that uses 'any of them' matches 5 webshells but also 47 clean PHP files. How should you fix this?

9

When YARA reads uint16(0) as 0x5A4D, the actual bytes in the file at offset 0 are 4D 5A. Why is the value reversed?

10

Which YARA condition technique allows you to explicitly exclude known-good files from matching?

0/10 answered