CyberBlue Academy — Blue Team & SOC Training

What You'll Learn

Apply playbook design principles — modularity, testability, documentation, and graceful error handling — to create production-quality automated workflows
Navigate the Shuffle workflow builder to create multi-step playbooks using triggers, conditions, loops, and app integrations
Build a complete phishing response playbook that receives an alert, extracts IOCs, enriches them via VirusTotal and MISP, creates a TheHive case, and notifies an analyst
Build an IOC enrichment playbook that accepts any indicator type and returns aggregated context from multiple sources
Implement error handling strategies including retries, fallbacks, and failure notifications
Test and validate playbooks using sample data before deploying to production in Lab 14.2

Playbook Design Principles

A playbook is not just a workflow — it is a piece of production infrastructure that runs 24/7, handles real threats, and takes actions that affect your environment. Treat playbook development with the same rigor you would apply to any production code.

Principle 1: Modularity

Build small, reusable playbooks that do one thing well, rather than monolithic workflows that handle everything.

Bad design:

One massive playbook:
  Receive alert → Extract IOCs → Enrich in MISP → Enrich in VT →
  Check reputation → Create case → Assign analyst → Send Slack →
  Send email → Block IP → Isolate endpoint → Generate report

Good design:

Playbook 1: IOC Enrichment
  Input: IOC (any type) → Enrich in MISP + VT → Output: enrichment results

Playbook 2: Case Creation
  Input: alert + enrichment → Create TheHive case → Output: case ID

Playbook 3: Notification
  Input: case details → Send Slack + email → Output: confirmation

Playbook 4: Phishing Response (orchestrator)
  Receive alert → Call Playbook 1 → Call Playbook 2 → Call Playbook 3

Modular playbooks are easier to test, debug, and maintain. When MISP's API changes, you update one enrichment playbook instead of every playbook that uses MISP.

Principle 2: Testability

Every playbook must be testable with sample data before handling real alerts:

Testing Stage	What You Test	How
Unit test	Each app action in isolation	Run individual steps with known inputs
Integration test	Full workflow with test data	Send a sample alert through the webhook
Validation	Output correctness	Verify TheHive case was created with correct fields
Edge cases	Missing fields, API timeouts, invalid data	Send malformed payloads, disconnect a tool temporarily

Principle 3: Documentation

Document every playbook with:

Purpose: What this playbook does and when it runs
Trigger: What starts it (webhook from Wazuh, schedule, manual)
Inputs: What data it expects (alert JSON schema, IOC format)
Outputs: What it produces (TheHive case, Slack message, block action)
Dependencies: Which tools/APIs it requires
Author and last updated: Who built it, when it was last modified
Known limitations: What it does not handle

Principle 4: Graceful Error Handling

Automation that fails silently is worse than no automation — because the team assumes the work is being done. Every playbook must handle failures explicitly.

🚨

Silent failures are the #1 automation risk. If your enrichment playbook fails to reach MISP and silently continues without enrichment, the resulting case will be missing critical context. Always implement failure detection and notification.

The Shuffle Workflow Builder

Creating a Workflow

In Shuffle, workflows are built visually by placing nodes on a canvas and connecting them:

Add a trigger — Drag a trigger node (webhook, schedule, or manual) to the canvas
Add app nodes — Drag app nodes for each tool you want to interact with
Connect nodes — Draw lines from output to input to define the execution flow
Configure each node — Set the app action, parameters, and authentication
Add conditions — Insert condition nodes that branch the flow based on data
Add loops — Use loop nodes to iterate over arrays (e.g., process each IOC in a list)

Working with Conditions

Conditions evaluate data and branch the workflow:

Condition: $misp_search.response.found == true
  ├── True branch:  Create HIGH severity case, notify L2
  └── False branch: Create LOW severity case, standard queue

Common condition patterns:

Pattern	Example	True Branch	False Branch
Intel match	MISP returns results	Elevate severity	Standard triage
Severity threshold	Alert level >= 10	Immediate notification	Queue for review
Known false positive	Hash in allowlist	Close with FP tag	Continue investigation
Rate limit	Same IOC enriched <1hr ago	Return cached result	Perform fresh lookup

Working with Loops

Loops iterate over arrays — essential when an alert contains multiple IOCs:

Alert contains: [IP: 185.220.101.42, Domain: evil.com, Hash: abc123...]

Loop: For each IOC in alert.iocs:
  → Enrich IOC in MISP
  → Enrich IOC in VirusTotal
  → Append result to enrichment_results[]

After loop: enrichment_results contains context for all three IOCs

Playbook design pattern showing the modular orchestrator model: a master playbook calls sub-playbooks for enrichment, case creation, and notification, with error handling at each stage

Building a Phishing Response Playbook

The phishing response playbook is the most common SOAR use case and the one you will build in Lab 14.2. Here is the complete design:

Trigger

Wazuh webhook — Wazuh sends a POST request to Shuffle when a phishing-related rule fires (e.g., Rule 87101: "Suspicious email attachment detected").

Step 1: Parse Alert

Extract structured data from the Wazuh alert JSON:

{
  "rule_id": "87101",
  "rule_description": "Suspicious email attachment detected",
  "agent_name": "mail-gw-01",
  "data": {
    "sender": "invoice@company-billing[.]com",
    "recipient": "j.martinez@company.com",
    "subject": "Urgent: Invoice #INV-2026-4417",
    "attachment_name": "invoice_feb2026.xlsm",
    "attachment_hash": "a1b2c3d4e5f6...",
    "urls_in_body": ["hxxps://company-billing[.]com/download/payload"]
  }
}

Extracted IOCs:

Sender domain: company-billing[.]com
Attachment hash: a1b2c3d4e5f6...
URL: hxxps://company-billing[.]com/download/payload

Step 2: Enrich IOCs

For each extracted IOC, query multiple intelligence sources:

VirusTotal enrichment:

Hash: a1b2c3d4e5f6...
  → VT detection: 38/72 engines flagged as malicious
  → Classification: Trojan.GenericKD.46789
  → First seen: 2026-02-20

Domain: company-billing[.]com
  → Registered: 2026-02-19 (2 days ago)
  → Hosting: 185.220.101.42 (known malicious infrastructure)
  → VT community score: -15 (malicious)

MISP enrichment:

Hash: a1b2c3d4e5f6...
  → Match: Event #4521 "Invoice Phishing Campaign Feb 2026"
  → TLP: AMBER
  → Tags: phishing, emotet-variant, T1566.001

Domain: company-billing[.]com
  → Match: Same event #4521
  → Linked to 12 other phishing domains

Step 3: Determine Verdict

Apply automated triage logic based on enrichment results:

IF (vt_detections > 5 AND misp_match == true):
    verdict = "MALICIOUS"
    severity = "HIGH"
ELIF (vt_detections > 5 OR misp_match == true):
    verdict = "SUSPICIOUS"
    severity = "MEDIUM"
ELIF (vt_detections > 0):
    verdict = "LOW_CONFIDENCE"
    severity = "LOW"
ELSE:
    verdict = "UNKNOWN"
    severity = "LOW"

Step 4: Create TheHive Case

Auto-create a case with all enrichment context pre-populated:

{
  "title": "[PHISHING] Invoice Phishing - j.martinez - 2026-02-21",
  "description": "Automated phishing triage. Wazuh Rule 87101.\nVerdict: MALICIOUS (VT: 38/72, MISP: Event #4521)",
  "severity": 3,
  "tags": ["phishing", "automated-triage", "emotet-variant"],
  "observables": [
    { "dataType": "mail", "data": "invoice@company-billing.com" },
    { "dataType": "domain", "data": "company-billing.com" },
    { "dataType": "hash", "data": "a1b2c3d4e5f6...", "tags": ["vt-38/72"] },
    { "dataType": "url", "data": "hxxps://company-billing.com/download/payload" }
  ]
}

Step 5: Notify Analyst

Send a structured notification to the SOC Slack channel:

🚨 PHISHING ALERT — HIGH SEVERITY

Recipient: j.martinez@company.com
Subject: "Urgent: Invoice #INV-2026-4417"
Sender: invoice@company-billing[.]com
Attachment: invoice_feb2026.xlsm (VT: 38/72 — MALICIOUS)

Intel Match: MISP Event #4521 — Invoice Phishing Campaign
Verdict: MALICIOUS (automated)

TheHive Case: #4521 — Assigned to L2 Queue
Action Required: Confirm containment, check if attachment was opened.

Complete phishing response playbook flow showing five stages: Trigger (Wazuh webhook) → Parse & Extract → Enrich (VT + MISP loop) → Verdict Logic → Case Creation + Notification, with error handling branches at each stage

Building an IOC Enrichment Playbook

The IOC enrichment playbook is a reusable sub-playbook called by other workflows. It accepts any IOC type and returns aggregated context:

Input Schema

{
  "ioc_type": "ip|domain|hash|url|email",
  "ioc_value": "185.220.101.42",
  "sources": ["virustotal", "misp", "abuseipdb"]
}

Enrichment Logic

For each source in request.sources:
  IF source == "virustotal":
    result = VT.lookup(ioc_type, ioc_value)
    enrichment.vt_score = result.detections
    enrichment.vt_classification = result.classification

  IF source == "misp":
    result = MISP.search(ioc_value)
    enrichment.misp_match = result.found
    enrichment.misp_events = result.events
    enrichment.misp_tags = result.tags

  IF source == "abuseipdb":
    result = AbuseIPDB.check(ioc_value)
    enrichment.abuse_score = result.confidence_score
    enrichment.abuse_reports = result.total_reports
    enrichment.abuse_country = result.country

Return aggregated enrichment object

Output Schema

{
  "ioc_type": "ip",
  "ioc_value": "185.220.101.42",
  "enrichment": {
    "vt_score": "12/89",
    "vt_classification": "malicious",
    "misp_match": true,
    "misp_events": ["Event #4521 — Invoice Phishing Campaign"],
    "misp_tags": ["phishing", "emotet-variant"],
    "abuse_score": 95,
    "abuse_reports": 847,
    "abuse_country": "RU"
  },
  "verdict": "MALICIOUS",
  "confidence": "HIGH",
  "enriched_at": "2026-02-21T14:32:00Z"
}

Error Handling in Automation

Failure Modes

Failure Type	Example	Impact
API timeout	VirusTotal does not respond within 30 seconds	Enrichment incomplete
Authentication failure	MISP API key expired	All MISP lookups fail
Rate limiting	VT free tier: 4 requests/minute exceeded	Lookups queued or dropped
Invalid data	Alert JSON missing expected fields	Parse error, workflow crashes
Tool unavailable	TheHive container is restarting	Case creation fails

Error Handling Strategies

Retry with backoff:

attempt = 1
max_retries = 3
while attempt <= max_retries:
    result = call_api(params)
    if result.success:
        return result
    wait(2 ^ attempt seconds)  # 2s, 4s, 8s
    attempt += 1
return error("API unreachable after 3 attempts")

Fallback to alternative source:

result = try: VirusTotal.lookup(hash)
if result.failed:
    result = try: MalwareBazaar.lookup(hash)
if result.failed:
    result = { status: "ENRICHMENT_UNAVAILABLE", source: "none" }

Failure notification:

if any_step.failed:
    Slack.post(channel="#soc-automation-alerts",
        message="⚠️ Playbook failure: {workflow_name} — Step {step_name} failed.
                 Error: {error_message}. Manual triage required for alert {alert_id}.")

💡

Every playbook should have a "failure path" that is as well-designed as the "success path." When automation fails, the analyst must be immediately notified with enough context to manually complete the workflow. A failure notification that says "playbook failed" is useless. A failure notification that says "MISP enrichment failed for alert 92101 on WIN-SERVER-01 — manual MISP lookup required for IP 185.220.101.42" is actionable.

Testing and Validating Playbooks

Test with Known-Good Data

Before connecting a playbook to live alerts, test with sample payloads:

{
  "test_name": "Phishing — known malicious",
  "payload": {
    "rule_id": "87101",
    "data": {
      "sender": "test@known-phishing-domain.com",
      "attachment_hash": "known_malicious_hash_from_misp",
      "urls_in_body": ["hxxps://known-malicious-url.com/payload"]
    }
  },
  "expected_result": {
    "verdict": "MALICIOUS",
    "case_created": true,
    "case_severity": "HIGH",
    "notification_sent": true
  }
}

Test Matrix

Test Case	Input	Expected Verdict	Expected Actions
Known malicious	Hash with VT 38/72 + MISP match	MALICIOUS / HIGH	Case (HIGH), Slack notification
Suspicious	VT 2/72, no MISP match	LOW_CONFIDENCE / LOW	Case (LOW), standard queue
Clean	VT 0/72, no MISP match	UNKNOWN / LOW	Case (LOW), standard queue
API failure	VT timeout, MISP unreachable	ENRICHMENT_FAILED	Failure notification, manual triage
Missing fields	Alert JSON missing attachment_hash	PARSE_ERROR	Failure notification with raw alert

Version Control for Playbooks

Treat playbooks like code:

Export workflow JSON from Shuffle after each change
Store in Git with meaningful commit messages
Tag releases (v1.0, v1.1, v2.0) for major changes
Review changes before deploying to production
Maintain a changelog documenting what changed and why

playbooks/
├── phishing-response/
│   ├── v1.0-workflow.json
│   ├── v1.1-workflow.json    # Added AbuseIPDB enrichment
│   ├── v2.0-workflow.json    # Added auto-containment for HIGH confidence
│   ├── test-payloads.json
│   └── CHANGELOG.md
├── ioc-enrichment/
│   ├── v1.0-workflow.json
│   └── test-payloads.json
└── README.md

Key Takeaways

Modular playbooks (small, single-purpose workflows called by an orchestrator) are easier to test, debug, and maintain than monolithic mega-workflows
Every playbook must be testable with sample data before handling live alerts — unit test individual steps, integration test the full flow, and test edge cases including API failures
The phishing response playbook follows five stages: parse alert → enrich IOCs → determine verdict → create case → notify analyst — this pattern applies to most alert-driven automation
Error handling is not optional. Silent failures are the #1 automation risk. Every failure must trigger an actionable notification that enables manual completion of the workflow
IOC enrichment playbooks should be reusable sub-playbooks that accept any IOC type and aggregate results from multiple sources (VirusTotal, MISP, AbuseIPDB)
Version control playbooks in Git with semantic versioning, test payloads, and changelogs — treat them with the same rigor as production code
Automated verdict logic (VT detections + MISP match → severity scoring) accelerates triage but should be validated against manual analysis before trusting in production

What's Next

You have designed and understood the architecture of automated playbooks. In Lesson 14.3 — Integration & Orchestration, you will connect Shuffle to the full CyberBlueSOC tool stack — Wazuh, TheHive, MISP, Velociraptor, and notification channels — building multi-tool orchestration workflows that automate enrichment, containment, and notification across your entire environment.

Knowledge Check: Building Playbooks

10 questions · 70% to pass

What is the main advantage of building modular playbooks (small, single-purpose workflows) instead of monolithic ones?

In the phishing response playbook, what happens when both VirusTotal detections exceed 5 AND a MISP match is found?

Why is silent failure considered the #1 risk in SOC automation?

In Lab 14.2, you build a phishing response playbook. What is the correct order of the five stages?

What error handling strategy should be used when VirusTotal API times out during IOC enrichment?

An alert contains three IOCs (an IP, a domain, and a hash). How should the enrichment playbook handle multiple indicators?

What should a failure notification include to be actionable for the analyst?

In Lab 14.2, you test the phishing playbook with sample data before connecting it to live alerts. Which test case validates error handling?

Why should playbook workflow JSON files be stored in version control (Git)?

What are the four documentation elements every playbook should include?

0/10 answered

Security Automation & SOAR FundamentalsPrevious Integration & OrchestrationNext

Building Automated Playbooks