What You'll Learn
- Apply playbook design principles — modularity, testability, documentation, and graceful error handling — to create production-quality automated workflows
- Navigate the Shuffle workflow builder to create multi-step playbooks using triggers, conditions, loops, and app integrations
- Build a complete phishing response playbook that receives an alert, extracts IOCs, enriches them via VirusTotal and MISP, creates a TheHive case, and notifies an analyst
- Build an IOC enrichment playbook that accepts any indicator type and returns aggregated context from multiple sources
- Implement error handling strategies including retries, fallbacks, and failure notifications
- Test and validate playbooks using sample data before deploying to production in Lab 14.2
Playbook Design Principles
A playbook is not just a workflow — it is a piece of production infrastructure that runs 24/7, handles real threats, and takes actions that affect your environment. Treat playbook development with the same rigor you would apply to any production code.
Principle 1: Modularity
Build small, reusable playbooks that do one thing well, rather than monolithic workflows that handle everything.
Bad design:
One massive playbook:
Receive alert → Extract IOCs → Enrich in MISP → Enrich in VT →
Check reputation → Create case → Assign analyst → Send Slack →
Send email → Block IP → Isolate endpoint → Generate report
Good design:
Playbook 1: IOC Enrichment
Input: IOC (any type) → Enrich in MISP + VT → Output: enrichment results
Playbook 2: Case Creation
Input: alert + enrichment → Create TheHive case → Output: case ID
Playbook 3: Notification
Input: case details → Send Slack + email → Output: confirmation
Playbook 4: Phishing Response (orchestrator)
Receive alert → Call Playbook 1 → Call Playbook 2 → Call Playbook 3
Modular playbooks are easier to test, debug, and maintain. When MISP's API changes, you update one enrichment playbook instead of every playbook that uses MISP.
Principle 2: Testability
Every playbook must be testable with sample data before handling real alerts:
| Testing Stage | What You Test | How |
|---|---|---|
| Unit test | Each app action in isolation | Run individual steps with known inputs |
| Integration test | Full workflow with test data | Send a sample alert through the webhook |
| Validation | Output correctness | Verify TheHive case was created with correct fields |
| Edge cases | Missing fields, API timeouts, invalid data | Send malformed payloads, disconnect a tool temporarily |
Principle 3: Documentation
Document every playbook with:
- Purpose: What this playbook does and when it runs
- Trigger: What starts it (webhook from Wazuh, schedule, manual)
- Inputs: What data it expects (alert JSON schema, IOC format)
- Outputs: What it produces (TheHive case, Slack message, block action)
- Dependencies: Which tools/APIs it requires
- Author and last updated: Who built it, when it was last modified
- Known limitations: What it does not handle
Principle 4: Graceful Error Handling
Automation that fails silently is worse than no automation — because the team assumes the work is being done. Every playbook must handle failures explicitly.
Silent failures are the #1 automation risk. If your enrichment playbook fails to reach MISP and silently continues without enrichment, the resulting case will be missing critical context. Always implement failure detection and notification.
The Shuffle Workflow Builder
Creating a Workflow
In Shuffle, workflows are built visually by placing nodes on a canvas and connecting them:
- Add a trigger — Drag a trigger node (webhook, schedule, or manual) to the canvas
- Add app nodes — Drag app nodes for each tool you want to interact with
- Connect nodes — Draw lines from output to input to define the execution flow
- Configure each node — Set the app action, parameters, and authentication
- Add conditions — Insert condition nodes that branch the flow based on data
- Add loops — Use loop nodes to iterate over arrays (e.g., process each IOC in a list)
Working with Conditions
Conditions evaluate data and branch the workflow:
Condition: $misp_search.response.found == true
├── True branch: Create HIGH severity case, notify L2
└── False branch: Create LOW severity case, standard queue
Common condition patterns:
| Pattern | Example | True Branch | False Branch |
|---|---|---|---|
| Intel match | MISP returns results | Elevate severity | Standard triage |
| Severity threshold | Alert level >= 10 | Immediate notification | Queue for review |
| Known false positive | Hash in allowlist | Close with FP tag | Continue investigation |
| Rate limit | Same IOC enriched <1hr ago | Return cached result | Perform fresh lookup |
Working with Loops
Loops iterate over arrays — essential when an alert contains multiple IOCs:
Alert contains: [IP: 185.220.101.42, Domain: evil.com, Hash: abc123...]
Loop: For each IOC in alert.iocs:
→ Enrich IOC in MISP
→ Enrich IOC in VirusTotal
→ Append result to enrichment_results[]
After loop: enrichment_results contains context for all three IOCs
Building a Phishing Response Playbook
The phishing response playbook is the most common SOAR use case and the one you will build in Lab 14.2. Here is the complete design:
Trigger
Wazuh webhook — Wazuh sends a POST request to Shuffle when a phishing-related rule fires (e.g., Rule 87101: "Suspicious email attachment detected").
Step 1: Parse Alert
Extract structured data from the Wazuh alert JSON:
{
"rule_id": "87101",
"rule_description": "Suspicious email attachment detected",
"agent_name": "mail-gw-01",
"data": {
"sender": "invoice@company-billing[.]com",
"recipient": "j.martinez@company.com",
"subject": "Urgent: Invoice #INV-2026-4417",
"attachment_name": "invoice_feb2026.xlsm",
"attachment_hash": "a1b2c3d4e5f6...",
"urls_in_body": ["hxxps://company-billing[.]com/download/payload"]
}
}
Extracted IOCs:
- Sender domain:
company-billing[.]com - Attachment hash:
a1b2c3d4e5f6... - URL:
hxxps://company-billing[.]com/download/payload
Step 2: Enrich IOCs
For each extracted IOC, query multiple intelligence sources:
VirusTotal enrichment:
Hash: a1b2c3d4e5f6...
→ VT detection: 38/72 engines flagged as malicious
→ Classification: Trojan.GenericKD.46789
→ First seen: 2026-02-20
Domain: company-billing[.]com
→ Registered: 2026-02-19 (2 days ago)
→ Hosting: 185.220.101.42 (known malicious infrastructure)
→ VT community score: -15 (malicious)
MISP enrichment:
Hash: a1b2c3d4e5f6...
→ Match: Event #4521 "Invoice Phishing Campaign Feb 2026"
→ TLP: AMBER
→ Tags: phishing, emotet-variant, T1566.001
Domain: company-billing[.]com
→ Match: Same event #4521
→ Linked to 12 other phishing domains
Step 3: Determine Verdict
Apply automated triage logic based on enrichment results:
IF (vt_detections > 5 AND misp_match == true):
verdict = "MALICIOUS"
severity = "HIGH"
ELIF (vt_detections > 5 OR misp_match == true):
verdict = "SUSPICIOUS"
severity = "MEDIUM"
ELIF (vt_detections > 0):
verdict = "LOW_CONFIDENCE"
severity = "LOW"
ELSE:
verdict = "UNKNOWN"
severity = "LOW"
Step 4: Create TheHive Case
Auto-create a case with all enrichment context pre-populated:
{
"title": "[PHISHING] Invoice Phishing - j.martinez - 2026-02-21",
"description": "Automated phishing triage. Wazuh Rule 87101.\nVerdict: MALICIOUS (VT: 38/72, MISP: Event #4521)",
"severity": 3,
"tags": ["phishing", "automated-triage", "emotet-variant"],
"observables": [
{ "dataType": "mail", "data": "invoice@company-billing.com" },
{ "dataType": "domain", "data": "company-billing.com" },
{ "dataType": "hash", "data": "a1b2c3d4e5f6...", "tags": ["vt-38/72"] },
{ "dataType": "url", "data": "hxxps://company-billing.com/download/payload" }
]
}
Step 5: Notify Analyst
Send a structured notification to the SOC Slack channel:
🚨 PHISHING ALERT — HIGH SEVERITY
Recipient: j.martinez@company.com
Subject: "Urgent: Invoice #INV-2026-4417"
Sender: invoice@company-billing[.]com
Attachment: invoice_feb2026.xlsm (VT: 38/72 — MALICIOUS)
Intel Match: MISP Event #4521 — Invoice Phishing Campaign
Verdict: MALICIOUS (automated)
TheHive Case: #4521 — Assigned to L2 Queue
Action Required: Confirm containment, check if attachment was opened.
Building an IOC Enrichment Playbook
The IOC enrichment playbook is a reusable sub-playbook called by other workflows. It accepts any IOC type and returns aggregated context:
Input Schema
{
"ioc_type": "ip|domain|hash|url|email",
"ioc_value": "185.220.101.42",
"sources": ["virustotal", "misp", "abuseipdb"]
}
Enrichment Logic
For each source in request.sources:
IF source == "virustotal":
result = VT.lookup(ioc_type, ioc_value)
enrichment.vt_score = result.detections
enrichment.vt_classification = result.classification
IF source == "misp":
result = MISP.search(ioc_value)
enrichment.misp_match = result.found
enrichment.misp_events = result.events
enrichment.misp_tags = result.tags
IF source == "abuseipdb":
result = AbuseIPDB.check(ioc_value)
enrichment.abuse_score = result.confidence_score
enrichment.abuse_reports = result.total_reports
enrichment.abuse_country = result.country
Return aggregated enrichment object
Output Schema
{
"ioc_type": "ip",
"ioc_value": "185.220.101.42",
"enrichment": {
"vt_score": "12/89",
"vt_classification": "malicious",
"misp_match": true,
"misp_events": ["Event #4521 — Invoice Phishing Campaign"],
"misp_tags": ["phishing", "emotet-variant"],
"abuse_score": 95,
"abuse_reports": 847,
"abuse_country": "RU"
},
"verdict": "MALICIOUS",
"confidence": "HIGH",
"enriched_at": "2026-02-21T14:32:00Z"
}
Error Handling in Automation
Failure Modes
| Failure Type | Example | Impact |
|---|---|---|
| API timeout | VirusTotal does not respond within 30 seconds | Enrichment incomplete |
| Authentication failure | MISP API key expired | All MISP lookups fail |
| Rate limiting | VT free tier: 4 requests/minute exceeded | Lookups queued or dropped |
| Invalid data | Alert JSON missing expected fields | Parse error, workflow crashes |
| Tool unavailable | TheHive container is restarting | Case creation fails |
Error Handling Strategies
Retry with backoff:
attempt = 1
max_retries = 3
while attempt <= max_retries:
result = call_api(params)
if result.success:
return result
wait(2 ^ attempt seconds) # 2s, 4s, 8s
attempt += 1
return error("API unreachable after 3 attempts")
Fallback to alternative source:
result = try: VirusTotal.lookup(hash)
if result.failed:
result = try: MalwareBazaar.lookup(hash)
if result.failed:
result = { status: "ENRICHMENT_UNAVAILABLE", source: "none" }
Failure notification:
if any_step.failed:
Slack.post(channel="#soc-automation-alerts",
message="⚠️ Playbook failure: {workflow_name} — Step {step_name} failed.
Error: {error_message}. Manual triage required for alert {alert_id}.")
Every playbook should have a "failure path" that is as well-designed as the "success path." When automation fails, the analyst must be immediately notified with enough context to manually complete the workflow. A failure notification that says "playbook failed" is useless. A failure notification that says "MISP enrichment failed for alert 92101 on WIN-SERVER-01 — manual MISP lookup required for IP 185.220.101.42" is actionable.
Testing and Validating Playbooks
Test with Known-Good Data
Before connecting a playbook to live alerts, test with sample payloads:
{
"test_name": "Phishing — known malicious",
"payload": {
"rule_id": "87101",
"data": {
"sender": "test@known-phishing-domain.com",
"attachment_hash": "known_malicious_hash_from_misp",
"urls_in_body": ["hxxps://known-malicious-url.com/payload"]
}
},
"expected_result": {
"verdict": "MALICIOUS",
"case_created": true,
"case_severity": "HIGH",
"notification_sent": true
}
}
Test Matrix
| Test Case | Input | Expected Verdict | Expected Actions |
|---|---|---|---|
| Known malicious | Hash with VT 38/72 + MISP match | MALICIOUS / HIGH | Case (HIGH), Slack notification |
| Suspicious | VT 2/72, no MISP match | LOW_CONFIDENCE / LOW | Case (LOW), standard queue |
| Clean | VT 0/72, no MISP match | UNKNOWN / LOW | Case (LOW), standard queue |
| API failure | VT timeout, MISP unreachable | ENRICHMENT_FAILED | Failure notification, manual triage |
| Missing fields | Alert JSON missing attachment_hash | PARSE_ERROR | Failure notification with raw alert |
Version Control for Playbooks
Treat playbooks like code:
- Export workflow JSON from Shuffle after each change
- Store in Git with meaningful commit messages
- Tag releases (v1.0, v1.1, v2.0) for major changes
- Review changes before deploying to production
- Maintain a changelog documenting what changed and why
playbooks/
├── phishing-response/
│ ├── v1.0-workflow.json
│ ├── v1.1-workflow.json # Added AbuseIPDB enrichment
│ ├── v2.0-workflow.json # Added auto-containment for HIGH confidence
│ ├── test-payloads.json
│ └── CHANGELOG.md
├── ioc-enrichment/
│ ├── v1.0-workflow.json
│ └── test-payloads.json
└── README.md
Key Takeaways
- Modular playbooks (small, single-purpose workflows called by an orchestrator) are easier to test, debug, and maintain than monolithic mega-workflows
- Every playbook must be testable with sample data before handling live alerts — unit test individual steps, integration test the full flow, and test edge cases including API failures
- The phishing response playbook follows five stages: parse alert → enrich IOCs → determine verdict → create case → notify analyst — this pattern applies to most alert-driven automation
- Error handling is not optional. Silent failures are the #1 automation risk. Every failure must trigger an actionable notification that enables manual completion of the workflow
- IOC enrichment playbooks should be reusable sub-playbooks that accept any IOC type and aggregate results from multiple sources (VirusTotal, MISP, AbuseIPDB)
- Version control playbooks in Git with semantic versioning, test payloads, and changelogs — treat them with the same rigor as production code
- Automated verdict logic (VT detections + MISP match → severity scoring) accelerates triage but should be validated against manual analysis before trusting in production
What's Next
You have designed and understood the architecture of automated playbooks. In Lesson 14.3 — Integration & Orchestration, you will connect Shuffle to the full CyberBlueSOC tool stack — Wazuh, TheHive, MISP, Velociraptor, and notification channels — building multi-tool orchestration workflows that automate enrichment, containment, and notification across your entire environment.
Knowledge Check: Building Playbooks
10 questions · 70% to pass
What is the main advantage of building modular playbooks (small, single-purpose workflows) instead of monolithic ones?
In the phishing response playbook, what happens when both VirusTotal detections exceed 5 AND a MISP match is found?
Why is silent failure considered the #1 risk in SOC automation?
In Lab 14.2, you build a phishing response playbook. What is the correct order of the five stages?
What error handling strategy should be used when VirusTotal API times out during IOC enrichment?
An alert contains three IOCs (an IP, a domain, and a hash). How should the enrichment playbook handle multiple indicators?
What should a failure notification include to be actionable for the analyst?
In Lab 14.2, you test the phishing playbook with sample data before connecting it to live alerts. Which test case validates error handling?
Why should playbook workflow JSON files be stored in version control (Git)?
What are the four documentation elements every playbook should include?
0/10 answered