What You'll Learn
- Identify the 8 log source categories that feed a production SIEM
- Understand what each category tells you and which attack phases it covers
- Know the critical Windows Event IDs every SOC analyst must recognize on sight
- Explain what Linux log files reveal about system and user activity
- Map log sources to MITRE ATT&CK tactics to understand where your visibility starts and stops
- Recognize log source gaps and why no single source is enough
Why Log Sources Are Everything
In Module 1, you learned that a SIEM is the central nervous system of the SOC. But a SIEM is only as good as the data flowing into it. If a log source isn't connected, the SIEM can't see it — and neither can you. Every blind spot in your log coverage is an opportunity for an attacker to operate undetected.
In Lab 1.3, you explored 10 log sources inside Wazuh and built a Log Source Reference Sheet. Now we're going to expand that foundation. A real enterprise SOC doesn't monitor 10 sources — it monitors hundreds, organized into 8 major categories. Understanding these categories is what separates a junior analyst who reacts to alerts from a senior analyst who understands the full picture.
The #1 question every SOC analyst should ask on day one: "What log sources do we have — and what are we missing?"
Category 1: Windows Event Logs
Windows Event Logs are the single most important log source in most enterprise SOCs because the vast majority of corporate endpoints and servers run Windows. The Windows Security channel alone generates the events you'll spend 60-70% of your time investigating.
The Must-Know Event IDs
| Event ID | Channel | What It Logs | ATT&CK Relevance |
|---|---|---|---|
| 4624 | Security | Successful logon | Lateral movement (T1021), valid accounts (T1078) |
| 4625 | Security | Failed logon | Brute force (T1110), password spraying |
| 4688 | Security | New process created | Execution (T1059), command-line logging |
| 4697 | Security | Service installed | Persistence (T1543.003) |
| 4720 | Security | User account created | Account manipulation (T1136) |
| 7045 | System | New service registered | Persistence (T1543.003) — duplicate coverage with 4697 |
| 1102 | Security | Audit log cleared | Defense evasion (T1070.001) — always critical |
| 4648 | Security | Explicit credential logon | Credential use — runas, scheduled tasks |
| 4672 | Security | Special privileges assigned | Admin logon — tracks who got elevated access |
Logon Types — The Context That Changes Everything
When you see Event ID 4624 or 4625, the logon type field tells you how the authentication happened. This single field can mean the difference between a routine event and an active intrusion:
| Type | Name | What It Means | Suspicious When... |
|---|---|---|---|
| 2 | Interactive | Keyboard login at the console | Happens outside business hours on a server |
| 3 | Network | SMB, RPC, WMI — remote resource access | Source IP is external or from an unexpected subnet |
| 5 | Service | Windows service starting | Normal — unless it's a service you didn't install |
| 7 | Unlock | Workstation unlocked | Rarely suspicious on its own |
| 10 | RemoteInteractive | RDP session | Source IP is external, or user doesn't normally RDP |
Type 3 (Network) Is Where Lateral Movement Lives. When an attacker uses stolen credentials to access file shares (SMB), run commands via WMI, or execute PsExec, it generates a Type 3 logon. If you see Type 3 from an unusual source IP — especially hopping between servers — that's your lateral movement indicator.
What Windows Event Logs Don't Tell You
Windows Security logs are powerful, but they have blind spots:
- No command-line arguments in 4688 unless you enable "Include command line in process creation events" via Group Policy
- No DLL loading, registry changes, or network connections per process — that's what Sysmon adds (Lesson 2.6)
- No file content analysis — you know a file was created but not what's in it (that's YARA, Module 7)
Category 2: Linux / Syslog
Linux systems generate logs through the syslog facility and dedicated log files. In a SOC, you'll encounter Linux logs from web servers, DNS servers, database servers, network appliances, and cloud instances.
The Critical Log Files
| Log File | What It Contains | SOC Relevance |
|---|---|---|
| /var/log/auth.log (Debian/Ubuntu) or /var/log/secure (RHEL/CentOS) | SSH logins, sudo commands, PAM authentication | Brute force (T1110), privilege escalation (T1548), valid accounts (T1078) |
| /var/log/syslog or /var/log/messages | System events, service starts/stops, kernel messages | Persistence (cron, services), system manipulation |
| /var/log/audit/audit.log | Detailed audit trail (if auditd is enabled) | Process execution, file access, syscalls — Linux's equivalent of deep telemetry |
| /var/log/cron | Cron job execution | Persistence via scheduled tasks (T1053.003) |
| /var/log/kern.log | Kernel-level messages | Firewall drops (iptables), hardware issues, kernel exploits |
Reading auth.log — What Matters
A single SSH brute force attempt in auth.log looks like this:
Feb 15 06:15:02 linux-web-01 sshd[5102]: Invalid user admin from 185.220.101.42 port 44891
Feb 15 06:15:14 linux-web-01 sshd[5104]: Failed password for root from 185.220.101.42 port 44903
The fields that matter for triage:
- Timestamp — when did it happen?
- Hostname — which server was targeted?
- Program — sshd, sudo, cron, etc.
- Source IP — internal (expected) or external (investigate)?
- Username — valid user or random guess?
Linux Audit Framework (auditd)
When auditd is enabled, it provides the deepest Linux visibility — comparable to what Sysmon gives on Windows. It can log:
- Every process execution with full command lines
- File access and permission changes
- Network socket creation
- System calls
Most production Linux servers in security-conscious organizations run auditd. If you see /var/log/audit/audit.log events in your SIEM, you have premium Linux visibility.
Career Note: Many SOC job postings list "experience with Linux log analysis" as a requirement. What they really mean is: can you read auth.log, understand sudo events, and spot anomalies in syslog? Lab 1.3 already gave you hands-on practice with exactly this.
Category 3: Firewall & Network Device Logs
Firewalls sit at network boundaries and log every connection they allow or block. They're your perimeter visibility — the first and last line of defense.
What Firewall Logs Contain
| Field | What It Tells You |
|---|---|
| Source IP | Who initiated the connection |
| Destination IP | What internal system they targeted |
| Destination Port | What service they tried to reach (22=SSH, 443=HTTPS, 3389=RDP) |
| Action | Allow or deny/drop |
| Protocol | TCP, UDP, ICMP |
| Bytes transferred | Volume of data (relevant for exfiltration) |
Common Firewall Platforms You'll Encounter
| Platform | Log Format | How It Reaches the SIEM |
|---|---|---|
| iptables/nftables (Linux) | Kernel syslog messages | Syslog forwarding |
| Palo Alto Networks | Structured CSV or CEF | Syslog or API integration |
| Fortinet FortiGate | Key-value pairs | Syslog forwarding |
| Cisco ASA | Syslog with message codes | Syslog forwarding |
| pfSense | BSD syslog format | Syslog forwarding |
| AWS Security Groups / NACLs | VPC Flow Logs (JSON) | CloudWatch → SIEM connector |
What to Look For
- Repeated drops to the same port from one IP — port scanning (reconnaissance)
- Drops on port 4444, 5555, or other non-standard ports — reverse shell attempts
- Allowed connections to known-bad IPs — C2 communication that got through
- Large outbound data transfers — potential exfiltration
- Internal-to-internal drops — misconfiguration or lateral movement attempts
Firewall Logs Show You What Was Blocked — And What Got Through. A "deny" event means the attack was stopped at the perimeter. An "allow" to a suspicious destination means it wasn't. Both are equally important: denies tell you who's knocking, allows tell you who got in.
Category 4: DNS Query Logs
Every network connection starts with a DNS query. Before malware can call home to evil-c2.example.com, it has to resolve that domain to an IP address. DNS logs capture every one of these queries.
Why DNS Logs Are a SOC Goldmine
- C2 Detection — Malware must resolve its C2 domain. DNS logs record it even if the actual C2 traffic is encrypted.
- DNS Tunneling — Attackers encode data inside DNS queries to exfiltrate information. These show up as unusually long subdomains or high query volumes to a single domain.
- DGA Detection — Domain Generation Algorithms produce random-looking domains (
xk7gf2p9.net). A spike in queries to newly-registered or algorithmically-generated domains is a strong malware indicator. - Shadow IT Discovery — DNS logs reveal what cloud services employees are using (Dropbox, personal email, unauthorized SaaS).
What DNS Log Fields Matter
| Field | What It Tells You |
|---|---|
| Client IP | Which internal host made the query |
| Queried domain | What they tried to resolve |
| Query type | A (IPv4), AAAA (IPv6), MX (mail), TXT (often used for tunneling) |
| Response code | NOERROR (found), NXDOMAIN (doesn't exist), SERVFAIL |
| Timestamp | When the query happened |
Suspicious DNS Patterns
| Pattern | What It May Indicate |
|---|---|
| Queries to domains with random characters | DGA malware (T1568.002) |
| Very long subdomain strings (50+ chars) | DNS tunneling / exfiltration (T1071.004) |
| High volume of NXDOMAIN responses | DGA probing or misconfigured malware |
| Queries to recently registered domains (< 30 days) | Newly staged C2 infrastructure |
| TXT record queries to unusual domains | DNS-based data exfiltration |
DNS Never Lies. An attacker can encrypt their C2 traffic, use legitimate cloud services for hosting, and blend into normal HTTPS traffic. But they can't avoid DNS resolution (unless they hardcode IPs, which limits flexibility). This makes DNS one of the most reliable detection sources across the entire kill chain.
Category 5: Web Proxy / HTTP Logs
A web proxy (also called a Secure Web Gateway) sits between users and the internet. It inspects, logs, and optionally blocks web traffic. In organizations that route all web traffic through a proxy, these logs are a treasure trove.
What Proxy Logs Capture
| Field | What It Tells You |
|---|---|
| User / Client IP | Who made the request |
| URL | Full URL including path and parameters |
| HTTP Method | GET, POST, PUT, DELETE |
| Response Code | 200 (OK), 403 (blocked), 404 (not found), 500 (server error) |
| Content Type | Was it HTML, JavaScript, an executable, a ZIP file? |
| User-Agent | What browser or tool made the request |
| Bytes transferred | How much data was uploaded or downloaded |
| Category | Proxy's classification (business, social media, malware, uncategorized) |
SOC Use Cases
- Malware delivery detection — User visited a compromised website, proxy logged the URL and the
.exedownload - C2 callback identification — Infected host makes periodic HTTPS connections to
cdn-static-assets.xyzevery 60 seconds - Data exfiltration — Large POST requests to an uncategorized domain at 3 AM
- Policy violation — Employee accessing unauthorized file-sharing or personal cloud storage
- Phishing follow-through — After clicking a link in email, what did they actually browse to?
Common Proxy Platforms
| Platform | Deployment | Notes |
|---|---|---|
| Zscaler | Cloud-based | Very common in modern enterprises |
| Symantec/BlueCoat ProxySG | On-prem appliance | Legacy but still widespread |
| Squid | Open-source, on-prem | Often used in smaller orgs |
| McAfee Web Gateway | On-prem or cloud | Enterprise-grade |
| Microsoft Defender for Cloud Apps | Cloud (M365) | CASB functionality |
Proxy + DNS = Network Visibility Duo. DNS tells you what domains were resolved. Proxy tells you what content was actually accessed. Together, they give you near-complete visibility into outbound network activity — even when the traffic is encrypted (because the proxy terminates TLS).
Category 6: Email Gateway Logs
Phishing remains the #1 initial access vector in real-world attacks. In Lab 1.2, the APT29 scenario started with spearphishing emails (T1566.001). Email gateway logs are where you detect and investigate these attacks.
What Email Gateway Logs Capture
| Field | What It Tells You |
|---|---|
| Sender address | Who sent the email (and whether it's spoofed) |
| Recipient | Who was targeted |
| Subject line | Social engineering context |
| Attachment name / type | invoice.docm, urgent-review.pdf.exe |
| URLs in body | Phishing links, credential harvester URLs |
| Verdict | Delivered, quarantined, blocked |
| SPF/DKIM/DMARC results | Email authentication — did it pass? |
| Threat classification | Phishing, malware, spam, BEC |
Why Email Logs Are Critical for Investigation
When a phishing campaign hits your organization, the SIEM alert might only show one user who clicked. Email gateway logs answer the harder questions:
- How many people received the same email? (scope of the campaign)
- Did anyone else click the link or open the attachment? (other potential victims)
- Was the email quarantined or delivered? (do you need to pull it from mailboxes?)
- What was the sender domain and IP? (IOCs for threat intel — Module 5)
Common Email Security Platforms
| Platform | Type | Notes |
|---|---|---|
| Microsoft Defender for Office 365 | Cloud (M365) | Built into most enterprise email |
| Proofpoint | Cloud gateway | Market leader in email security |
| Mimecast | Cloud gateway | Strong attachment sandboxing |
| Barracuda | Cloud/on-prem | Mid-market |
| Google Workspace Security | Cloud (Gmail) | Built into Google Workspace |
The Gap Between "Blocked" and "Delivered." An email gateway might block 99% of phishing attempts. That sounds great until you realize that in a 10,000-employee organization receiving 1,000 phishing emails per day, "99% blocked" means 10 phishing emails land in inboxes every single day. Those 10 are why SOC analysts exist.
Category 7: Cloud Audit Trails
Modern organizations run hybrid environments — on-premises servers plus cloud infrastructure (AWS, Azure, GCP) plus SaaS applications (M365, Salesforce, Slack). Each of these generates audit logs that track who did what.
The Big Three Cloud Audit Sources
| Source | Platform | What It Logs |
|---|---|---|
| AWS CloudTrail | AWS | Every API call — EC2 launches, S3 access, IAM changes, console logins |
| Azure AD / Entra ID Sign-in Logs | Microsoft | User sign-ins, MFA challenges, conditional access results, risky sign-ins |
| Microsoft 365 Unified Audit Log | M365 | Email access, SharePoint file operations, Teams activity, admin changes |
| Google Workspace Audit | Gmail access, Drive sharing, admin console changes | |
| GCP Cloud Audit Logs | GCP | Admin activity, data access, system events |
Why Cloud Logs Matter More Every Year
- Identity is the new perimeter. In cloud environments, there's no firewall between the attacker and your data — just an identity (username + password + MFA). Cloud audit logs track every authentication and authorization decision.
- Attackers target cloud directly. Credential stuffing against Azure AD, phishing for M365 tokens, compromising AWS access keys — these attacks skip your on-prem defenses entirely.
- Data lives in the cloud. If an attacker exfiltrates data from SharePoint or S3, the only log that records it is the cloud audit trail.
Key Cloud Events to Monitor
| Event | Platform | Why It Matters |
|---|---|---|
| Console login from unusual location | AWS CloudTrail / Azure AD | Compromised credentials (T1078.004) |
| MFA bypass or disabled | Azure AD / M365 | Attacker removing security controls |
| IAM policy changed | AWS CloudTrail | Privilege escalation in cloud (T1098) |
| S3 bucket made public | AWS CloudTrail | Data exposure — accidental or malicious |
| Mail forwarding rule created | M365 | Business Email Compromise (BEC) persistence |
| Mass file download from SharePoint | M365 | Data exfiltration (T1530) |
Cloud Logs Are the Fastest Growing Category. Five years ago, most SOCs monitored only on-prem logs. Today, cloud audit trails often generate more events than traditional sources. In some cloud-native organizations, CloudTrail and Azure AD logs are the primary data sources in the SIEM.
Category 8: Application & Database Logs
Every application generates logs — web servers, databases, custom business applications, middleware. These logs are often overlooked in SOCs but contain evidence that no other source captures.
Web Server Logs
| Source | Log File | What It Records |
|---|---|---|
| Apache | access.log, error.log | Every HTTP request: IP, URL, method, status code, user-agent |
| Nginx | access.log, error.log | Same as Apache with different format |
| IIS | W3SVC logs | Windows web server access logs |
Web server logs detect:
- SQL injection attempts —
/api/users?id=1' OR '1'='1(T1190) - Web shell access — Repeated requests to
/uploads/shell.phpfrom a single IP - Directory traversal —
/../../etc/passwdin URL paths - Vulnerability scanning — Rapid requests to known vulnerable paths (
/wp-login.php,/phpmyadmin,/.env)
Database Audit Logs
| Database | Audit Feature | What It Records |
|---|---|---|
| MySQL / MariaDB | General query log, audit plugin | Every SQL query executed |
| PostgreSQL | pgaudit extension | Query logging with parameters |
| MSSQL | SQL Server Audit | Login events, schema changes, query execution |
| Oracle | Unified Auditing | Comprehensive query and access logging |
Database audit logs detect:
- Unauthorized data access — SELECT queries on sensitive tables from unusual users
- Data manipulation — UPDATE/DELETE on critical records
- Schema changes — DROP TABLE, ALTER TABLE from non-admin accounts
- Privilege escalation — GRANT commands giving excessive permissions
Custom Application Logs
Many organizations build custom applications that generate their own logs. These often contain business-context that no other log source provides:
- Authentication events specific to the application
- Business logic violations (e.g., transferring more than a threshold amount)
- API access patterns that indicate scraping or abuse
The Overlooked Goldmine. Web server logs are available on virtually every organization's web-facing systems but are often not forwarded to the SIEM. If you join a SOC and discover that Apache/Nginx access logs aren't being collected, flag it immediately — you're blind to web application attacks.
Connecting Log Sources to ATT&CK
In Lab 1.2, you mapped 15 APT29 techniques to the ATT&CK framework and color-coded them by detection capability. Now let's see which log sources cover which tactics:
| ATT&CK Tactic | Primary Log Sources | Why |
|---|---|---|
| Initial Access | Email gateway, web proxy, firewall | Phishing emails, drive-by downloads, and exploit attempts arrive through these channels |
| Execution | Windows 4688, Linux audit.log, application logs | Process creation captures what ran; app logs capture web-based execution |
| Persistence | Windows 7045/4697, Linux syslog/cron, cloud audit | New services, scheduled tasks, and IAM changes establish long-term access |
| Defense Evasion | Windows events, cloud audit, FIM | Log clearing (1102), policy changes, and file modifications reveal evasion attempts |
| Credential Access | Windows 4625, Linux auth.log, cloud sign-in logs | Failed authentication across all platforms tracks brute force and credential theft |
| Discovery | DNS logs, Windows events, cloud audit | Network reconnaissance generates DNS queries and enumeration events |
| Lateral Movement | Windows 4624 (Type 3), firewall logs | Network logons between internal systems and allowed internal traffic patterns |
| Exfiltration | Web proxy, DNS logs, firewall logs | Outbound data transfers, DNS tunneling, and large file uploads to external services |
No Single Log Source Covers the Full Kill Chain. This is the most important takeaway of this lesson. Windows Event Logs alone miss initial access (email), network C2 (DNS/proxy), and cloud attacks entirely. A SOC that only monitors Windows events has massive blind spots. Defense in depth requires log sources in depth.
The Log Source Priority Matrix
Not every log source is equally important. If you're building a SOC from scratch or evaluating coverage, here's how to prioritize:
| Priority | Log Sources | Why First |
|---|---|---|
| Tier 1 — Must Have | Windows Security Events, Linux auth.log, Firewall logs, DNS logs | These cover authentication, process execution, network boundaries, and name resolution — the four pillars of visibility |
| Tier 2 — High Value | Email gateway, Web proxy, Cloud audit trails (Azure AD / CloudTrail) | These cover the top initial access vector (email), outbound traffic inspection, and cloud identity — where modern attacks happen |
| Tier 3 — Deep Visibility | Sysmon (Lesson 2.6), Application logs, Database audit, Endpoint telemetry (EDR) | These provide the deep technical detail needed for advanced investigation and threat hunting |
What Happens When a Log Source Is Missing
| Missing Source | What You Can't See |
|---|---|
| No email gateway logs | Phishing campaigns, BEC, malicious attachments — you only find out after the user clicks |
| No web proxy logs | C2 callbacks, malware downloads, data exfiltration over HTTPS |
| No DNS logs | DNS tunneling, DGA activity, C2 domain resolution |
| No cloud audit logs | Compromised cloud accounts, unauthorized data access, shadow IT |
| No process creation (4688) | What programs ran on compromised systems — you see the login but not what happened after |
Practical Application: Your Lab Environment
In Lab 1.3, you worked with events from 4 agents covering 10 log sources. Here's how they map to the 8 categories you just learned:
| Category | Covered in Lab? | Agent | What You Saw |
|---|---|---|---|
| Windows Event Logs | Yes | WIN-SERVER-01 | 4624, 4625, 4688, 7045 |
| Linux / Syslog | Yes | linux-web-01 | SSH auth, sudo, audit |
| Firewall & Network | Yes | fw-edge-01 | iptables drop events |
| DNS Query Logs | Yes | dns-server-01 | Named query logs |
| Web Proxy / HTTP | Partial | linux-web-01 | Apache access logs (web server, not proxy) |
| Email Gateway | No | — | Not in lab environment |
| Cloud Audit Trails | No | — | Not in lab environment |
| Application & DB | Partial | linux-web-01 | Apache access/error logs |
The gaps are intentional — email gateway and cloud audit logs require enterprise infrastructure that doesn't fit in a lab container. But now you know they exist and what they provide, so when you encounter them in a real SOC, you'll understand their value immediately.
Key Takeaways
- A SIEM monitors 8 categories of log sources: Windows Events, Linux/Syslog, Firewall, DNS, Web Proxy, Email Gateway, Cloud Audit Trails, and Application/Database logs
- Windows Event IDs 4624, 4625, 4688, 7045, and 1102 are the five you must recognize on sight — they cover authentication, process execution, persistence, and anti-forensics
- Logon Type in Windows 4624/4625 events tells you how the authentication happened — Type 3 (Network) is where lateral movement lives
- DNS logs are one of the most reliable detection sources because attackers cannot avoid DNS resolution
- Email gateway logs are critical because phishing remains the #1 initial access vector
- Cloud audit trails are the fastest-growing log category as organizations move to hybrid and cloud-first architectures
- No single log source covers the full ATT&CK kill chain — defense in depth requires log sources in depth
- Your first question on day one of a new SOC job: "What log sources are we collecting — and what gaps do we have?"
What's Next
You now understand what data flows into a SIEM. In Lesson 2.2 — Anatomy of a SIEM Alert, you'll learn what happens after the data arrives: how Wazuh rules turn raw log events into the alerts you investigate, and how to read every field in an alert quickly and accurately.
Knowledge Check: Log Sources That Matter
10 questions · 70% to pass
How many log source categories does a production SIEM typically monitor?
A SOC analyst sees Windows Event ID 4625 with Logon Type 3 from an external IP address. What does this indicate?
Why are DNS query logs considered one of the most reliable detection sources for C2 communication?
What critical limitation of Windows Security Event Logs requires Sysmon to address?
An analyst notices a spike in DNS TXT record queries from a single host to an unusual domain with very long subdomain strings. What does this most likely indicate?
Why is 'no single log source covers the full kill chain' the most important takeaway about log sources?
Which log source would reveal that an attacker created a mail forwarding rule to maintain access to a compromised email account (BEC persistence)?
In Lab 1.3, you explored events from 4 Wazuh agents covering 10 log sources. Which of the following agents was responsible for SSH authentication and sudo events in the lab environment?
In Lab 1.3, you identified that email gateway and cloud audit trail logs were NOT present in the lab environment. Why is this gap significant for a production SOC?
In Lab 1.3, the fw-edge-01 agent generated iptables drop events. When you saw repeated drops to port 22 from a single external IP, which attack pattern were you observing?
0/10 answered