Phishing Email
Investigation & Attribution
A practical guide to parsing raw email headers, validating authentication protocols, and identifying indicators of spoofed domains.
1. Parsing Raw RFC 5322 Headers
Email headers contain a wealth of metadata tracing the message's journey from the sender's mail client to the recipient's inbox. When investigating a suspicious email, the raw message headers are the absolute starting point.
Key header fields to review include:
- Return-Path: The address where bounce messages are sent. Often mismatched with the visual "From" address in spoofing campaigns.
- Received: Appended by each Mail Transfer Agent (MTA) that handles the message. Read from bottom to top to trace the true origin.
- X-Distribution / X-Mailer: Information about the software used to compile or bulk-send the message.
2. Verifying SPF, DKIM, & DMARC
Modern email systems use three primary authentication protocols to combat spoofing and verify authenticity:
SPF (Sender Policy Framework)
A DNS record list indicating which IP addresses are authorized to send mail on behalf of the domain. Checked against the Return-Path IP.
DKIM (DomainKeys Identified Mail)
A cryptographic signature appended to the email header. It guarantees that the email content was not altered in transit.
DMARC (Domain-based Message Authentication)
Defines domain policies for handling failures (None, Quarantine, Reject). Requires SPF and/or DKIM alignment with the visual "From" address.
3. Spoofed Domains & Reconnaissance
Attackers often register lookalike domains (e.g., using Unicode characters in IDN homograph attacks or inserting small typos like micros0ft.com).
Use domain lookup tools to check domain age, registrar information, and name server history:
# Retrieve WHOIS records to check domain creation date
whois target-suspicious-domain.com
# Query MX (Mail Exchanger) records for the domain
dig mx target-suspicious-domain.com +short4. Extracting Headers Programmatically
Using Python's native email package, security analysts can automate the parsing of raw email files:
from email import message_from_file
with open('phish_sample.eml', 'r') as f:
msg = message_from_file(f)
print("Subject:", msg['Subject'])
print("From:", msg['From'])
print("Return-Path:", msg['Return-Path'])
print("Authentication-Results:", msg['Authentication-Results'])
