Final Year Project β€” BSc Cybercrime β€” SETU Carlow

Automated Threat Intelligence Pipeline

An open-source SOC automation lab integrating MISP, Cortex, TheHive and Maltego into a single threat intelligence workflow, demonstrating that enterprise-grade security automation is achievable without enterprise-grade costs.

78%
Average investigation time reduction
39s
Average automated enrichment time
4
Incident scenarios tested
0
Data lost across 5 service interruptions
Automated Pipeline Flow
πŸ”
MISP
β†’
βš™οΈ
Cortex
β†’
🐝
TheHive
β†’
πŸ•ΈοΈ
Maltego
β†’
πŸ“Š
Dashboard

🎯 Research Question

Does automating SOC enrichment and triage workflows meaningfully reduce analyst investigation time and by how much?

βœ… Finding

Yes by an average of 78% across four incident scenario types, consistent across two weeks of structured testing.

Background & Motivation

πŸ‘€
Kacper Szkutnik
Student ID: C00285636
Course: BSc (Hons) Cybercrime β€” SETU Carlow
Supervisor: Martin Tolan
Year: 2025–2026

πŸ’‘ Why This Project?

The motivation came from direct experience using MISP and TheHive during an Erasmus exchange in Finland, where these tools were deployed in a real SOC environment. They worked well individually but were almost always used in isolation. This project connects all four tools into a single automated workflow something that is rarely done in open-source environments.

🎯 The Problem

Enterprise-grade threat intelligence platforms like IBM QRadar and Splunk Phantom offer integrated automation pipelines but at a cost that puts them out of reach for small organisations, regional CERTs, and academic institutions. This project demonstrates that the same core capabilities are achievable using entirely free, open-source tools.

Tools Used

πŸ”
MISP
Threat intelligence collection and IOC sharing
βš™οΈ
Cortex
Automated IOC enrichment via VirusTotal, AbuseIPDB, IPInfo
🐝
TheHive
Case management, dashboards and investigation workflows
πŸ•ΈοΈ
Maltego
Visual threat correlation and infrastructure mapping
🐳
Docker
Container orchestration for TheHive and Cortex
πŸ–₯️
VMware
Virtualised lab environment across two VMs

Development Journey

October 2025
Project scoping and specification
Initial supervisor meetings to define project purpose, audience and deliverables.
January 2026
Installation attempts and challenges
Docker-based TheHive installation failed repeatedly over two weeks due to SSL certificate issues. Security Onion tested and abandoned.
February 2026
StrangeBee breakthrough
StrangeBee contacted directly, provided pre-configured image. Full pipeline integration achieved within 48 hours.
March–April 2026
Testing and evaluation
Two weeks of structured performance testing across four incident scenarios, plus stress testing and failure testing.

How It Works

Phase 1 MISP β€” Collection

Threat indicators are created or imported into MISP as attributes within an event. Events are published and automatically picked up by TheHive as alerts.

Phase 2 TheHive β€” Case Management

Cases are created from MISP alerts. Observables such as IP addresses, domains, file hashes are added and tracked through the investigation workflow.

Phase 3 Cortex β€” Enrichment

Cortex analysers query VirusTotal, AbuseIPDB and IPInfo automatically. Results return to TheHive as structured reports within seconds.

Phase 4 Feedback Loop

Enriched intelligence is fed back into MISP, completing the closed-loop cycle. New context from one investigation strengthens detection of future similar events.

Four Incident Types Tested

🎣 Scenario A β€” Phishing Campaign

Simulated phishing email targeting a finance department user. IOCs: spoofed sender IP, typosquatted domain, credential harvesting URL, EICAR test hash.

🦠 Scenario B β€” Malware Infection

Simulated AgentTesla infostealer infection on an endpoint. IOCs: C2 server IPs, malware dropper domain, malware file hash, persistence registry key.

🌐 Scenario C β€” Suspicious Network Activity

Simulated data exfiltration alert from a SIEM. IOCs: internal source IP, suspicious destination IPs, exfiltration URL via paste site.

πŸ”‘ Scenario D β€” Account Compromise

Simulated password spray attack on a privileged account. IOCs: attacker IPs, compromised account, off-hours login timestamp, scripted user-agent.

Lab Setup

VM 1 β€” MISP Server
192.168.x.x
Ubuntu 22.04
MISP 2.4.x
VM 2 β€” TheHive + Cortex
192.168.x.x
TheHive 5.5.13
Cortex 4.0.0

⚑ Key Optimisations Made During Testing

Several pipeline improvements were discovered and documented during testing:

β€’ Cassandra RAM capped from 1,280MB β†’ 512MB β€” freed 768MB, eliminated most crashes
β€’ Elasticsearch heap reduced 512MB β†’ 256MB β€” freed additional 256MB
β€’ Microsoft Edge β†’ Google Chrome incognito β€” resolved persistent blank screen bug
β€’ Fixed restart order: Elasticsearch β†’ Cassandra β†’ TheHive β€” reliable recovery under 3 minutes
β€’ Four browser tabs instead of four windows β€” allowed 4 concurrent cases without crash

Testing Findings

Scenario Auto W1 Auto W2 Manual W1 Manual W2 % Reduction
A β€” Phishing 0:35 0:36 3:05 3:01 81%
B β€” Malware 0:45 0:40 2:52 2:50 74%
C β€” Network Activity 0:42 0:39 3:10 3:07 78%
D β€” Account Compromise 0:35 0:32 2:56 3:02 80%
Average 0:39 0:37 3:01 3:00 ~78%
Test Time Errors Crashed RAM Impact
Single case (baseline) 0:39 0 No Baseline
Two concurrent cases 1:33 0 No Negligible
Four cases (4 windows) N/A β€” Yes β€”
Four cases (4 tabs) 2:26 0 No +200MB
Failure Scenario Impact Recovery Data Lost
API rate limit (12 jobs) None β€” Cortex queued automatically N/A None
Cortex restarted mid-job Job terminated, TheHive unaffected 0:40 None
MISP disconnected No impact on TheHive or Cortex 2:14 None
78%
Avg time reduction
39s
Avg automated time
3:01
Avg manual time
100%
Accuracy parity
0
Data lost in 5 interruptions
12
Concurrent VT jobs β€” no failures