Final Year Project — BSc Cybercrime — SETU Carlow

Automated Threat Intelligence Pipeline

An open-source SOC automation lab integrating MISP, Cortex, TheHive and Maltego into a single threat intelligence workflow, demonstrating that enterprise-grade security automation is achievable without enterprise-grade costs.

78%

Average investigation time reduction

39s

Average automated enrichment time

4

Incident scenarios tested

0

Data lost across 5 service interruptions

Automated Pipeline Flow

🔍

MISP

→

⚙️

Cortex

→

🐝

TheHive

→

🕸️

Maltego

→

📊

Dashboard

🎯 Research Question

Does automating SOC enrichment and triage workflows meaningfully reduce analyst investigation time and by how much?

✅ Finding

Yes by an average of 78% across four incident scenario types, consistent across two weeks of structured testing.

About the Project

Background & Motivation

👤

Kacper Szkutnik

Student ID: C00285636

Course: BSc (Hons) Cybercrime — SETU Carlow

Supervisor: Martin Tolan

Year: 2025–2026

💡 Why This Project?

The motivation came from direct experience using MISP and TheHive during an Erasmus exchange in Finland, where these tools were deployed in a real SOC environment. They worked well individually but were almost always used in isolation. This project connects all four tools into a single automated workflow something that is rarely done in open-source environments.

🎯 The Problem

Enterprise-grade threat intelligence platforms like IBM QRadar and Splunk Phantom offer integrated automation pipelines but at a cost that puts them out of reach for small organisations, regional CERTs, and academic institutions. This project demonstrates that the same core capabilities are achievable using entirely free, open-source tools.

Technology Stack

Tools Used

🔍

MISP

Threat intelligence collection and IOC sharing

⚙️

Cortex

Automated IOC enrichment via VirusTotal, AbuseIPDB, IPInfo

🐝

TheHive

Case management, dashboards and investigation workflows

🕸️

Maltego

Visual threat correlation and infrastructure mapping

🐳

Docker

Container orchestration for TheHive and Cortex

🖥️

VMware

Virtualised lab environment across two VMs

Project Timeline

Development Journey

October 2025

Project scoping and specification

Initial supervisor meetings to define project purpose, audience and deliverables.

January 2026

Installation attempts and challenges

Docker-based TheHive installation failed repeatedly over two weeks due to SSL certificate issues. Security Onion tested and abandoned.

February 2026

StrangeBee breakthrough

StrangeBee contacted directly, provided pre-configured image. Full pipeline integration achieved within 48 hours.

March–April 2026

Testing and evaluation

Two weeks of structured performance testing across four incident scenarios, plus stress testing and failure testing.

Pipeline Architecture

How It Works

Phase 1 MISP — Collection

Threat indicators are created or imported into MISP as attributes within an event. Events are published and automatically picked up by TheHive as alerts.

Phase 2 TheHive — Case Management

Cases are created from MISP alerts. Observables such as IP addresses, domains, file hashes are added and tracked through the investigation workflow.

Phase 3 Cortex — Enrichment

Cortex analysers query VirusTotal, AbuseIPDB and IPInfo automatically. Results return to TheHive as structured reports within seconds.

Phase 4 Feedback Loop

Enriched intelligence is fed back into MISP, completing the closed-loop cycle. New context from one investigation strengthens detection of future similar events.

Test Scenarios

Four Incident Types Tested

🎣 Scenario A — Phishing Campaign

Simulated phishing email targeting a finance department user. IOCs: spoofed sender IP, typosquatted domain, credential harvesting URL, EICAR test hash.

🦠 Scenario B — Malware Infection

Simulated AgentTesla infostealer infection on an endpoint. IOCs: C2 server IPs, malware dropper domain, malware file hash, persistence registry key.

🌐 Scenario C — Suspicious Network Activity

Simulated data exfiltration alert from a SIEM. IOCs: internal source IP, suspicious destination IPs, exfiltration URL via paste site.

🔑 Scenario D — Account Compromise

Simulated password spray attack on a privileged account. IOCs: attacker IPs, compromised account, off-hours login timestamp, scripted user-agent.

Infrastructure

Lab Setup

VM 1 — MISP Server

192.168.x.x
Ubuntu 22.04
MISP 2.4.x

VM 2 — TheHive + Cortex

192.168.x.x
TheHive 5.5.13
Cortex 4.0.0

⚡ Key Optimisations Made During Testing

Several pipeline improvements were discovered and documented during testing:

• Cassandra RAM capped from 1,280MB → 512MB — freed 768MB, eliminated most crashes
• Elasticsearch heap reduced 512MB → 256MB — freed additional 256MB
• Microsoft Edge → Google Chrome incognito — resolved persistent blank screen bug
• Fixed restart order: Elasticsearch → Cassandra → TheHive — reliable recovery under 3 minutes
• Four browser tabs instead of four windows — allowed 4 concurrent cases without crash

Evaluation Results

Testing Findings

Performance — Manual vs Automated

Scenario	Auto W1	Auto W2	Manual W1	Manual W2	% Reduction
A — Phishing	0:35	0:36	3:05	3:01	81%
B — Malware	0:45	0:40	2:52	2:50	74%
C — Network Activity	0:42	0:39	3:10	3:07	78%
D — Account Compromise	0:35	0:32	2:56	3:02	80%
Average	0:39	0:37	3:01	3:00	~78%

Stress Testing

Test	Time	Errors	Crashed	RAM Impact
Single case (baseline)	0:39	0	No	Baseline
Two concurrent cases	1:33	0	No	Negligible
Four cases (4 windows)	N/A	—	Yes	—
Four cases (4 tabs)	2:26	0	No	+200MB

Failure Testing

Failure Scenario	Impact	Recovery	Data Lost
API rate limit (12 jobs)	None — Cortex queued automatically	N/A	None
Cortex restarted mid-job	Job terminated, TheHive unaffected	0:40	None
MISP disconnected	No impact on TheHive or Cortex	2:14	None

Key Statistics

78%

Avg time reduction

39s

Avg automated time

3:01

Avg manual time

100%

Accuracy parity

0

Data lost in 5 interruptions

12

Concurrent VT jobs — no failures