PrivacyTotal/Research brief · Vol. 01 · 2025–26

A Final Year Project · BSc (Hons) Cybercrime & IT Security · South East Technological University

Every app asks.
Few policies answer.

PrivacyTotal is an end-to-end auditing pipeline that downloads an Android APK, extracts its declared permissions, retrieves the developer's privacy policy, and uses a fine-tuned 2.7 billion-parameter language model to decide — permission by permission — whether the policy discloses what the app collects. Every application is then scored on a 0–100 Privacy Health Score, and every verdict cites the specific policy excerpt it was drawn from.

0.930
F1 · covered class
precision 0.892, recall 0.971
92.5%
LLM – oracle agreement
2,931 of 3,167 pairs
117
applications analysed
3,483 permission pairs
8GB
peak VRAM at inference
RTX 2060 Super, 4-bit NF4

The approach

A four-stage pipeline, unattended after one package ID.

PrivacyTotal acquires an APK through a five-source fallback chain (APKMirror → APKPure → APKCombo → APKMonk → Uptodown), decodes its AndroidManifest.xml with aapt2, and retrieves the privacy policy from the Play Store listing. A QLoRA-tuned MobileLLaMA 2.7B then receives a structured prompt with each permission, its human-readable description, and the top-3 TF-IDF-ranked policy excerpts.

Verdicts are aggregated into a weighted Privacy Health Score on a 0–100 scale, persisted to SQLite for longitudinal tracking, and surfaced through a Flask web application. The full stack runs on consumer hardware with no cloud inference.

Headline findings

The gap is structured, not random — and concentrated on feature-level permissions.

Across 3,483 permission–policy pairs, roughly 35% of declared Android permissions are either undisclosed or only vaguely disclosed. The gap falls hardest on feature-level permissions — not on core-sensitive ones:

  • NFC: only 13.6% of 22 declarations are disclosed in policy.
  • Bluetooth: 36.8% of 201 declarations disclosed, despite the role in proximity tracking.
  • Notifications group: 39.1% — policy lags platform.
  • 69.2% of the 117 sampled apps have at least one undisclosed permission.

Why this matters

In 2025, RTÉ Prime Time revealed that granular location data from tens of thousands of Irish smartphones was openly brokered on data markets — tracking journeys to prisons, military bases, and private addresses, despite GDPR Article 13 requiring plain-language disclosure of exactly this kind of collection. Manual auditing does not scale to a Play Store of millions of apps whose policies routinely run past 15,000 words.

PrivacyTotal automates the audit end-to-end: the same check can be repeated across thousands of apps, re-run when policies change, and reproduced from a single package ID on consumer hardware. No cloud inference, no proprietary dependencies — an open-source pipeline from APK acquisition through to Privacy Health Score.

Built with

Model · MobileLLaMA 2.7B · QLoRA / PEFT · 4-bit NF4
Training · PyTorch · transformers · bitsandbytes
Acquisition · Playwright · TLS-fingerprint masking
Extraction · aapt2 · androguard · BeautifulSoup
App · Flask · Jinja2 · SQLite · Chart.js
Platform · Python 3.11 · Windows + WSL · RTX 2060 Super