A Final Year Project · BSc (Hons) Cybercrime & IT Security · South East Technological University
Every app asks.
Few policies answer.
PrivacyTotal is an end-to-end auditing pipeline that downloads an Android APK, extracts its declared permissions, retrieves the developer's privacy policy, and uses a fine-tuned 2.7 billion-parameter language model to decide — permission by permission — whether the policy discloses what the app collects. Every application is then scored on a 0–100 Privacy Health Score, and every verdict cites the specific policy excerpt it was drawn from.
precision 0.892, recall 0.971
2,931 of 3,167 pairs
3,483 permission pairs
RTX 2060 Super, 4-bit NF4
The approach
A four-stage pipeline, unattended after one package ID.
PrivacyTotal acquires an APK through a five-source fallback chain
(APKMirror → APKPure → APKCombo → APKMonk → Uptodown),
decodes its AndroidManifest.xml with aapt2, and retrieves the
privacy policy from the Play Store listing. A QLoRA-tuned MobileLLaMA 2.7B then
receives a structured prompt with each permission, its human-readable description, and
the top-3 TF-IDF-ranked policy excerpts.
Verdicts are aggregated into a weighted Privacy Health Score on a 0–100 scale, persisted to SQLite for longitudinal tracking, and surfaced through a Flask web application. The full stack runs on consumer hardware with no cloud inference.
Headline findings
The gap is structured, not random — and concentrated on feature-level permissions.
Across 3,483 permission–policy pairs, roughly 35% of declared Android permissions are either undisclosed or only vaguely disclosed. The gap falls hardest on feature-level permissions — not on core-sensitive ones:
- NFC: only 13.6% of 22 declarations are disclosed in policy.
- Bluetooth: 36.8% of 201 declarations disclosed, despite the role in proximity tracking.
- Notifications group: 39.1% — policy lags platform.
- 69.2% of the 117 sampled apps have at least one undisclosed permission.
Why this matters
In 2025, RTÉ Prime Time revealed that granular location data from tens of thousands of Irish smartphones was openly brokered on data markets — tracking journeys to prisons, military bases, and private addresses, despite GDPR Article 13 requiring plain-language disclosure of exactly this kind of collection. Manual auditing does not scale to a Play Store of millions of apps whose policies routinely run past 15,000 words.
PrivacyTotal automates the audit end-to-end: the same check can be repeated across thousands of apps, re-run when policies change, and reproduced from a single package ID on consumer hardware. No cloud inference, no proprietary dependencies — an open-source pipeline from APK acquisition through to Privacy Health Score.
Built with
Model · MobileLLaMA 2.7B · QLoRA / PEFT · 4-bit NF4
Training · PyTorch · transformers · bitsandbytes
Acquisition · Playwright · TLS-fingerprint masking
Extraction · aapt2 · androguard · BeautifulSoup
App · Flask · Jinja2 · SQLite · Chart.js
Platform · Python 3.11 · Windows + WSL · RTX 2060 Super