A research brief on the gap between Android permissions and the policies that are supposed to explain them.
PrivacyTotal is an end-to-end auditing pipeline that downloads an APK, extracts its declared permissions, retrieves the developer's privacy policy, and runs a fine-tuned 2.7 billion-parameter language model to decide, permission by permission, whether the policy actually discloses what the app collects. Every application is then scored on a 0–100 Privacy Health Score, and every verdict cites the specific policy excerpt it was drawn from.
Headline numbers
117 Android applications analysed. 3,483 permission–policy pairs drawn across 175 analysis runs. F1 of 0.930 on the covered class versus a deterministic keyword oracle labelled over 2,562 pairs.
Why this project
RTÉ Prime Time's 2025 investigation revealed that granular location data from tens of thousands of Irish smartphones was openly brokered on data markets, tracking journeys to prisons, military bases, and private addresses, despite GDPR Article 13 requiring plain-language disclosure of exactly this kind of collection. Manual auditing does not scale to a Play Store of millions of apps whose policies routinely run past 15,000 words. PrivacyTotal automates the audit end-to-end.
Every Android app declares its required device permissions in AndroidManifest.xml.
Every app handling user data is supposed to link a privacy policy disclosing what it collects
and why. In practice, the two documents diverge — silently, and across every category of
app.
Given only a Google Play package ID, the tool acquires the APK through a five-source
fallback chain, decodes its manifest with aapt2, retrieves the Play Store
privacy policy, and asks a fine-tuned MobileLLaMA whether each declared permission is
disclosed in the policy text. Results are compiled into a gap report, stored in a
versioned database, and surfaced in a Flask web app.
Architecture
The full pipeline — acquisition, extraction, classification, scoring — runs unattended from the Flask web UI or the batch CLI. No human intervention after the package ID is entered.
APK downloaded through a five-source fallback chain (APKMirror → APKPure → APKCombo → APKMonk → Uptodown), with TLS-fingerprint masking and package-ID verification on every candidate.
Android manifest decoded with aapt2; permissions grouped into 14 semantic categories. Privacy-policy URL pulled from the Play Store listing, HTML stripped to clean text.
Fine-tuned MobileLLaMA 2.7B receives a structured prompt with the permission, its human-readable description, and the top-3 TF-IDF-ranked policy excerpts. Returns “Mentioned:” or “Not mentioned:” with rationale.
Results aggregated into a Privacy Health Score on a 0–100 scale, penalising undisclosed high-risk permissions and vague language. Full report persisted to SQLite for longitudinal tracking.
The model
Full fine-tuning at this scale needs more than 40 GB of VRAM. QLoRA with 4-bit NF4 quantisation brings it inside 8 GB, without giving up classification quality.
| Base model | MobileLLaMA 2.7B-Chat |
| Technique | QLoRA, 4-bit NF4 |
| LoRA rank / alpha | r = 8, α = 16 |
| Learning rate | 5 × 10⁻⁵ |
| Epochs | 3 (≈ 2.5 h) |
| Dataset | 5,154 examples |
| Loss | 2.91 → 0.34 |
| GPU | RTX 2060 Super, 8 GB |
Several small-footprint LLMs were evaluated before training began: TinyLLaMA, Mistral 7B, and Phi-2. MobileLLaMA 2.7B won on three criteria: it runs inside 8 GB, it handles legal and technical prose without trailing into hallucination, and it has an open Chat variant suitable for instruction tuning.
Training data was assembled from the PrivacyQA corpus, OPP-115 annotations, and hand-authored permission-style capability templates. Earlier multi-stage adapters over-fit and produced boilerplate; the current model was trained from scratch on the base checkpoint against a single classification task.
Privacy Health Score
The Privacy Health Score compresses the gap analysis into a 0–100 metric, weighted by the risk category of each undisclosed permission and the vagueness of the policy text.
Empirical findings
Across 3,483 permission–policy pairs, roughly 35% of declared Android permissions are either undisclosed or only vaguely disclosed. The pattern is not random: it falls hardest on permissions introduced by specific app features (Bluetooth, notifications, NFC) rather than on core-sensitive ones.
Only 3 of 22 declared NFC instances were disclosed; 81.8% were not mentioned at all. The least-disclosed permission group in the corpus.
Only 74 of 201 BLUETOOTH-family declarations are addressed in the policy, despite Bluetooth's role in proximity tracking and beaconing.
POST_NOTIFICATIONS (Android 13), VIBRATE, and WAKE_LOCK together form the single largest source of undisclosed declarations. Policy lags platform.
81 of 117 analysed applications had at least one undisclosed permission. Clean disclosure is the exception, not the norm.
The classifier agrees with the keyword oracle on 2,931 of 3,167 adjudicated pairs — safe to deploy as a triage layer ahead of human review.
55% of apps score Low Risk, 18% Critical. Best and worst practices coexist: blanket template reform will not reach the worst offenders.
Breakdown
Same 3,483-pair corpus, split two ways. Left: what share of each permission category ever makes it into the policy. Right: where the 100 scored apps land on the Privacy Health Score.
Built with
No cloud APIs, no proprietary dependencies. The full stack is reproducible on a single consumer GPU with off-the-shelf Python libraries.
The papers
The four formal write-ups behind the project. Click through for the full text rendered in the site's reader, or grab the original PDFs.
Background research carried out before the project began — a survey of the legal, technical, and policy landscape that motivated PrivacyTotal. Separate from the final Research Report; this is the groundwork that shaped the project's scope.
Read preliminary researchThe opening proposal: problem framing, motivation from the 2025 RTÉ Prime Time disclosure, literature review, proposed system architecture, deliverables, technical constraints, and sources.
Read specificationEnd-to-end engineering report: system design, APK acquisition pipeline, model training methodology, inference fixes, web application, database design, and a full testing and evaluation chapter.
Read reportEmpirical study of 117 Play Store apps: corpus construction, permission-level coverage statistics, Privacy Health Score distribution, and case studies. Published in full on site.
Read researchCorrespondence
PrivacyTotal is a Final Year Project, so the work is ongoing and feedback is welcome. Reach out by email or LinkedIn.