Right now, Android users can choose from over 2.7 million apps in the Google Play Store. That’s a daunting number to any privacy researcher wanting to investigate whether apps are complying with privacy laws or not.
Privacy researchers, fear not. There’s a new tool in town, and it’s revealed some pretty eye-opening findings about the state of privacy for Android apps.
A team of researchers from Carnegie Mellon University and Fordham University recently created the Mobile App Privacy System (MAPS), a tool that uses natural language processing, machine learning, and code analysis to identify potential privacy compliance issues by inspecting apps’ privacy policies and code.
The researchers tested the tool on over a million Android apps, and presented their findings at last month’s Privacy Enhancing Technologies Symposium in Stockholm, Sweden.
“The sheer number of apps in app stores, combined with their complexity and all of the different third-party interfaces they may use, make it impossible for regulators to systematically look for privacy compliance issues,” says CyLab’s Norman Sadeh, a professor in the Institute for Software Research and the principal investigator on the study. “This tool provides a system for systematically identifying potential privacy issues at scale, and can be customized to help app store operators or regulators focus on issues relevant to specific privacy regulations.”
The tool also allows users to filter privacy results to focus on, for example, apps with more than a threshold number of downloads, specific categories of apps, or particular types of potential compliance issues.
To analyze the state of privacy for Android apps, the team used MAPS to analyze 1,039,003 apps downloaded from the Google Play Store. The analysis took the tool about three weeks, working at a blazing average rate of 2,023 apps per hour.
When policies do exist, many seem to inaccurately portray the practices performed by the appNorman Sadeh, Professor, Institute for Software Research
“When policies do exist, many seem to inaccurately portray the practices performed by the app,” says Sadeh. “For example, 12 percent of apps’ policies did not seem to accurately describe how the app is handling your location data.”
Sadeh cautioned that all these results require further manual vetting, because not all potential compliance issues are necessarily actual violations. For instance, code that may seem to be sharing sensitive information with third parties may not actually be executed.
“For practical reasons, we were only able to fully vet a tiny fraction of our results, but many of those results that were checked proved to correspond to actual compliance issues,” says Sadeh. “In particular, the tool was used as part of a project with a large European electronics manufacturer to check several of their mobile apps for compliance with the European General Data Protection Regulation (GDPR).”
On average, the researchers found about 3 potential privacy compliance issues per app. They also found that while newer apps were more likely to have privacy policies, they also had more potential issues than older apps.
“Overall, we found that Google’s efforts to push developers to post privacy policies may not be enough,” says Sadeh. “Developers may not be able or willing to adequately describe their apps’ behaviors without proper tools and incentives.”
Sadeh further notes that this particular study was conducted just before the GDPR took effect. Under GDPR, companies are subject to more stringent disclosure requirements and are also facing steeper penalties for not complying.
“One can hope that with GDPR, the number of compliance issues will diminish over time,” Sadeh says. “At the same time, our research as well as that of others suggests that many app developers simply lack the sophistication and the resources necessary to be fully compliant. This is an area where, in my view, App Store operators should be more proactive and provide additional support to app developers.”
Other researchers on the study included former CMU computer science postdoctoral associate Sebastian Zimmeck, graduate students Peter Story, Daniel Smullen, Abhilasha Ravichander, Ziqi Wang, as well as Fordham University law faculty members Joel Reidenberg and N. Cameron Russell.