Skip to main content

CyLab News

How CyLab Researchers are Protecting Consumers This Shopping Season: Distilling Lengthy Privacy Policies

posted by Daniel Tkacik
December 17, 2015

Consumers spent nearly $3 billion on Cyber Monday this year, and the holiday online shopping frenzy will continue through the end of the year. Within these purchases, financial data and other personal information are shared on thousands of websites, each one with unique privacy policies that explain how that data is being used. But do consumers actually understand these privacy policies? Recent studies suggest the answer is no, but it’s not necessarily the fault of the consumer.

A few years ago, a study led by CyLab’s Lorrie Cranor concluded that the average length of a website’s privacy policy is around 2,500 words and would take an average user 10 minutes to read. The authors of the study went on to say that reading every privacy policy encountered by a single individual over the course of one year would take over 600 hours—or 76 eight-hour workdays.

For years, Cranor has been working on ways to distill lengthy privacy policies down to their most basic cores so users can quickly and easily understand what they are being asked to sign off on when visiting a website or signing up for an online account. One recent manifestation of this goal is a tool being developed called “PrivacyPal,” an internet browser extension that tells users in real-time what parts of a website’s privacy policy are relevant to them.

“If a user enters their name or their email address on a website registration page, PrivacyPal will pop up and say whether or not the website will sell that information, or whether or not the user may opt out from letting the website do that,” says Jonathan Liao, a graduate student in the Institute for Software Research (ISR) in the School of Computer Science, and one of five students who worked on PrivacyPal for a class project.

“The major problem that a lot of us have faced is that privacy policies are too long, too convoluted, and very tough to understand,” says ISR graduate student Arnab Kumar, another member of the student team. “Because of that, it’s very important that when a person is entering personal information online that they understand how it is being used.”

Currently, PrivacyPal’s process of “distilling” relevant information from websites’ privacy policies is manual; the researchers have to read the full policies and manually pull out the information they know is relevant to a particular website. In the future, the team hopes to rely on machine learning to make the process fully automated, so the tool can be used with any website with a privacy policy.

“We want to make everyone privacy-aware one day,” says Kumar. “That’s our main goal.”

Other members of the PrivacyPal team include ISR graduate students Zheng Zong, Harishma Dayanidhi, and Vijay Kumar Kalanji Sakharam.

See all CyLab News articles