CyLab researchers set to present their work at PETS 2024

Michael Cunningham

Jul 5, 2024

Carnegie Mellon University faculty members and students shared their research at the 2024 Privacy Enhancing Technologies Symposium (PETS).

Held annually, the 24th PETS awas a hybrid event taking place July 15-20, with a physical gathering held in Bristol, UK and a concurrent virtual event.

PETS brings together privacy experts from around the world to discuss recent advances and new perspectives on research in privacy technologies. PETS addresses the design and realization of privacy services for the Internet and other digital systems and communication networks.

Here, we’ve compiled a list of papers co-authored by CyLab researchers that were presented at the event.

MAPLE: MArkov Process Leakage attacks on Encrypted Search

Authors: Seny Kamara, MongoDB & Brown University; Abdelkarim Kati, Mohammed VI Polytechnic University; Tarik Moataz, MongoDB; Jamie DeMaria, Brown University; Andrew Park, Carnegie Mellon University; and Amos Treiber, Rohde & Schwarz Cybersecurity GmbH

Abstract: Encrypted search algorithms (ESAs) enable private search on encrypted data and can be constructed from a variety of cryptographic primitives. All known sub-linear ESA algorithms leak information and, therefore, the design of leakage attacks is an important way to ascertain whether a given leakage profile is exploitable in practice. Recently, Oya and Kerschbaum (Usenix ’22) presented an attack called IHOP that targets the query equality pattern—which reveals if and when two queries are for the same keyword—of a sequence of dependent queries.

In this work, we continue the study of query equality leakage on dependent queries and present two new attacks in this setting which can work either as known-distribution or known-sample attacks. They model query distributions as Markov processes and leverage insights and techniques from stochastic processes and machine learning. We implement our attacks and evaluate them on real-world query logs. Our experiments show that they outperform the state-of-the-art in most settings but also have limitations in practical settings.

Exploring the Privacy Experiences of Closeted Users of Online Dating Services in the US

Authors: Elijah Bouma-Sims, Sanjnah Ananda Kumar, and Lorrie Cranor, Carnegie Mellon University

Abstract: Online dating services present significant privacy risks, especially for LGBTQ+ people who are "in the closet" and have not shared their LGBTQ+ identity with others. We conducted a survey (n = 114) and nine follow-up interviews with US-based, closeted users of online dating services focused on their privacy experience. We found that participants in the study were strongly concerned about the risk of being seen by social relations and institutional data sharing practices like targeted advertising. Participants experienced a range of privacy and safety harms, including inadvertent outing, unauthorized saving and sharing of photos, extortion, and harassment. To protect their privacy, participants typically limited the amount of information and the photos they included in their profile. In order to improve their privacy experience, participants requested better profile visibility controls, limits on the ability of others to download or screenshot their photos, better user verification, and making premium privacy features available for free.

Data Safety vs. App Privacy: Comparing the Usability of Android and iOS Privacy Labels

Authors: Yanzi Lin, Wellesley College; Jaideep Juneja, Carnegie Mellon University; Eleanor Birrell, Pomona College; and Lorrie Cranor, Carnegie Mellon University

Abstract: Privacy labels—standardized, compact representations of data collection and data use practices—are often presented as a solution to the shortcomings of privacy policies. Apple introduced mandatory privacy labels for apps in its App Store in December 2020; Google introduced mandatory labels for Android apps in July 2022. iOS app privacy labels have been evaluated and critiqued in prior work. In this work, we evaluated Android Data Safety Labels and explored how differences between the two label designs impact user comprehension and label utility. We conducted a between-subjects, semi-structured interview study with 12 Android users and 12 iOS users. While some users found Android Data Safety Labels informative and helpful, other users found them too vague. Compared to iOS App Privacy Labels, Android users found the distinction between data collection groups more intuitive and found explicit inclusion of omitted data collection groups more salient. However, some users expressed skepticism regarding elided information about collected data type categories. Most users missed critical information due to not expanding the accordion interface, and they were surprised by collection practices excluded from Android’s definitions. Our findings also revealed that Android users generally appreciated information about security practices included in the labels, and iOS users wanted that information added.

NOTRY: Deniable messaging with retroactive avowal

Authors: Faxing Wang, University of Melbourne; Shaanan Cohney, University of Melbourne; Riad Wahby, Carnegie Mellon University; and Joseph Bonneau, a16z crypto research

Abstract: Modern secure messaging protocols typically aim to provide deniability. Achieving this requires that convincing cryptographic transcripts can be forged without the involvement of genuine users. In this work, we observe that parties may wish to revoke deniability and avow a conversation after it has taken place. We propose a new protocol called Not-on-the-Record-Yet (NOTRY) which enables users to prove a prior conversation transcript is genuine. As a key building block we propose avowable designated verifier proofs which may be of independent interest. Our implementation incurs roughly 8× communication and computation overhead over the standard Signal protocol during regular operation. We find it is nonetheless deployable in a realistic setting as key exchanges (the source of the overhead) still complete in just over 1ms on a modern computer. The avowal protocol induces only constant computation and communication performance for the communicating parties and scales linearly in the number of messages avowed for the verifier—in the tens of milliseconds per avowal.

Crumbling Cookie Categories: Deconstructing Common Cookie Categories to Create Categories that People Understand

Authors: Soha Jiwani, Rachna Sasheendran, Adhishree Abhyankar, Elijah Bouma-Sims, and Lorrie Cranor, Carnegie Mellon University

Abstract: Users of online services often encounter cookie banners that ask them to consent to different categories of cookies. Frequently, these categories are labeled using the four categories defined by the 2012 Cookie Guide from the UK’s International Chamber of Commerce (ICC). However, prior research suggests that users have difficulty understanding what these category labels actually mean. We conducted a four-part study to identify labels that more intuitively convey the four cookie categories. First, we crowdsourced new category labels. We then evaluated users’ comprehension and sentiment towards the labels in a series of surveys focused on definitions and hypothetical scenarios. Finally, we selected a new slate of category labels based on the results of the prior surveys, and conducted a between-subjects, online behavioral experiment to compare the new slate with the original labels. We ultimately recommend that the industry adopt the category label “anonymous analytics cookies” in lieu of the term “performance cookies” and “extra functionality cookies” instead of “functional cookies.” Adopting our recommended terms would both improve the usability of current cookie consent interfaces and any future privacy consent mechanisms that use the same categorization. We also recommend revisiting the categories themselves as the distinctions between these categories do not seem to be well understood and may not reflect useful distinctions for privacy decision making.

MicroSecAgg: Streamlined Single-Server Secure Aggregation

Authors: Yue Guo, JP Morgan AI Research; Antigoni Polychroniadou, JP Morgan AI Research; Elaine Shi, Carnegie Mellon University; David Byrd, Bowdoin College; and Tucker Balch, J.P. Morgan AI Research

Abstract: This work introduces MicroSecAgg, a framework that addresses the intricacies of secure aggregation in the single-server landscape, specifically tailored to situations where distributed trust among multiple non-colluding servers presents challenges. Our protocols are purpose-built to handle situations featuring multiple successive aggregation phases among a dynamic pool of clients who can drop out during the aggregation. Our different protocols thrive in three distinct cases: firstly, secure aggregation within a small input domain; secondly, secure aggregation within a large input domain; and finally, facilitating federated learning for the cases where moderately sized models are considered. Compared to the prior works of Bonawitz et al. (CCS 2017), Bell et al. (CCS 2020), and the recent work of Ma et al. (S&P 2023), our approach significantly reduces the overheads. In particular, MicroSecAgg halves the round complexity to just 3 rounds, thereby offering substantial improvements in communication cost efficiency. Notably, it outperforms Ma et al. by a factor of n on the user side, where n represents the number of users. Furthermore, in MicroSecAgg the computation complexity of each aggregation per user exhibits a logarithmic growth with respect to n, contrasting with the linearithmic or quadratic growth observed in Ma et al. and Bonawitz et al., respectively. We also require linear (in n) computation work from the server as opposed to quadratic in Bonawitz et al., or linearithmic in Ma et al. and Bell et al. In the realm of federated learning, a delicate tradeoff comes into play: our protocols shine brighter as the number of participating parties increases, yet they exhibit diminishing computational efficiency as the sheer volume of weights/parameters increases significantly.

We report an implementation of our system and compare the performance against prior works, demonstrating that MicroSecAgg significantly reduces the computational burden and the message size.

What Do Privacy Advertisements Communicate to Consumers?

Authors: Xiaoxin Shen, Carnegie Mellon University; Eman Alashwali, King Abdullah University of Science and Technology (KAUST), King Abdulaziz University (KAU), and Carnegie Mellon University; and Lorrie Cranor, Carnegie Mellon University

Abstract: When companies release marketing materials aimed at promoting their privacy practices or highlighting specific privacy features, what do they actually communicate to consumers? In this paper, we explore the impact of privacy marketing materials on: (1) consumers' attitude towards the organizations providing the campaigns, (2) overall privacy awareness, and (3) the actionability of suggested privacy advice. To this end, we investigated the impact of four privacy advertising videos and one privacy game published by five different technology companies. We conducted 24 semi-structured interviews with participants randomly assigned to view one or two of the videos or play the game. Our findings suggest that awareness of privacy features can contribute to positive perceptions of a company or its products. The ads we tested were more successful in communicating the advertised privacy features than the game we tested. We observed that advertising a single privacy feature using a single metaphor in a short ad increased awareness of the advertised feature. The game failed to communicate privacy features or motivate study participants to use the features. Our results also suggest that privacy campaigns can be useful for raising awareness about privacy features and improving brand image, but may not be the most effective way to teach viewers how to use privacy features.

Automatic generation of web censorship probe lists

Authors: Jenny Tang, Carnegie Mellon University; Leo Alvarez, EPFL and CMU; Arjun Brar, Carnegie Mellon University; Nguyen Phong Hoang, University of Chicago; and Nicolas Christin, Carnegie Mellon University

Domain probe lists—used to determine which URLs to probe for Web censorship—play a critical role in Internet censorship measurement studies. Indeed, the size and accuracy of the domain probe list limits the set of censored pages that can be detected; inaccurate lists can lead to an incomplete view of the censorship landscape or biased results. Previous efforts to generate domain probe lists have been mostly manual or crowdsourced. This approach is time-consuming, prone to errors, and does not scale well to the ever-changing censorship landscape.

In this paper, we explore methods for automatically generating probe lists that are both comprehensive and up-to-date for Web censorship measurement. We start from an initial set of 139,957 unique URLs from various existing test lists consisting of pages from a variety of languages to generate new candidate pages. By analyzing content from these URLs (i.e., performing topic and keyword extraction), expanding these topics, and using them as a feed to search engines, our method produces 119,255 new URLs across 35,147 domains. We then test the new candidate pages by attempting to access each URL from servers in eleven different global locations over a span of four months to check for their connectivity and potential signs of censorship. Our measurements reveal that our method discovered over 1,400 domains—not present in the original dataset—we suspect to be blocked. In short, automatically updating probe lists is possible, and can help further automate censorship measurements at scale.