Skip to main content

Statistical Methodology and Disclosure Limitation

Researcher: Stephen Fienberg


Statistical Methodology and Disclosure Limitation

Confidentiality and privacy are widely conceived of as ethical matters and they impinge on all facets of the transmission and storage of information on computer networks and systems. While many believe that encryption and other physical safeguards offer the prospect of privacy protection, we believe that attention must be paid to this issue in other ways, largely because of unanticipated development of possible "intrusion engines" and data mining approaches to the breaking of codes and identification.

The CMU researchers working on disclosure limitation are part of a team assembled by the National Institute of Statistical Sciences (NISS) to develop a Web-based query system that allows the use of disclosure limitation methods applied sequentially in response to a series of statistical queries in which the public knowledge of releases is cumulative.

As part of the center's research program, we propose to extend the development of the approaches to simulation and bounds developed in the literature, to be applicable to databases of the diverse sorts stored on and transmitted over computer networks. We plan to investigate further the technologies of reconstructing statistical databases from a set of partial data releases that may involve marginal distributions from the full data set, or possibly conditional distributions. The idea is to provide usable information while at the same time preserving privacy and confidentiality. Developing statistical methods that scale to large databases and that run in real time are essential to the long-term success of computer network security efforts.

NSF funded Digital Government