Misspecification, and Uncertainty Quantification in Differential Privacy
Differential privacy (DP) offers privacy guarantees to ensure arbitrarily powerful adversaries gain relatively little knowledge through sanitized statistical outputs, and DP’s methodological transparency makes it possible to understand the effect of privacy-preserving errors on downstream inferences. Both properties are laudable, but they’re also both embedded in a deeper set of statistical assumptions about models for disclosure risk and privacy-preserving inference. In large parts of DP research, these modeling choices are intentionally abstracted away to focus on mechanism design; in practice, however, we need to be more attentive to these modeling assumptions, and what consequences we may face should we mis-specify them. Thankfully, neither of these problems are methodologically new: tools from statistical disclosure limitation, semiparametric and missing data theory, and measurement error methodology have encountered these issues before in numerous contexts. This “big picture” talk articulates connections between these literatures and modern research in DP. In doing so, we outline research questions and methodological approaches to productively work at the intersection of statistics and computer science. [Note to organizers: I’m not sure what’s an appropriate scope for the longer talks, so I’m happy to focus on one of these problems in more technical detail or keep the big picture approach. I'm also happy to narrow the scope to focus only on our more recent results]