An Introduction to Disclosure Risk Assessment
Data from household surveys, needs assessments and other forms of microdata make up an increasingly significant volume of data in the humanitarian sector. This type of data is critical to determining the needs and perspectives of people affected by crises but it also presents unique risks. Understanding how to assess and manage the sensitivity of this data is essential to ensuring its safe, ethical, and effective use in different response contexts.
The learning path can be completed in under one hour and includes:
1. An overview of Statistical Disclosure Control;
2. A series of explanatory videos on the steps of the risk assessment process; and
3. An optional technical tutorial demonstrating how to perform the assessment using sdcMicro.
Why It's Important
Humanitarian actors acknowledge the sensitivity of personal data.
Within the humanitarian sector, personal data such as names, biometric data, or ID numbers are understood to be sensitive. This data should be anonymized, as a matter of standard practice, before being shared. However, even after removing the direct identifiers, it may still be possible to re-identify respondents.
After removing direct identifiers, re-identification may still be possible.
By combining different data points, it may be possible to re-identify individuals or disclose confidential information. Humanitarians can apply Statistical Disclosure Control to microdata to help detect and reduce this type of risk.
Finding ways to quickly and safely share microdata is essential.
During emergencies, microdata needs to be shared with partners as quickly and safely as possible. Having processes and tools in place to consistently assess and reduce the disclosure risk of this data enables organizations to share data in a safe, ethical and effective way.
The Stages of Statistical Disclosure Control
Limiting the risk of the disclosure using statistical disclosure control techniques includes three distinct stages. Through these three stages, you will assess the disclosure risk of your data and then take steps to limit that risk. Because applying disclosure control techniques will result in information loss, the final stage of the process involves quantifying that loss in order to strike a balance between utility and risk in your data.
General Questions
Microdata provides information on a set of variables for each respondent in a dataset, where respondents can be individuals, households or establishments. In humanitarian settings, this type of data is gathered through exercises such as a Multi-Sector Needs Assessment (MSNA), community feedback & perception surveys, and other forms of needs assessments, household surveys, or monitoring activities.
SDC techniques are intended to prevent identity and attribute disclosure. Identity disclosure occurs when it is possible to associate a known individual with a released data record. Attribute disclosure occurs when it is possible to determine some new characteristics of an individual based on the information available in the released data
SDC techniques are not intended to prevent a third type of disclosure – inferential disclosure. Inferential disclosure occurs when it is possible to determine the value of some characteristic of an individual more accurately with the released data than would otherwise have been possible.
Statistical Disclosure Control is a technique used in statistics to assess and limit the risk of re-identification. The first step in this process is a disclosure risk assessment. This tutorial covers the steps required to conduct that risk assessment.