IRB Corner: Understanding Privacy, Confidentiality, and Anonymity

IRB Corner: Understanding Privacy, Confidentiality, and Anonymity

Here in the IRB Office we regularly see descriptions of research projects where researchers appear to confuse the terms privacy, confidentiality, and anonymity. This month in the IRB Corner we will define these important terms and provide background information to help researchers understand why the distinctions between privacy, confidentiality, and anonymity are critical to the protection of human subjects.

What is Privacy?

Privacy is about people and the information they share about themselves. Privacy involves the ability to decide what information people share, how much information is shared, when the information is shared, with whom the information is shared, and the conditions around sharing that personal information. Individuals have the right to control and protect their private information, and individuals decide what information is deemed private.

What is Confidentiality?

Confidentiality is about data and the practice of managing information to protect an individual’s privacy. Confidentiality involves an agreement about how identifiable information will be maintained and who will have access to the information. One benefit of confidentiality is that it helps establish trust between the researcher and research participant. Research participants are more likely to volunteer to participate in research if they trust that the researcher will maintain their private information.

What is Anonymous?

Anonymous data are data that have been collected without any personally identifying information or identifiers. Anonymous data are never linked to an individual.

What is the Difference between Confidential Data and Anonymous Data?

Confidential data are data that have (or have had) identifiers, but the researcher has established a process for protecting an individual’s privacy by managing this information so as to not share the identifying information with anyone. Whereas anonymous data never have identifiers, confidential data have identifiers but are managed in such a way as to protect the privacy of research participants. One way researchers establish confidentiality is to aggregate data. Another way researchers establish confidentiality is to use a coding process to separate identifiers from the data. Any time a researcher has a list of participants with a method of making contact, such as an individual’s name, email address, phone number, or code/link, the researcher has identifiable data and is responsible for establishing a plan to keep the data confidential.

Data cannot be both anonymous and confidential. Data collected without identifiers are considered anonymous. Data that have, or have had, identifiers are confidential and must be maintained in a way to protect the privacy of participants.

Maintaining Confidentiality and Protecting Privacy

A critical element of the informed consent process is to describe to the research participant how materials with identifying information will be managed and who will have access to these materials. In other words, the informed consent process stipulates how confidentiality will be maintained, which allows subjects to determine whether there are necessary precautions in place to protect their privacy. Researchers should keep in mind that when identifiers are removed from data, or when a study involves anonymous data, the researcher cannot promise to remove a subject’s data from the study if the subject chooses to withdraw from the study. So, although participants should be given the opportunity to withdraw from research at any time (i.e., participation should be voluntary), the researcher should clearly communicate how data from the subject will be maintained.

Part of the IRB review is to evaluate whether the researcher’s proposed data management strategy is sufficient to protect the privacy of research participants. The IRB review also assesses whether the stated plans will maintain the confidentiality of the data throughout the proposed research process.

One mechanism for maintaining confidentiality is to collect data anonymously, or without any personal identifiers. In projects where it is not possible to collect data anonymously, data can be de-identified through a process of removing and destroying identifying information. Removing identifiers as soon as possible in the research process is a best practice for reducing potential breaches to confidentiality. The longer a study requires identifiers be maintained, the greater the potential risk to human subjects and the more protections required as part of the data security plans. Even if you remove obvious identifiers, such as a code that links data to a recruitment list of names with contact information, you may not have de-identified the data. For example, if you have a small population and you are collecting demographic information, such as gender, race, age, and/or position, the combination of demographic items associated with data may identify a participant. As a researcher develops a data management plan, it is important to remember that there is not one right way to ensure confidentiality. It is important for researchers to consider strategies that best protect participants’ privacy.

Additional Questions about Protecting Privacy and Confidentiality

For more information on confidentiality and data management, please see the Recommendations of Confidentiality and Research Data Protections by the National Human Research Protections Advisory Committee ( For more on the University of Phoenix IRB, please see the UOPX Research Hub: The UOPX IRB also maintains a library of guidance materials within IRBNet ( under the Forms and Templates tab from the left menu. Additional questions about the role of confidentiality in research can be directed to the IRB Office at