Written by HITRUST Independent Security Journalist Sean Martin.
It’s a noble goal: When healthcare data about patients is used and shared appropriately, there are enormous benefits for both the healthcare industry and society overall, by expanding crucial research, improving health outcomes for patients, preventing the spread of infectious diseases, and lowering the cost of health care. The key challenge is how to achieve these admirable goals while still protecting patient privacy.
One of the most effective means of achieving these broad societal goals while still protecting privacy involves “de-identifying” the data, which requires the masking or removal of all personal identifiers that could be used to identify the individual. To be effective, a de-identification (or “de-ID”) program should protect the patient’s identity, retaining at most a very small risk that an individual could be re-identified. If the de-identification approach isn’t sufficient, the resulting information could be used to re-identify the individual, and could put the individual at risk for various potential harms, including identify theft or reputational damage. At the same time, if too many identifiers are removed, there’s no significant increase in privacy protection — but the data is no longer useful for these societal purposes.
Because of the enormous variety and volume of personal data that can be useful for society, it’s often not easy to de-ID health information in a way that creates this “very small” risk of re-identification. Some claim (although often without meaningful data) that it’s easy to re-ID health data, especially when the data is associated with other information that may be available publicly, such as in voter registries or social networks. Despite these claims, there’s little data to support the idea that properly de-identified data sets realistically can be re-identified. To address these issues, however, there’s even a new field of academic research focused on understanding re-identification attacks from de-identified data.
For the healthcare industry, the HIPAA Privacy Rule defines two methods of de-identifying patient data, says Kimberly Gray, Chief Privacy Officer of IMS Health. The first, called the “Safe Harbor” method, requires the removal of a specific set of direct identifiers and quasi-identifiers. If the data is properly scrubbed to the Safe Harbor requirements, the “healthcare organization has met the requirements of the HIPAA rules – but often the data that remains isn’t very useful for research or other beneficial public purposes.”
The second HIPAA approach, Ms. Gray explains, is called the “Expert Determination method.” This approach — also defined by the HIPAA rules — uses scientific, mathematical and statistical methods as evaluated by an appropriate “expert,” to eliminate identifiers in a way that shows that the risk is “very small” that the remaining data can be used to re-identify any individual. Because an “expert” must be involved, the steps needed to meet these Expert Determination requirements can be complicated, and involve both sophisticated statistical analysis and thorough documentation. Read more about the HIPAA approaches to de-identification in “Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule,” from the U.S. Department of Health and Human Services.
Since the passage of the HIPAA rules, there’s been a lot of debate about the de-identification methods. Some people think the safe harbor method, although considered easier to implement than the expert determination approach, isn’t very useful. Observes Peter Dumont, Senior Director of Privacy at Optum, “Safe Harbor was put in place when the HIPAA Privacy Rule went into effect, around 2003, and the industry now realizes there can be privacy risk in utilizing this less-savvy approach… You could conceptualize sufficient indirect identifiers to create a risk of re-ID. With today’s Big Data analysis techniques, those situations are more likely to occur.” At the same time, he notes, many Safe Harbor data sets aren’t useful for most research purposes, as too much information has been removed to allow for meaningful analysis.
The expert determination method generally leads to better data and effective privacy protection, notes IMS Health’s Ms. Gray. At the same time, many in the healthcare industry find that the Expert Determination techniques are not only quite complex, but it’s beyond many healthcare organizations to know how to do it properly – and be sure that their de-ID work will meet the approval of regulators like the U.S. Department of Health and Human Services, the U.S. Federal Trade Commission, or equivalent organizations in other countries.
The answer lies in the HITRUST De-Identification Framework, assembled to help healthcare organizations and third parties in the health industry with the de-ID challenge. The Framework provides a consistent, managed methodology for the de-identification of personal data and the sharing of compliance and risk information amongst entities and their key stakeholders. It’s designed to meet the HIPAA requirements, and also can be used in other situations where the HIPAA rules don’t apply, to provide an overall safe and effective means of utilizing data while still protecting privacy.
The working group behind the HITRUST Framework identified 12 criteria for a successful de-ID program and methodology that can be scaled for use with any organization. Those criteria are divided into two groups:
- The administrative controls that an organization should have in place to govern de-identification.
- How the organization can actually arrive at a de-identified data set, either on an ad hoc basis or by instituting a process that will deliver de-identified data sets.
Healthcare organizations can download the De-Identification Framework free of charge.
The best advice from privacy experts: “Use the HITRUST Framework! De-ID is risky if you don’t understand the ins and out of it,” says Optum’s Dumont. “There’s 20+ years of re-ID attacks that are obvious minefields. If you repeat those errors, you are putting yourself and your patients at risk.”
IMS Health’s Gray adds that healthcare regulators are also embracing the HITRUST De-Identification Framework, and may use it to help evaluate the efficacy of de-ID programs used by healthcare organizations and third parties. “If I’m a regulator, I can compare against the HITRUST De-ID Framework.” If you’re following the Framework, and implementing it correctly, the regulators should be pleased with your work. “We have discussed this with many U.S. regulators. They are very supportive. That’s good for the organizations doing de-ID on a day-to-day basis: Those who will be judging them are favorable to this.”
HITRUST offers the Framework and certifications to help organizations and individuals develop the expertise to effectively create and manage de-ID programs. Learn more on the HITRUST website.