A behavioral health clinic in Connecticut received a $125,000 penalty from OCR after disclosing a spreadsheet of patient data that staff believed was "de-identified" — but still contained dates of birth, ZIP codes, and medical record numbers. The clinic's compliance officer assumed removing patient names was sufficient. It wasn't. Understanding how many patient identifiers are required to be removed under HIPAA is not optional — it's a foundational compliance obligation that protects your organization from exactly this kind of enforcement action.
How Many Patient Identifiers Are Required to Be Removed Under HIPAA?
The HIPAA Privacy Rule at 45 CFR §164.514(b) defines exactly 18 specific identifiers that must be removed from protected health information (PHI) to qualify as de-identified data under the Safe Harbor method. Not 5, not 10 — all 18. There is no partial credit.
If even one of these identifiers remains in a dataset, the information is still PHI and subject to every Privacy Rule protection, disclosure restriction, and breach notification requirement. Healthcare organizations consistently underestimate this threshold, assuming that removing a name or Social Security number is enough.
The Complete List of 18 HIPAA Patient Identifiers
- Names
- Geographic data smaller than a state (street address, city, ZIP code, equivalent geocodes)
- All dates directly related to an individual (birth date, admission date, discharge date, date of death) — and all ages over 89
- Telephone numbers
- Fax numbers
- Email addresses
- Social Security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate/license numbers
- Vehicle identifiers and serial numbers (including license plates)
- Device identifiers and serial numbers
- Web URLs
- IP addresses
- Biometric identifiers (fingerprints, voiceprints)
- Full-face photographs and comparable images
- Any other unique identifying number, characteristic, or code
That 18th identifier — "any other unique identifying number" — is the catch-all that trips up many covered entities. Internal tracking codes, research subject IDs that could be cross-referenced, and custom database keys all fall under this category unless they cannot be used to re-identify individuals.
Safe Harbor vs. Expert Determination: Two Paths to De-Identification
The Privacy Rule gives your organization two methods to de-identify PHI. The Safe Harbor method requires removal of all 18 identifiers and the organization must have no actual knowledge that the remaining information could identify an individual. Most organizations default to Safe Harbor because it provides a concrete, auditable checklist.
The Expert Determination method under 45 CFR §164.514(b)(1) allows a qualified statistical expert to certify that the risk of re-identification is "very small." This approach is more flexible but requires documented methodology and formal expert opinion — resources most small and mid-size covered entities don't have readily available.
In either case, the burden of proof is on your organization. OCR will not accept a good-faith assumption that data has been de-identified. You need documentation.
The Workforce Training Gap That Creates Identifier Violations
In my work with covered entities, the most common gap isn't a policy failure — it's a training failure. Staff who handle PHI daily often cannot name more than four or five of the 18 identifiers. They don't realize that a ZIP code paired with a diagnosis date is enough to re-identify a patient in a small community.
The Privacy Rule at 45 CFR §164.530(b) requires that your workforce be trained on policies and procedures relevant to their job functions. For anyone who touches patient data — from analysts creating reports to billing staff exporting claims data — understanding patient identifiers is directly relevant.
This is exactly why structured HIPAA training and certification programs include dedicated modules on de-identification standards. Generic annual training that glosses over identifiers in a single slide is insufficient when your staff is making de-identification decisions in spreadsheets every week.
Why "Just Remove the Name" Is a Dangerous Compliance Myth
OCR enforcement actions consistently demonstrate that partial de-identification creates a false sense of security. A 2023 HHS guidance update reinforced that re-identification risk increases dramatically when even two or three identifiers remain in a dataset — particularly geographic data combined with dates.
Research published in health informatics journals has shown that 87% of the U.S. population can be uniquely identified using only ZIP code, date of birth, and sex. Your organization cannot afford to treat de-identification as a casual task delegated to whoever builds the next report.
The minimum necessary standard under the Privacy Rule adds another layer: even when sharing fully identifiable PHI for permitted purposes, your covered entity must limit disclosures to the minimum necessary information. Patient identifiers should never travel with data unless a specific use or disclosure justifies their inclusion.
Practical Steps to Audit Your Identifier Removal Process
Start with a data inventory. Identify every workflow in your organization where PHI leaves its primary system — reports, research datasets, business associate transmissions, quality improvement extracts, and patient communications.
For each workflow, document which of the 18 identifiers are present and whether removal is performed before sharing. Map who performs the removal, what tools they use, and how completion is verified. This directly supports the risk analysis requirement under the Security Rule at 45 CFR §164.308(a)(1).
Then close the loop with training. Every workforce member involved in these workflows should demonstrate competency on identifier removal — not just awareness. Investing in comprehensive HIPAA workforce compliance turns your staff from a liability into your strongest safeguard against improper disclosure.
What Happens When Identifiers Are Not Properly Removed
If your organization shares data it believes is de-identified but actually contains one or more of the 18 identifiers, that disclosure is a PHI disclosure. If it was unauthorized, it may constitute a HIPAA violation subject to OCR investigation.
Under the Breach Notification Rule at 45 CFR Part 164, Subpart D, improperly de-identified data that is accessed or disclosed without authorization triggers breach analysis. If the data includes identifiers, the presumption is that a breach has occurred unless your organization can demonstrate a low probability of compromise through the four-factor risk assessment.
Penalties under the HITECH Act's tiered structure range from $137 per violation for unknowing infractions up to $2,067,813 per violation category per year for willful neglect. Even at the lowest tier, a dataset with thousands of patient records means thousands of individual violations.
The Bottom Line for Your Organization
The answer to how many patient identifiers are required to be removed is unambiguous: all 18, every time, with documentation to prove it. Half-measures create regulatory exposure, erode patient trust, and put your organization one OCR complaint away from an investigation. Build the training, build the process, and verify both continuously.