In 2023, OCR settled with a covered entity for $1.3 million after an investigation revealed the organization had misclassified certain data as non-PHI — and failed to apply Privacy Rule protections. The root cause wasn't malicious intent. It was a fundamental misunderstanding of which is not considered protected health information under HIPAA. This confusion is more common than most compliance officers want to admit, and it creates risk that ripples across every department in a healthcare organization.

Which Is Not Considered Protected Health Information Under HIPAA

The Privacy Rule at 45 CFR §160.103 defines protected health information as individually identifiable health information that is created, received, maintained, or transmitted by a covered entity or business associate. PHI includes demographic data, medical histories, test results, insurance information, and any other data that can identify a patient and relates to their health condition, treatment, or payment.

So what falls outside that definition? Understanding the boundaries is critical for your workforce. Here are the primary categories of information that are not considered PHI:

  • De-identified data: Information that has been stripped of all 18 identifiers specified in 45 CFR §164.514(b) is no longer PHI. This includes names, dates, geographic data smaller than a state, Social Security numbers, and more. Once properly de-identified — either through expert determination or the safe harbor method — HIPAA protections no longer apply.
  • Employment records: Records maintained by a covered entity in its role as an employer are explicitly excluded from PHI. Your employee attendance records, workers' compensation files, and HR documentation are not PHI, even if they contain health-related details — provided they are held in the employment context.
  • Education records: Records covered by the Family Educational Rights and Privacy Act (FERPA) are excluded from HIPAA's definition of PHI. Student health records maintained by a school that receives Department of Education funding fall under FERPA, not HIPAA.
  • Data about deceased individuals (after 50 years): PHI protections apply to deceased persons for 50 years following death. After that window, the information is no longer considered protected health information.
  • Aggregate or statistical data with no individual identifiers: Summary data that cannot be traced to any individual — such as population-level health statistics — does not qualify as PHI.

The De-Identification Threshold Your Organization Must Meet

In my work with covered entities, the most dangerous assumption I encounter is that simply removing a patient's name makes data "de-identified." It does not. The safe harbor method under 45 CFR §164.514(b)(2) requires the removal or generalization of all 18 specified identifiers, and the covered entity must have no actual knowledge that the remaining information could re-identify an individual.

The alternative — expert determination under §164.514(b)(1) — requires a qualified statistical or scientific expert to confirm that the risk of re-identification is very small. Most organizations lack the resources for this approach, which is why the safe harbor method dominates in practice.

If your organization shares data sets for research, analytics, or operational improvement, every member of your workforce needs to understand where the PHI boundary starts and ends. Proper HIPAA training and certification should cover de-identification requirements in detail — not as an afterthought.

Why Employment Records Create Confusion for Covered Entities

Healthcare organizations are unique because they are simultaneously covered entities and employers. A hospital's HR department may hold employee drug screening results, disability accommodation requests, and health insurance enrollment forms. None of these are PHI when held in the employment context — even though identical information in a clinical setting would be fully protected.

The confusion intensifies when an employee is also a patient at the same facility. OCR has made clear that the role in which data is collected and maintained determines its classification. If an employee's medical record exists in the clinical system, it is PHI. If the same employee's absence note sits in an HR file, it is an employment record. Your policies must draw this line explicitly, and your workforce must be trained to respect it.

Common Misconceptions That Lead to HIPAA Violations

Healthcare organizations consistently struggle with several gray areas. Misclassifying any of these can lead to a HIPAA violation — or to under-protecting data that actually qualifies as PHI:

  • Health information shared verbally: Some staff assume that PHI only exists in written or electronic form. Verbal disclosures of individually identifiable health information are covered by the Privacy Rule and subject to the minimum necessary standard.
  • Data held by non-covered entities: Health data collected by fitness apps, wellness programs not sponsored by a covered entity, or consumer health devices generally falls outside HIPAA's scope. However, the moment a business associate or covered entity receives that data alongside individual identifiers, it becomes PHI.
  • Billing codes without patient names: A diagnosis code alone may seem harmless, but if it is linked to any of the 18 identifiers — even a medical record number — it is PHI. The "individually identifiable" element is broader than most staff realize.

Every one of these scenarios should be addressed in your organization's Notice of Privacy Practices and reinforced through regular workforce training. If your team cannot confidently answer which is not considered protected health information, your risk analysis has a gap.

Build PHI Classification Into Your Risk Analysis Process

The Security Rule requires covered entities and business associates to conduct a thorough risk analysis under 45 CFR §164.308(a)(1). That analysis should explicitly address how your organization classifies, labels, and handles data at the PHI boundary. OCR enforcement actions consistently cite insufficient risk analysis as a contributing factor in breaches involving misclassified data.

Practical steps your organization should take immediately:

  • Audit all data repositories — clinical, administrative, research, and HR — and classify each as PHI or non-PHI with documented justification.
  • Review your de-identification workflows to confirm full compliance with the 18-identifier safe harbor standard.
  • Train every workforce member — not just clinical staff — on what qualifies as PHI and what does not.
  • Update your policies and your Notice of Privacy Practices to reflect clear data classification guidelines.

If your organization has not conducted comprehensive workforce HIPAA compliance training in the past 12 months, this is where to start. OCR expects ongoing education — not a one-time checkbox exercise.

The Compliance Cost of Getting PHI Classification Wrong

Misclassifying data creates exposure in both directions. Treating non-PHI as PHI wastes resources and slows operations. Treating PHI as non-PHI opens the door to Breach Notification Rule obligations, OCR investigations, and civil monetary penalties that range from $141 to $2,134,831 per violation category under the adjusted penalty tiers.

Neither outcome serves your organization. The fix starts with precision — knowing exactly which is not considered protected health information, documenting that classification, and training your workforce to apply it consistently across every system and workflow your covered entity touches.