A hospital receptionist emails a patient's lab results to the wrong address. A dental office posts a before-and-after photo on Instagram — with the patient's full name visible in the background on a chart. A billing clerk leaves a spreadsheet of patient names and diagnosis codes open on a shared computer. Every one of these scenarios involves protected health information, and every one of them can trigger a federal investigation. Understanding what is considered PHI under HIPAA isn't academic — it's the difference between a normal Tuesday and a six-figure penalty.
I've spent years helping covered entities — hospitals, clinics, health plans, clearinghouses — untangle this exact question. And the answer is more expansive than most people expect.
What Is Considered PHI Under HIPAA? The Core Definition
Protected health information (PHI) is any individually identifiable health information that a covered entity or its business associate creates, receives, maintains, or transmits. That's the legal definition from HHS's Privacy Rule guidance. But let me break it into plain language.
PHI has two components. First, there must be health information — anything related to a person's past, present, or future physical or mental health condition, the provision of healthcare, or payment for healthcare. Second, that information must be linked to an identifier that can tie it to a specific individual.
Strip away the identifier, and it's no longer PHI. Keep the identifier but remove the health component, and it's still not PHI. Both pieces have to exist together.
The 18 Identifiers That Make Health Data PHI
The Privacy Rule spells out exactly 18 types of identifiers. If health information is paired with any one of these, you're dealing with PHI:
- Names
- Geographic data smaller than a state (street address, city, ZIP code)
- All dates directly related to an individual (birth date, admission date, discharge date, date of death) — and all ages over 89
- Phone numbers
- Fax numbers
- Email addresses
- Social Security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate or license numbers
- Vehicle identifiers and serial numbers (including license plates)
- Device identifiers and serial numbers
- Web URLs
- IP addresses
- Biometric identifiers (fingerprints, voiceprints)
- Full-face photographs and comparable images
- Any other unique identifying number, characteristic, or code
That last one is the catch-all, and it's intentionally broad. I've seen organizations assume they were safe because they used an internal patient ID instead of a name. That internal ID is still an identifier under category 18.
Why ZIP Codes and Dates Trip People Up
Here's where I see the most confusion. A researcher strips names from a dataset but leaves ZIP codes and dates of service. That dataset is still PHI. HHS has been clear: geographic subdivisions smaller than a state qualify as identifiers. And dates — not just birthdays, but admission dates, procedure dates, and discharge dates — all count.
The HHS de-identification guidance outlines two methods for removing identifiers: the Safe Harbor method (strip all 18) and the Expert Determination method (a statistician certifies the risk of re-identification is very small). If you're not using one of these formally, your data likely still qualifies as PHI.
PHI Isn't Just in Medical Records
This is the mistake that generates the most enforcement actions. Organizations think of PHI as something locked inside an EHR. In reality, PHI lives everywhere:
- Billing records — A patient name paired with a procedure code is PHI.
- Appointment schedules — A name and date on a whiteboard in a hallway is PHI.
- Voicemails — A message confirming a prescription refill with a patient's name is PHI.
- Emails and text messages — Any electronic communication containing identifiable health data is electronic PHI (ePHI).
- Paper sign-in sheets — If patients can see other patients' names and reasons for visit, that's a PHI exposure.
- Conversations — Verbal disclosures of identifiable health information in public areas count.
I once worked with a specialty clinic that had a breach not from a hacker, but from a fax machine. Lab results for 312 patients were faxed to a decommissioned number now owned by an auto parts store. That's a reportable breach of PHI — full stop.
The $4.3 Million Wake-Up Call From OCR
If you need proof that PHI definitions matter, look at enforcement history. In 2016, the HHS Office for Civil Rights (OCR) settled with Advocate Health Care Network for $5.55 million after multiple breaches involving ePHI — including an unencrypted laptop containing the PHI of roughly 4 million individuals. The data included names, addresses, dates of birth, and clinical information. Every one of those data points mapped directly to the 18 identifiers.
More recently, OCR's enforcement actions have continued to underscore that even small organizations face consequences. Korunda Medical, a Florida provider, paid $85,000 in 2021 after failing to provide a patient access to their own PHI. The underlying issue wasn't a hack — it was a fundamental misunderstanding of what PHI the patient had a right to receive.
These aren't cautionary tales from a textbook. They're actual settlements documented on OCR's enforcement page.
Does PHI Include Information About Deceased Individuals?
Yes. Under the HIPAA Privacy Rule, PHI protections apply to deceased individuals for 50 years after the date of death. I've seen health systems assume that once a patient dies, their records fall outside HIPAA. That's wrong, and it can lead to unauthorized disclosures that trigger breach notification obligations.
What About De-Identified Data?
If you properly strip all 18 identifiers using the Safe Harbor method — or get an expert determination — the resulting dataset is no longer considered PHI. You can use it, share it, and analyze it without HIPAA restrictions.
But "properly" is doing a lot of work in that sentence. I've reviewed de-identification efforts where organizations removed names and Social Security numbers but left medical record numbers and ZIP codes intact. That's not de-identified. That's still PHI, and disclosing it without authorization is a violation.
Re-Identification: The Hidden Risk
Even when data appears de-identified, small datasets with rare conditions in small geographic areas can sometimes be re-identified. A dataset showing "Female, age 87, diagnosed with a rare neurological condition, residing in ZIP code 04652" might identify only one person in a rural Maine town. The Expert Determination method exists specifically to address this statistical risk.
Your Staff Needs to Know This — Not Just Your Privacy Officer
Here's what I tell every organization I work with: your privacy officer can write the best policies in the world, but if your front desk staff, your nurses, your billing team, and your IT department don't understand what is considered PHI under HIPAA, those policies are just expensive paperwork.
Workforce training isn't optional. The Privacy Rule at 45 CFR § 164.530(b) requires covered entities to train all members of their workforce on PHI policies and procedures. OCR has cited training failures in numerous enforcement actions.
If your physicians need targeted compliance education, the HIPAA training course for physicians and clinical environments covers PHI identification, minimum necessary standards, and breach prevention in clinical workflows. For broader organizational needs, the full HIPAA training catalog addresses roles from administrative staff to IT.
A Quick PHI Checklist for Your Organization
Use this as a practical gut-check when you're unsure whether something qualifies as PHI:
- Does the information relate to health, healthcare, or payment for healthcare? If no, it's not PHI.
- Can the information be linked to a specific individual through any of the 18 identifiers? If no, it's likely de-identified.
- Is your organization a covered entity or business associate? HIPAA applies to covered entities (health plans, healthcare clearinghouses, healthcare providers who transmit electronically) and their business associates. If neither applies, HIPAA doesn't govern the data — though state laws might.
- Is the information in any format — electronic, paper, or verbal? PHI isn't limited to digital records. A whispered diagnosis in an elevator is still a disclosure.
If you answered yes to all four, you're handling PHI. Treat it accordingly.
The Bottom Line on PHI
PHI is broader than most healthcare professionals think. It's not just what's in the EHR. It's the appointment reminder text, the billing spreadsheet, the conversation in the break room, and the fax that went to the wrong number. It's any health information tied to any of 18 identifiers, in any format, held by a covered entity or business associate.
Get your team trained. Audit where PHI actually lives in your workflows — not just where you think it lives. And when in doubt, treat it as PHI. The cost of over-protecting data is zero. The cost of under-protecting it can be millions.