Platform

Captain Compliance Platform Overview

Get an Overview of our all-in-one data privacy platform

Cookie Scanner

Get a real-time view of all the cookies on any website

Privacy Notice Generator

Central place to manage your privacy policy tied to your consent and Cookie Policy

Cookie Consent Manager

Manage consent for data privacy laws around the world

DSR Portal

Easily manage Data Subject Requests in your DSR Portal

Cookie Transparency Page

Build trust by showing users the most current cookies on your site
Solutions

By Regulation

CPRA

California Privacy Rights Act

GDPR

General Data Protection Regulation

CIPA

California Invasion of Privacy Act

VPPA

Video Privacy Protection Act

LGPD

Lei Geral de Proteção de Dados

PIPEDA

Personal Information Protection and Electronic Documents Act

New Regulations

States with New Privacy Laws

Compliance

Cookie Scanner

Get a real-time view of all the cookies on any website

Cookie Consent Manager

Manage consent for data privacy laws around the world

Cookie Transparency Page

Build trust by showing users the most current cookies on your site

Privacy Notice Generator

Central place to manage your privacy policy tied to your consent and Cookie Policy

DSR Portal

Easily manage Data Subject Requests in your DSR Portal
Pricing
Education
Contact

Home » Education » Data » When Does Health Data Become Sensitive? The NAI’s Five-Factor Test Has Answers

When Does Health Data Become Sensitive? The NAI’s Five-Factor Test Has Answers

Published March 2, 2026

The Five-Factor Test That Could Save Your Health Advertising Program — Or Expose It

There is a question sitting at the center of every health-related digital advertising campaign right now, and the honest answer for most organizations is that they do not know the answer with any confidence: is the personal information we are using to target this campaign sensitive health data under applicable law?

It sounds like a question with an obvious answer. Health data is health data. A cancer diagnosis is sensitive. A grocery shopping list is not. But the reality of how modern digital advertising actually works — where behavioral signals, purchase data, article consumption patterns, demographic characteristics, and third-party data segments combine in real time to create targeting decisions — has produced a situation where the line between ordinary commercial data and regulated sensitive health data is genuinely, technically, legally unclear. And the consequence of getting that classification wrong runs from regulatory investigation and enforcement action to FTC consent orders and state attorney general settlements that have already cost companies millions.

It is into this compliance gap that the Network Advertising Initiative published its Factor Analysis for Health-Related Sensitive Personal Information in February 2026 — a detailed, practically-oriented framework designed to help organizations navigate one of the most complex and consequential classification problems in privacy law. For privacy professionals and compliance teams working at the intersection of health advertising and data protection, this document is essential reading. This article walks through what it says, why it matters, and what it means for how organizations should be approaching health data classification in 2026.

The Problem the NAI Is Trying to Solve

The Network Advertising Initiative has promoted strong privacy practices among member companies engaged in data-driven digital advertising for decades, and the health data context has always been one of its primary areas of focus. The new Factor Analysis arrives at a moment when the regulatory complexity around health-related data has reached a level that the industry’s previous frameworks were not designed to handle.

The core problem is definitional fragmentation. In the United States, there is no single authoritative definition of what makes personal information “sensitive” from a health perspective. HIPAA defines Protected Health Information based on the entity that holds it — health information held by a covered entity or its business associate in connection with health care functions is PHI; the same information held by a retailer or an ad tech company is not. State comprehensive privacy laws use different relationship tests and different threshold tests, creating a patchwork where the same data processing activity might be regulated in Washington but not in Texas, might require opt-in consent in Colorado but only enhanced disclosure in California, and might be governed by the FTC Act’s evolving unfairness authority in ways that the current Commission is actively contesting.

The consequence of this fragmentation is that organizations have been responding in three problematic ways. First, some over-classify: treating any information with any connection to health or the human body as sensitive, applying heightened requirements uniformly, and in doing so limiting the benefits that responsible health advertising can deliver to consumers — including genuinely valuable public health communications, awareness of treatment options, and access to health resources. Second, some under-classify: failing to recognize that ordinary commercial data, when used in specific ways, can cross the legal threshold for sensitive health information under one or more applicable frameworks. Third, some simply withdraw: pulling back from health-adjacent advertising categories in jurisdictions where they perceive too much uncertainty, rather than investing in the analysis that would let them operate responsibly.

The NAI’s Factor Analysis is designed to give organizations a structured analytical tool that prevents all three failure modes — enabling responsible health advertising while building in the protections that consumers legitimately expect.

The Five Factors: A Framework for Nuanced Classification

The document identifies five factors that, taken together, help organizations determine whether personal information they are processing for health-related advertising may qualify as health-related sensitive personal information — which the NAI terms HSPI — under applicable law. Critically, no single factor is dispositive. The framework is designed for honest, holistic analysis rather than the kind of mechanistic checklist-checking that produces technically defensible conclusions while missing the substance of what the law is trying to protect.

Factor 1: The Source of the Personal Information

The first factor asks where the data came from. Data from health care providers, medical practitioners, insurance claims systems, or health-specific mobile applications like menstrual cycle trackers or pregnancy apps is more likely to implicate this factor than the same type of data collected from a general retail context or a marketing survey.

The source factor reflects the HIPAA tradition of context-based sensitivity — the idea that information becomes more sensitive when it travels from a health care relationship because the nature of that relationship creates a reasonable expectation that the data will remain within the health care system. An individual who shared information with their doctor has a fundamentally different expectation about how that information will be used than someone who mentioned the same information in a brand survey. The NAI’s analysis makes clear that the same type of data can reach different conclusions under this factor depending entirely on where it originated.

This has immediate practical implications for organizations that license or purchase data from third-party data providers. The question is not just what the data says — it is where it came from. A data partner claiming their data is de-identified general consumer data may be sourcing it from contexts that would implicate this factor if that sourcing were examined.

Factor 2: The Contents of the Personal Information

The second factor examines the substance of the data itself, independent of its source. Three specific content categories are identified as most clearly implicating this factor.

First, data relating to a treatment for a specific health condition — prescriptions, medications tied to identified conditions, condition-specific therapeutics — more strongly implicates this factor than general over-the-counter medication purchases that may relate to common symptoms without revealing a specific health condition. The distinction between a prescription for a chemotherapy drug and a purchase of ibuprofen is stark and legally significant.

Second, data that indicates a specific health condition or diagnosis — from diagnostic codes in health records at the highest sensitivity end, down to article consumption patterns that may weakly or indirectly suggest a condition. Here the document engages carefully with the 2025 California Healthline settlement, in which the California AG’s office scrutinized the processing of data indicating a consumer had viewed articles with titles like “Newly Diagnosed with Ulcerative Colitis? Here’s What to Know.” The settlement is instructive about how regulators can treat indirect signals — reading an article does not directly reveal a health condition, since the reader may have no personal connection to it, but article titles phrased as addressing people who have already been diagnosed with a specific condition create a different analytical situation than general health information articles.

Third, measurements of vital signs or bodily functions — heart rate, blood pressure, menstrual cycle data, weight — directly implicate this factor even without disclosing a specific diagnosis. General fitness or dietary information is treated as more peripheral.

The contents factor creates a genuine operational challenge for scale: it requires organizations to assess the health-content implications of data at a granular level that most automated data pipelines are not designed to perform. The Factor Analysis acknowledges this directly, noting that distinguishing between article titles referring to those newly diagnosed with a condition versus general trend articles presents significant operational challenges, and pointing to intended use as the next analytical step when content is ambiguous.

Factor 3: The Intended Use of the Personal Information

The third factor is in some ways the most significant, because it captures the data processing scenarios that have generated the most enforcement attention: cases where ordinary, non-sensitive commercial data is used to make health-related inferences about individual consumers.

The Factor Analysis draws a sharp line between two conceptually distinct uses of data. On one side: demographic targeting — using population-level correlations between demographic characteristics and health status to reach audiences for whom a health-related product or service is more likely to be relevant, without ascribing any health condition to specific individuals. On the other side: individualized health profiling — using behavioral, purchase, or other personal data to generate scores or classifications that assign a probable health status to specific individuals.

The classic example the document uses is the pregnancy prediction score — a scenario closely associated with a well-known incident in which a major retailer used purchase data to infer pregnancy likelihood and target advertising accordingly. Assigning an individual consumer a pregnancy prediction score based on their purchase behavior uses non-sensitive data — groceries, toiletries — in a way that creates sensitive health inferences at the individual level. That use implicates Factor 3 even though the underlying data, in isolation, would not.

By contrast, a pharmaceutical company running an ad campaign for a diabetes medication that uses demographic characteristics — age, income — to reach an audience statistically more likely to include diabetics, without making any individual-level inference about specific consumers’ diabetes status, does not implicate Factor 3. The distinction is not about the product being advertised or the health condition being implicated. It is about whether the processing assigns a health status to a specific individual.

This distinction has significant practical implications for organizations using behavioral interest segments. The NAI’s prior guidance on refraining from placing users, without opt-in consent, in segments that identify them with sensitive health conditions like cancer, a mental health condition, or an STD remains directly applicable here. Building a behavioral interest segment labeled “Heart Disease” and adding individual users to it based on their browsing behavior is categorically different from using population-level demographics to reach audiences for a heart medication. The first implicates Factor 3. The second, if done correctly, does not.

Factor 4: Heightened Consumer Expectations of Privacy

The fourth factor asks a contextual and somewhat subjective but legally essential question: would a reasonable consumer in this context expect their health-related information to be treated with heightened privacy protections?

The answer is most clearly yes for reproductive or sexual health, mental health, and particularly serious health conditions. The BetterHelp enforcement action — in which consumers voiced explicit anger that their mental health information had been shared with Facebook — is cited as direct evidence of the consumer expectations that regulators treat as legally significant. One user quoted in the enforcement record wrote: “I have not given ANY consent to share my information with ANYONE. ESPECIALLY ads targeting my mental health weakness.” The emotional force of that statement illustrates why this factor exists: there are categories of health information that people treat as fundamentally different from ordinary commercial data, and privacy law reflects that distinction.

Disclosure practices affect this analysis but do not resolve it. A privacy policy that mentions data sharing does not necessarily eliminate heightened expectations — the Healthline case is instructive here, with the California AG arguing that sharing data of an intimate nature with third parties, even when briefly mentioned in a privacy policy, may be unlawful when consumers would not expect it. Heightened privacy expectations cannot be fully extinguished by fine-print disclosure. This is a lesson that should inform how health-adjacent data processing is designed, not just disclosed.

Factor 5: The Risk of Consumer Harm

The final factor examines whether processing the data in question increases the likelihood of consumer harm and how severe that harm could be. The document distinguishes between objective and severe harms — unlawful discrimination, economic harm, denial of benefits like health insurance or employment — and more subjective harms like embarrassment or emotional distress.

The discrimination angle is particularly important for health data: pregnancy is a protected class under federal and state employment law. A mental health diagnosis is relevant to disability protection frameworks. A diabetes diagnosis could affect insurance eligibility. Health data that ends up informing eligibility determinations — whether through the deliberate design of the system or through inadvertent use by downstream data recipients — creates discrimination exposure that goes well beyond privacy law and into employment, insurance, and civil rights frameworks.

Critically, Factor 5 also asks organizations to weigh the benefits of processing against the risks. Health-related advertising is not inherently harmful. Advertising that informs a consumer about a treatment for a condition they have, or about a clinical trial they qualify for, or about a preventive care resource they were unaware of, delivers genuine value. The factor analysis is not designed to prohibit health advertising — it is designed to ensure that the data processing that enables it is calibrated to the sensitivity of the information involved and the risks it creates.

The Three Hypotheticals: Where the Framework Does Its Most Useful Work

The Factor Analysis concludes with three hypothetical scenarios that illustrate the application of the five factors to realistic advertising situations. These hypotheticals are where the framework becomes most operationally useful, because they demonstrate not just the logic but the conclusions — and two of the three scenarios reach clearly different results that illustrate exactly where the compliance line sits.

The running shoe retargeting campaign — where a general sports retailer uses add-to-cart events to retarget interested consumers — implicates none of the five factors. The data source is a general retailer. The content is a shopping intent signal with no health condition connection. The intended use is neutral content matching, not health inference. Consumer privacy expectations in a general retail context are not elevated. And the risk of harm from showing someone running shoe ads based on their browsing behavior is functionally zero. This is the baseline: ordinary commercial advertising data used for ordinary commercial advertising purposes.

The pregnancy prediction score scenario reaches the opposite conclusion, and the reasoning is instructive. The data source — a general retailer’s loyalty program — does not implicate Factor 1. The original contents — grocery and toiletry purchases — do not themselves implicate Factor 2. But the intended use — generating a pregnancy prediction score that assigns a probable health status to individual consumers — implicates both Factor 2 (the resulting inference relates to a specific bodily condition) and Factor 3 (the data is used to impute a health status to individual consumers). The heightened privacy expectations associated with reproductive health implicate Factor 4. And the potential for harm if a pregnancy prediction score were inadvertently released or misused — discrimination, stigma, legal exposure — implicates Factor 5. The conclusion is not a flat prohibition; it is a flag for further scrutiny and legal counsel. But the multiple factors in tension make clear that this is precisely the use case that requires careful legal analysis before deployment.

The diabetes medication demographic targeting scenario — the most nuanced of the three — reaches a conclusion of no HSPI implication, but the analysis is careful and conditional. The de-identified insurance claims data used to model the demographic profile of existing customers is not personal information at all, so its HIPAA-adjacent source does not implicate Factor 1. The demographic characteristics used to build the targetable audience — age and income — are health-neutral in content. And the intended use is specifically structured to avoid individual-level health inference: the pharmaceutical company is using a population-level correlation (this demographic is statistically more likely to include diabetics) to improve ad relevance without telling itself that any specific individual in the audience has diabetes. The analysis notes explicitly that if the company were to use the 40% diabetes prevalence rate to infer that individual members of the segment likely have diabetes, Factor 3 would be implicated — the conclusion changes with the inference step.

What This Means for Privacy and Compliance Programs

The NAI’s Factor Analysis arrives at a moment when the enforcement environment around health data is both active and directionally uncertain. The FTC under Chair Ferguson has expressed skepticism about the expansive application of unfairness authority to health data inferences, arguing that drawing conclusions from lawfully obtained data does not inherently violate Section 5. At the same time, state attorneys general — particularly in California and Washington — have been actively enforcing exactly the theories the current FTC leadership is skeptical of. For organizations operating across multiple jurisdictions, both enforcement environments are simultaneously real.

The practical guidance for privacy and compliance professionals is as follows. Health data classification needs to be treated as a workflow step, not a one-time policy decision. Organizations that have not implemented a review process for identifying and classifying HSPI should build one now, using the five-factor framework as the analytical structure. The framework is not a compliance checklist — it is a tool for making the relevant considerations visible and explicit in a way that demonstrates analytical rigor and good faith.

For organizations using behavioral interest segments that touch health topics, the Factor 3 analysis is the most immediately demanding. The line between a demographic audience for a health product and an interest segment that identifies individual users with a health condition is a compliance line with enforcement consequences, and it needs to be drawn deliberately and documented. Organizations relying on third-party data segments should be asking their data providers directly: does this segment assign a health status to individual users? If the answer is yes or uncertain, further analysis is required.

For organizations operating in California and Washington specifically, the Factor 4 analysis — heightened consumer expectations — needs to be taken seriously as an independent basis for additional scrutiny even where the other factors might not clearly be implicated. The Healthline case demonstrates that California will pursue enforcement based on contextual expectations of privacy rather than waiting for clear-cut HSPI classification. Reproductive health, mental health, and serious condition data carry heightened expectations regardless of how they entered the data pipeline.

And for all organizations processing health-adjacent data at any scale, the fundamental lesson of the five-factor framework is the same lesson that runs through every serious privacy analysis: classification is a function of source, content, use, context, and risk — not just category. The same data point can be sensitive or not sensitive depending on how it is combined with other data, what inferences it is used to generate, and what decisions those inferences feed into. A compliance program that treats data classification as a static labeling exercise will consistently misidentify exactly the processing activities that create the most significant regulatory and reputational exposure.

The NAI’s Factor Analysis will not resolve every ambiguous case. It says so explicitly: regulators and courts may reach different conclusions than companies do when applying these factors to similar fact patterns. What it does is provide a structured, legally grounded way to reason through the relevant considerations openly — and in an enforcement environment that has consistently rewarded documented analytical rigor and consistently penalized the absence of it, that is not a small thing.

Written by:

Richart Ruddie

Relevant Posts

Crate & Barrel Hit with Privacy Claims Over Website Tracking: What the Lawsuit Signals for Every Online Retailer

Purpose Limitation and Data Minimization in Agentic AI: What GDPR Compliance Actually Requires

AI Guardrails Give Governance Teams a False Sense of Security

McGraw Hill Data Breach 2026: The $10 Billion Governance Gap in Salesforce Security

Vercel Breach Tied to Context AI Hack Exposes Limited Customer Credentials

xAI v. Bonta: Why This Constitutional Showdown Over AI Training Data Is About Much More Than One Company

Online Privacy Compliance Made Easy

Captain Compliance makes it easy to develop, oversee, and expand your privacy program. Book a demo or start a trial now.

Captain Compliance Platform Overview

Get an Overview of our all-in-one data privacy platform

Cookie Scanner

Get a real-time view of all the cookies on any website

Privacy Notice Generator

Central place to manage your privacy policy tied to your consent and Cookie Policy

Cookie Consent Manager

Manage consent for data privacy laws around the world

DSR Portal

Easily manage Data Subject Requests in your DSR Portal

Cookie Transparency Page

Build trust by showing users the most current cookies on your site

By Regulation

Compliance