Stanford’s Institute for Human-Centered Artificial Intelligence has released a timely issue brief on one of the most unsettled questions in AI governance: whether modern foundation models can be developed and deployed without eroding core privacy rights. The paper, Data Privacy and Foundation Models: Can We Have Both?, argues that these systems introduce privacy risks that are not only broader than those found in traditional AI systems, but harder to detect, harder to govern, and harder to unwind once the data has already entered the model pipeline.

That framing matters. A great deal of AI policy discussion still treats privacy as a downstream problem: what happens when a chatbot gives a bad answer, leaks a sensitive detail, or stores a conversation too long. Stanford’s brief pushes the analysis much earlier. The privacy problem starts at ingestion, expands during training, resurfaces during inference, deepens through chatbot use, and is amplified by adversarial attacks that can bypass safeguards. In other words, the brief treats privacy not as a side constraint on model deployment, but as a full life-cycle governance issue.
Why Stanford says foundation models are different
The paper’s central argument is that foundation models should not be viewed as a scaled-up version of older machine learning systems. Their appetite for data, their opacity, their capacity to make inferences, and their role as general-purpose building blocks all make the privacy analysis more complicated. Stanford says these models present “unprecedented and largely unaddressed privacy risks,” beginning with mass scraping of personal information during training, continuing through memorization and regurgitation of sensitive data, and extending to the intimate details users reveal through chatbot interfaces.
That is a sharper critique than the usual “AI needs better privacy controls” line. The Stanford brief is effectively saying the architecture of the current foundation model ecosystem clashes with the assumptions underlying data protection law. Existing privacy regimes were built around collection, notice, use, retention, and deletion in relatively legible systems. Foundation models complicate all of those concepts because developers often do not fully disclose training sources, cannot easily trace the role of any one data point once a model is trained, and may still produce personal information through inference even if a specific record is later removed.
The privacy risk begins before the model is ever released
One of the strongest parts of the Stanford paper is its focus on data collection and curation. The authors note that foundation models rely on enormous and varied training datasets, including open datasets, scraped web content, proprietary platform data, licensed third-party data, and ongoing user interactions with AI systems. Because developers are often reluctant to disclose the full sources of their training data, the public has limited visibility into what personal information is included and how it got there.
Stanford argues that at this scale, personal data becomes inevitable unless developers actively remove it or exclude it from the pipeline. The brief notes that researchers have already found personally identifiable information and sensitive data in training corpora, including Social Security numbers and data from breached datasets. It also stresses that even when information was technically public, large-scale repurposing still creates serious privacy harms by dragging information out of obscurity and into high-powered generative systems.
That “loss of obscurity” point deserves more attention in the privacy world. Data protection law has long struggled with the false binary between public and private information. Stanford’s paper makes clear that something can be public in a narrow, contextual sense and still be privacy-sensitive when aggregated, reprocessed, and made usable in a completely different context. A forgotten post, an old image, a niche forum comment, or a pseudonymous account may not feel dangerous in isolation. But once those fragments are absorbed into model training, the result can be reidentification, inference, and commercial reuse at scale.
The paper illustrates this with the example of the MegaFace database, which assembled millions of facial images for training facial recognition systems using photos pulled in part from Flickr without explicit consent from the people who posted them. Stanford uses that history to show how data repurposing can violate contextual expectations and strip people of any meaningful ability to object, correct, or delete their information after it has already been absorbed into the AI development process.
Training and inference create a second layer of exposure
Stanford then moves to a harder problem: what happens once the data is inside the model. The brief explains that foundation models can reveal personal information in at least two ways. First, they can make powerful inferences about individuals using patterns drawn from many sources. Second, they can sometimes memorize training data and reproduce it verbatim. Both are privacy risks, but they are not the same risk. One stems from probabilistic synthesis. The other looks more like extraction.
The paper notes that models may infer identities from pseudonymous activity or infer sensitive traits that go beyond descriptive facts, including sexual orientation or political views. It also warns that high-entropy data, meaning uncommon data, may be more likely to be memorized, and that recency effects can make later-stage fine-tuning especially sensitive from a privacy standpoint. That matters because fine-tuning frequently involves proprietary or context-specific datasets supplied later in the model life cycle, which can increase the odds that sensitive information shows up in outputs unless developers impose robust guardrails.
This is one reason the Stanford paper feels more operational than rhetorical. It does not just say models are risky because they are large. It explains why certain training practices and deployment choices can create persistent exposure. Once sensitive information has shaped model weights or downstream behavior, traditional rights like deletion or correction become much harder to operationalize. That is a direct challenge to the way many privacy statutes assume accountability should work.
The harms are not limited to disclosure
A useful contribution of the brief is that it refuses to treat privacy as nothing more than secrecy. Stanford argues that privacy harms from foundation models can also appear as decisional, economic, social, and dignitary harms. The paper cites examples where models used in résumé screening favored résumés with white-associated names, where administrative healthcare uses reproduced gender bias, and where models evaluating mortgage applications recommended worse outcomes for Black applicants than for otherwise similar white applicants. It also notes that major U.S. foundation models still associate African American English with harmful stereotypes.
That broader frame is important because foundation models do not just expose data; they operationalize it. They classify, rank, infer, and recommend. So even when a model never prints out a Social Security number or copies a paragraph from training data, it can still convert personal information into discriminatory outcomes. Stanford’s point is that privacy governance for foundation models has to address not just leakage, but the ways personal data and modeled inferences can be used to shape opportunities, prices, autonomy, and social standing.
Chatbot interfaces may be creating a new intake channel for sensitive data
The Stanford brief also spends significant time on the privacy risks created during ordinary model use. The paper warns that AI chatbots are designed to feel conversational, agreeable, and responsive in ways that encourage disclosure. Users may share health details, family issues, financial records, work documents, or identifying facts without fully appreciating how that information will be stored, reused, or fed back into future model development. Stanford notes that this risk is especially acute for people seeking emotional or health-related support, but it can also affect ordinary users asking for coding help, business advice, or general assistance.
The paper then adds a critical governance point: developers still face relatively little oversight regarding how they handle increasingly personal data submitted through chatbot interfaces. Stanford cites research finding that six major U.S.-based LLM developers defaulted to using customer chat data for retraining, while some systems offered no opt-out at all. The brief also notes that retraining may extend to uploaded files, including documents, photos, and voice recordings, and that many developers appear to retain this information indefinitely.
That observation goes beyond a disclosure problem. It suggests the industry may be building a self-reinforcing privacy cycle in which users reveal sensitive information to a chatbot, that information is then retained or repurposed, and the expanded data pool increases future memorization and inference risks. As AI agents evolve to handle more personal tasks across more contexts, Stanford warns that the incentive to gather and repurpose user data will only intensify.
Attackers can turn model weaknesses into privacy breaches
Another reason Stanford sees foundation models as a distinct privacy challenge is that privacy harms do not depend only on routine use. Attackers can actively force these systems to expose information. The brief highlights prompt injection as one of the most consequential attack vectors, explaining that malicious instructions can override system prompts, manipulate model behavior, and induce models to reveal sensitive information. In retrieval-augmented systems or models connected to external tools and APIs, indirect prompt injection can even trick a model into exfiltrating data from connected systems or private documents.
The paper also discusses model inversion, membership inference, and extractable memorization, all of which can be used to infer whether specific data was in a training set or reconstruct sensitive elements from model behavior. Stanford is explicit that these techniques can expose health, biometric, financial, and other personal information and may undermine attempts to anonymize training data. It further warns that data poisoning, model poisoning, and the weakening or removal of safety guardrails can turn models into better tools for reidentification and personal data extraction.
This part of the brief is especially useful for privacy professionals because it underscores that privacy cannot be reduced to policy text or front-end controls. Even when developers adopt alignment measures or suppress certain outputs, attackers may still find ways to exploit model architecture, connected systems, or fine-tuning pathways. Privacy, in this environment, becomes inseparable from security engineering.
Stanford’s policy message: current frameworks are not enough
The Stanford paper does not stop at diagnosis. It argues that existing privacy frameworks, including the GDPR, are fundamentally mismatched with how foundation models are currently developed, and that neither Europe nor the United States has yet adopted comprehensive rules likely to meaningfully change developer behavior. Without stronger guardrails, the public remains dependent on voluntary action by model developers.
The governance section lays out a more practical path. First, Stanford argues policymakers should reduce or remove personal data from the training pipeline upstream rather than rely on downstream fixes. The brief calls for limitations on the kinds of data developers can collect and use for model training, and says a meaningful U.S. federal privacy law would be an important foundation, particularly if it restricted broad consumer data collection and the activities of data brokers.
Second, the paper calls for stronger transparency. But Stanford is careful here: it does not celebrate transparency in the abstract. It specifically argues that transparency mandates must be specific, informative, and enforceable. The authors point to California’s AB 2013 as an example of a law that may have produced technically compliant but vague reporting that is not especially actionable. That is a useful warning for privacy regulators who assume disclosure alone will discipline the market.
Third, Stanford endorses privacy and security by design. The brief argues that model developers should build system architectures and interfaces that reflect how users actually expect their data to be treated, rather than pushing design choices that favor engagement or product growth at the expense of privacy expectations. It points to recent incidents involving publicly indexed chatbot chats and unintentionally shared AI conversations as examples of what happens when companies misread user expectations. The paper also discusses privacy-enhancing methods such as federated learning and differential privacy, but stresses that these should be treated as additional safeguards rather than complete solutions because models can remain vulnerable to privacy attacks and performance tradeoffs persist.
Fourth, Stanford supports output suppression as a supplemental defense. Guardrails that block the sharing of phone numbers, benefit IDs, or other personal information can reduce predictable harms, but the authors warn that output suppression is not a panacea. It cannot account for every possible context in which privacy may be implicated, and in some cases developers may need to retain the very categories of information they are trying to prevent from appearing in outputs.
What this means for privacy teams, policymakers, and AI developers
The Stanford brief lands at a moment when many organizations are still treating AI privacy as a procurement checklist issue. The paper suggests that is too narrow. The real question is not whether a chatbot has a privacy notice or whether a vendor will sign a DPA. It is whether the entire foundation model life cycle is compatible with long-standing principles such as data minimization, purpose limitation, deletion, contextual integrity, and meaningful user control.
For privacy leaders, the takeaway is that foundation models require a more aggressive risk lens than many other software tools. Review has to extend upstream into data provenance, downstream into output controls, sideways into user-interface design, and deeply into security architecture. For lawmakers, Stanford’s message is that generic transparency promises and after-the-fact enforcement will not be enough if the business model still rewards indiscriminate data collection. And for developers, the paper is a reminder that scale does not excuse opacity. The bigger the system, the stronger the obligation to explain what data went in, how risks are mitigated, and what rights remain available once a model is in the wild.
Why the Stanford brief matters
What makes this Stanford HAI paper worth reading is not just that it catalogs familiar concerns such as scraping, memorization, or data retention. It ties those concerns together into a single claim: foundation models are pushing privacy law into terrain it was not built to govern. The brief closes by asking whether the principles of data protection can still hold in an era of general-purpose models, or whether society will have to rethink both individual rights and developer responsibilities as AI systems become capable of enabling both individual and population-level surveillance. :contentReference[oaicite:26]{index=26}
That is the real significance of the report. Stanford is not arguing that privacy and foundation models are necessarily incompatible. It is arguing that compatibility will not happen on autopilot. Without firmer rules on data intake, stronger transparency, more serious privacy-by-design obligations, and tighter controls on model outputs and connected systems, the industry will keep asking the public to absorb risks it never knowingly accepted.