Policy & Regulation Bearish

Merriam-Webster and Britannica Sue OpenAI for 'Cannibalizing' Dictionary Data

Merriam-Webster and its parent company Britannica have filed a lawsuit against OpenAI, alleging that ChatGPT was trained on their proprietary reference material without permission. The plaintiffs argue that the AI's ability to provide instant definitions has decimated their web traffic and threatens the economic viability of traditional lexicography.

Mar 17, 2026 · 4 min read · By AI Intelligence Brief Editorial

Key Takeaways

Merriam-Webster and its parent company Britannica have filed a lawsuit against OpenAI, alleging that ChatGPT was trained on their proprietary reference material without permission.
The plaintiffs argue that the AI's ability to provide instant definitions has decimated their web traffic and threatens the economic viability of traditional lexicography.

Mentioned

Merriam-Webster company Britannica company OpenAI company ChatGPT product LLM technology

Key Intelligence

Key Facts

1Lawsuit filed on March 17, 2026, by Merriam-Webster and Britannica against OpenAI.
2Plaintiffs allege ChatGPT was trained on proprietary definitions without a licensing agreement.
3The complaint highlights 'cannibalization' of web traffic as a primary economic harm.
4Merriam-Webster argues that its unique arrangement and expression of facts are copyright-protected.
5The suit follows similar high-profile legal actions from the New York Times and the Authors Guild.

Who's Affected

OpenAI

companyNegative

Merriam-Webster

companyPositive

AI Developers

technologyNegative

Analysis

The legal battleground for generative AI has expanded from the realms of creative literature and visual arts into the foundational world of reference data. Merriam-Webster and its parent company, Britannica, filed a lawsuit on March 17, 2026, against OpenAI, claiming that the AI giant’s flagship product, ChatGPT, was built using vast quantities of their copyrighted dictionary and encyclopedia entries without authorization. This case represents a significant escalation in the ongoing conflict between content owners and AI developers, shifting the focus to the highly structured, factual data that gives Large Language Models (LLMs) their linguistic precision and semantic depth.

At the heart of the complaint is the allegation that OpenAI’s training process involved scraping millions of definitions, etymologies, and usage examples that Merriam-Webster and Britannica have spent decades—and in some cases, centuries—curating. While facts themselves are generally not copyrightable under U.S. law, the specific expression, arrangement, and comprehensive compilation of those facts in a dictionary are protected. The plaintiffs argue that ChatGPT does not merely provide information but replicates the unique voice and structural expertise of their reference works, effectively creating a derivative product that competes directly with the original sources.

At the heart of the complaint is the allegation that OpenAI’s training process involved scraping millions of definitions, etymologies, and usage examples that Merriam-Webster and Britannica have spent decades—and in some cases, centuries—curating.

The economic argument presented by the publishers is particularly compelling: the cannibalization of web traffic. For decades, the business model for digital dictionaries has relied on search engine traffic leading users to their websites, where ad revenue and premium subscriptions sustain the costly work of lexicography. By providing instant, high-quality definitions within its own interface, ChatGPT bypasses the need for users to visit external reference sites. Merriam-Webster and Britannica contend that this is not a case of transformative fair use, but rather a parasitic relationship where the AI uses the publishers' own data to render their primary distribution channels obsolete.

This lawsuit follows a pattern of litigation from other content-heavy industries, including the New York Times and various authors' guilds. However, the Merriam-Webster case is unique because it targets the very building blocks of language that AI models require to function. If the courts side with the publishers, it could force a radical shift in how AI companies source their training data. We may see the emergence of a mandatory licensing regime for reference data, similar to how music streaming services pay royalties to labels and artists. For OpenAI, which has recently sought to strike licensing deals with news organizations like Axel Springer and the Associated Press, this lawsuit suggests that the fair use defense is becoming increasingly difficult to maintain as a blanket strategy.

What to Watch

Industry analysts suggest that the outcome of this case will hinge on whether the court views ChatGPT’s output as a substitute for a dictionary. If a user asks for a definition and receives a response that is substantially similar to a Merriam-Webster entry, the claim of market harm becomes much stronger. Conversely, OpenAI is expected to argue that its models learn the patterns of language rather than storing specific entries, and that the resulting definitions are synthesized on the fly. This technical distinction will be a central point of contention as the discovery process begins.

Looking ahead, the resolution of this dispute will likely dictate the future of the Open Web. If reference giants like Britannica can successfully gate their content behind paywalls or licensing fees, the era of free, high-quality information being easily accessible to AI scrapers may be coming to an end. This could lead to a bifurcated AI landscape where only the wealthiest tech companies can afford to train models on verified, high-authority data, while smaller players are left with lower-quality, unverified web-scraped content. The case is a stark reminder that in the AI economy, data is not just the new oil—it is the sovereign territory of the institutions that created it.

Timeline

Nov 30, 2022
ChatGPT Launch
Dec 27, 2023
NYT Lawsuit
Mar 17, 2026
Dictionary Suit Filed

"Merriam-Webster and Britannica Sue OpenAI for 'Cannibalizing' Dictionary Data." AI Intelligence Brief, March 17, 2026. https://getaibrief.com/story/merriam-webster-britannica-openai-lawsuit

From the Network

Legal

Merriam-Webster and Britannica Sue OpenAI Over AI Training Data Theft

Merriam-Webster and Britannica have filed a joint lawsuit against OpenAI, alleging the tech giant used their proprietary definitions and encyclopedic content to train ChatGPT without authorization. Th

17w ago Startups

Merriam-Webster and Britannica Sue OpenAI Over AI Training 'Theft'

Merriam-Webster and Britannica have filed a lawsuit against OpenAI, alleging that ChatGPT was trained on their proprietary reference material without authorization. The plaintiffs claim the AI system

17w ago

How we covered this story

Every story in our AI coverage is assembled from multiple primary sources, cross-referenced for factual consistency, and scored along three independent dimensions: sentiment, operational impact, and source-cluster confidence. Single-source rumors and unverifiable claims do not pass our editorial gate. When a story shows "Verified by N sources" with N≥2, the development is independently corroborated; when N=1, we mark it explicitly so readers can weigh the signal accordingly.

Impact scoring uses a 1-10 scale weighted toward regulatory, financial, and operational consequence rather than coverage volume. A topic that runs in every outlet but moves no real decisions ranks lower than a niche regulatory filing that reshapes how operators in the AI space have to behave. Read our full methodology for the scoring rubric, our glossary for term definitions, and our trends index for the longitudinal view across the beat.

Sources are only linked to a story once they clear our classification pipeline at a minimum 35 percent relevance threshold. According to that methodology, reviewed July 2026, this follows multi-source corroboration standards recommended by journalism research bodies such as the Reuters Institute for the Study of Journalism.

See something wrong in this story — a wrong fact, a broken source link, a misattributed entity? Report a data issue.

Signal on this page	What it tells you
Verified by N sources	Independent corroboration count. N≥2 is our confidence floor; N=1 is marked explicitly.
Impact score (1-10)	Regulatory + financial + operational weight. 8+ signals an experienced-operator action item.
Sentiment	Five-tier classification trained on labeled AI-specific corpora.
Timeline	Where applicable, the related-events sequence that contextualizes today's development.

Key Takeaways

Mentioned

Key Intelligence

Key Facts

Who's Affected

Analysis

What to Watch

Timeline

Timeline

Cite This Page

Related Stories

Burnham Names Kanishka Narayan as UK's 1st AI Cabinet Minister

Meta's AI Layoff Algorithm Accused of Bias: 26 Suits Threaten $145B AI Ambition

200+ AI Experts Warn of Unprecedented Economic Shift: Urgent Call for Action

Meta's Metamate AI Under Fire: 26 Employees Sue Over Biased Layoff Algorithm

From the Network

Merriam-Webster and Britannica Sue OpenAI Over AI Training Data Theft

Merriam-Webster and Britannica Sue OpenAI Over AI Training 'Theft'

How we covered this story