This website uses cookies

Read our Privacy policy and Terms of use for more information.

Quebec's national library is building a database of cultural and government content specifically designed to train AI systems on French and Indigenous language data. The initiative directly addresses a well-documented problem: major generative AI platforms consistently underperform when answering questions about Quebec society, culture, and economy.

Bibliothèque et Archives nationales du Québec, known as BAnQ, launched the experimental phase of its proposed government and cultural databank in French and Indigenous languages after completing a feasibility study earlier in 2026. The project aims to address concerns that major generative AI systems often struggle to provide reliable information about Quebec because of the limited amount of Quebec-related data available to train them. Lethbridge Herald

Why AI Gets Quebec Wrong

The problem has been documented at the policy level for over a year. A 2024 report by Quebec's innovation council attributed the issue in part to the "very small quantity of data on Quebec" available in AI training datasets. Chronicle Journal

Destiny Tchéhouali, co-holder of a Quebec-based research chair focused on French-language AI and digital technologies, said Quebec culture remains underrepresented in current AI systems. "And when we also talk about Indigenous peoples, we run an even greater risk of all these biases," said Tchéhouali, a professor at Université du Québec à Montréal. Chronicle Journal

This is not a niche academic concern. Businesses in Quebec using generative AI tools for customer communications, legal work, or content production are working with systems that have significant blind spots around local regulations, cultural context, and language nuance.

How the Databank Would Work

BAnQ plans to begin with its own collections before considering data from other sources. Tchéhouali described the proposed database as "strategic infrastructure" that could help establish guidelines for how local content is identified, catalogued, and tracked within AI systems. Chronicle Journal

Valérie D'Amour, who led the feasibility study, said the project is still in early stages. "All scenarios are a little bit on the table right now. We have a lot of ideas and we want to validate the possibilities with cultural stakeholders, as well as with data owners and providers, who will be involved in the discussions."

The feasibility study envisions the platform becoming operational by 2029, with a five-year budget of nearly $10.5 million through 2030, including operating and capital costs. D'Amour said the timeline will be reassessed following the experimental phase. Lethbridge Herald

The Creator Compensation Problem

The project raises a tension that goes beyond Quebec. Artists and content creators whose work would feed this database are skeptical, even if they get paid for it.

"The main criticism we hear in the field is that, even if artists earn income from it, they are still feeding the beast that will eventually be used to replace contracts they may lose because of AI," said Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research. Lethbridge Herald

This mirrors debates happening across the creative industries globally. The question of whether licensing content to AI training programs creates long-term risk for the creators involved remains unresolved. For businesses thinking about AI for content creation, this tension is worth watching - the legal and ethical frameworks around AI training data are still being written.

What This Means for Businesses Using AI

Quebec's move is part of a broader global pattern. Governments are starting to treat AI training data as strategic infrastructure, not just a technical problem for tech companies to solve.

For business leaders, the practical implication is straightforward. If your company operates in French, serves Quebec customers, or works with Indigenous communities, the AI tools you're using today are likely missing important context. That gap affects the quality of outputs across AI for customer service, legal analysis, marketing, and research.

The databank won't fix this overnight - a 2029 operational target means current tools remain limited for several more years. But it signals that sovereign AI data infrastructure is becoming a policy priority at the regional level, not just nationally. Expect more governments to follow this model.

Cut Through the Noise

What is Quebec's AI cultural databank? Quebec's national library, BAnQ, is building a database of French and Indigenous language cultural and government content designed to train AI systems. The project launched its experimental phase in May 2026 after a feasibility study confirmed that major generative AI platforms consistently underperform on Quebec-related topics due to limited local training data.

How much is Quebec spending on the AI databank? The feasibility study estimates a five-year budget of nearly $10.5 million through 2030, covering operating and capital costs. The platform is targeted to become operational by 2029, though that timeline will be reviewed after the experimental phase concludes.

Why do AI systems struggle with French and Quebec content? Quebec's innovation council identified the problem in a 2024 report, attributing it to the very small quantity of Quebec-specific data in AI training datasets. Researchers note that Indigenous languages face an even greater underrepresentation risk, creating compounding bias issues in AI outputs related to those communities.

Will artists be compensated for contributing to the AI databank? Compensation for creators is part of the discussion, but artists have raised concerns that income from licensing their work to AI training systems may ultimately fund tools that replace their future contracts. The creator compensation question remains unresolved as the project moves into its experimental phase.



Keep Reading