Mozilla

Taking on powerful tech companies will not be easy – but it is possible, as long as we are in it together.‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏

Hello,

Imagine for just a moment that we live in a world where AI is trustworthy, and tech companies provide transparency about the products they build – including whether they’re using our personal data to build them.

That world is possible… but only if all of us in the Mozilla community come together to make it happen.

The reality today is that we simply don’t know what data the latest generative AI models of leading AI companies were trained on – because their developers aren’t telling anyone. But we do know that they rely on massive archives of web crawl data that contain harmful content, which those AI companies would need to filter out with care before training their models.

Mozilla’s recent research looked into Common Crawl, a very popular source of such data among AI builders, and found that such care is often lacking among AI builders.¹ And we also know that companies like Google and Microsoft have massive amounts of people’s personal data at their disposal, including deeply personal things like saved chats, financial spreadsheets, and family photos and videos.

That's why Mozilla is calling on OpenAI, Microsoft, and Google to be transparent about the source of data used to build their AI models. Understanding where the data that these companies used to build their models came from and how they prepared it before the model training is the first step to understanding how to address the real-world harms that generative AI is causing. Yet leading AI companies refuse this basic level of transparency for “competitive reasons.” ^{2, 3, 4} In other words: Transparency is not in their commercial interest.

Taking on powerful tech companies will not be easy. It is going to take a massive amount of advocacy and campaigning by the Mozilla community to successfully pressure these companies to be more transparent about the training data they use to build AI.

Mozilla's campaign work is made possible by tens of thousands of grassroots donors who have contributed to reclaim the internet. Today we are hoping you'll join them by adding your donation today.

Mozilla has been leading efforts for many years to fund, build, and advocate for trustworthy AI.⁵ And that work is now more important than ever as today's AI ecosystem contains serious flaws that have allowed the real-world harms produced by AI to go unchecked.

This latest campaign to demand transparency from irresponsible tech companies on their AI models is one way that the Mozilla community will use our collective power to ensure the next wave of AI is for the good of the public.

But we are going to be taking on some of the most powerful companies in the world, and it will take all of us, working together, to turn this vision into reality. If you agree it's time for OpenAI, Google, and Microsoft to be held accountable for how they build AI models, we hope you'll consider making a contribution today.

Thank you for all you do for the internet.

Michael Whitney
Director, Digital Engagement
Mozilla

Hello,

More information:

Connect with us

Thanks for reading!