stephane.bio
  • Invest
  • Build
  • Write
  • Think
Ketchup

Research | Cohere For AI

https://cohere.com/research

Cohere For AI is a non-profit research lab that seeks to solve complex machine learning problems. We support fundamental research that explores the unknown, and are focused on creating more points of entry into machine learning research.

Fundamental research lab

We work at the frontier of AI progress with the goal of solving cutting edge scientific problems. We see contributions to traditional conferences and publications in journals as an important part of our work, but also support efforts that go “beyond the research paper” and encourage scientific communication through different mediums. We drive the creation of new research spaces and breakthroughs that changes where, how and by whom research is done. We believe that technology is powerful, and empowering different perspectives ensures responsible innovation.

Open Science Initiative

We’re not just another research group. We are a hybrid lab with both a dedicated research staff and support for open science initiatives. We collaborate openly with independent researchers all over the world to conduct top-tier ML research.

Our open science research community is a space where researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We come together from over 100 countries around the world and support large and small scale research collaborations.

Join us

Our programs

Advancing the NLP space through our programs.

making models multilingual

AYA research project

About

Aya is a global project that aims to build a multilingual language model via instruction tuning that harnesses the collective wisdom and contributions of people from all over the world. It is the largest open science initiative to-date in AI involving 100+ independent researchers all over the world. Aya is open to anyone who is passionate about advancing the field of natural language processing and is committed to promoting open science. Learn more about the project in this blog post.

Join the Aya Discord server and start contributing in your language today.

academic support

Research grant

Benefits

Cohere For AI research grants are designed to support academic partners who are conducting research with the goal of releasing a peer-reviewed scientific artifact. Our program provides academic partners, developers, researchers, and other members of our community with subsidized access to the Cohere API.

Exploring the unknown together

Scholars program

About

Our Scholars Program provides the opportunity to work alongside some of the best research and engineering experts in the world. We have created an open and supportive environment that provides an alternative point of entry into machine learning research.

Spotlight papers

image

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

Authors: Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi Wu, Enrico Shippole Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, Deb Roy, Sara Hooker

A cross-institutional effort involving experts across 13 institutions to shine a spotlight on data transparency and attribution in AI. dataprovenance.org

Keep Reading

image

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

Authors: Meriem Boubdir, Edward Kim, Beyza Ermis, Marzieh Fadaee, Sara Hooker

Human evaluation of LLMs is critical, but comes at a high cost. This research proposes prompt ranking methods to make pairwise human evaluation more efficient.

Keep Reading

image

Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models

Authors: Luiza Pozzobon, Beyza Ermis, Patrick Lewis, Sara Hooker

Toxicity definitions evolve and change over time. Why don't mitigation techniques account for this? With Goodtriever, we take the first steps on continual toxicity mitigation!

Keep Reading

image

The Grand Illusion: The Myth of Software Portability and Implications for ML Progress

Authors: Fraser Mince, Dzung Dinh, Jonas Kgomo, Neil Thompson, Sara Hooker

How portable are popular ML software frameworks? This work reveals how costly straying from a narrow set of hardware-software combinations can be.

Keep Reading

image

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

Authors: Ted Zadouri, Ahmet Üstün, Arash Ahmadian, Beyza Ermiş, Acyr Locatelli, Sara Hooker

Can we use Mixture of Experts (MoE) for instruction tuning in extreme parameter constraint? Here, we push MoEs to the limit with ultra-lightweight experts, enabling parameter-efficient MoEs with high performance.

Keep Reading

image

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

Authors: Max Marion, Ahmet Üstün, Luiza Pozzobon, Alex Wang, Marzieh Fadaee, Sara Hooker

Leveraging data pruning to examine what makes “good data.” We explore several metrics for measuring LLM pretraining data and find that we can remove up to 70% of pre-training data while achieving better test set performance.

Keep Reading

image

Evaluating the Social Impact of Generative AI Systems in Systems and Society

Authors: Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Hal Daumé III, Jesse Dodge, Ellie Evans, Sara Hooker, Yacine Jernite, Alexandra Sasha Luccioni, Alberto Lusoli, Margaret Mitchell, Jessica Newman, Marie-Therese Png, Andrew Strait, Apostol Vassilev

How do we benchmark the social impact of generative systems? This cross-institutional research collaboration provides a guide to evaluating the social impact of Generative AI Systems.

Keep Reading

image

Intriguing Properties of Quantization at Scale

Authors: Arash Ahmadian, Saurabh Dash, Hongyu Chen, Bharat Venkitesh, Stephen Gou, Phil Blunsom, Ahmet Üstün, Sara Hooker

How do we benchmark the social impact of generative systems? This cross-institutional research collaboration provides a guide to evaluating the social impact of Generative AI Systems.

Keep Reading

image

On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research

Authors: Luiza Pozzobon, Beyza Ermis, Patrick Lewis, Sara Hooker

How changes to the Perspective API, used widely for toxicity evaluation, impact research reproducibility and rankings of model risk.

Keep Reading

All Research Papers

Past events and videos

Research is inherently a human endeavor, and our event series provide insights from beginning to breakthrough.

See upcoming events

image

Video

Fireside Chat: Colin Raffel

Watch the video

image

Video

8-bit Methods for Efficient Deep Learning

Watch the video

image

Video

Mechanistic Interpretability: Getting Started

Watch the video

image

Video

Fireside Chat: Pablo Samuel Castro

Watch the video

image

Video

Career creation for non-standard candidates

Watch the video

image

Video

Fireside Chat: Samy Bengio

Watch the video

Meet our research team

Our staff brings together machine learning experts to contribute to progress in machine learning through fundamental research. We are committed to open collaboration, and empowering more points of entry into machine learning research through our scholars program.

Sara hooker

head, Cohere for ai

Marzieh Fadaee

Senior Research Scientist

Julia Kreutzer

SENIOR RESEARCH SCIENTIST

Ahmet Üstün

Research Scientist

Beyza Ermis

Research Scientist

Madeline Smith

Operations and Community Lead

Brittawnya Prince

Operations Associate

Arash Ahmadian Dehkordi

Research Scholar

Luiza Pozzobon

Research Scholar

Viraat Aryabumi

Research Intern

Frequently Asked Questions

  • What’s C4AI’s origin story?
    • In 2017, a team of friends, classmates, and engineers started a distributed research collaboration, with a focus on creating a medium for early-career AI enthusiasts to engage with experienced researchers – they called it “for.ai.” Two of those co-founding members, Aidan Gomez and Ivan Zhang, later went on to co-found Cohere, and many of the original members went on to do exciting things (pursuing PhDs, working at industry and academic labs).
    • At the time, For AI was one of the first community-driven research groups to support independent researchers around the world. Today, Cohere is proud to reintroduce For AI as Cohere For AI, a dedicated research lab and community for exploring the unknown, together. Watch the C4AI history video here.

  • Do you charge for your educational programs or community membership?
    • We do not charge for participating in any of our programs, and are committed to supporting educational outreach programs, which include compute resources and infrastructure needed to participate in machine learning research.
  • are you hiring for research positions or interns?
    • Our full list of positions are listed here.
  • How can I stay in touch?
    • To stay up to date on upcoming talks, sign up for our mailing list.
    • You can also apply to join our open science community or follow us on LinkedIn and Twitter.

Join our open science community

Collaborate with researchers, engineers, linguists, social scientists, and lifelong learners from 100+ countries on top-tier ML research.

stephane.bio

Made with Notion, Published on Super - 2026 © Stephane Boghossian

LinkedInInstagramMediumGitHubXBehanceDiscordPinterest