Cohere For AI is a non-profit research lab that seeks to solve complex machine learning problems. We support fundamental research that explores the unknown, and are focused on creating more points of entry into machine learning research.
Fundamental research lab
We work at the frontier of AI progress with the goal of solving cutting edge scientific problems. We see contributions to traditional conferences and publications in journals as an important part of our work, but also support efforts that go “beyond the research paper” and encourage scientific communication through different mediums. We drive the creation of new research spaces and breakthroughs that changes where, how and by whom research is done. We believe that technology is powerful, and empowering different perspectives ensures responsible innovation.
Open Science Initiative
We’re not just another research group. We are a hybrid lab with both a dedicated research staff and support for open science initiatives. We collaborate openly with independent researchers all over the world to conduct top-tier ML research.
Our open science research community is a space where researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We come together from over 100 countries around the world and support large and small scale research collaborations.
Our programs
Advancing the NLP space through our programs.
making models multilingual
AYA research project
About
Aya is a global project that aims to build a multilingual language model via instruction tuning that harnesses the collective wisdom and contributions of people from all over the world. It is the largest open science initiative to-date in AI involving 100+ independent researchers all over the world. Aya is open to anyone who is passionate about advancing the field of natural language processing and is committed to promoting open science. Learn more about the project in this blog post.
Join the Aya Discord server and start contributing in your language today.
academic support
Research grant
Benefits
Cohere For AI research grants are designed to support academic partners who are conducting research with the goal of releasing a peer-reviewed scientific artifact. Our program provides academic partners, developers, researchers, and other members of our community with subsidized access to the Cohere API.
Exploring the unknown together
Scholars program
About
Our Scholars Program provides the opportunity to work alongside some of the best research and engineering experts in the world. We have created an open and supportive environment that provides an alternative point of entry into machine learning research.
Spotlight papers
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI
Authors: Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi Wu, Enrico Shippole Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, Deb Roy, Sara Hooker
A cross-institutional effort involving experts across 13 institutions to shine a spotlight on data transparency and attribution in AI. dataprovenance.org
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
Authors: Meriem Boubdir, Edward Kim, Beyza Ermis, Marzieh Fadaee, Sara Hooker
Human evaluation of LLMs is critical, but comes at a high cost. This research proposes prompt ranking methods to make pairwise human evaluation more efficient.
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models
Authors: Luiza Pozzobon, Beyza Ermis, Patrick Lewis, Sara Hooker
Toxicity definitions evolve and change over time. Why don't mitigation techniques account for this? With Goodtriever, we take the first steps on continual toxicity mitigation!
The Grand Illusion: The Myth of Software Portability and Implications for ML Progress
Authors: Fraser Mince, Dzung Dinh, Jonas Kgomo, Neil Thompson, Sara Hooker
How portable are popular ML software frameworks? This work reveals how costly straying from a narrow set of hardware-software combinations can be.
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Authors: Ted Zadouri, Ahmet Üstün, Arash Ahmadian, Beyza Ermiş, Acyr Locatelli, Sara Hooker
Can we use Mixture of Experts (MoE) for instruction tuning in extreme parameter constraint? Here, we push MoEs to the limit with ultra-lightweight experts, enabling parameter-efficient MoEs with high performance.
When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
Authors: Max Marion, Ahmet Üstün, Luiza Pozzobon, Alex Wang, Marzieh Fadaee, Sara Hooker
Leveraging data pruning to examine what makes “good data.” We explore several metrics for measuring LLM pretraining data and find that we can remove up to 70% of pre-training data while achieving better test set performance.
Evaluating the Social Impact of Generative AI Systems in Systems and Society
Authors: Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Hal Daumé III, Jesse Dodge, Ellie Evans, Sara Hooker, Yacine Jernite, Alexandra Sasha Luccioni, Alberto Lusoli, Margaret Mitchell, Jessica Newman, Marie-Therese Png, Andrew Strait, Apostol Vassilev
How do we benchmark the social impact of generative systems? This cross-institutional research collaboration provides a guide to evaluating the social impact of Generative AI Systems.
Intriguing Properties of Quantization at Scale
Authors: Arash Ahmadian, Saurabh Dash, Hongyu Chen, Bharat Venkitesh, Stephen Gou, Phil Blunsom, Ahmet Üstün, Sara Hooker
How do we benchmark the social impact of generative systems? This cross-institutional research collaboration provides a guide to evaluating the social impact of Generative AI Systems.
On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research
Authors: Luiza Pozzobon, Beyza Ermis, Patrick Lewis, Sara Hooker
How changes to the Perspective API, used widely for toxicity evaluation, impact research reproducibility and rankings of model risk.
Past events and videos
Research is inherently a human endeavor, and our event series provide insights from beginning to breakthrough.
Video
Fireside Chat: Colin Raffel
Video
8-bit Methods for Efficient Deep Learning
Video
Mechanistic Interpretability: Getting Started
Video
Fireside Chat: Pablo Samuel Castro
Video
Career creation for non-standard candidates
Video
Fireside Chat: Samy Bengio
Meet our research team
Our staff brings together machine learning experts to contribute to progress in machine learning through fundamental research. We are committed to open collaboration, and empowering more points of entry into machine learning research through our scholars program.
Sara hooker
head, Cohere for ai
Marzieh Fadaee
Senior Research Scientist
Julia Kreutzer
SENIOR RESEARCH SCIENTIST
Ahmet Üstün
Research Scientist
Beyza Ermis
Research Scientist
Madeline Smith
Operations and Community Lead
Brittawnya Prince
Operations Associate
Arash Ahmadian Dehkordi
Research Scholar
Luiza Pozzobon
Research Scholar
Viraat Aryabumi
Research Intern
Frequently Asked Questions
- What’s C4AI’s origin story?
- In 2017, a team of friends, classmates, and engineers started a distributed research collaboration, with a focus on creating a medium for early-career AI enthusiasts to engage with experienced researchers – they called it “for.ai.” Two of those co-founding members, Aidan Gomez and Ivan Zhang, later went on to co-found Cohere, and many of the original members went on to do exciting things (pursuing PhDs, working at industry and academic labs).
- Do you charge for your educational programs or community membership?
- We do not charge for participating in any of our programs, and are committed to supporting educational outreach programs, which include compute resources and infrastructure needed to participate in machine learning research.
- are you hiring for research positions or interns?
- Our full list of positions are listed here.
- How can I stay in touch?
- To stay up to date on upcoming talks, sign up for our mailing list.
At the time, For AI was one of the first community-driven research groups to support independent researchers around the world. Today, Cohere is proud to reintroduce For AI as Cohere For AI, a dedicated research lab and community for exploring the unknown, together. Watch the C4AI history video here.
You can also apply to join our open science community or follow us on LinkedIn and Twitter.
Join our open science community
Collaborate with researchers, engineers, linguists, social scientists, and lifelong learners from 100+ countries on top-tier ML research.