Three ways to get involved: research collaboration, open source contributions, or sending us data. We are selective but genuinely open — the right collaborator on the right problem makes a real difference to what we can learn.
Each path has different expectations and different benefits. Read them carefully before reaching out — the clearer you are about which path you are interested in, the faster we can respond.
If you are working on problems that overlap with our research streams — agent reliability, evaluation methodology, AI system failure modes, or developer tooling — we occasionally partner on experiments with external researchers.
Each open-source repository has its own CONTRIBUTING.md with specific guidelines. Generally: we welcome PRs, issue reports, documentation improvements, and test coverage additions.
Anonymised real-world data from your AI systems helps make our benchmarks more representative. We credit data contributors in the methodology notes and share relevant findings back.
These are not formal rules — they are the norms that make collaborations go well. We hold ourselves to them and expect the same from contributors.
Vague offers to "help" are hard to act on. Tell us what you have: a dataset, a specific skill, a research context, time to review methodology.
Research timelines depend on contributors following through. If circumstances change, tell us early. An early "I cannot continue" is far better than a silent drop-off.
We want people who will challenge our methodology and point out where we are wrong. We do not want people who are rude about it. The distinction matters.
We credit contributions accurately. If you reviewed methodology, you are credited as a reviewer. If you co-designed the experiment, you are a co-author. We do not inflate or deflate credit.
If a collaboration produces a result that contradicts our prior findings, we publish that. Contributing to a result that overturns a previous conclusion is still a valid contribution.
Research takes longer than planned. We communicate timeline changes proactively and expect the same from collaborators. Build slack into your own commitments.
These are open needs across our four research streams as of March 2026. We update this list quarterly. Reaching out about a specific need here gets a faster response than a general inquiry.
We respond to every genuine inquiry, even if the answer is “not right now.” The more context you give us about your background and what you are working on, the more useful our response can be.