Reasoning Models

Advanced research into reinforcement learning algorithms that enable more sophisticated reasoning

Reasoning Models

Our research in reasoning models focuses on creating AI training algorithms that enable models to reason in a deeper and more generalised way.

Current RL training for LLMs focuses on verifiable rewards, which limits the ability to reason in non-verifiable domains that are more fuzzy, open-ended, and linguistic.

Enabling deeper reasoning in these areas is crucial for real-world domains such as linguistic logic, law, and science.

Our reward assignment strategy allows for the extension of reward attribution to every textual domain, including non-verifiable ones.

Deeper Reasoning

Creating AI training algorithms that enable models to reason in a deeper and more generalised way

Beyond Verifiable Rewards

Moving beyond current RL training limitations to handle fuzzy, open-ended, and linguistic domains

Real-World Applications

Enabling deeper reasoning for linguistic logic, law, and science domains

Universal Rewards

Extending reward attribution to every textual domain, including non-verifiable ones