Ad image

Cohere’s smallest, fastest R-series model excels at RAG, reasoning in 23 languages

5 Min Read

Join our daily and weekly newsletters for the latest industry-leading AI news and exclusive content. learn more


AI startups that demonstrate intent to support a wide range of enterprise use cases, including those that do not require expensive and resource-intensive large-scale language models (LLMs) Kohia has released the Command R7B, the smallest and fastest in the R model series.

Command R7B is built to support rapid prototyping and iteration, and uses search augmentation generation (RAG) to improve accuracy. This model features a context length of 128K and supports 23 languages. Cohere says it performs better on tasks like math and coding than other open-weight models in its class (Google’s Gemma, Meta’s Llama, Mistral’s Minister).

“This model is designed for developers and enterprises who need to optimize speed, cost performance, and computing resources for their use cases,” said Aidan Gomez, co-founder and CEO of Cohere. Masu. Write it in a blog post Announcement of new model.

Outperform your competitors in math, coding, and RAG

Cohere has a strategic focus on enterprises and their unique use cases. The company you introduced us to March Command R Powerful Command R+ released in April with upgrades all year round Supports speed and efficiency. It has teased Command R7B as the “final” model of the R series and said it will release the model weights to the AI ​​research community.

Cohere said the key areas of focus when developing Command R7B were improving performance in math, reasoning, code, and translation. The company seems to be having success in these areas, with new smaller models coming out on top. HuggingFace Open LLM Leaderboard Compare to similarly sized open weight models such as Gemma 2 9B, Ministeral 8B and Llama 3.1 8B.

Additionally, the R-series minimal model outperforms competing models in areas such as AI agents, tool usage, and RAG, and helps improve accuracy by pinning model output to external data. Cohere said the Command R7B excels in conversational tasks such as technology-related workplaces and enterprise risk management (ERM) support. Technical facts. Media workplace and customer service support. HR FAQs; and summaries. Kohia also noted that the model is “very good” at capturing and manipulating numerical information in a financial environment.

Overall, the Command R7B averaged first place in key benchmarks including Instruction Following Evaluation (IFeval). Big Bench Hard (BBH); Graduate Level Google Proof Q&A (GPQA). Multi-step soft reasoning (MuSR); and Massive multitasking language understanding (MMLU).

Remove unnecessary calling features

Command R7B can be extended with tools such as search engines, APIs, and vector databases. Cohere reports that the model’s use of tools has shown superior performance against competitors on the Berkeley Function-Calling Leaderboard, which evaluates a model’s accuracy on function calls (connections to external data and systems). I am.

Gomez notes that this has proven its effectiveness in “real-world, diverse and dynamic environments” and eliminates the need for unnecessary calling features. This could be a good option for building “fast and capable” AI agents. For example, Cohere points out that when Command R7B acts as an Internet-enhanced search agent, it can break down complex questions into sub-goals while also performing well in advanced reasoning and information retrieval.

Command R7B’s small size allows it to be deployed on low-end and consumer CPUs, GPUs, and MacBooks, enabling on-device inference. This model is currently available on the Cohere platform and HuggingFace. The price is $0.0375 per million input tokens and $0.15 per million output tokens.

“This is an ideal choice for companies looking for a cost-effective model based on internal documents and data,” Gomez wrote.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version