Breaking Chains: Cohere’s Free LLM API’s Shakes OpenAI’s Foundation.

7 min readNov 24, 2023

In this article, we are going to test Cohere’s free LLM APIs and assess their potential significant impact on open source.


Cohere’s platform gives developers and businesses access to NLP, which is powered by large language models. Cohere’s AI is designed to be easy to use, accessible, and private.

Cohere’s recent breakthroughs in LLM development are remarkable steps toward creating unbiased and responsible AI solutions for consumers. Despite these advancements, LLMs are still in the development phase, and Cohere strives to address potential biases and ensure responsible deployment.

Play with coral model here.

Cohere API’s

Cohere provides a variety of powerful APIs that enable developers to integrate cutting-edge language models into their projects. Here are some of the key APIs offered by Cohere:

  • Chat
  • Generation
  • Embeddings
  • Reranking
  • Classification
  • Language Detection
  • Summarization
  • Tokenize
  • Detokenize

Chat Endpoint

The chat endpoint allows users to have conversations with a Large Language Model (LLM) from Cohere. Users can send messages as part of a persisted conversation using the conversation_id parameter, or they can pass in their own conversation history using the chat_history parameter.
The endpoint features additional parameters such as connectors and documents that enable conversations enriched by external knowledge. We call this "Retrieval Augmented Generation", or "RAG".

!pip -q install cohere
import cohere
co = cohere.Client('<<apiKey>>')

You can find your API key in API Reference page.

response =
{"role": "USER", "message": "Who is current CEO of Openai"},
{"role": "CHATBOT", "message": "There has been a lot of turmoil surrounding the leadership of OpenAI in recent months. Former CEO, Sam Altman, was briefly ousted in November 2023, with the board appointing Chief Technology Officer, Mira Murati, as interim CEO. However, just days later, a deal was struck and Altman returned as CEO of the company. There has been speculation about Altman's dealings with the board, with claims that he had been dishonest in his communication with them, and a desire from Altman to install an entirely new slate of directors."}
message="What happened in openai in lastweek, explain clearly",
# perform web search before answering the question. You can also use your own custom connector.
connectors=[{"id": "web-search"}]

It was quite a turbulent week for OpenAI, with the board initially announcing that Sam Altman would be departing as CEO, and technology chief Mira Murati would be taking on the role of interim CEO. This was met with a large proportion of employees calling for Altman's return, including Murati herself. After negotiations, it was then announced that Altman would be returning to the company as CEO just 5 days after his initial departure, with a new initial board of directors.

Here are the new members of the board:
- Bret Taylor (Chair)
- Larry Summers
- Adam D'Angelo

Is there anything else you'd like to know about the events at OpenAI last week?

Generate Endpoint

This endpoint generates realistic text conditioned on a given input.

response = co.generate(
prompt = "Write an introductory paragraph for a blog post about language models.",


Language models have emerged as powerful tools for natural language processing and artificial intelligence. These models are designed to analyze, generate, and understand human language, and have revolutionized various industries and applications.
The popularity and usage of language models have skyrocketed, showcasing their effectiveness and potential in numerous domains. As researchers and developers continue to innovate, the future for language models remains bright, enabling us to tackle complex language-based challenges and unlock new opportunities for human-computer interaction.

Embedding Endpoint

This endpoint returns text embeddings. An embedding is a list of floating point numbers that captures semantic information about the text that it represents.

Embeddings can be used to create text classifiers as well as empower semantic search, etc.

response = co.embed(
texts=['what is large language models'],

[-0.04324341, -0.04257202, -0.029647827, -0.05340576, -0.06439209, 0.021331787, -0.09307861,......]

Reranking Endpoint

This endpoint takes in a query and a list of texts and produces an ordered array with each text assigned a relevance score.

docs = ['Carson City is the capital city of the American state of Nevada.',
'The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.',
'Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.',
'Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.']

response = co.rerank(
model = 'rerank-english-v2.0',
query = 'What is the capital of the United States?',
documents = docs,
top_n = 3,

[RerankResult<document['text']: Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.,index: 2, relevance_score: 0.98005307>, RerankResult<document['text']: Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states., index: 3, relevance_score: 0.27904198>, RerankResult<document['text']: Carson City is the capital city of the American state of Nevada., index: 0, relevance_score: 0.10194652>]

Classification Endpoint

This endpoint makes a prediction about which label fits the specified text inputs best. To make a prediction, Classify uses the provided examples of text + label pairs as a reference.

from cohere.responses.classify import Example
Example("Dermatologists don't like her!", "Spam"),
Example("'Hello, open to this?'", "Spam"),
Example("I need help please wire me $1000 right now", "Spam"),
Example("Nice to know you ;)", "Spam"),
Example("Please help me?", "Spam"),
Example("Your parcel will be delivered today", "Not spam"),
Example("Review changes to our Terms and Conditions", "Not spam"),
Example("Weekly sync notes", "Not spam"),
Example("'Re: Follow up from today's meeting'", "Not spam"),
Example("Pre-read for tomorrow", "Not spam"),
"Confirm your email address",
"hey i need u to send some $",
response = co.classify(

[Classification<prediction: "Not spam", confidence: 0.5661598, labels: {'Not spam': LabelPrediction(confidence=0.5661598), 'Spam': LabelPrediction(confidence=0.43384025)}>, Classification<prediction: "Spam", confidence: 0.9909811, labels: {'Not spam': LabelPrediction(confidence=0.009018883), 'Spam': LabelPrediction(confidence=0.9909811)}>]

Language detection endpoint

This endpoint identifies which language each of the provided texts is written in.

response = co.detect_language(
texts=['தமிழ், உலகில் உள்ள முதன்மையான மொழிகளில் ஒன்றும் செம்மொழியும் ஆகும்.', 'Hello world']
[Language<language_code: "ta", language_name: "Tamil">, Language<language_code: "en", language_name: "English">]

Summarize Endpoint

This endpoint generates a summary in English for a given text.

"Ice cream is a sweetened frozen food typically eaten as a snack or dessert. "
"It may be made from milk or cream and is flavoured with a sweetener, "
"either sugar or an alternative, and a spice, such as cocoa or vanilla, "
"or with fruit such as strawberries or peaches. "
"It can also be made by whisking a flavored cream base and liquid nitrogen together. "
"Food coloring is sometimes added, in addition to stabilizers. "
"The mixture is cooled below the freezing point of water and stirred to incorporate air spaces "
"and to prevent detectable ice crystals from forming. The result is a smooth, "
"semi-solid foam that is solid at very low temperatures (below 2 °C or 35 °F). "
"It becomes more malleable as its temperature increases.\n\n"
"The meaning of the name \"ice cream\" varies from one country to another. "
"In some countries, such as the United States, \"ice cream\" applies only to a specific variety, "
"and most governments regulate the commercial use of the various terms according to the "
"relative quantities of the main ingredients, notably the amount of cream. "
"Products that do not meet the criteria to be called ice cream are sometimes labelled "
"\"frozen dairy dessert\" instead. In other countries, such as Italy and Argentina, "
"one word is used fo\r all variants. Analogues made from dairy alternatives, "
"such as goat's or sheep's milk, or milk substitutes "
"(e.g., soy, cashew, coconut, almond milk or tofu), are available for those who are "
"lactose intolerant, allergic to dairy protein or vegan."

response = co.summarize(

Ice cream is a popular frozen dessert made from dairy products or alternatives, sweeteners, and spices. It is a foam made by cooling and stirring a liquid mixture to incorporate air spaces. While it is solid at very low temperatures, it becomes more malleable as its temperature increases.
The name "ice cream" has different definitions depending on the country, with some countries having stricter regulations on commercial use. Ice cream made from dairy alternatives is available for those who are lactose intolerant or vegan.

Tokenize Endpoint

This endpoint splits input text into smaller units called tokens using byte-pair encoding (BPE). To learn more about tokenization and byte pair encoding, see the tokens page.

response = co.tokenize(
text='tokenize me! :D',

cohere.Tokens {
tokens: [10002, 2261, 2012, 8, 2792, 43]
token_strings: ['token', 'ize', ' me', '!', ' :', 'D']
meta: {'api_version': {'version': '1'}}

Detokenize Endpoint

This endpoint takes tokens using byte-pair encoding and returns their text representation. To learn more about tokenization and byte pair encoding, see the tokens page.

response = co.detokenize(
tokens=[10104, 12221, 1315, 34, 1420, 69],

Anton Mun🟣;🥭^


The model’s performance was great overall. Cohere has primarily focused on reducing the model’s hallucination effectively. It was fast and accurate in providing responses. This stands out in comparison to OpenAI’s APIs, which currently have delays. Cohere not only offers a more affordable solution but also a superior model compared to OpenAI.

Reference :




🤖 Exploring Generative AI & LLM. Join the Gathnex community for cutting-edge discussions and updates! LinkedIn : 🌟