C4AI Command-R - Hugging Face

C4AI Command-R is a research edition featuring a highly efficient generative model with 35 billion parameters. Designed for diverse applications, such as reasoning, summarization, and answering questions, Command-R is a substantial language model with freely accessible weights. It supports multilingual content creation in 10 languages and boasts exceptional capabilities with Recursive Answer Generation (RAG).

Creators: Cohere and Cohere For AI
Contact Information: Cohere For AI at cohere.for.ai
Licensing: Covered under the CC-BY-NC license, with an obligation to comply with the C4AI Acceptable Use Policy.
Model Identifier: c4ai-command-r-v01
Model Specifications: 35 billion parameters
Maximum Context Size: 128K

How to Use

# pip install transformers
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-v01"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

# Format message with the command-r chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text

C4AI Command-R Model Overview

The model exclusively processes and generates text data. It outputs text-based responses only.

Model Structure

Command-R is built on an auto-regressive framework leveraging an advanced transformer design. The model undergoes initial pre-training, followed by supervised fine-tuning (SFT) and preference training to ensure its outputs are aligned with human values, focusing on usefulness and safety.

Language Compatibility

Command-R is finely tuned for high performance across multiple languages, including English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. It also incorporates pre-training data from an additional 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian, enhancing its multilingual capabilities.

The model supports a substantial context length of 128K.

Tool Interaction

Command-R is adept at utilizing conversational tools, trained through a blend of supervised and preference fine-tuning using specific prompt formats. While sticking to these formats is advised for optimal performance, exploring different prompt styles is encouraged.

Its tool utilization is designed around processing conversations (optionally beginning with a user-system introduction) and generating a JSON-formatted action list based on a provided toolset. C4AI Command-R might employ multiple actions from its tool repertoire.

A unique feature is the ‘directly_answer’ tool, indicating the model’s choice to bypass other tools. Including this tool is recommended, but experimentation is welcome.

Detailed documentation and prompt strategy guides for tool interaction are forthcoming.

Retrieval Augmented Generation (RAG) Features

Command-R has been enhanced with grounded generation capabilities, enabling it to craft responses grounded in provided document snippets. This feature facilitates creating responses that reference specific information pieces, shown as grounding spans (citations) within the output. Such functionality is crucial for tasks like grounded summarization and implementing the final phase of Retrieval Augmented Generation (RAG). This capability is embedded in the model through a combination of supervised fine-tuning and preference fine-tuning, adhering to a designated prompt structure. While following this structure is recommended for optimal performance, exploring variations is encouraged for potential innovation.

In grounded generation tasks, Command-R takes a dialogue input (optionally starting with a preamble provided by the user) and a collection of retrieved document snippets. These snippets should be concise (about 100-400 words each) and formatted as key-value pairs, where keys are concise descriptors, and values can be textual or semi-structured data.

By design, C4AI Command-R engages in grounded response generation by identifying relevant documents, selecting those to cite, crafting a response based on this information, and incorporating grounding spans to cite sources directly within the response. This method is known as accurate grounded generation.

Additionally, C4AI Command-R supports various response modes, including a fast citation mode integrated within the tokenizer. This mode allows for direct generation of answers with grounding spans, bypassing the step of drafting a full answer first, thus trading some grounding precision for token generation efficiency.

A simple example code snippet illustrates how to format prompts, generate responses, and interpret completions. Detailed documentation and instructional material on leveraging grounded generation will be available subsequently.

Download: CohereForAI/c4ai-command-r-v01

Read other articles: