Stable LM 2 12B Chat is a language model with 12 billion parameters, specifically tuned for interpreting instructions. It was trained using a combination of publicly accessible and synthetic datasets, and employs Direct Preference Optimization (DPO) in its development.
Stable LM 2 12B is trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model.
This release includes an update to Stable LM 2 1.6B, which improves its conversational skills in all of the seven aforementioned languages and incorporates tool usage and function calling.
Model Details
Developed by Stability AI, the StableLM 2 12B Chat model is an auto-regressive language model built on the transformer decoder architecture. It operates primarily in English. For an in-depth look, refer to the “Stable LM 2 Chat Technical Report” and the “Alignment Handbook” library.
The model was finetuned from an undisclosed base model and is distributed under the StabilityAI Non-Commercial Research Community License. Commercial use requires contact with Stability AI; please direct inquiries to lm@stability.ai.
Training Dataset
The training dataset consists of a blend of publicly available large-scale datasets and proprietary safety datasets. Some of the key sources include:
SFT Datasets
- HuggingFaceH4/ultrachat_200k
- meta-math/MetaMathQA
- WizardLM/WizardLM_evol_instruct_V2_196k
- Open-Orca/SlimOrca
- openchat/openchat_sharegpt4_dataset
- LDJnr/Capybara
- hkust-nlp/deita-10k-v0
- teknium/OpenHermes-2.5
- glaiveai/glaive-function-calling-v2
Safety Datasets
- Anthropic/hh-rlhf
- Internal Safety Dataset
Preference Datasets:
- argilla/dpo-mix-7k
These datasets ensure a robust and secure modeling framework, integrating both safety and preference tuning.
Intended Use
This model is designed for integration into chat-like applications. Developers should conduct thorough safety evaluations to assess its performance in their specific context. Additional information on safety practices and limitations can be found in the sections below.
Limitations and Bias
To mitigate potential risks, it is highly recommended that this model be used in conjunction with input and output classifiers to prevent harmful responses. The model requires robust safeguards for both inputs and outputs to ensure that the responses generated do not contain fabrications. Given the unique nature of each application, conducting your own comprehensive testing is essential to confirm the model’s efficacy. Avoid using the model if it is not suitable for your application, or in scenarios that could potentially cause harm, whether intentional or accidental.
Download
https://huggingface.co/stabilityai/stablelm-2-12b-chat/tree/main
Read related articles:

