Phi-3 on HuggingFace

Phi-3 models are the most efficient and effective small language models (SLMs) currently on the market, surpassing similarly sized and larger models in diverse benchmarks including language, reasoning, coding, and mathematics. These models utilize high-quality data for training.

The introduction of Phi-3 models provides Microsoft Azure customers with a broader range of top-tier models, enhancing their ability to develop and implement generative AI applications.

Phi-3 Family Models

The Phi-3 family comprises four models, each fine-tuned for specific instructions and developed according to Microsoft’s standards for responsible AI, safety, and security, ensuring they are ready for immediate deployment.

Phi-3-vision, a multimodal model with 4.2B parameters, combines language and visual processing capabilities.
Phi-3-mini is a language model with 3.8B parameters, available in context lengths of 128K and 4K.
Phi-3-small is a 7B parameter model also available in two context lengths: 128K and 8K.
Phi-3-medium offers 14B parameters and is available in context lengths of 128K and 4K.

The impact of Phi-3 models is significant. For instance, ITC, a major Indian conglomerate, has developed a copilot to help farmers communicate in their local languages. Similarly, Khan Academy is utilizing the Azure OpenAI Service and experimenting with Phi-3 to enhance their math tutoring services in a cost-effective, scalable, and flexible way.

Download Phi-3 on HuggingFace

Model Link
microsoft/Phi-3-small-128k-instruct-onnx-cuda
microsoft/Phi-3-small-8k-instruct-onnx-cuda
microsoft/Phi-3-vision-128k-instruct
microsoft/Phi-3-medium-4k-instruct
microsoft/Phi-3-medium-128k-instruct
microsoft/Phi-3-mini-128k-instruct
microsoft/Phi-3-mini-4k-instruct
microsoft/Phi-3-small-8k-instruct
microsoft/Phi-3-small-128k-instruct

Download Phi-3 on HuggingFace

Usage Advice

This model bears a strong resemblance to Llama, but it uniquely incorporates Phi3SuScaledRotaryEmbedding and Phi3YarnScaledRotaryEmbedding to enhance the rotary embeddings’ context. Here, the fusion of the query, key, and values occurs, and the MLP’s upward and gate projection layers are combined. The tokenizer employed is the same as LlamaTokenizer, though it includes extra tokens.

Read related articles:

Grok on HuggingFace

May 22, 2024

Tags:

Hub, Hugging Face