How to Use Phi-4 GGUF: A Quick Guide

Sebastian Petrus
4 min readDec 14, 2024

--

Microsoft’s Phi-4 is an advanced language model that has recently been made available in GGUF format, allowing for local deployment and use. This guide will walk you through the process of setting up and using Phi-4 GGUF on your own machine, enabling you to harness its capabilities for various natural language processing tasks.

If you are seeking an All-in-One AI platform that manages all your AI subscriptions in one place, including:

  • Virtually any LLMs, such as: Claude 3.5 Sonnet, Google Gemini, GPT-40 and GPT-o1, Qwen Models & Other Open Source Models.
  • You can even use the uncensored Dolphin Mistral & Llama models!
  • Best AI Image Generation Models such as: FLUX, Stable Diffusion 3.5, Recraft
  • You can even use AI Video Generation Models such as Minimax, Runway Gen-3 and Luma AI with Anakin AI

Phi-4: Small But Mighty

Phi-4 is the latest iteration in Microsoft’s Phi series of language models. It represents a significant advancement in AI technology, designed to handle a wide range of language tasks with improved efficiency and accuracy. The GGUF (GPT-Generated Unified Format) is a file format optimized for efficient loading and inference of large language models on consumer-grade hardware.

Microsoft Phi-4 Benchmarks

Key Features of Phi-4:

  • Advanced natural language understanding
  • Improved context retention
  • Enhanced performance on various NLP tasks

Benefits of GGUF Format:

  • Reduced memory footprint
  • Faster loading times
  • Optimized for consumer hardware
Comparing Phi-4 Performance with other popular models on AMC 10/12 Tests

Download Phi-4 GGUF

To begin using Phi-4 GGUF, you first need to download the model files. As of now, an unofficial release is available through a community member’s Hugging Face repository.

Steps to Download:

  1. Visit the Hugging Face repository: https://huggingface.co/matteogeniaccio/phi-4/tree/main
  2. Choose the quantization option that suits your needs (Q8_0, Q6_K, or f16)
  3. Download the selected model file

Note: The official release from Microsoft is expected in the near future, which may offer additional features or optimizations.

Setting Up Your Environment

Before running Phi-4 GGUF, you need to set up your environment with the necessary tools and dependencies.Required Software:

  • Python 3.7 or higher
  • Git (for cloning repositories)
  • A compatible inference engine (e.g., llama.cpp or Ollama)

Installation Steps:

  1. Install Python from the official website if not already installed
  2. Install Git from git-scm.com if not present on your system
  3. Choose and install an inference engine (detailed in the next sections)

Using Phi-4 GGUF with llama.cpp

llama.cpp is a popular inference engine for running large language models locally. Here’s how to set it up for use with Phi-4 GGUF:

Setting Up llama.cpp:

  • Clone the llama.cpp repository:
git clone https://github.com/ggerganov/llama.cpp.git\
  • Navigate to the cloned directory:
cd llama.cpp
  • Build the project:
make

Running Phi-4 with llama.cpp:

  1. Place your downloaded Phi-4 GGUF file in the models directory
  2. Run the model using the following command:
./main -m models/phi-4-q8_0.gguf -n 1024 --repeat_penalty 1.1 --temp 0.1 -p "Your prompt here"

Adjust the parameters as needed for your specific use case.

For more details, you can read this PR on llama.cpp repo:

Deploying Phi-4 GGUF with Ollama

Ollama is another excellent tool for running language models locally, offering a more user-friendly interface.Installing Ollama:

  1. Visit https://ollama.ai/ and download the appropriate version for your operating system
  2. Follow the installation instructions provided on the website

Running the Phi-4 Model in Ollama:

  1. Create a Modelfile named Modelfile with the following content:

Run the following command to test the model:

ollama run vanilj/Phi-4

More details in the link:

Conclusion

Phi-4 GGUF represents a significant step forward in making advanced language models accessible for local deployment. By following this guide, you should now be equipped to download, set up, and use Phi-4 GGUF for various natural language processing tasks. As you explore its capabilities, remember to stay updated with the latest developments and best practices in the rapidly evolving field of AI and language models.

--

--

Sebastian Petrus
Sebastian Petrus

Written by Sebastian Petrus

Asist Prof @U of Waterloo, AI/ML, e/acc

Responses (1)