Running DeepSeek R1 via Ollama on Windows with an Unsupported AMD GPU (RX 6600 XT)

4 min read
Running DeepSeek R1 via Ollama on Windows with an Unsupported AMD GPU (RX 6600 XT)

Are you trying to run DeepSeek R1 using Ollama on Windows with an officially unsupported GPU, such as the Radeon RX 6600 XT? While AMD GPUs are gaining more support in AI applications, some models like the 6600 XT are not listed as officially supported (see official AMD-supported GPUs). Fortunately, there is a workaround using a community-maintained fork of Ollama. This guide will walk you through the steps to get DeepSeek R1 running on your system.

Prerequisites

Before proceeding, ensure that you:

  • Have administrative access to your Windows machine.
  • Uninstall any existing official Ollama installation.
  • Download the community fork of Ollama for AMD GPUs.

VRAM Requirements and Model Variations

The RX 6600 XT comes with 8GB of VRAM, which is sufficient to run several DeepSeek R1 variants. In my testing, both deepseek-r1:1.5b and deepseek-r1:7b models ran successfully. For optimal performance, consider trying the various distilled and quantized versions available in the Ollama model library - these compressed variants might offer better performance for their size while using less VRAM.

For a deeper understanding of VRAM requirements and quantization techniques for LLMs, check out this excellent video: How Much VRAM My LLM Model Needs?

Step 1: Download and Install the AMD-Supported Ollama Fork

The official Ollama does not support all AMD GPUs, but a modified version exists:

  1. Uninstall the official version of Ollama if you have it installed.
  2. Visit the Ollama for AMD GitHub releases page.
  3. Download the latest OllamaSetup.exe file (at the time of writing, the latest version was v0.5.4).
  4. Run the installer and follow the on-screen instructions.

Step 2: Identify Your GPU Architecture

To properly configure the ROCm libraries, you need to find your GPU’s LLVM target (GPU Arches):

  1. Check the AMD GPU Arches list or the ROCm documentation (see the second tab labeled “AMD Radeon”).
  2. Find your GPU model, in this case Radeon RX 6600 XT and note its LLVM target. For the 6600 XT, it is gfx1032.

Step 3: Download and Install ROCm Libraries

ROCm is AMD’s open-source platform that enables GPU computing. We need pre-built ROCm libraries to optimize Ollama’s performance:

  1. Visit the pre-built ROCm libraries repository.
  2. Download the latest compatible version (at the time of writing, v0.6.2.4 was used, but v0.6.1.2 is also compatible).
  3. Look for a release that supports your GPU’s LLVM target (gfx1032 for 6600 XT).
  4. Download the corresponding ZIP file, save it, and extract it.

Step 4: Replace Files in the Ollama Installation Directory

To enable ROCm support for your GPU:

  1. Navigate to your Ollama installation directory, typically found at:
    Terminal window
    C:\Users\[YourUsername]\AppData\Local\Programs\Ollama\lib\ollama
  2. Backup the existing rocblas.dll file and replace it:
    • Rename rocblas.dll to rocblas.dll.backup.
    • Copy the new rocblas.dll from the extracted ROCm files into this directory.
  3. Navigate to the rocblas subdirectory within Ollama:
    Terminal window
    C:\Users\[YourUsername]\AppData\Local\Programs\Ollama\lib\ollama\rocblas
  4. Backup the existing library folder and replace it:
    • Rename the library folder to library_backup.
    • Copy the new library folder from the extracted ROCm files into this directory.

Step 5: Install and Run DeepSeek R1

With the setup complete, install DeepSeek R1 using Ollama:

  1. Visit the Ollama model library.
  2. Copy the command from the top right of the page, open a terminal and paste the command:
    Terminal window
    ollama run deepseek-r1:1.5b
  3. This will download and install the model. Once complete, the chat interface will start.

DeepSeek R1 running in Windows Terminal

  1. To exit the chat, type:
    Terminal window
    /bye

Step 6: Verify GPU Utilization

To ensure the model is running on your GPU and not your CPU:

  1. Run the following command:
    Terminal window
    ollama ps
  2. If the installation is successful, you should see something like 100% GPU usage.

Additional useful commands:

  • List installed models:
    Terminal window
    ollama list
  • Stop a running model:
    Terminal window
    ollama stop deepseek-r1:1.5b
  • Run with verbose mode to see tokenization speed:
    Terminal window
    ollama run deepseek-r1:1.5b --verbose

Important Update Information

If you are using a demo release, DO NOT click the “Update” button if Ollama prompts you. Instead, manually download updates from the Ollama for AMD releases page.

Alternatively, you can update using the Ollama-For-AMD-Installer created by ByronLeeeee. This tool allows for a simple one-click update and automatic library replacement.

If you’re interested in running Ollama with AMD GPUs on different operating systems or want to explore additional perspectives, check out these valuable resources:

  1. Running ollama with an AMD Radeon 6600 XT by Major Hayden

  2. Running ollama on RX 6600 by Adham Omran

Conclusion

With this setup, you can run DeepSeek R1 on a Windows machine using an unsupported AMD GPU like the Radeon RX 6600 XT. Thanks to the community-driven ollama-for-amd fork and ROCm libraries, you can bypass official limitations and leverage your GPU for AI workloads.

If you encounter issues, refer to the Ollama for AMD Wiki for troubleshooting steps. Happy experimenting!

Comments