Harnessing Docker Model Runner: A Practical Guide for Local AI Development

Looking to run AI models locally within Docker? The Docker Model Runner offers a streamlined way to manage and interact with your models directly from your Docker environment. This guide provides a practical overview of how to use the Docker Model Runner, complete with examples you can implement right away.

What is Docker Model Runner?

Docker Model Runner is a backend library designed to make running and managing AI models within Docker easier. It's designed to integrate seamlessly with Docker Desktop, allowing you to build, deploy, and experiment with models locally. While still under active development, it provides a powerful set of tools for local AI development.

Why Use Docker Model Runner?

Simplified Model Management: Streamline the process of creating, listing, and deleting models.
Local Development: Test and iterate on your AI models in a controlled environment before deploying them.
Docker Integration: Leverage the power of Docker for consistent and reproducible model deployments.

Getting Started with the Makefile

The project includes a Makefile that simplifies common tasks. Ensure you have Docker Desktop (version 4.1 or later). Here's how to use it:

make build: Compiles the Go application.
make run: Starts the application locally.
make clean: Removes build artifacts.
make help: Displays available commands.

To get started with building the app run make build in your terminal for a smooth setup experience.

Interacting with the Model Runner API

The Docker Model Runner exposes a REST API via a Unix socket ( model-runner.sock). You can use curl to interact with the API. Don't forget to provide the path to the socket via --unix-socket flag.

Example API Interactions:

These examples assume your Docker Model Runner is running and accessible via the Unix socket.

List Available Models

See what models are currently available:

curl --unix-socket model-runner.sock localhost/models

Create a New Model

Create a model using a specified image:

curl --unix-socket model-runner.sock localhost/models/create -X POST -d '{"from": "ai/smollm2"}'

Here, ai/smollm2 is the image name for the model, and remember to use a valid Docker image containing your model.

Get Model Information

Retrieve detailed information about a specific model:

curl --unix-socket model-runner.sock localhost/models/ai/smollm2

Replace "ai/smollm2" with the actual model name.

Chat with a Model

Send a request to /chat/completions to get an AI model to respond to user input.

curl --unix-socket model-runner.sock localhost/engines/llama.cpp/v1/chat/completions -X POST -d '
{
  "model": "ai/smollm2",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"}
  ]
}
'

In this example, the model is ai/smollm2 and it leverages the llama.cpp engine. Adjust the model and engine as needed. This is especially useful for experimenting with local AI model deployment.

Expected Response:

{
  "id": "chat-12345",
  "object": "chat.completion",
  "created": 1682456789,
  "model": "ai/smollm2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'm doing well, thank you for asking! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 16,
    "total_tokens": 40
  }
}

The response contains the model's generated message, usage statistics, and other metadata.

Delete a Model

Remove a model from the server:

curl --unix-socket model-runner.sock localhost/models/ai/smollm2 -X DELETE

Key Takeaways

Docker Model Runner empowers developers to work with AI models inside self-contained environments. By utilizing the Model Runner, you can ensure each model runs in its own isolated space, which prevents conflicts and compatibility issues. The provided API endpoints give developers the tools to manage the full life cycle of their models, from initial creation to deletion. The consistent environment that Docker provides promotes collaboration and ensures the reproducibility of results across different machines and teams. With Docker Model Runner, the complexities of deploying and experimenting with AI are greatly reduced, enabling developers to focus on innovation and creating intelligent applications. This is helpful for local AI model testing.