Docker for Machine Learning: A Beginner-Friendly Guide

3 minute read

Published:

Docker for Machine Learning: A Beginner-Friendly Guide

As ML practitioners, we often find ourselves struggling with messy dependencies, incompatible environments, and inconsistent results across systems. Docker offers an elegant solution—containerization—that packages your code and environment into a consistent, portable unit.

In this post, we’ll walk through the basics of Docker in the ML context, with easy-to-follow examples.


What is Docker and Why Should You Care?

Docker is a tool that lets you package your code along with everything it needs to run (libraries, Python version, system dependencies) into a self-contained unit called a container.

Why it matters for ML:

  • Reproducibility: Run the same code with the same results, regardless of the machine.
  • Deployment: Move your training or inference pipelines easily to servers or cloud.
  • Experiment Isolation: Try different versions of code or packages without conflict.

Step-by-Step: Running a Simple ML Script in Docker

1. Create Your ML Script

# ml_script.py
import numpy as np
from sklearn.linear_model import LinearRegression

X = np.array([[1], [2], [3]])
y = np.array([2, 4, 6])

model = LinearRegression().fit(X, y)
print("Coefficient:", model.coef_)

2. Write a Dockerfile

FROM python:3.10
WORKDIR /app
COPY ml_script.py ./
RUN pip install numpy scikit-learn
CMD ["python", "ml_script.py"]

3. Build the Image

docker build -t ml-demo .

4. Run the Container

docker run --rm ml-demo

Running Deep Learning Workloads with NVIDIA Docker

When training deep learning models on GPUs, standard Docker containers are not sufficient. You need access to the GPU drivers and CUDA libraries. NVIDIA Docker allows containers to directly utilize the host GPU.

Make sure the NVIDIA Container Toolkit is installed. You can verify GPU access with:

nvidia-smi

Dockerfile for GPU-based Deep Learning

FROM pytorch/pytorch:2.2.0-cuda11.8-cudnn8-runtime
WORKDIR /workspace
COPY train.py .

Sample Training Script (train.py)

import torch
x = torch.randn(3, 3).cuda()
print("Tensor on GPU:", x)

Build and Run

docker build -t dl-gpu-demo .
docker run --rm --gpus all dl-gpu-demo

Managing Data and Models with Volumes

docker run -v $(pwd)/output:/app/output ml-demo

Speed Up Your Workflow with Docker Aliases and Functions

Once you start working with Docker regularly, you’ll notice many commands are repetitive and verbose. To simplify daily usage, it’s helpful to define custom aliases and functions.

Common Commands

  • docker ps: Lists running containers.
  • docker ps -a: Lists all containers (including stopped ones).
  • docker images: Shows all locally stored Docker images.
  • docker exec: Runs a command inside a running container.
  • docker run: Starts a new container from an image.
alias dps='docker ps'
alias dpa='docker ps -a'
alias dimg='docker images'

Function: Run Container with Mounts and GPU Access

rdock() {
  docker run -it \
    --workdir="/home/$USER" \
    -v "$HOME/.bash_aliases:$HOME/.bash_aliases" \
    -v "$HOME/.bashrc:$HOME/.bashrc" \
    -v "$HOME/your_project_folder:/workspace/" \
    -v "/etc/group:/etc/group:ro" \
    -v "/etc/passwd:/etc/passwd:ro" \
    --gpus all \
    -u $UID:$GID \
    --name $1 \
    --ipc=host \
    --network host $2
}

Function: Enter Running Container as Root

rootdock() {
  docker exec -u root -t -i $1 /bin/bash
}

Reload your aliases after editing:

source ~/.bash_aliases

Safety Tips

  • Avoid running containers as root in production.
  • Use .dockerignore to exclude files like __pycache__, datasets, etc.
  • Regularly prune unused images with docker system prune

What’s Next

  • Docker Compose for ML pipelines
  • Serving ML models with Flask or FastAPI
  • GPU acceleration in multi-container setups
  • Kubernetes for scalable training

If you are an ML researcher or engineer tired of environment issues and deployment pain, Docker is worth learning.