Step-by-Step Guide to Deploying Machine Learning Models with FastAPI and Docker

step-by-step-guide-to-deploying-machine-learning-models-with-fastapi-and-docker

Source: MachineLearningMastery.com

Step-by-Step Guide to Deploying Machine Learning Models with FastAPI and Docker

Step-by-Step Guide to Deploying Machine Learning Models with FastAPI and Docker
Image by Editor | Midjourney

You’ve trained your machine learning model, and it’s performing great on test data. But here’s the truth: a model sitting in a Jupyter notebook isn’t helping anyone. It’s only when you deploy it to production real users can benefit from your work.

In this article we’re building a diabetes progression predictor on a sample dataset from scikit-learn. We’ll take it from raw data all the way to a containerized API that’s ready for the cloud.

By coding along to this tutorial, you’ll have:

  • A trained Random Forest model that predicts diabetes progression scores
  • A REST API, built using FastAPI, that accepts patient data and returns predictions
  • A fully containerized application ready for deployment

Let’s get started.

🔗 Link to the code on GitHub.

Setting Up Your Development Environment

Before we start coding, let’s get your dev environment  ready. You’ll need:

  • Python 3.11+ (though 3.9+ works fine, too)
  • Docker installed and running
  • Basic familiarity with Python and APIs (I’ll explain the non-trivial parts)

Project Structure

Here’s how we’ll organize everything in the project directory:

diabetespredictor/

├── app/

   ├── __init__.py

   └── main.py              # FastAPI application

├── models/

   └── diabetes_model.pkl   # Trained model (we’ll create this)

├── train_model.py           # Model training script

├── requirements.txt         # Python dependencies

└── Dockerfile              # Container configuration

Installing Dependencies

Let’s create a clean virtual environment:

$ python m venv diabetesenv

$ source diabetesenv/bin/activate  # Windows: diabetes-envScriptsactivate

Now install the required libraries:

$ pip install scikitlearn pandas fastapi uvicorn

Building a Machine Learning Model for Predicting Diabetes Progression

Let’s start by creating our machine learning model. Create train_model.py:

# train_model.py

from sklearn.datasets import load_diabetes

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error, r2_score

import pickle

import os

We’ve chosen Random Forest because it’s robust, handles different feature scales well, and gives us feature importance insights as well.

Let’s load and explore our diabetes dataset:

# Load the diabetes dataset

diabetes = load_diabetes()

X, y = diabetes.data, diabetes.target

print(f“Dataset shape: {X.shape}”)

print(f“Features: {diabetes.feature_names}”)

print(f“Target range: {y.min():.1f} to {y.max():.1f}”)

The diabetes dataset is a collection of 442 patient records with 10 physiological features. The target is a quantitative measure of disease progression one year after baseline: higher numbers indicate more advanced progression.

Output:

Dataset shape: (442, 10)

Features: [‘age’, ‘sex’, ‘bmi’, ‘bp’, ‘s1’, ‘s2’, ‘s3’, ‘s4’, ‘s5’, ‘s6’]

Target range: 25.0 to 346.0

Now let’s prepare our data:

# Split the data

X_train, X_test, y_train, y_test = train_test_split(

    X, y, test_size=0.2, random_state=42

)

print(f“Training samples: {X_train.shape[0]}”)

print(f“Test samples: {X_test.shape[0]}”)

The 80/20 split gives us enough training data while reserving a solid test set. Using random_state=42 ensures reproducible results.

Output:

Training samples: 353

Test samples: 89

Time to train our model:

# Train Random Forest model

model = RandomForestRegressor(

    n_estimators=100,

    random_state=42,

    max_depth=10

)

model.fit(X_train, y_train)

We’ve set max_depth=10 to prevent overfitting on this relatively small dataset. With 100 trees, we get good performance without excessive computation time.

Let’s evaluate our model:

# Make predictions and evaluate

y_pred = model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print(f“Mean Squared Error: {mse:.2f}”)

print(f“R² Score: {r2:.3f}”)

The R² score tells us what percentage of variance in disease progression our model explains. Anything above 0.4 is pretty good for this dataset!

Output:

Mean Squared Error: 2974.20

R² Score: 0.439

Finally, let’s save our trained model:

# Create models directory and save model

os.makedirs(‘models’, exist_ok=True)

with open(‘models/diabetes_model.pkl’, ‘wb’) as f:

    pickle.dump(model, f)

print(“Model trained and saved successfully!”)

Run this script to train your model:

You should see output showing your model’s performance and confirmation that it’s been saved.

Creating the FastAPI Application

Now for the exciting part: turning our model into a web API.

If you haven’t already, create the app directory and an empty __init__.py file:

$ mkdir app

$ touch app/__init__.py

Now create app/main.py with our API code:

# app/main.py

from fastapi import FastAPI

from pydantic import BaseModel

import pickle

import numpy as np

import os

FastAPI uses Pydantic for request validation. Meaning it automatically validates incoming data and provides clear error messages if something’s wrong.

Let’s define our input data structure:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

# Define input data schema

class PatientData(BaseModel):

    age: float

    sex: float  

    bmi: float

    bp: float   # blood pressure

    s1: float   # serum measurement 1

    s2: float   # serum measurement 2  

    s3: float   # serum measurement 3

    s4: float   # serum measurement 4

    s5: float   # serum measurement 5

    s6: float   # serum measurement 6

    class Config:

        schema_extra = {

            “example”: {

                “age”: 0.05,

                “sex”: 0.05,

                “bmi”: 0.06,

                “bp”: 0.02,

                “s1”: 0.04,

                “s2”: 0.04,

                “s3”: 0.02,

                “s4”: 0.01,

                “s5”: 0.01,

                “s6”: 0.02

            }

        }

The example values help API users understand the expected input format. Note that the diabetes dataset features are already normalized.

Next, we initialize FastAPI app and load the model into the FastAPI environment:

# Initialize FastAPI app

app = FastAPI(

    title=“Diabetes Progression Predictor”,

    description=“Predicts diabetes progression score from physiological features”,

    version=“1.0.0”

)

# Load the trained model

model_path = os.path.join(“models”, “diabetes_model.pkl”)

with open(model_path, ‘rb’) as f:

    model = pickle.load(f)

Finally, let’s create our prediction endpoint:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

@app.post(“https://machinelearningmastery.com/predict”)

def predict_progression(patient: PatientData):

    “”

    Predict diabetes progression score

    ““”

    # Convert input to numpy array

    features = np.array([[

        patient.age, patient.sex, patient.bmi, patient.bp,

        patient.s1, patient.s2, patient.s3, patient.s4,

        patient.s5, patient.s6

    ]])

    # Make prediction

    prediction = model.predict(features)[0]

    # Return result with additional context

    return {

        “predicted_progression_score”: round(prediction, 2),

        “interpretation”: get_interpretation(prediction)

    }

def get_interpretation(score):

    “”“Provide human-readable interpretation of the score”“”

    if score < 100:

        return “Below average progression”

    elif score < 150:

        return “Average progression”

    else:

        return “Above average progression”

The interpretation function helps make our API more user-friendly by providing context for the numerical predictions.

Let’s also add a health check endpoint:

@app.get(“https://machinelearningmastery.com/”)

def health_check():

    return {“status”: “healthy”, “model”: “diabetes_progression_v1”}

Testing the API Locally

Before containerizing, let’s test our API locally. Run the following command from your project’s root directory:

$ uvicorn app.main:app reload port 8000

Open your browser to http://localhost:8000/ and you’ll see the FastAPI app running. Try making a prediction with the example data.

You can also test with curl:

curl X POST “http://localhost:8000/predict”

  H “Content-Type: application/json”

  d ‘{

    “age”: 0.05,

    “sex”: 0.05,

    “bmi”: 0.06,

    “bp”: 0.02,

    “s1”: -0.04,

    “s2”: -0.04,

    “s3”: -0.02,

    “s4”: -0.01,

    “s5”: 0.01,

    “s6”: 0.02

  }’

This should give you the following result:

{“predicted_progression_score”:213.34,

“interpretation”:“Above average progression”}

Containerizing with Docker

Now let’s package everything into a Docker container. First, create requirements.txt:

fastapi==0.115.12

uvicorn==0.34.2

scikitlearn==1.6.1

pandas==2.2.3

numpy==2.2.6

We’ve pinned specific versions to ensure consistency across environments.

Now create the Dockerfile:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

# Use Python 3.11 slim image

FROM python:3.11slim

# Set working directory

WORKDIR /app

# Install system dependencies (if needed)

RUN aptget update && aptget install y

    && rm rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies

COPY requirements.txt .

RUN pip install nocachedir r requirements.txt

# Copy application code

COPY app/ ./app/

COPY models/ ./models/

# Expose port

EXPOSE 8000

# Run the application

CMD [“uvicorn”, “app.main:app”, “–host”, “0.0.0.0”, “–port”, “8000”]

The slim image keeps our container small, and --no-cache-dir prevents pip from storing cached packages, further reducing size.

Build your Docker image:

$ docker build t diabetespredictor .

Run the container:

$ docker run d p 8000:8000 diabetespredictor

Your API is now running in a container! Test it the same way as before.

Publishing to Docker Hub

Now that your containerized API is working locally, let’s share it with the world through Docker Hub. This step is necessary for cloud deployment. Most cloud platforms can pull directly from Docker Hub, making deployment seamless.

Setting Up Docker Hub

First, you’ll need a Docker Hub account if you don’t have one:

  1. Go to hub.docker.com and sign up
  2. Choose a username you’re happy with. It’ll be part of your image URLs

Logging Into Docker Hub

From your terminal, log into Docker Hub:

You’ll be prompted for your Docker Hub username and password. Enter them carefully. This creates an authentication token that lets you push images.

Tagging Your Image

Before pushing, we need to tag our image with your Docker Hub username. Docker uses a specific naming convention:

$ docker tag diabetespredictor yourusername/diabetespredictor:v1.0

Replace your-username with your actual Docker Hub username. The v1.0 is a version tag. It’s good practice to version your images so you can track changes and roll back if needed.

Let’s also create a latest tag, which many deployment platforms use by default:

$ docker tag diabetespredictor yourusername/diabetespredictor:latest

Check your tagged images:

$ docker images | grep diabetespredictor

You should see three entries: your original image and the two newly tagged versions.

Pushing to Docker Hub

Now let’s push your image to Docker Hub:

$ docker push yourusername/diabetespredictor:v1.0

$ docker push yourusername/diabetespredictor:latest

The first push might take a few minutes as Docker uploads all the layers. Subsequent pushes should be substantially faster.

You can verify everything works by pulling and running your published image:

# Stop your local container first

$ docker stop $(docker ps q filter ancestor=diabetespredictor)

# Pull and run from Docker Hub

$ docker run d p 8000:8000 yourusername/diabetespredictor:latest

Test the API again to make sure everything still works. If it does, your model is now publicly available and ready for cloud deployment.

Wrapping Up

Congratulations! You’ve just built a complete machine learning deployment pipeline:

  • Trained a robust Random Forest model on medical data
  • Created a working REST API with FastAPI
  • Containerized the application with Docker

Your model is now ready for cloud deployment! You could deploy this to AWS ECS, Fargate, Google Cloud, or Azure.

Want to take it further? You can consider adding the following:

  • Authentication and rate limiting
  • Model monitoring and logging
  • Batch prediction endpoints

You now have all the basics to deploy any machine learning model to production. Happy coding!

No comments yet.