The most surprising thing about W&B self-hosted deployments is that they often end up being more complex to manage than cloud-based solutions, precisely because you’re responsible for everything.

Let’s see it in action. Imagine you’re training a PyTorch model and want to track your experiments.

import wandb
import torch

# Initialize a run, pointing to your self-hosted server
# Replace 'http://your-wandb-host:8080' with your actual server address
wandb.init(
    project="my-pytorch-project",
    entity="your-team-name",  # Your team/user name on the server
    mode="online",  # Ensure it's trying to connect to the server
    settings=wandb.Settings(
        base_url="http://your-wandb-host:8080",
        # If using API key authentication, uncomment and set:
        # api_key="YOUR_API_KEY"
    )
)

# Define your model and optimizer
model = torch.nn.Linear(10, 2)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = torch.nn.MSELoss()

# Simulate training loop
for epoch in range(10):
    x = torch.randn(32, 10)
    y = torch.randn(32, 2)

    outputs = model(x)
    loss = criterion(outputs, y)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Log metrics to your self-hosted W&B server
    wandb.log({"epoch": epoch, "loss": loss.item()})

wandb.finish()

In this snippet, wandb.init is the crucial part. The base_url parameter explicitly tells the wandb client where to find your self-hosted server. If you’re using authentication (which you absolutely should for a self-hosted deployment), you’d also provide your api_key here. The mode="online" ensures it attempts to send data to the server, rather than falling back to local logging.

A self-hosted W&B server typically consists of several core components:

  1. The W&B Service: This is the main application server, handling API requests, data processing, and serving the UI. It’s usually run as a Docker container.
  2. Database: W&B uses a PostgreSQL database to store metadata about projects, runs, users, and artifacts.
  3. Object Storage: For storing large files like model checkpoints, W&B uses an object storage solution. This can be MinIO (a self-hosted S3-compatible object store) or an external S3-compatible service.
  4. Redis: Used for caching and potentially for background job queues.
  5. Web Server/Proxy (Optional but Recommended): Often Nginx or Traefik, to handle SSL termination, load balancing, and routing to the W&B service.

The problem this solves is data sovereignty and control. When you run W&B on-premise, all your experiment data, including model weights, logs, and system metrics, stays within your infrastructure. This is critical for organizations with strict data privacy requirements, regulatory compliance needs, or those operating in air-gapped environments.

Internally, when wandb.log is called, the client serializes the data and sends it via HTTP POST requests to the base_url you configured. The W&B service receives this, validates it, and then writes metadata to PostgreSQL and the actual data blobs (like large files) to your configured object storage. The UI then queries both the database and object storage to present the experiment dashboard.

The exact levers you control are primarily in the settings object during wandb.init and, more importantly, in the configuration of the deployed W&B Docker containers and their associated infrastructure (database, object storage, networking). You manage the versions of W&B, the underlying database, the object storage system, and the host machines. This includes setting up backups, monitoring, and scaling.

Many people overlook the implications of object storage configuration. When setting up settings.object_store_url and settings.object_store_secret, it’s not just about providing credentials. The W&B service needs to be able to reach this object store. If you’re using MinIO, this means ensuring the W&B container’s network can resolve the MinIO hostname and has the correct IAM policies or access keys configured to write objects. A common mistake is to have the object store accessible from your local machine but not from the W&B service container itself, leading to silent failures when uploading large files.

The next concept you’ll likely grapple with is setting up robust backup and disaster recovery for your self-hosted W&B instance, covering both the database and object storage.

Want structured learning?

Take the full Wandb course →