Weights & Biases (W&B) can automatically log a vast array of metrics from your PyTorch training runs without you needing to write explicit wandb.log() calls for them.

Let’s see it in action. Imagine you have a standard PyTorch training loop like this:

import torch
import torch.nn as nn
import torch.optim as optim
import wandb

# Initialize W&B (replace with your project and entity)
wandb.init(project="pytorch-auto-log-demo", entity="your-entity")

# Dummy Model, Data, and Optimizer
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(10, 1)

    def forward(self, x):
        return self.linear(x)

model = SimpleModel()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Dummy Data
inputs = torch.randn(32, 10)
labels = torch.randn(32, 1)

# Training Loop
num_epochs = 5
for epoch in range(num_epochs):
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

    # W&B auto-logging happens here implicitly!
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {loss.item():.4f}")

wandb.finish()

When you run this code with wandb initialized, W&B automatically captures:

  • Loss: The loss.item() is logged per batch by default.
  • Learning Rate: The learning rate of the optimizer, if it’s a standard PyTorch LR scheduler or a simple fixed LR.
  • Model Summary: A detailed summary of your model’s architecture, including layer types, output shapes, and trainable parameters.
  • Gradients: The norm of the gradients across all parameters.
  • Parameters: The norm of the weights for each layer.
  • System Metrics: CPU utilization, GPU utilization (if available), RAM usage.

You can see these metrics populate in your W&B dashboard as the run progresses.

The magic behind this is the wandb.init() call. By default, when you integrate W&B with PyTorch (or TensorFlow/Keras), it hooks into the training process. For PyTorch, it looks for common patterns:

  1. Loss Calculation: It intercepts the loss tensor produced by your criterion.
  2. Optimizer Step: It observes when optimizer.step() is called.
  3. Learning Rate Schedulers: It can detect and log changes if you’re using standard torch.optim.lr_scheduler objects.

The wandb.watch() function is another key player, though often implicitly called by wandb.init() when it detects a PyTorch model. wandb.watch(model) explicitly tells W&B to track gradients and parameters for the model. It does this by attaching forward and backward hooks to the model’s modules. These hooks are small functions that execute automatically during the forward and backward passes, respectively. The forward hook can capture parameter values and shapes, while the backward hook captures gradients.

The log_freq argument in wandb.init() controls how often batch-level metrics are logged. For example, wandb.init(project="...", log_freq=10) would log batch loss every 10 batches. If you want to log epoch-level metrics, you’d typically do that manually within your epoch loop after the loop finishes, or configure W&B to do so.

When W&B auto-logs parameters and gradients, it computes the L2 norm (Euclidean norm) for each parameter tensor and its corresponding gradient tensor. This provides a single scalar value representing the overall magnitude of weights and gradients for a given layer. This is incredibly useful for debugging exploding or vanishing gradients, as well as for understanding weight drift during training.

The system metrics are collected using libraries like psutil for CPU/RAM and pynvml for NVIDIA GPUs. W&B polls these metrics at a set interval (default is usually 5 seconds) and logs them to provide a holistic view of your training environment’s performance. This helps identify bottlenecks, such as I/O or CPU saturation, that might not be apparent from model-centric metrics alone.

One aspect of auto-logging that often surprises people is its ability to infer and log the number of trainable parameters in your model. When wandb.watch() is active, it iterates through model.named_parameters(), sums up the param.numel() for all parameters where param.requires_grad is True, and logs this total count. This is a crucial piece of information for understanding model complexity and is automatically available in the "Model" tab of your W&B run.

The next step is often to integrate custom logging for evaluation metrics or to control the exact frequency and type of data being logged.

Want structured learning?

Take the full Wandb course →