Parallel coordinates plots are actually a poor tool for comparing individual runs within a sweep, but they excel at revealing emergent properties of the sweep’s search strategy.
Let’s see it in action. Imagine you’ve run a hyperparameter sweep for a simple regression model, tuning learning_rate and batch_size, and tracking val_loss.
import wandb
import random
# Simulate a sweep run
api = wandb.Api()
run_group = "my-regression-sweep-123" # Replace with your actual sweep ID
# Fetch runs from the sweep
runs = api.runs(f"your-entity/{run_group}")
# Prepare data for parallel coordinates
data = []
for run in runs:
if run.summary:
data.append({
"learning_rate": run.config.get("learning_rate"),
"batch_size": run.config.get("batch_size"),
"val_loss": run.summary.get("val_loss"),
"run_id": run.id
})
# To plot this in W&B UI, you'd navigate to your sweep's page,
# click "Compare Runs", and select "Parallel Coordinates".
# The axes would be your chosen hyperparameters and metrics.
# Example of what the plot would visually represent:
# Each line is a run.
# Vertical axes are 'learning_rate', 'batch_size', 'val_loss'.
# A line connects the specific values of learning_rate, batch_size, and val_loss for a single run.
The goal of a hyperparameter sweep is to efficiently explore a search space to find the combination of hyperparameters that yields the best performance (lowest validation loss, in this case). Parallel coordinates plots, when applied to sweeps, don’t just show you which run was best. They show you how the search space was explored and where the promising regions lie.
Here’s how it works internally: each vertical axis in the plot represents a metric or a hyperparameter. A single run in your sweep is visualized as a line that traverses these axes, connecting the specific value of each hyperparameter and metric for that run. When you have dozens or hundreds of runs from a sweep, these lines can become a dense tangle.
The real power emerges when you look beyond individual lines. Instead of focusing on a single "best" run, you observe the patterns formed by many lines. If all the low-loss runs tend to cluster in a specific region of the learning_rate and batch_size axes, the plot reveals that this region is promising. You can see how the sweep algorithm (e.g., random search, grid search, Bayesian optimization) sampled the space. Did it systematically explore? Did it converge on a particular area?
You control the view by selecting which metrics and hyperparameters to display as axes. More axes give a richer, but potentially more complex, view. You can also filter runs based on their performance (e.g., show only runs with val_loss below a certain threshold) to highlight the characteristics of the successful runs.
The counterintuitive part is that the "noise" of many lines is precisely what makes the plot useful for sweep analysis. If you’re only looking for the single best run, a simple table sorted by your metric is more direct. But to understand the behavior of your search and identify regions of good performance, the parallel coordinates plot of a sweep is invaluable. It helps you answer: "Given my best runs, what are the common traits of their hyperparameters, and where should I focus my next, more targeted, search?"
Once you’ve identified promising regions using parallel coordinates, the next step is often to perform a more focused, finer-grained sweep around those areas, perhaps using a different search strategy like Bayesian optimization.