Weights & Biases (W&B) can track your Azure ML experiments by logging your metrics, parameters, and model artifacts to the W&B platform.

Let’s see it in action. Imagine you’re training a simple PyTorch model on Azure ML.

import wandb
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import Environment

# Authenticate to Azure
subscription_id = "YOUR_SUBSCRIPTION_ID"
resource_group = "YOUR_RESOURCE_GROUP"
workspace_name = "YOUR_WORKSPACE_NAME"
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace_name)

# Define your W&B API key as an Azure ML environment variable
wandb_env_var = Environment(
    name="wandb-env",
    description="Environment variables for W&B",
    conda_file="environment.yml", # Assume environment.yml has wandb and required packages
    environment_variables={
        "WANDB_API_KEY": "YOUR_WANDB_API_KEY",
        "WANDB_PROJECT": "azureml-wandb-demo",
        "WANDB_ENTITY": "your-wandb-entity" # Your W&B username or team name
    }
)
ml_client.environments.create_or_update(wandb_env_var)

# Define your training script
# Assume train.py contains your PyTorch training loop and wandb.init()
# Example train.py snippet:
# import wandb
# import torch
#
# wandb.init(project="azureml-wandb-demo")
# # ... your training code ...
# wandb.log({"accuracy": 0.95, "loss": 0.05})
# wandb.finish()

from azure.ai.ml import command, Input

# Define the command job
job = command(
    code="./src",  # Local path to your code directory (contains train.py)
    command="python train.py --learning_rate 0.01 --epochs 10",
    environment=f"{wandb_env_var.name}:{wandb_env_var.version}",
    inputs={
        "training_data": Input(type="uri_folder", path="azureml://datastores/workspaceblobstore/paths/datasets/mnist")
    },
    display_name="pytorch-wandb-training",
    experiment_name="azureml-wandb-integration"
)

# Submit the job
returned_job = ml_client.jobs.create_or_update(job)

# You can then stream logs or view the job in Azure ML Studio
# returned_job.stream()

This setup allows Azure ML to execute your train.py script within a managed environment where the WANDB_API_KEY, WANDB_PROJECT, and WANDB_ENTITY are pre-configured. When wandb.init() is called within train.py, it automatically picks up these environment variables and starts logging to your specified W&B project.

The core problem W&B integration solves for Azure ML users is bridging the gap between the robust infrastructure of Azure ML for compute and data management, and the specialized experiment tracking and visualization capabilities of W&B. Without this, you’d be manually downloading logs, managing artifact versions, and trying to correlate metrics across different runs, which quickly becomes unmanageable for complex projects.

Internally, Azure ML’s command job provides a mechanism to inject environment variables into the execution context of your training script. By defining an Environment object in Azure ML and setting WANDB_API_KEY, WANDB_PROJECT, and WANDB_ENTITY within its environment_variables dictionary, you’re essentially telling Azure ML to make these variables available to your Python script when it runs. The W&B Python SDK, when initialized with wandb.init(), checks for these standard environment variables. If found, it uses them to authenticate with the W&B service and direct your logs to the correct project and entity without requiring explicit configuration within your script’s code itself.

The code parameter in the command job points to the local directory containing your training scripts (train.py in this case). Azure ML uploads this directory to cloud storage, making your code accessible to the compute instance where the job runs. The environment parameter references the Azure ML Environment object you defined, ensuring your training script has access to W&B and any other necessary Python packages (specified in environment.yml).

The inputs parameter allows you to specify data sources registered in Azure ML. Here, training_data is mapped to a specific path within your Azure ML datastore. Your train.py script would then need to access this data using the path provided by Azure ML.

The command string itself is the actual shell command that Azure ML executes. It invokes python train.py and passes any necessary command-line arguments, such as --learning_rate and --epochs.

A common point of confusion is how wandb.init() in your script interacts with the Azure ML environment variables. You don’t need to pass api_key, project, or entity directly to wandb.init() if you’ve set these environment variables correctly in the Azure ML Environment. The W&B SDK is designed to pick them up automatically. This separation of concerns keeps your training script cleaner and more portable.

Once the job completes, you can navigate to your W&B project page (e.g., https://wandb.ai/your-wandb-entity/azureml-wandb-demo) to see a detailed dashboard of your experiment’s metrics, hyperparameters, and any logged artifacts. Azure ML also provides its own job tracking interface, where you can monitor resource utilization and view console logs.

The next step after successfully tracking experiments is to leverage W&B’s hyperparameter optimization capabilities within your Azure ML workflows.

Want structured learning?

Take the full Wandb course →