W&B XGBoost and Scikit-Learn Integration (2026)

The most surprising thing about integrating Weights & Biases with XGBoost and Scikit-Learn is how little code you actually need to write to get massive visibility into your model training.

Let’s see it in action. Imagine you’re training an XGBoost model. Here’s a typical training loop:

import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
import wandb

# Load data
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize W&B
wandb.init(project="xgboost-sklearn-integration", config={
    "learning_rate": 0.1,
    "max_depth": 3,
    "n_estimators": 100
})

# Convert data to DMatrix format for XGBoost
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# Define parameters
params = {
    'objective': 'binary:logistic',
    'eval_metric': 'logloss',
    'eta': wandb.config.learning_rate,
    'max_depth': wandb.config.max_depth,
    'seed': 42
}

# Train the model with W&B integration
model = xgb.train(
    params,
    dtrain,
    num_boost_round=wandb.config.n_estimators,
    evals=[(dtrain, 'train'), (dtest, 'eval')],
    callbacks=[wandb.integration.xgboost.XGBoostCallback()]
)

# Log final metrics (optional, as callback logs them)
predictions = model.predict(dtest)
from sklearn.metrics import accuracy_score, log_loss
accuracy = accuracy_score(y_test, predictions > 0.5)
logloss = log_loss(y_test, predictions)
wandb.log({"final_accuracy": accuracy, "final_logloss": logloss})

wandb.finish()

When you run this, wandb.integration.xgboost.XGBoostCallback() does the heavy lifting. It automatically logs:

Hyperparameters: All parameters passed to xgb.train or defined in params.
Metrics: The eval_metric specified (e.g., logloss) for each dataset in evals at every boosting round.
Model Artifacts: The trained XGBoost model itself.

The wandb.config object allows you to easily pull hyperparameters from your W&B run configuration, making sweeps and hyperparameter optimization seamless. You can even define your params dictionary directly using wandb.config values, as shown with eta and max_depth.

For Scikit-Learn models, the integration is equally straightforward. You’d typically use wandb.sklearn.log_model() after training:

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
import wandb

# Assuming you've already initialized W&B and split data as above

# Initialize W&B (if not already done)
run = wandb.init(project="xgboost-sklearn-integration", config={
    "solver": "liblinear",
    "C": 0.1,
    "max_iter": 1000
})

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Scikit-Learn model
model_sklearn = LogisticRegression(
    solver=wandb.config.solver,
    C=wandb.config.C,
    max_iter=wandb.config.max_iter,
    random_state=42
)
model_sklearn.fit(X_train, y_train)

# Evaluate
y_pred_sklearn = model_sklearn.predict(X_test)
accuracy_sklearn = accuracy_score(y_test, y_pred_sklearn)
logloss_sklearn = log_loss(y_test, model_sklearn.predict_proba(X_test))

# Log metrics
wandb.log({"sklearn_accuracy": accuracy_sklearn, "sklearn_logloss": logloss_sklearn})

# Log the trained Scikit-Learn model artifact
wandb.sklearn.log_model(model_sklearn, "sklearn_model",
                        input_type="numpy",
                        X=X_train,
                        y=y_train)

run.finish()

The wandb.sklearn.log_model() function not only saves your model object but also logs its hyperparameters, signature (input/output schema), and can even create a W&B Table of predictions on a sample of your test data if you provide X and y. This gives you immediate insight into how your model performs on unseen data without needing to manually craft logging code for every metric.

What many users miss is that wandb.sklearn.log_model can infer and log the model’s signature automatically if you provide X and y during the logging call. This signature is crucial for understanding the expected data types and shapes for model inference, and it’s automatically populated into the model artifact’s metadata, which is invaluable for deploying models.

The next step is exploring how to use these logged artifacts for model comparison and deployment.

More Deep Dives in Wandb