PDPilot: Exploring Partial Dependence Plots Through Ranking, Filtering, and Clustering

The image shows a screenshot of a web app displaying 12 multi-series line chart visualizations. On the left side, there are two sets of checkboxes for filtering the plots by feature type and shape and by feature name.
A screenshot of the One-way Plots tab in PDPilot for an XGBoost model trained on the Ames Housing dataset. The main area of the page shows a grid of ICE plots. Here, the user has highlighted a cluster of lines for the Total Basement Area feature (top row, middle column). The lines for those instances are highlighted in green across all of the plots. Above each plot are overlaid histograms showing the feature's distribution for the entire dataset and the distribution for the highlighted instances. The row of controls at the top enables the user to modify the plots. In this case, the user is looking at the first page of plots, they are sorting the plots by importance, they are visualizing centered ICE plots, and all plots are using the same y-scale. The left sidebar enables the user to filter the plots.
Abstract
Partial dependence plots (PDPs) and individual conditional expectation (ICE) plots are visualizations used for explaining the behavior of machine learning (ML) models trained on tabular datasets. They show how the values of a feature or pair of features impact a model's predictions. However, in models with a large number of features, it is impractical for an ML practitioner to analyze all possible plots. To address this, we present new techniques for ranking and filtering PDP and ICE plots and build upon existing strategies for clustering the lines in ICE plots. Together, these techniques aim to help ML practitioners efficiently explore PDP and ICE plots and identify interesting model behavior. We integrate these techniques into PDPilot, a visual analytics tool that runs in Jupyter notebooks. We use PDPilot to study how 7 ML practitioners utilize the ranking, filtering, and clustering techniques to analyze an ML model.
Materials
GitHub | DOI | Demo video | User Study | Evaluations | BibTeX
Authors
Citation

Khoury Vis Lab — Northeastern University
* West Village H, Room 302, 440 Huntington Ave, Boston, MA 02115, USA
* 100 Fore Street, Portland, ME 04101, USA
* Carnegie Hall, 201, 5000 MacArthur Blvd, Oakland, CA 94613, USA