Model Explainability: Making Sense of SHAP and LIME

Your model just denied someone a loan, flagged a tumour, or rejected a job application — and when the person asks “why?”, your answer is a 400-tree gradient-boosted ensemble shrugging in unison. That’s not a great look. As models have grown more accurate, they’ve also grown more inscrutable, and we’ve quietly traded “I understand this” for “trust me, the validation AUC is excellent.” Model explainability is the discipline of getting some of that understanding back. The two tools you’ll meet first are LIME and SHAP, and knowing when to reach for which is genuinely useful.

Why black boxes need a flashlight

There are three boringly practical reasons to crack open a model.

Trust. Stakeholders — doctors, loan officers, you at 2am — believe a model more when it can show its reasoning. “Approved because income is high and debt is low” beats “approved, vibes.”

Debugging. Models are masterful cheaters. The famous example: a network that “detected pneumonia” beautifully, until someone noticed it had learned to read the metadata tag identifying which hospital took the X-ray. Explanations expose this kind of cheating before it ships.

Regulation. Frameworks like the EU’s GDPR and AI Act lean hard on the idea that automated decisions affecting people should be contestable and explainable. “The neural net said no” is increasingly not a legally comfortable place to stand.

Global vs. local: two different questions

Before the tools, one distinction that clears up most confusion. A global explanation answers “how does this model behave overall?” — which features matter across all predictions. A local explanation answers “why this prediction, for this person?” These are different questions, and a model can have a sensible global story while making a baffling individual call. Both LIME and SHAP specialise in the local question; SHAP also scales up to the global one by aggregating.

LIME: lie to me, but locally

LIME (Local Interpretable Model-agnostic Explanations) has a wonderfully cheeky core idea. Your model is a hopelessly wiggly surface, impossible to describe in plain words. But zoom in close enough to any single point and even a wiggly surface looks roughly flat. So LIME does this: take the prediction you care about, jiggle the inputs to create a cloud of nearby fake samples, ask the black box what it predicts for each, then fit a simple, readable model (usually linear) to that little cloud — weighting points by how close they are to the original.

The coefficients of that tiny surrogate are your explanation: “for this loan, debt-to-income pushed the decision toward ‘reject’.” It’s fast, intuitive, and treats your model as a sealed box, so it works on anything. The catch: those explanations can be unstable. Run LIME twice with different random perturbations and you can get noticeably different stories, because “how close is close?” is a knob you set somewhat arbitrarily.

SHAP: paying every feature what it’s owed

SHAP (SHapley Additive exPlanations) borrows from cooperative game theory. Imagine the features are players on a team and the prediction is the prize money. Shapley values, from a 1953 result by Lloyd Shapley, give the uniquely fair way to split that prize: each feature’s payout is its average marginal contribution across every possible order in which features could “join the team.” Apply that to a prediction and you learn exactly how many points each feature added or subtracted relative to the average prediction.

The payoff is rigour: SHAP values come with guarantees (they always add up to the prediction, equal features get equal credit) that LIME’s ad-hoc fitting can’t promise. In Python it’s refreshingly direct:

import shap
import xgboost

model = xgboost.XGBClassifier().fit(X_train, y_train)

# TreeExplainer is fast and exact for tree ensembles
explainer = shap.TreeExplainer(model)
shap_values = explainer(X_test)

# Why did the model decide THIS one row the way it did?
shap.plots.waterfall(shap_values[0])

# Zoom out: which features matter across the whole test set?
shap.plots.beeswarm(shap_values)

That same shap_values object gives you a local waterfall and a global beeswarm — one computation, both questions answered.

When to use which (and the fine print)

A rough rule of thumb:

Reach for SHAP when you want consistency, trustworthy global feature importance, or you’re working with tree models — TreeExplainer is fast and exact. It’s the safer default for anything you’ll defend to a regulator.
Reach for LIME when you need a quick, cheap, intuitive read on a single prediction, especially for text or images where its “highlight the influential words/pixels” output is delightfully concrete.

Now the caveats, because both share an awkward secret: correlated features break the intuition. Both methods simulate “removing” a feature by sampling values for it, which can conjure absurd, impossible data points — a 25-year-old with 40 years of work experience — and credit gets smeared across correlated features in misleading ways. SHAP offers an interventional mode and correlation-aware masking to mitigate this, but it’s not magic.

The other cost is literal: exact SHAP over many features is combinatorially expensive. KernelExplainer approximates it for any model, but on a big dataset you’ll be waiting. (Tree models dodge this; their structure makes SHAP cheap.)

The takeaway

Treat explanations as hypotheses, not gospel. Here’s a workable default: use SHAP’s TreeExplainer for your tree-based models to get both local waterfalls and global importance from one run; pull out LIME when you need a fast, vivid single-instance explanation on text or images. Whatever you get back, sanity-check it against domain knowledge and stay suspicious when your features are correlated. An explanation you blindly trust is just a more articulate black box.