Review of Interpretability and Stability in Soft Sensors

Recently, an interesting review paper on soft sensors was published.

Overview

Soft sensors are essential in industrial process monitoring for estimating difficult-to-measure quality variables. While data-driven models (especially deep learning) have improved accuracy, they often face two critical challenges: they operate as “black boxes” lacking transparency, and they suffer from instability under shifting industrial conditions.

This post summarizes a recent 2025 review paper published in IEEE Transactions on Instrumentation and Measurement, which provides a comprehensive analysis of methodologies to enhance both the interpretability and stability of these systems.

The Core Challenges

Interpretability

Operators and engineers need to understand the decision-making process of a model to trust it, especially in high-stakes applications.
There is often a trade-off: simple models (e.g., linear regression) are interpretable but less accurate, while complex models (e.g., deep neural networks) are accurate but opaque.

Stability

Industrial environments are dynamic; models assume a consistent data distribution, but real-world data often shifts (data drift).
Models based solely on correlations may fail when environmental conditions change, whereas causal relationships tend to remain stable.

Solutions and Methodologies

Enhancing Interpretability

The paper categorizes Interpretable Machine Learning (IML) into two main dimensions:

Intrinsic vs. Post-hoc: Intrinsic models are inherently transparent (e.g., Decision Trees), while post-hoc methods explain complex models after training.
Global vs. Local: Global methods explain the model’s overall behavior, while local methods explain specific individual predictions.

Prominent post-hoc methods reviewed include LIME (Local Interpretable Model-agnostic Explanations) and SHAP (Shapley Additive Explanations), which provide instance-specific insights and feature attributions.

Ensuring Stability via Causal ML

To address stability, the authors emphasize Causal Machine Learning. Unlike traditional correlation-based learning, causal methods identify directional cause-and-effect relationships.

Causal Discovery: Techniques to identify causal structures from observational data, categorized into constraint-based, score-based, and causal function-based algorithms.
Benefits: By focusing on causal features, soft sensors become robust to environmental changes and distribution shifts.

Open-Source Resources

The paper highlights several open-source libraries that practitioners can use to implement these techniques.

Summary of Recommended Tools

Domain	Tool	Description	Source
Interpretability	InterpretML	Microsoft’s toolkit integrating glass-box models (e.g., EBM) and post-hoc explanations like SHAP/LIME	GitHub
	Alibi	Python library for diverse explanation methods including counterfactuals and contrastive explanations	GitHub
	Captum	Library specifically for interpreting PyTorch models, featuring Integrated Gradients and DeepLIFT	GitHub
	AIX360	IBM’s toolkit providing comprehensive algorithms for data, model, and prediction explanations, plus fairness detection	GitHub
	DALEX	Tools for visualizing and understanding complex models (available in R and Python)	GitHub
	Eli5	Simplifies debugging and explanation of classifiers; compatible with scikit-learn (ELI5 stands for “Explain Like I’m 5”, unmaintained)	GitHub
	Fairlearn	Focuses on assessing and mitigating fairness issues in machine learning models	GitHub
Causal ML	DoWhy	Microsoft’s library for principled causal inference, combining causal graphs with statistical estimation	GitHub
	CausalML	Uber’s package for estimating heterogeneous treatment effects and uplift modeling	GitHub
	EconML	Microsoft’s library bridging econometrics and ML for heterogeneous treatment effects (e.g., Causal Forests)	GitHub
	CausalNex	QuantumBlack’s library combining causal discovery with probabilistic modeling and visualization	GitHub
	Tigramite	Specialized library for causal discovery in time-series data	GitHub
	Causal Discovery Toolbox	Comprehensive suite of algorithms (constraint-based, score-based, functional) for various data types	GitHub
	CausalPy	User-friendly API for causal analysis, effect estimation, and counterfactual reasoning	GitHub

Reference

L. Cao et al., “Comprehensive Analysis on Machine Learning Approaches for Interpretable and Stable Soft Sensors,” IEEE Transactions on Instrumentation and Measurement, vol. 74, pp. 1-17, 2025, doi: 10.1109/TIM.2025.3556830

Additional resources:

Book: Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (repo)
Model agnostic interpretability:
- SHAP package
- LIME package (unmaintained)
- RISE: Randomized Input Sampling for Explanation of Black-box Models paper code (unmaintained)
Grad-CAM package for visual explanations of PyTorch CNNs