• :

    Machine learning algorithms and models constitute the dominant set of predictive methods for a wide range of complex, real-world processes and domains. However, interpreting what these methods effectively infer from data is difficult in general, and they possess a limited ability to directly yield insights on the underlying relationships between inputs and the outcome for a process. We present a methodology based on new predictive comparisons to identify the relevant inputs, and interpret their conditional and two-way associations with the outcome, that are inferred by machine learning methods. Fisher consistent estimators, and their corresponding standard errors, for our new estimands are established under a condition on the inputs' distributions. The broad scope and significance of this predictive comparison methodology are demonstrated by illustrative simulation and case studies, and an additive manufacturing application involving different stereolithography processes, that utilize Bayesian additive regression trees, neural networks, and support vector machines.

  • : Raquel de Souza Borges Ferreira and Arman Sabbaghi
  • : Purdue University Department of Statistics
  • : Arman Sabbaghi
  • : statistics
  • : advanced/theoretical
  • : sabbaghi@purdue.edu
  • : 7654960234
Predictive Comparisons for Screening and Interpreting Inputs in Machine Learning