# September 26th

### Session 1 (9:15-10AM)

**The Spike Interaction**

James M. Lucas, J. M. Lucas and Associates and Carmen L. Traxler, Stat-Trax

This talk will describe spike interactions, their causes and the actions to take when they are observed. A spike interaction may be observed during an experiment or during process monitoring when an observed value or group of observed values is far from the bulk of the observations. These values are not outliers because the same observed values will be obtained if the set of conditions is repeated. The spike interactions are often deleterious. When this is the case, a major goal is to understand why they occur so their future occurrence can be avoided. We also show where a spike interaction gives a more appropriate description of a designed experiment.

We give examples of spike interactions that we have observed. We describe the modelling procedure that we use for spike interactions. Our modelling procedure uses indicator variable defined for selected high order interactions. This modelling procedure is non-hierarchical because hierarchical modelling will produce the near equality (in absolute value) of a large number of factor effects and give a much less parsimonious model.

Mechanisms that can produce spike interactions will be described.

**Constructing a Space-Filling Mixture Experiment Design to Augment Existing Points Over a Region Specified by Linear and Nonlinear Constraints**

Bryan A. Stanfill, Greg F. Piepel, Charmayne E. Lonergan, and Scott K. Cooley, Pacific Northwest National Laboratory

This presentation describes methods used for the first time to construct a space-filling design (SFD) to augment existing points over a constrained mixture experiment region specified by linear and nonlinear constraints. A mixture experiment (ME) involves (i) combining various proportions (summing to 1.0) of the components in a mixture making up an end product, and (ii) measuring the values of one or more responses for each mixture.

The experimental design problem addressed in this presentation was challenging for several reasons. First, there is no existing literature or software that directly addresses augmenting existing points using a SFD approach. The literature for augmenting experimental designs focuses on optimal design approaches. Second, until recently, approaches and software for constructing mixture experiment SFDs or optimal designs over constrained regions have been limited to constraints linear in the mixture components, and have not accommodated nonlinear constraints. Third, our experimental design problem was high-dimensional, for which many SFD criteria do not work well. Ultimately, we addressed these challenges using (i) SFD capabilities in JMP, and (ii) the JMP scripting language to access control parameters and calculations done outside of JMP.

The presentation uses a case study involving nuclear waste glass to discuss and illustrate the SFD methods used. Low-activity waste (LAW) at the Hanford site in Washington state will be immobilized in glass. The LAW glass composition region of interest for a new study phase was the same as for the previous phase, specified by lower and upper bounds on 15 components, as well as one nonlinear and several linear multiple-component constraints. The 42 LAW glasses from the previous phase included center points and points on outer and inner layers of the experimental region. The goal of the new phase was to augment the 42 glasses to better fill the experimental region, because of some complicated response-glass composition relationships. The methods discussed in the presentation were used to generate 40 different 20-glass SFDs to augment the 42 glasses. SFD metrics and design visualizations were used to assess how well each of the 40 designs covered the experimental region and to select the 20-glass SFD used.The methods discussed in this presentation can be used for any experimental design problem involving mixture and/or nonmixture variables with an experimental region specified by linear and/or nonlinear constraints.

The methods discussed in this presentation can be used for any experimental design problem involving mixture and/or nonmixture variables with an experimental region specified by linear and/or nonlinear constraints.

**Statistical Process Control: Myths, Misconceptions, and Applications**

Daksha Chokshi, Aerojet Rocketdyne

Statistical Process Control (SPC) is a powerful tool used to control and monitor processes. It allows us to listen to the voice of the process and to understand the natural process variability associated with our products. Several myths and misconceptions about SPC exist that have led to improper use and under-use of the tools. Companies as a result reap fewer benefits from SPC than they could get. Misapplication of the tools may also lead to incorrect conclusions that affect business decision-making.

This presentation will reveal many of the commonly held myths and misconceptions both at technical and infrastructure level about SPC and how to avoid them. Several case studies and success stories will highlight scenarios where SPC were used correctly or incorrectly with key learning points and resulting benefits, including how it enabled opportunities for efficiency, innovation, and enhancing the standard for success in quality.

### Session 2 (10:30AM-12PM)

**Time to Say Goodbye to Statistical Significance**

Ron Wasserstein, Executive Director, ASA and Allen Schirm, Mathematica Policy Research (retired)

The speakers will discuss what they learned from editing (with Nicole Lazar) the 43 articles in the special issue of *The American Statistician *on “Statistical Inference in the 21^{st} Century: A World Beyond P < 0.05.” After briefly reviewing the “Don’ts” set forth in the *ASA Statement on P-Values and Statistical Significance*—and adding a new one—they offer their distillation of the wisdom of the many voices in the special issue (and the broader literature): specific Do’s for teaching, doing research, and informing decisions as well as underlying principles for good statistical practice. They will talk about the response to the suggestions in the special issue, and what next steps might be. The session will include about an hour of presentation followed by about 30 minutes of discussion with the speakers, in which they will be interested in hearing audience perspective on the important issues raised by moving to world beyond p < 0.05. The one hour presentation will split at the 45 minute mark, allowing people to join who are coming from other sessions or who need to leave for other sessions. But we have no idea why you wouldn’t stay for this entire presentation!

**Case Studies: Tools for System Availability and Sensor Performance with Small Samples**

Kyle Kolsti, The Perduco Group

In recent experience supporting acquisition programs, testing has often been constrained by a combination of rapid acquisition timelines, high risk tolerance, and limited resources. In this talk I will discuss two tools designed to address specific test scenarios challenged by small sample sizes. In the first case study, system availability must be inferred following a short user acceptance test without access to actionable reliability data from system development. The proposed method uses Monte Carlo simulation to provide a lower confidence limit on availability. The method’s performance is evaluated through simulation using a variety of underlying failure time and repair time populations. The second case study involves the characterization of sensor probability of detection as a function of range. It is assumed the system and the targets are essentially stationary – in other words, significant effort is required to change the range – so the response is binary. The proposed Bayesian method assumes a realistic functional form for the performance curve and uses a uniform prior to incorporate expert predictions and typical sensor behavior. This information augments the test data to permit inference of the probability of detection across all ranges, even with a small sample.

**Tuning Algorithmic Parameters for Locating Array Construction**

Erin Lanus, Virginia Tech

Consider fault testing a system for which testing all combinations of factor levels is infeasible. An assignment of factor levels to *t* factors is a *t*-way interaction. A (*d*, *t*)-locating array is a combinatorial design that produces a test suite in which every set of at most *d* *t*-way interactions appears in a unique set of tests. When faults are caused by *t*-way interactions, using such a test suite guarantees to locate all faulty interactions in a system where there are no more than *d* faults. Partitioned Search with Column Resampling (PSCR) is a computational search algorithm to verify if an array is (*d*, *t*)-locating by partitioning the search space to decrease the number of comparisons. If a candidate array is not locating, random resampling is performed until a locating array is constructed or an iteration limit is reached. Algorithmic parameters determine which factor columns to resample and when to add additional tests to the candidate array. In this talk, we employ a 5 × 5 × 3 × 2 × 2 full factorial design to analyze the effect of the algorithmic parameters on running time and rows in the resulting locating array in order to tune the parameters to prioritize speed, accuracy, or a combination of both.

**A Hierarchical Model for Heterogeneous Reliability Field Data**

William Meeker, Iowa State University

When analyzing field data on consumer products, model-based approaches to inference require a model with sufficient flexibility to account for multiple kinds of failures. The causes of failure, while not interesting to the consumer per se, can lead to various observed lifetime distributions. Because of this, standard lifetime models, such as using a single Weibull or lognormal distribution, may be inadequate. Usually cause-of-failure information will not be available to the consumer and thus traditional competing risk analyses cannot be performed. Furthermore, when the information carried by lifetime data are limited by sample size, censoring, and truncation, estimates can be unstable and suffer from imprecision. These limitations are typical, for example, lifetime data for high-reliability products will naturally tend to be right-censored. In this article, we present a method for joint estimation of multiple lifetime distributions based on the generalized limited failure population (GLFP) model. This five-parameter model for lifetime data accommodates lifetime distributions with multiple failure modes: early failures (sometimes referred to in the literature as “infant mortality”) and failures due to wearout. We fit the GLFP model to a heterogenous population of devices using a hierarchical modeling approach. Borrowing strength across subpopulations, our method enables estimation with uncertainty of lifetime distributions even in cases where the number of model parameters is larger than the number of observed failures. Moreover, using this Bayesian method, comparison of different product brands across the heterogenous population is straightforward because estimation of arbitrary functionals is easy using draws from the joint posterior distribution of the model parameters. Potential applications include assessment and comparison of reliability to inform purchasing decisions. Supplementary materials for this article are available online.

**Optimal Experimental Design in the Presence of Nested Factors**

Bradley Jones, SAS Institute Inc., JMP Division

A common occurrence in practical design of experiments is that one factor, called a nested factor, can only be varied for some but not all the levels of a categorical factor, called a branching factor. In this case, it is possible, but inefficient, to proceed by performing two experiments. One experiment would be run at the level(s) of the branching factor that allow for varying the second, nested, factor. The other experiment would only include the other level(s) of the branching factor. It is preferable to perform one experiment that allows for assessing the effects of both factors. Clearly, the effect of the nested factor then is conditional on the levels of the branching factor for which it can be varied. For example, consider an experiment comparing the performance of two machines where one machine has a switch that is missing for the other machine. The investigator wants to compare the two machines but also wants to understand the effect of flipping the switch. The main effect of the switch is conditional on the machine. This article describes several example situations involving branching factors and nested factors. We provide a model that is sensible for each situation, present a general method for constructing appropriate models, and show how to generate optimal designs given these models.

### Session 3 (2-3:30PM)

**Utilizing the Structure of Two-Level Designs for Design Choice and Analysis**

David J. Edwards, Virginia Commonwealth University and Robert W. Mee, The University of Tennessee

Two-level fractional factorial designs are often used in screening scenarios to identify active factors. This presentation investigates the block diagonal structure of the information matrix of certain two-level designs. We connect the block diagonal information matrix to a class of designs known as parallel flats designs and gain insights into the structure of what is estimable and/or aliased. Three parallel flat designs represent one common example (e.g. three-quarter replicates of regular designs), but we show how the block diagonal structure arises in other contexts. Recognizing this structure helps with understanding the advantages of alternative designs as well as analysis.

**OMARS Designs: Bridging the Gap between Definitive Screening Designs and Standard Response Surface Designs**

Peter Goos and José Núñez Ares, KU Leuven

Response surface designs are a core component of the response surface methodology, which is widely used in the context of product and process optimization. In this contribution, we present a new class of 3-level response surface designs, which can be viewed as matrices with entries equal to −1, 0 and +1. Because the new designs are orthogonal for the main effects and exhibit no aliasing between the main effects and the second-order effects (two-factor interactions and quadratic effects), we call them orthogonal minimally aliased response surface designs or OMARS designs. We constructed a catalog of 55,531 OMARS design for 3 to 7 factors using integer programming techniques. Also, we characterized each design in the catalog extensively in terms of estimation and prediction efficiency, power, fourth-order correlations, and projection capabilities, and we identified interesting designs and investigated trade-offs between the different design evaluation criteria. Finally, we developed a multi-attribute decision algorithm to select designs from the catalog. Important results of our study are that we discovered some novel designs that challenge standard response surface designs and that our catalog offers much more flexibility than the standard designs currently used.

**Laplace-Beltrami Spectra as a Tool for Statistical Shape Analysis of 2-D Manifold Data**

Xueqi Zhao and Enrique del Castillo, Pennsylvania State University

Complex datasets of part measurements are increasingly common because of the wider availability of 3D range scanners and other advanced metrology in industry. Defect detection and process control based on such complex data structures are important but not fully explored. We present a new approach that uses the Laplace-Beltrami spectra to infer the intrinsic geometrical features of the surfaces of scanned manufactured parts. The isometric invariant property avoids the computationally expensive registration pre-processing of the data sets. The discriminatory power of several nonparametric multivariate statistics is tested on simulated data through permutation tests and distribution-free control charts.

**Control Charting Techniques using Parametric, non-Parametric, and Semi-Parametric Methods**

Statistical Process Control (SPC) is widely applied to monitor quality performance within industry settings to track and improve product quality. Profile monitoring is one of the methods used in SPC to understand the functional relationship between response and explanatory variables through observing/tracking this relationship and estimating parameters. Within this method there are two phases: Phase I – creating a statistical model and estimating parameters based on historical data and Phase II – implementing the established control chart to a live production process. Control charts are a visual tool used in profile monitoring and researchers have proposed a variety of charts which we explore and implement. Many of the works in profile monitoring are based in Phase I with more recent research focusing on quality control approaches considering non-parametric and semi-parametric models. An extensive review on articles published in SPC is completed and an exploration of control charting methods in Phase I analysis using parametric, non-parametric, and semi-parametric models is conducted in this research. We then apply a range of profile monitoring techniques to various datasets featured in previously published works and examine their performance.

**Translating Images to Information: Improving Process Evaluation and Control Systems for Additive Manufacturing**

Mindy Hotchkiss, Aerojet Rocketdyne and Sean P. Donegan, Air Force Research Laboratory (AFRL/RXCM)

Various metals-based additive manufacturing (AM) systems are currently being explored for potential applications in aerospace hardware. This advanced manufacturing technology presents significant opportunities for accelerated development, agile manufacturing, and design freedom, significantly reducing the process footprint in time and space and enabling rapid prototyping. A current limitation is the lack of well-defined methodologies for qualifying AM components through non-destructive inspection (NDI), a requirement for production applications. This multi-faceted problem has the following key aspects:

- Flaw definition, identification, and detection (both post-build and in-process)
- Flaw detection method verification and validation, and
- Flaw remediation, mitigation, and prevention.

Specifics will vary by AM machine types and material; however, a robust framework based on sound statistical methods is needed that will establish standards and provide guidance to the industry community. The Metals Affordability Initiative (MAI) NG-7 program “Material State Awareness for Additive Manufacturing” (MSAAM) demonstrated the ability to bring together information from new data heavy inspection and characterization methods, using computational and image processing tools to register (align) multiple representations against the intended design and fuse the co-located data together for purposes of analysis and the development of prediction models. Several test cases from powder bed fusion AM systems will be discussed comparing post-build computed tomography (CT) scans of parts built under different conditions to output from in process monitoring (IPM) sensor systems, implemented using the open-source tool suite DREAM.3D (Digital Representation Environment for Analysis of Microstructure in 3D). Challenges and lessons learned will be discussed, as well as statistical enablers used to facilitate the iterative model development and assessment processes and adaptations to ultra-high resolution data.

**Sequential Approach to Identify Factors Influencing Flammability of Selective Laser Melted Inconel 718**

Jonathan Tylka, NASA

Materials incompatibility with oxygen has been attributed to many high profile NASA failures including Apollo 1 and Apollo 13. Most metals and all polymeric materials are flammable in 100% oxygen with sufficient pressure and temperature. Oxygen is necessary for the combustion of fuels in most conventional rocket engines and for the sustainment of humans with life support systems. Therefore, NASA requires that usage of materials in oxygen system applications shall be based on flammability and combustion test data in a relevant oxygen environment to avoid materials incompatibility with oxygen leading to a catastrophic system failure. Understanding of metals produced by additive manufacturing (AM) methods, such as Powder Bed Fusion Technologies, are now mature enough to be considered for qualification in manned spaceflight oxygen systems. NASA Space Launch System, Commercial Resupply, and Commercial Crew programs are using AM components in propulsion systems, which are likely to include various life support systems in the future. Without systematic flammability and ignition testing in oxygen there is no credible method for NASA to accurately evaluate the risk of using AM metals in oxygen systems. NASA White Sands Test Facility (WSTF) develops and maintains the expertise and facilities to evaluate material flammability and ignitability in pressurized oxygen environments for the AM materials in question. Through experimentation WSTF is working to identify and investigate specific oxygen compatibility issues unique to AM materials. Historically flammability statistics have relied on binomial pass/fail responses and resulted in low statistical power and confidence without extensive testing. Using a sequential design of experiments approach, systematic flammability testing in oxygen has been performed at WSTF for selective laser melted Inconel 718. Identification of a meaningful continuous response, sequential orthogonal experimentation, and test matrix randomization allowed for the rigorous evaluation of many factors and their effect, on flammability. Factors studied include base chemistry, post processing methods, heat treatments, and surface treatments. Several additional statistical methods were employed during experimentation, including parametric life analysis and regression model selection. By leveraging modern experimental design philosophy a useful flammability model was developed for Inconel 718. This new methodology for evaluating factors influencing flammability was efficient (required fewer tests) and yielded greater statistical confidence, to make actionable engineering decisions, than previous test methodology. The methodology and results will be discussed.

## September 27th

### Session 4 (8-9:30AM)

**Nonparametric Kernel Machine Model**

Inyoung Kim, Virginia Tech

Variable selection for recovering sparsity in nonadditive and nonparametric models with high dimensional variables has been challenging. This problem becomes even more difficult due to complications in modeling unknown interaction terms among high dimensional variables. There is currently no variable selection method to overcome these limitations. Hence, in this talk we propose a variable selection approach that is developed by connecting a kernel machine with the nonparametric regression model. The advantages of our approach are that it can: (1) recover the sparsity, (2) automatically model unknown and complicated interactions, (3) connect with several existing approaches including linear nonnegative garrote and multiple kernel learning, and (4) provide flexibility for both additive and nonadditive nonparametric models. Our approach can be viewed as a nonlinear version of a nonnegative garrote method. We model the smoothing function by a Least Squares Kernel Machine (LSKM) and construct the nonnegative garrote objective function as the function of the sparse scale parameters of kernel machine to recover sparsity of input variables whose relevance to the response are measured by the scale parameters. We also provide the asymptotic properties of our approach. We show that sparsistency is satisfied with consistent initial kernel function coefficients under certain conditions. An efficient coordinate descent/backfitting algorithm is developed. A resampling procedure for our variable selection methodology is also proposed to improve the power.

**Fog Computing for Distributed Family Learning in Cyber-Manufacturing Modeling**

Xiaoyu Chen, Virginia Tech

Cyber-manufacturing systems (CMS) interconnect manufacturing facilities via sensing and actuation networks to provide reliable computation and communication services in smart manufacturing. In CMS, various advanced data analytics have been proposed to support effective decision-making. However, most of them were formulated in a centralized manner to be executed on single workstations, or on Cloud computation units as the data size dramatically increases. Therefore, the computation or communication service may not be responsive to support online decision-making in CMS. In this research, a method to decompose a group of existing advanced data analytics models (i.e., family learning for CMS modeling) into their distributed variants is proposed via alternative direction method of multipliers (ADMM). It improves the computation services in a Fog-Cloud computation network. A simulation study is conducted to validate the advantages of the proposed distributed method on Fog-Cloud computation network over Cloud computation system. Besides, six performance evaluation metrics are adopted from the literature to access the performance of computation and communication. The evaluation results also indicate the relationship between Fog-Cloud architectures and computation performances, which can contribute to the efficient design of Fog-Cloud architectures in the future.

**Computer Experiments for the Optimization and Understanding of Continuous Pharmaceutical Manufacturing Processes**

John Peterson, Tim Boung Wook Lee and Greg Stockdale, GlaxoSmithKline

The understanding and output-prediction of continuous manufacturing processes can often benefit from development and assessment of a mechanistic model. Many continuous processes have several stages and substages, which are physically linked together. Each substage may have its own mechanistic model. Hence, the overall model is a link-up of two or more models. This has a tendency to make the overall process a complex function of several process input factors. As such, it may be the case that to optimally assess the response surface associated with the overall mechanistic model, special experimental designs, such as Latin hyper-square designs, are helpful. Following the validation of the overall mechanistic model with experimental data, a computer experiment employing a modified Latin hyper-square design was used to generate data from a computer model. This data was then used to build a response surface meta-model using flexible fitting methods such a thin-plate spline and artificial neural network regression. In addition to main effects and interaction plots derived from the meta-model response surface, a global sensitivity analysis was done using functional ANOVA methods.

The construction of meta-models for practical exploration of complex mechanistic models have been used in the aerospace, automotive, and medical device areas, but is just starting to be considered for the understanding & optimization of continuous pharmaceutical manufacturing processes. It might have applications to other areas where complex (molecular, pharmacological, or epidemiological) models are employed.

**Maximum Likelihood Estimation for the Poly-Weibull Distribution**

Major Jason Freels, US Air Force Institute of Technology

Modified Weibull distributions have been introduced to model data for which the hazard function is bathtub-shaped. Each modified distribution’s performance is assessed by its ability to fit reference dataset known to produce a bathtub-shaped hazard rate function. This paper compares the performance of modified Weibull distributions in the literature to that of the generalized poly-Weibull distribution. Numerical and analytical procedures are developed for obtaining the maximum likelihood parameter estimates, standard errors, and moments for the generalized poly-Weibull distribution. The results show that the poly-Weibull distribution fits the reference dataset better than the current best-fit models.

**A Bayesian Hierarchical Model for Quantitative and Qualitative Responses**

Lulu Kang, Illinois Institute of Technology; Xiaoning Kang, Dongbei University of Finance and Economics; Xinwei Deng and Ran Jin, Virginia Tech

In many science and engineering systems both quantitative and qualitative output observations are collected. If modeled separately the important relationship between the two types of responses is ignored. In this article, we propose a Bayesian hierarchical modeling framework to jointly model a continuous and a binary response. Compared with the existing methods, the Bayesian method overcomes two restrictions. First, it solves the problem in which the model size (specifically, the number of parameters to be estimated) exceeds the number of observations for the continuous response. We use one example to show how such a problem can easily occur if the design of the experiment is not proper; all the frequentist approaches would fail in this case. Second, the Bayesian model can provide statistical inference on the estimated parameters and predictions, whereas it is not clear how to obtain inference using the latest method proposed by Deng and Jin (2015 Deng, X., and R. Jin.), which jointly models the two responses via constrained likelihood. We also develop a Gibbs sampling scheme to generate accurate estimation and prediction for the Bayesian hierarchical model. Both the simulation and the real case study are shown to illustrate the proposed method.

**Spatially Weighted PCA for Monitoring Video Image Data with Application to Additive Manufacturing**

Bianca M. Colosimo and Marco Grasso, Politecnico di Milano

Machine vision systems for in-line process monitoring in advanced manufacturing applications have attracted an increasing interest in recent years. One major goal is to quickly detect and localize the onset of defects during the process. This implies the use of image-based statistical process monitoring approaches to detect both when and where a defect originated within the part. This study presents a spatiotemporal method based on principal component analysis (PCA) to characterize and synthetize the information content of image streams for statistical process monitoring. A spatially weighted version of the PCA, called ST-PCA, is proposed to characterize the temporal auto-correlation of pixel intensities over sequential frames of a video-sequence while including the spatial information related to the pixel location within the image. The method is applied to the detection of defects in metal additive manufacturing processes via in-situ high-speed cameras. A k-means clustering-based alarm rule is proposed to provide an identification of defects in both time and space. A comparison analysis based on simulated and real data shows that the proposed approach is faster than competitor methods in detecting the defects. A real case study in selective laser melting (SLM) of complex geometries is presented to demonstrate the performances of the approach and its practical use.

### Session 5 (10-11:30AM)

**Functional Analysis in Chemical Process**

Flor Castillo, SABIC

Most of the responses and variables in chemical processes are functional in the sense that they are curves that depend on some other variable (i.e. time). Of particular importance are the shapes of the curves since they offer additional insight fundamental for engineering interpretation and practical implementation. A number of approaches are available for analysis, ranging from simplistic to very complex. However, the most successful applications are those that consider the sequential nature of the data and incorporate it accordingly. Applications of functional analysis in chemical processes are presented.

**Iterative Modeling with Functional Data**

Joanne Wendelberger, Los Alamos National Laboratories (LANL)

Based on the analysis, we discuss the role of statistical techniques in diagnostic problem solving and reasoning patterns that make the application of statistics powerful. The contribution to theory in statistics is not in the individual techniques but in their application and integration in a coherent sequence of studies – a reasoning strategy.

**Active Learning for Gaussian Process considering Uncertainties with Application to Shape Control of Composite Fuselage**

Xiaowei Yue, Virginia Tech; Yuchen Wen, FedEx Services; Jeffrey H. Hunt, The Boeing Company; and Jianjun Shi, Georgia Institute of Technology

In the machine learning domain, active learning is an iterative data selection algorithm for maximizing information acquisition and improving model performance with limited training samples. It is very useful especially for the industrial applications where training samples are expensive, time-consuming, or difficult to obtain. Existing methods mainly focus on active learning for classification, and a few methods are designed for regression such as linear regression or Gaussian process. Uncertainties from measurement errors and intrinsic input noise inevitably exist in the experimental data, which further affect the modeling performance. The existing active learning methods do not incorporate these uncertainties for Gaussian process. In this paper, we propose two new active learning algorithms for the Gaussian process with uncertainties, which are variance-based weighted active learning algorithm and D-optimal weighted active learning algorithm. Through numerical study, we show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance. This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage.

**Predictive Comparisons for Screening and Interpreting Inputs in Machine Learning**

Raquel de Souza Borges Ferreira and Arman Sabbaghi, Purdue University

Machine learning algorithms and models constitute the dominant set of predictive methods for a wide range of complex, real-world processes and domains. However, interpreting what these methods effectively infer from data is difficult in general, and they possess a limited ability to directly yield insights on the underlying relationships between inputs and the outcome for a process. We present a methodology based on new predictive comparisons to identify the relevant inputs, and interpret their conditional and two-way associations with the outcome, that are inferred by machine learning methods. Fisher consistent estimators, and their corresponding standard errors, for our new estimands are established under a condition on the inputs’ distributions. The broad scope and significance of this predictive comparison methodology are demonstrated by illustrative simulation and case studies, and an additive manufacturing application involving different stereolithography processes, that utilize Bayesian additive regression trees, neural networks, and support vector machines.

**Rethinking Control Chart Design and Evaluation**

William H. Woodall and Frederick W. Faltin, Virginia Tech

We discuss some practical issues involving the control of the number of false alarms in process monitoring. This topic is of growing importance as the number of variables being monitored and the frequency of measurement increase. An alternative formulation for evaluating and comparing the performance of control charts is given based on defining in-control, indifference and out-of-control regions of the parameter space. Methods are designed so that only changes of practical importance are to be detected quickly. This generalization of the existing framework makes control charting much more useful in practice, especially when many variables are being monitored. It also justifies to a greater extent the use of cumulative sum (CUSUM) methods.

**Statistical Reasoning in Diagnostic Problem Solving — The Case of Flow-Rate Measurements**

Jeroen De Mast and Stefan H. Steiner, University of Waterloo; Rick Kuijten, Water Board de Dommel; Elly Funken-Van den Bliek, Holland Innovative

There are various methods for measuring flow rates in rivers, but all of them have practical issues and challenges. A period of exceptionally high water levels revealed substantial discrepancies between two measurement setups in the same waterway. Finding a causal explanation of the discrepancies was important, as the problem might have ramifications for other flow-rate measurement setups as well. Finding the causes of problems is called diagnostic problem-solving. We applied a branch-and-prune strategy, in which we worked with a hierarchy of hypotheses, and used statistical analysis as well as domain knowledge to rule out options. We were able to narrow down the potential explanations to one main suspect and an alternative explanation. Based on the analysis, we discuss the role of statistical techniques in diagnostic problem-solving and reasoning patterns that make the application of statistics powerful. The contribution to theory in statistics is not in the individual techniques but in their application and integration in a coherent sequence of studies – a reasoning strategy.

### Session 6 (1:30-3PM)

**Practical Considerations in the Design of Experiments for Binary Data**

Martin Bezener, Stat-Ease

Binary data is very common in experimental work. In some situations, a continuous response is not possible to measure. While the analysis of binary data is a well-developed field with an abundance of tools, design of experiments (DOE) for binary data has received little attention, especially in practical aspects that are most useful to experimenters. Most of the work in our experience has been too theoretical to put into practice. Many of the well-established designs that assume a continuous response don’t work well for binary data, yet are often used for teaching and consulting purposes. In this talk, I will briefly motivate the problem with some real-life examples we’ve seen in our consulting work. I will then provide a review of the work that has been done up to this point. Then I will explain some outstanding open problems and propose some solutions. Some simulation results and a case study will conclude the talk.

**Separation in D-optimal Experimental Designs for the Logistic Regression Model**

Anson R. Park, Michelle V. Mancenido, and Douglas C. Montgomery, Arizona State University

The D-optimality criterion is often used in computer-generated experimental designs when the response of interest is binary, such as when the attribute of interest can be categorized as pass or fail. The majority of methods in the generation of D-optimal designs focus on logistic regression as the base model for relating a set of experimental factors with the binary response. Despite the advances in computational algorithms for calculating D-optimal designs for the logistic regression model, very few have acknowledged the problem of separation, a phenomenon where the responses are perfectly separable by a hyperplane in the design space. Separation causes one or more parameters of the logistic regression model to be inestimable via maximum likelihood estimation. The objective of this paper is to investigate the tendency of computer-generated, non-sequential D-optimal designs to yield separation in small-sample experimental data. Sets of local D-optimal and Bayesian D-optimal designs with different run (sample) sizes are generated for several “ground truth” logistic regression models. A Monte Carlo simulation methodology is then used to estimate the probability of separation for each design. Results of the simulation study confirm that separation occurs frequently in small-sample data and that separation is more likely to occur when the ground truth model has interaction and quadratic terms. Finally, the paper illustrates that different designs with identical run sizes created from the same model can have significantly different chances of encountering separation.

**On-site Surrogates for Large-scale Calibration**

Jiangeng Huang and Robert B. Gramacy, Virginia Tech; Mickael Binois, Argonne National Laboratory; Mirko Libraschi, Baker Hughes, a GE Company

Motivated by a challenging computer model calibration problem from the oil and gas industry, involving the design of a so-called honeycomb seal, we develop a new Bayesian calibration methodology to cope with limitations in the canonical apparatus stemming from several factors. We propose a new strategy of on-site experiment design and surrogate modeling to emulate a computer simulator acting on a high-dimensional input space that, although relatively speedy, is prone to numerical instabilities, missing data, and nonstationary dynamics. Our aim is to strike a balance between data-faithful modeling and computational tractability within an overarching calibration framework–tuning the computer model to the outcome of a limited field experiment. Situating our on-site surrogates within the canonical calibration apparatus requires updates to that framework. In particular, we describe a novel yet intuitive Bayesian setup that carefully decomposes otherwise prohibitively large matrices by exploiting the sparse blockwise structure thus obtained. We illustrate empirically that this approach outperforms the canonical, stationary analog, and we summarize calibration results on a toy problem and on our motivating honeycomb example.

**Automated Uncertainty Analysis for Model Calibration and Machine Learning***
*David Sheen, NIST

Machine learning (ML), calibration of mathematical models using experimental data, has enabled tremendous advances in computational science. Some such advances have included digital character recognition, diagnoses from medical imaging, and forensics. However, models developed from ML are often treated as a black box, where data goes in and predictions come out. Very little concern is given to the uncertainty in the predictions of these models, so consequently not very much can be said about their reliability. This becomes of critical concern when dealing with questions that may have life-altering consequences. The central issue is whether life-and-death decisions can turn on the result of a model that no one understands. Knowing the uncertainty in a ML model’s predictions allows for more robust assessments of the model’s performance and can even provide some insight into its underlying behavior. Understanding and appreciation of uncertainty in models would be improved if modeling packages had a more transparent means of estimating uncertainty. Several computational methods have been developed to estimate uncertainty in ML models, including Bayesian uncertainty analysis and bootstrapping. Likewise, these algorithms have been implemented in various computer packages. In this talk, I will discuss existing uncertainty estimation codes, with focus on the strengths of their implementation and potential areas for improvement. I will focus on NIST’s MUM-PCE as well as a machine-learning uncertainty package under development at NIST, although the conclusions that I draw will be broadly applicable to most uncertainty analysis packages. In general, although uncertainty analysis codes serve well at implementing the uncertainty analysis algorithms, they are still mostly written for uncertainty experts rather than users of common simulation packages.

**Poisson Count Estimation**

Michaela Brydon and Erin Leatherman, Kenyon College; Kenneth Ryan, West Virginia University; Michael Hamada, Los Alamos National Laboratories (LANL)

This talk considers a series of manufactured parts, where each part is composed of the same specified regions. Part of the manufacturing process includes counting the number of occurrences of a characteristic in each region for each part. However, after collecting data in this manner for some time, a more expansive data collection method is utilized and the number of regions is increased, such that at least some of the new regional boundaries are different from the original boundaries. While the more numerous regions allow this new setting to describe the locations of occurrences with more precision, the number of observations that can be collected in the new setting is limited (e.g., due to time or financial restrictions). The ultimate goal is to estimate the average number of occurrences for the more numerous regions using both sets of data. This set-up is motivated by a proprietary application at Los Alamos National Lab; thus, simulated data is used in our study. Maximum Likelihood and Bayesian Estimators of the mean number of occurrences in the second setting are explored using analytical and simulation results from the combined datasets. Although this methodology will be discussed in terms of the manufacturing application, this technique and our code can be applied to other settings.

**Monitoring Within and Between Non-Linear Profiles Using a Gaussian Process Model with Heteroscedasticity**

Ana Valeria Quevedo Candela, Universidad de Piura; G. Geoff Vining, Virginia Tech

There is a need for some industrial processes such as agricultural, aquacultural and chemical for identifying if their processes are performing out of control at an early stage to then investigate and take corrective actions. Most of the current profile monitoring procedures use a Hotelling’s T2 type of chart that requires the entire profile data. We propose to use a Shewhart control chart based on a Gaussian process (GP) model with heteroscedasticitythat takes into account replications to evaluate each profile as it evolves with respect to a mean function. The advantage of the GP model with replications is that it captures both the between and within profile correlation. Another advantage is that this chart does not need to wait until the profile ends to evaluate whether its behavior is abnormal. Our results indicate that our proposed chart is effective especially relative to the current T2 procedures under this model.