# October 5th

### Welcome & Plenary Session (8-9AM)

We Stand on the Shoulders of Giants

Ronald D. Snee, Snee Associates, LLC

Recognition by the Gerry Hahn Quality and Productivity Achievement Award provides the opportunity to celebrate the work of Gerry Hahn, one of the distinguished pioneers of industrial statistics. This important occasion also provides the opportunity to honor the pioneers of industrial statistics; the people that created and nurtured the field and profession that we enjoy today. This presentation will focus on these pioneers and the organizations that got it all started. Along the way we will review their contributions and what we can learn from their experiences that can help us to be more effective in today’s world.

### Session 1 (9:15-10AM)

A Statistical Engineering Approach to Problem Solving

Roger Hoerl, Union College

Problem solving is obviously a generic issue common to all disciplines, and a fundamental of both quality improvement and statistics. Individual methodologies, such as Six Sigma, Lean, TRIZ, and GE Workout, to name just a few, have often been proposed in the literature as the “best” or “most powerful” method available for problem solving. However, is there one approach that is always best, regardless of the uniqueness of the specific problem to be solved? Some problems are relatively straightforward, while other are large, complex, and unstructured. Also, what about Big Data analytics; can they be utilized for problem solving in addition to more traditional methodologies? If so, how would they fit? Recent research has clearly shown that for best results the problem-solving approach needs to be tailored to the individual problem. Too often, the problem seems to have been restated to be more suitable to the pre-selected problem-solving method. Given this, how should one glean through the menu of problem-solving approaches, including Big Data analytics, and select the most appropriate method for a given problem? In this presentation, I propose a statistical engineering approach to this question. Since statistical engineering is inherently tool agnostic, it is uniquely suited to help make this decision objectively based on the attributes of the problem to be addressed. A statistical engineering approach provides actionable guidelines for selection of the most suitable problem-solving methodology, which will be illustrated during the presentation.

Data Fusion and Mining of In Situ Monitoring Sensors, Process Modeling, and Defect Characterization in Powder Bed Fusion Additive Manufacturing

Sean P Donegan, AFRL

Although additive manufacturing offers the potential to revolutionize the design of structural metallic components, current limitations in understanding the relationships between process, geometry, and resulting microstructure prevent its general adoption. A key benefit of additive manufacturing technologies is the amount of available data streams characterizing all stages of the process, from engineering design to in situ build monitoring to post-build qualification. Modern analytical methods for data fusion and machine learning can take advantage of this wealth of information to accelerate our understanding of additive processes. We present several techniques for managing and mining additive manufacturing data, including an efficient architecture for the storage and manipulation of “big” additive data, fusion of multiple spatially varying data streams, and mining of these data streams to extract quantitative correlations via dimensionality reduction and clustering. We apply the techniques to several components from canonical additive builds across multiple materials systems (Ti-6Al-4V and nickel-based superalloys).

Statistical Model Uncertainty in Measurements: Three Examples

Adam L. Pintar, NIST

Careful assessments of uncertainty are a critical part of the measurement process. The “Guide to the Expression of Uncertainty in Measurement” (GUM) http://www.bipm.org/en/publications/guides/gum.html divides uncertainty evaluated by statistical methods (Type A) from other methods (Type B). Statistical assessments of uncertainty are performed conditionally on a set of assumptions, or model. Two common assumptions involve parametric functional forms, e.g., polynomial degree, and statistical independence. Often, after the choice of model is made, the analyst’s uncertainty about that choice is not considered further even though the choice could dramatically impact the uncertainty assessment.

In many situations, a collection of statistical models may be embedded within a global model and thus informed or estimated using the observed data. An advantage of this approach is that any rigorous procedure for quantifying uncertainty will naturally include model uncertainty. Bayesian methods seem particularly useful since calculation of the joint posterior distribution for the global model will naturally incorporate model uncertainty.

In this presentation, three examples are described. The first is concerned with the calibration of thermistors. Part of the original calibration process was choosing a polynomial to use as the calibration function. The example is reworked to include model uncertainty by fitting the polynomial of the highest plausible degree, but “shrinking” the coefficients toward zero to control for over-fitting. Uncertainty in the shrinkage parameter accounts for model uncertainty.

The second example involves data from a computed tomography experiment concerned with comparing two measurements of tumor growth. Imaging phantoms designed to simulate tumor growth were used so that each measurement's ability to predict ground truth, mass, could be quantified. A random coefficients model that allows the sharing of information between phantoms when appropriate was leveraged. The random coefficients model, compared to modeling each phantom separately, incorporates model uncertainty and leads to physically more sensible predictions.

The third example considers the problem of heterogeneity in Standard Reference Materials. During the creation of these materials, great effort is typically expended to make individual units interchangeable or homogeneous. However, repeated measurements are made on multiple units so that a component of uncertainty due to material heterogeneity may be included on the certificate if needed. The “heterogeneity model” used to estimate this component, and typically the values to appear on the certificate, may lead to a large uncertainty relative to the judgment of the subject matter expert for the material. Averaging the “heterogeneity” and “homogeneity” models can help the situation while still integrating model uncertainty.

### Session 2 (10:30AM-12PM)

*The Seven Deadly Sins of Big Data*

Dick De Veaux, Williams College

As we are all too aware, organizations accumulate vast amounts of data from a variety of sources nearly continuously. Big data advocates promise the moon and the stars as you harvest the potential of all these data. There is certainly a lot of hype. There’s no doubt that savvy organizations are fueling their strategic decision making with insights from data mining, but what are the challenges?

Much can wrong in the data mining process, even for trained professionals. In this talk I’ll discuss a wide variety of case studies from a range of industries to illustrate the potential dangers and mistakes that can frustrate problem solving and discovery — and that can unnecessarily waste resources. My goal is that by seeing some of the mistakes I have made, you will learn how to take advantage of data mining insights without committing the “Seven Deadly Sins.”

Beyond Reliability: Advanced Analytics for Predicting Quality

William Goodrum, Elder Research, Inc

Predictive analytics techniques model outcomes for individual entities, whether they are widgets coming off an assembly line, claims being serviced by an insurance company, or shoppers considering a purchase. This focus on particular individuals provides a contrasting and complementary view to traditional approaches for the analysis of reliability, such as Kaplan-Meier estimates or Design of Experiments, which generally focus on populations of entities. We will provide a brief overview of predictive analytics and then, using case studies from the energy sector and connected devices/IoT, demonstrate how these approaches can enhance more traditional analyses of reliability.

Planning Fatigue Tests for Polymer Composites

Yili Hong, Virginia Tech

The implementation of computer experiments is a competitive advantage in business environments where fast and cost effective product development is critical. In many industrial applications computer experiments are replacing physical experiments because the physical creation and testing of prototypes is very prohibitive in terms of time and cost. Computer experiments typically involve complex systems with numerous input variables. A primary goal with computer experiments is to develop a metamodel – a good empirical approximation to the original complex computer model – which provides an easier and faster approach to sensitivity analysis, prediction and optimization. This talk will discuss and present new strategies to efficiently design computer experiments whose input factors may not be equally important. The first part of the talk will introduce the Maximum Projection (MaxPro) Design criterion, which automatically maximizes the space-filling properties of a design on projections to all subsets of factors. The MaxPro criterion has already been incorporated into the latest version of JMP and we will illustrate it using real industrial applications. In the second part of the talk, we will present a new sequential design strategy which could screen out the potential inert factors quickly.

Comparison of Accelerated Life Test Plans Based on Exact Small-Sample Methodology and Asymptotic Large-Sample Methodology

Caleb King, Sandia National Laboratories

The majority of the statistical literature on optimal test planning for accelerated life testing use the asymptotic variance of the quantity of interest as the criterion for optimality. Implicit in this procedure is the assumption that the number of samples used during testing is large enough to justify the underlying theory. However, it is often the case that financial and material constraints severely limit the number of samples available for testing. In these small-sample settings, a concern is that the test plans suggested by asymptotic theory may not yield the minimum variance. This paper represents a first attempt to develop optimal accelerated life test plans based on the exact variance of a quantile estimator with small samples providing an alternative to test plans based on large-sample theory. The optimal location of design points and sample allocation is determined for lognormal and Weibull lifetime distributions with complete data. The resulting test plans are then compared to those suggested by large-sample methods. In addition, the optimal small-sample test plans are used to determine the total number of samples needed to achieve a desired bound on the variance of the quantile estimator.

Benefits and Fast Construction of Efficient Two-Level Foldover Designs

Anna Errore, University of Minnesota

Recent work in two-level screening experiments has demonstrated the advantages of using small foldover designs, even when such designs are not orthogonal for the estimation of main effects. In this talk, we provide further support for this argument and develop a fast algorithm for constructing efficient two-level foldover (EFD) designs. We show that these designs have equal or greater efficiency for estimating the main effects model versus competitive designs in the literature and that our algorithmic approach allows the fast construction of designs with many more factors and/or runs. Our compromise algorithm allows the practitioner to choose among many designs making a trade-off between efficiency of the main effect estimates and correlation of the two-factor interactions. Using our compromise approach practitioners can decide just how much efficiency they are willing to sacrifice to avoid confounded two-factor interactions as well as lowering an omnibus measure of correlation among the two-factor interactions.

Selecting an Orthogonal or Non-Orthogonal Two-Level Design for Screening

David Edwards, Virginia Commonwealth University

This talk presents a comparison of criteria used to characterize two-level designs for screening purposes. To articulate the relationships among criteria, we focus on seven-factor designs with 16– 32 runs and 11-factor designs with 20– 48 runs. Screening based on selected designs for each of the run sizes considered is studied with simulation using a forward selection procedure and the Dantzig selector. This talk compares Bayesian D-optimal designs, designs created algorithmically to optimize estimation capacity over various model spaces, and orthogonal designs by estimation- based criteria and simulation. In this way, we furnish both general insights regarding various design approaches, as well as a guide to make a choice among a few final candidate designs.

### Session 3 (2-3:30PM)

H-Canonical Regression

Joseph G. Voelkel, Rochester Institute of Technology

Multivariate texts use a certain eigenanalysis in MANOVA and multivariate-regression testing. Canonical correlation, not surprisingly, is treated as a separate topic. However, in reduced-rank regression, the most natural formulation of so-called canonical regression leads to the same solution as canonical correlation, and in fact, all three approaches yield the same eigenanalysis. So, this standard approach would to appear to be the most natural one to use in a multivariate regression problem in which we want to approximate a solution in lower dimensions.

However, we illustrate through an example—a 2 x 33 experiment (n = 54 runs) on a PID controller, in which the response consists of p = 311 temperature readings over time and is modeled as a second-order function of the factors (q =14 predictors)—evidence that clearly indicates that this standard approach does not lead to a reasonable solution. We propose a better solution, called H-canonical regression, which is designed for the situation in which all p responses are measured on, and are to be compared on, the same scale. We show the value of this approach; connect it to a special case of reduced-rank regression; show its relationship to principal components analysis in a limiting case; and compare it to other competing methods. We recommend that this method be used more generally for the same-scale response problem. We also show how this method can be naturally extended to multi-scale response problems, one which includes as a special case the above-mentioned standard approach.

Estimating Change Points in Matrix Linear Models

Yana Melnykov, The University of Alabama

**Motivation.** A variety of change point estimation and detection algorithms have been developed for random variables observed over time. In general, the acquisition of data in current practice will often result in multiple observation units studied. The traditional treatment of such observations involves the assumption of their independence. In practice, however, this assumption is often inadequate or unrealistic. Thus, the current literature lacks effective methods capable of imposing a correlation structure on observations. The primary goal of the proposed methodology is to develop an effective and modern computerized approach to estimating change points in time series processes in the situation when the assumption of independent observations is not feasible. We apply the developed method to the analysis of property crime rates in major US cities in the XXI century. The procedure is capable of taking into account correlation of the geographical location of the analyzed cities.

**Description.** The developed methodology relies on the multivariate transformation and matrix normal distribution. The latter can be used for modeling matrix data in such a way that variability associated with rows and columns is separated and can be individually addressed. The back-transform of the exponential Manly transformation applied to a normally distributed random matrix leads to a flexible distribution that we refer to as matrix Manly distribution. This allows modeling skewed matrix data efficiently and improves the robustness of the developed procedure. The covariance matrix associated with observations preserves its traditional meaning, while the one associated with time can be effectively modeled based on a specified time series process. A change point search procedure is developed based on the Bayesian information criterion. The procedure is generalized to be capable of identifying multiple Phase I change points.

**Significance.** The developed methodology provides an effective mathematical tool for detecting multiple change points in correlated observations. As shown through simulation studies, the proposed approach demonstrates robustness to deviations from the true model. An application to the analysis of property crime rates in 125 cities with population over 100,000 in the United States in the XXI century is considered.

Normalizing the I-Control Chart

Wayne Taylor, Taylor Enterprises, Inc

The I-chart is the only Shewhart control chart in Minitab that does not have the ability to be normalized. A simple method of allowing the data to be normalized is presented. The I-chart is often recommended for trending of complaint data, but complaint data depends on of the number of units sold for consumables and the installed base for hardware. For this application it performs similar to the Laney U’-chart. There are also applications for which it is uniquely suited like the between/within model with unequal sample sizes per batch. Together, the Average and Normalized I charts will handle most control charting needs with the only exceptions being data that is pronouncedly nonnormal. This greatly simplifies the process of selecting a control chart.

CUSUM for Counts: Power Considerations and the Low-Count Regime

James M. Lucas, J.M. Lucas & Associates

In this talk we discuss the power needs of control schemes and give a set of recommendations for the low-count regime. We show that the power needed by monitoring applications is less than the power required by other applications. We show that an ARL Ratio (ARLin-control/ARLout- of-control) > 20 gives adequate power for monitoring applications.

The low count regime has the complication in that only large proportional shifts in level can be detected. We discuss the ability of monitoring procedures to detect order of magnitude (OOM) shifts, OOM/2 shifts and a doubling in the low-count regime. We show that it is important to consider not only the size of the shift but also the “data richness” so that the amount of data needed to detect the shift with high power will be obtained. For detecting doubling, we compare the rule of 50 used in clinical trials with a rule of 20 that works for monitoring applications. We also show that, when count levels are low, it is usually not feasible to detect improvement so only high-side control is generally recommended.

A CUSUM is an optimum detector of a change in distribution. For shift detection, a CUSUM will detect a specified shift faster than any other control procedure that has the same ARLin-control (false alarm frequency). Trying to detect too small a shift can lead to an optimum procedure with poor performance. Optimum does not always imply good.

RSM Split-Plot Designs & Diagnostics Solve Real-World Problems

Shari Kraber, Stat-Ease Inc

By way of several case studies, this talk features innovative diagnostics for split-plot designs that reveal outliers and other abnormalities. These diagnostics are derived from restricted maximum likelihood (REML) tools that estimate fixed factor effects and random variance components. The presentation begins by laying out the benefits of split-plot designs versus traditional response surface method (RSM). Power calculations and fraction of design space are illustrated to size factorial and split-plot designs, respectively. It then demonstrates how diagnostic plots tailored for split-plot designs can reveal surprising wrinkles in experimental data.

Strategies for Near Replicates in Response Surface Analysis

Tom Bzik

In the practice of design of experiments and response surface methodology, pure-error estimates based on replication are highly desirable. These model independent estimates of experimental error derived from the variability in the responses when the factor settings are identical provide a powerful statistic particularly useful in communicating inherent system noise to decision makers since they are intuitive to explain and calculate. In addition, pure-error estimates are required for evaluating model lack-of- fit and are helpful for assessing factor significance. Often factor levels for the replicates of the independent variables are precisely measured rather than precisely set, resulting in near replicates. Departures from exact set point values do not prevent model building by regression. However, near replicates complicate the process of obtaining an estimate of pure-error by potentially inflating the estimate due to set point variation. This diminishes the usefulness of these estimates for model lack-of-fit determination and subsequent model interpretation. In this presentation, we review, propose, and compare various strategies to estimate pure-error from near replicates that range from simple to complex and seek to provide practical advice to practitioners.

### W. J. Youden Address (4-5PM)

There is no “I” in Youden, but there is “You”!

Steven Bailey, Principal Consultant and Master Black Belt

“Statistics: Powering a Revolution in Quality Improvement” is the theme for this year’s 61st Fall Technical Conference (FTC). In his Youden address 25 years ago at the 36th FTC, John Cornell described “W.J. Youden – The Man and His Methodology” as follows: “Jack Youden believed in continuous improvement; he spent his life improving the ways measurements are taken”. Indeed, Youden’s 1962 book is simply titled Experimentation and Measurement, and these interrelated topics are the focus of today’s Youden address. Experiences, approaches, and examples in the areas of Measurement Systems Analysis (MSA) and Design of Experiments (DOE) will be shared, based on decades of application in the chemical and process industries. First, a roadmap for MSA will be presented that has proved useful in guiding MSA studies for six sigma projects. Then the Strategy of Experimentation that has been taught and used successfully in DuPont for over 50 years will be reviewed, involves using two-level screening designs – Plackett-Burman or fractional factorial designs – to identify the important process factors, followed by using response surface designs – central composite or Box-Behnken designs – to identify the optimal settings for the critical few factors. The importance of including both “design” (or “control”) and “environmental” (or “noise”) factors in these studies to achieve both “functional” and “robust” products will be illustrated. The role of custom (algorithmic or optimal) designs and Definite Screening Designs (the “new tool on the block”) this strategy will be discussed. Finally, some comments on “Big Data” and the combined power of both “Historical Data Mining” and DOE will be shared.

## October 6th

### Session 4 (8-9:30AM)

Manufacturing Data Fusion

Ran Jin, Virginia Tech

Modern manufacturing needs to optimize the entire manufacturing supply chain, and product lifecycles to satisfy the highly diverse customer needs. With the deployment of sensor networks and computing technologies, manufacturing with data-driven decision making is able to achieve high level of adaptability and flexibility in smart design, operations, and services with high efficiency and low cost, which is called smart manufacturing. A smart manufacturing system generates both temporally and spatially dense data sets. Heterogeneity and high dimensionality of the data pose great challenges for manufacturing decision making. This is primarily due to the lack of systematic models to handle mix-types of data to represent different types of information. Motivated by this concept, this presentation will summarize our recent research progress in data fusion methodologies, with broad applications in manufacturing industries.

Modeling and Diagnosis of Manufacturing Time Series Data via a Natural Language Processing Perspective

Hongyue Sun, University of Buffalo, The State University of New York

Time series data contain rich information for manufacturing activities, process conditions, etc. in smart manufacturing. These time series data are widely used for manufacturing variation reduction, efficiency improvement and defects mitigation in data-driven approaches. However, the data analytics results usually are not easily interpretable. On the other hand, many critical decisions in manufacturing, such as trouble shooting and fault diagnosis, still need to be made by human operators. It is important to bring insights and discover knowledge from the time series data analytics to human operators with clear interpretations and understandings. Motivated by this observation, we propose to analyze the manufacturing time series data from a natural language processing perspective, and perform the fault diagnosis based on this modeling perspective. The methodology is validated in a crystal growth process defect modeling and fault diagnosis.

Saving Lives with Statistics: A Framework for Statistical Applications in Pharmaceutical Manufacturing & Development

Julia O’Neill, Tunnell Consulting

A revolution in the discovery of life-changing medicines has turned up the heat on accelerating new products through design and validation into commercial production. Meanwhile, expectations for control of pharmaceutical manufacturing have shifted from inspection-based monitoring to designing quality into the process. These two forces are driving high demand for the use of statistics throughout development and manufacturing.

As a result, well-qualified FDA clinical statisticians are being asked to review manufacturing statistical applications, statisticians from other industries are finding meaningful work in pharmaceuticals and biotech, and engineers and scientists are building their practical knowledge of applied statistics.

This presentation will cover a framework for the application of statistics in the pharmaceutical industry, focusing on the big picture and what’s important. The material is relevant for formally trained statisticians without extensive experience in pharmaceutical manufacturing, and also for scientists and engineers interested in applying statistical methods in their work.

Stability Assessment with the Stability Index

John Szarka, W. L. Gore

In the pursuit of quality, process capability and process performance indices are simple tools to measure the ability of a process to meet customer requirements. To ensure a high quality process, it is also important to assess process stability where stability is defined as the absence of special causes. While control charts are extremely valuable and widely used as a graphical tool to assess process stability, it becomes more challenging to use them when there are many variables of interest or many processes to monitor. The challenge is not in our inability to get enough data, rather the challenge is to be able to quickly sift through large amounts of data to focus improvement efforts on what is important at the moment. A simple numerical value could help quickly identify lower quality processes that would benefit from special cause reduction or consistency improvement. We will describe a metric, the stability index, that is simple to calculate, easy to interpret, sensitive to a variety of unstable scenarios, and that can be used in conjunction with the capability index. The intent of the stability index is not a substitute for a control chart or a final determination for an issue with instability but a preliminary trigger to signal when one needs to look more deeply. We will provide examples of the stability index computation, guidance for its interpretation, and compare to some other approaches in the literature. In addition to showing the stability index for the individual and grouped data scenarios, we will show how it can be applied in the scenario when there is common cause variation within and between subgroups. We will also explain the relationship between the common capability indices and the stability index, and how they can be used together in a graph to effectively identify improvement opportunities. In summary, we believe industry has focused too much on capability at the expense of process stability. It is our hope that it will become more common for quality practitioners to use the stability index in conjunction with control charts and capability indices to improve quality across many processes.

An Update on Statistical Learning Methods Applied to Process Monitoring

L. Allison Jones-Farmer, Miami University

In 2016, “Statistical Learning Methods Applied to Process Monitoring: An Overview and Perspective” was published in JQT. In that paper, my co-authors and I discussed how the increasing availability of high volume, high velocity data, often with mixed variable types, brought new challenges for process monitoring and surveillance. In this talk, I will give our view of the current state of research and application of these monitoring tools, and highlight the main research areas in data-driven process monitoring. We will consider some of the benefits and many of the challenges that are presented with the application of data-driven methods in practice. One of the primary challenges for the use of data-driven methods is the need for a large representative training sample upon which to base a model; thus, we will take a closer look at the use of data-driven approaches that can be used in Phase I for establishing a baseline sample. In particular, we will consider existing methods based on Support Vector Data Description (SVDD) for use in Phase I and introduce a novel method for outlier detection that works for wide (p > n) data. Finally, I will offer some insight and suggestions for future work in this area.

A Bayesian Approach to Diagnostics for Multivariate Control Charts

Steven Rigdon, SLU

When a multivariate control chart raises an out-of- control signal, several diagnostic questions arise. When did the change occur? Which components or quality characteristics changed? For those components for which the mean shifted, what are the new values for the mean? While methods exist for addressing these questions individually, we present a Bayesian approach that addresses all three questions in a single model. We employ Markov chain Monte Carlo (MCMC) methods in a Bayesian analysis that can be used in a unified approach to the diagnostics questions for multivariate charts. We demonstrate how a reversible jump Markov chain Monte Carlo (RJMCMC) approach can be used to infer (1) the change point, (2) the change model (i.e., which components changed), and (3) post-change estimates of the mean.

### Session 5 (10-11:30AM)

Analyzing Supersaturated Designs

Maria Weese, Miami University

Screening experiments, Plackett-Burman designs or resolution III fractional factorials, are used in the early stages of experimentation where the goal is to identify the important factors. When experiments are constrained, either temporally or economically, a typical screening design might require too many resources. As an alternative to a typical screening design a Supersaturated design, where *n* < *k*+1, might be an attractive alternative. Clearly, Supersaturated designs have too few runs to allow for the estimation of all main effects and thus require the experimenter to rely heavily on the assumption of effect sparsity. The analysis strategy of a supersaturated design is an important consideration and has been a popular topic in recent literature. This talk will review several of the author’s findings as well as the conclusions of others and attempt to provide recommendations on the analysis of Supersaturated designs.

Testing for Lack of Fit and Split-Plot Design

Peter Goos, KU Leuven & University of Antwerp

Textbooks on response surface methodology emphasize the importance of lack-of- fit tests when fitting response surface models, and stress that, to be able to test for lack of fit, designed experiments should have replication and allow for pure-error estimation. In this presentation, we show how to obtain pure-error estimates and how to carry out a lack-of- fit test when the experiment is not completely randomized, but a blocked experiment, a split-plot experiment, or any other multi-stratum experiment. Our approach to calculating pure-error estimates is based on residual maximum likelihood (REML) estimation of the variance components in a full treatment model (sometimes also referred to as a cell means model). It generalizes the one suggested by Vining et al (2005) in the sense that it works for a broader set of designs and for replicates other than center point replicates. Our lack-of- fit test also generalizes the test proposed by Khuri (1992) for data from blocked experiments because it exploits replicates other than center point replicates and works for split-plot and other multi-stratum designs as well. We demonstrate how to perform the lack-of- fit test in the SAS procedure MIXED. We re-analyze several published data sets and discover a few instances in which the usual response surface model exhibits significant lack of fit.

Using Innovative Statistics for Increased Quality, Faster Development

Katherine Giacoletti, SynoloStats LLC

When most people think of innovation in pharmaceutical (and other product) development, they usually think of innovation in formulation, packaging, dosage forms, analytical methods, or manufacturing technologies. However, innovation in statistics is also a key driver of improvements in product development and manufacturing. Well-known examples include statistical experimental designs (factorial designs, split-plot designs, etc.), regression and ANOVA, etc. However, mathematical innovation did not stop with those now-familiar tools; leveraging recent and ongoing innovations in statistics presents a ripe opportunity for improving quality and accelerating development, without increasing risk.

This talk will present examples of modern innovations in statistical methodology applicable to product development and manufacturing. Applications of these methodologies will be motivated through examples. The talk will also address how implementation of these types of tools can be integrated into the larger business and scientific contexts within a company. The examples will be drawn from the pharmaceutical industry, but the concepts are broadly applicable to any product/process development.

Failure to embrace statistical innovation inhibits the full realization of the benefits of innovation in other scientific areas relevant to product development and manufacturing. Statistics, like other sciences, continues to develop and new methodologies and tools provide valuable means for accelerating timelines, improving quality, and decreasing risk. Full integration of statistics into the product and process lifecycle, from development through commercial supply, is essential to modern quality by design and quality risk management.

A Panel Discussion on Collaboration: Increasing the Likelihood of Success

Peter Parker, NASA

Willis Jensen, W. L. Gore

Jennifer van Mullekom, Virginia Tech

A single statistician or statistical organization cannot possess all the skills and information necessary to answer increasingly complex data-based questions in industry. The ability to collaborate is a critical factor to successfully address this increasing complexity. Statistics and analytics problems can benefit immensely from inter-organizational collaborations among industry, government, and academia. Many of these have been the genesis of creative solutions and techniques commonly used today. While the American Statistical Association has actively recognized such collaborative efforts through SPAIG and the SPAIG award, collaborative relationships often evolve slowly and organically through networking. However, we feel this process can be both actively facilitated and accelerated through awareness, best practices, and readiness.

This panel discussion will address several issues related to encouraging collaborative statistical problem solving among academia, government, and industry including:

• What are some of the barriers to collaboration?

• How can we increase the number of collaborations?

• How can we increase the speed at which collaborations evolve?

• How can we overcome the confidentiality and cost barriers related to collaboration?

We will discuss ways in which we have fostered collaborations and addressed obstacles based on our own experiences. The audience is encouraged to share their own experiences and best practices related to collaboration. In addition, we will highlight the Industrial Statistics Virtual Collaboratory (ISVC), which is an online forum designed to promote statistical collaborations among industry, academia, and government, located at http://community.amstat.org/spes/blogs/byran-smucker. We look forward to an active exchange of insights on the topics described.

Quantitative Measures to Evaluate Process Stability and Assess Process Health

Brenda Ramirez, Amgen

In this talk, I discuss two measures that can be used to assess the stability of a process and classify the underlying variability for a process parameter as common cause or special cause. Test criteria for each of the two measures are suggested, and their overall performance is compared to the performance of Shewhart Charts. One stability metric, the SR Test, is combined with the process capability index Ppk, in order to visualize the overall health of hundreds of parameters using the Process Screening platform in JMP®, version 13.

Statistical Detective Work to Understand the Isotopic Ratios in Drum 68660 and the Radioactive Release at WIPP

Elizabeth J. Kelly, Los Alamos National Laboratory

An incident at the Department of Energy’s Waste Isolation Pilot Plant (WIPP) in 2014 resulted in the release of radioactive material into the environment. Initially, it was known that at least one drum in WIPP, identified as drum 68660, was involved. However, questions remained. Could the air-monitor isotopic ratios measured in WIPP at the time of the release be explained by materials in drum 68660 or were other drums involved? Could internal conditions in drum 68660 have caused the breach? What were the implications for 68660’s sister drum? These questions needed to be answered as quickly as possible. This analysis, which was completed in 3 weeks, combined combinatorics and uncertainty analysis to provide scientists with the timely evidence they needed to either answer these important questions or to design experiments to answer them.

### Luncheon (11:45AM-1:15PM)

It’s Not What We Said, It’s Not What They Heard, It’s What They Say They Heard

Barry D. Nussbaum, President of American Statistical Association

Statisticians have long known that success in our profession frequently depends on our ability to succinctly explain our results so decision makers may correctly integrate our efforts into their actions. However, this is no longer enough. While we still must make sure that we carefully present results and conclusions, the real difficulty is what the recipient thinks we just said. This presentation will discuss what to do, and what not to do. Examples, including those used in court cases, executive documents, and material presented for the President of the United States will illustrate the principles.

### Session 6 (1:30-3PM)

Novel Statistical Tools for Robust Process Monitoring for Biopharmaceutical Drug Products

Nitin J. Champaneria, Genentech

1. Motivation: Increased robustness in process monitoring is being demanded by regulators to meet Continuous Process Verification (FDA, ICH) guidelines. Biopharmaceutical industry faces a significant challenge in application of conventional SPC techniques due to short run lengths, questionable conformance to underlying SPC assumptions and increased occurrence of false alarms. As SPC run (Nelson) rules are added to distinguish special cause situations from common cause, the false alarm rate increases. Novel tools are needed to increase robustness of process monitoring and to make reliable capability assessment so that high quality drug products are manufactured efficiently.

2. A new paradigm for PM/PA (Process Monitoring/Performance Analysis) is proposed for reliable and efficient process monitoring and process capability assessment. Easy-to- use metrics are described to reliably and efficiently detect true quality trends. We propose that the stability of the process be first evaluated using stability metric tests. If the process is stable, capability assessment can be done with traditional Cpk analysis for normally distributed data and with quantile-based capability index for the dataset that is not normally distributed. If the process is not stable, capability assessment should not be done. Process must be fixed remove “special causes” of poor process stability.

3. Significance: We provide new point of view and simple novel tools for reliable SPC and capability analysis. Utilization of these tools will eliminate wasted efforts and increase accuracy of process monitoring investigations in biopharmaceutical industry.

On the Statistical Monitoring of Interpersonal Organizational Networks

Marcus Perry, University of Alabama

Statistical process control (SPC) charts have traditionally been applied to manufacturing processes; however, more recent developments highlight their application to social networks. Although recent papers emphasize the use of SPC for detecting changes in the communication levels of a network or subnetwork, methods for detecting changes in the structural tendencies of a graph are lacking. Structural balance theories have been well studied in the social sciences literature, where the “balance" of a network is most often assessed via the relationship between the reciprocal and transitive tendencies of a graph. For example, networks having low reciprocity and high transitivity tend to be hierarchical, and are often associated with increased network centralization. Further, the root cause of increased centralization can often stem from confliict or crisis affecting the organization. Armed with this understanding, we propose two useful control-charting strategies for quickly detecting shifts in the structural balance of a directed network over time. Such strategies might be useful to organizational managers when quickly detecting the onset of conflict or crisis within the organization is of interest. Using Monte Carlo simulation, we assess the ARL performances of our strategies, relative to other relevant methods proposed in the literature. Application of our proposed control-charting strategies are then demonstrated using the open source Enron email corpus.

How to do Gage R&R when you can’t do a Gage R&R

Thomas Rust, Autoliv

A Gage R&R is a common measurement system analysis for variable data in most manufacturing industries. Sometimes performing a gage R&R to evaluate a measurement system is not possible or it becomes cumbersome and infeasible. We propose some alternate methods of setting up and performing an MSA that use the same statistics and give the same results for these conditions:

- A measurement method that cannot be repeated but is not destructive
- Measuring the torque while installing a nut
- Measuring the crimp depth while crimping
- When many independent measurements are measured on the same part by the same measurement method
- When one type of gage is used to measure similar characteristics across multiple processes and multiple parts
- CMM for dimensional measurements on many parts
- Calipers to measure dimensions

We will discuss the methods to set up these studies, calculate results similar to Gage R&R, and evaluate them. We will also talk about assumptions and how to evaluate them as well as cautions that need to be taken. We will discuss how these methods can be evaluated over time to look for changes and still be simpler than periodic Gage R&R studies.

Application of Simulation to Sample Size Calculations

Louis Johnson, SnapDat, Inc.

Modern statistical software opens the door to sample / resample and random sample generation simulation routines to statisticians and engineers alike. Simulation can answer many common sampling plan questions. How can the overall sample size of a Gage R&R be reduced by taking 3 replicates instead of 2? Why is sample size critical in a nested variability study? How many samples are required to estimate process capability? How should samples be divided between parts, operators and gages in an Expanded Gage R&R? Why is the use of Percent Study Variation in an MSA never a good idea? Simulation can answer these questions in situations where an exact solution is not readily available. It also provides a rationale that is easily understood and accepted by non-statisticians. This presentation will walk through several examples from Johnson & Johnson, Intuitive Surgical, NuSil Silicone Technologies and others to demonstrate how to use simple simulations to develop smart data collection plans for several descriptive study scenarios.

What My Experiment Died From: Common Types of Sources of Variation in Designed Experiments

Katherine Allen, North Carolina State University

At the heart of every study is the basic need to create a methodical, reproducible, and well-designed experiment, beginning with the identification of all potential major sources of variation. This is often a daunting and tedious task for anyone with little experience designing experiments. Courses in design of experiments focus their attention on techniques for reducing the impact of sources unrelated to the treatment factor(s), such as analysis of covariance or blocking. However, these techniques are useless if a crucial source is overlooked. To prevent neglecting a variance source, experimenters and data analysts must have a clear understanding of the types of variation that occur within a general experiment. This understanding is eventually gained through years of practice in design and analysis, leading to many failed or irreproducible experiments and poorly analyzed data in the meantime. The goal of this talk is to summarize past discussions of major sources of variation and give illustrative examples of relevant experiments based on the authors’ experiences. The summary streamlines identification of the most common types of sources of variation, increasing one’s ability to (1) design a successful experiment and (2) assess and articulate potential issues with data collected from an experiment s/he did not design.

Strategies for Mixture-Design Space Augmentation

Martin Bezener, Stat-Ease, Inc.

One of the critical steps during the planning of a mixture experiment is the selection of the component lower and upper bounds. These bounds should be chosen wide enough such that an interesting change in the response occurs somewhere in between the lows and highs, but not too wide as to create extreme conditions. Ideally, any optimum should occur near the center of the experimental region, where prediction precision is high. In practice, however, at least one of these bounds is often chosen to be too narrow, causing an optimum occur along an edge or vertex of the experiment-design space. In our experience, industrial experimenters will often simply re-run an entirely new experiment with wider ranges due to lack of readily-available augmentation strategies. Augmentation and sequential strategies for response surface methodology (RSM) have been typically studied in the context of model order or sample size. However, strategies for the expansion of the region for experimentation are sparse, especially for mixture designs. In this talk, we briefly describe the problem at hand for RSM in general. Then we pay particular attention to mixture experiments, where expanding component ranges is more complicated due to their interdependence on one another. We propose several strategies, including an optimal DOE space augmentation algorithm. Several examples will be given.

### SPES Special Session (3:15-5:15PM)

Statistics Training for Industry: Up-to-Date or Out-of-Date?

William Myers, Procter & Gamble

Willis Jensen, W. L. Gore & Associates

Ron Snee, Snee & Associates

Jennifer van Mullekom, Virginia Tech

As data collection and computing power have evolved, so have the statistical methodologies used to make decisions. Companies need employees to have strong statistics and analytics skills in order to be successful. Routine application of these skills by non-statisticians produces smarter data-driven business decisions resulting in a competitive advantage for the company.

Academic preparation for non-statisticians may be limited in scope. Furthermore, it may lack the perspective of collaborative problem solving through the art of statistical practice. The task of elevating statistics and analytics skillsets requires industrial statistics groups to devote resources to developing and delivering training. Some groups will outsource training in the form of pre-existing or custom designed short courses.

So, how is the statistics community performing on the critical task of training non-statisticians? This panel will examine the issue, “Statistics Training for Industry: Up-to- Date or Out-of- Date?”, related to training content and delivery. The panelists represent industry, academia, and consulting and have extensive experience in developing and delivering training. Please attend for a lively and informative discussion, including Q&A on the following issues and more:

• Why is statistics training important?

• What are best practices in developing and delivering training?

• What is the current curriculum for industry training?

• What does the future of training look like?

• How can academic preparation in statistical practice for non-statisticians evolve to meet corporate needs?