www.dfss.nl

Design for Six Sigma

  • Categories
    • Books
    • Case
    • DfSS
    • Download
    • Events
    • News
    • Software
    • Training
  • About Design for Six Sigma
    • Define
    • Identify
    • Design
    • Optimize
    • Verify
    • Monitor
  • Powered by CQM
  • Contact
  • LOGIN

Sample size done differently

  • DfSS

“… we need at least 80 subjects in our sample otherwise the sample won’t be representative…”! This is a remark I often hear while designing user studies. I’ve always wondered why 80? When I ask for details about this the only thing I hear is that it has been a way of working for many years.  I think that what they say is based on the fact that they like to encompass most of the population variation in their sample. I checked with my customer this morning and that is exactly why they say 80 subjects, because they like to see the variation between subjects in their sample as well.  But can we understand this from a statistical point of view?  Well, I think tolerance intervals can provide an answer.

The definition is : Let L < U be two statistics i.e., quantities calculated from the data. Then [L,U] is called a 100β% tolerance interval at confidence level 100(1-α)% if Pr(F(U)-F(L)≥β)≥1-α, or if, with high probability, at least a given large part of the distribution will be enclosed between L and U. Typical values for α and β are α=0.05 and β=0.95.

An example of a 95% tolerance interval at confidence level 95% assuming a normal  distribution, sample mean=10, sample standard deviation=1, sample size n=100 can be found  to be [7.766, 12.234] using Minitab.

Tolerance intervals however come in two flavours i.e. parametric, like in the example above where we have assumed normality, and non-parametric ones where no specific distribution is assumed. Let’s focus on non-parametric tolerance intervals. Say, we take L to be the minimum of the sample and U its sample maximum. We would like see that between this sample minimum and sample maximum a large part of the population is located because if that is so then we will almost include the entire population variation.  The question now is how large should my sample be to make this happen? That is, to be able to state that the interval made up from this sample minimum and the sample maximum contains, with high probability (95%, say), at least 95% of the population. Using sample minimum and sample maximum the following relation[1] holds between α, β and sample size n,

This can be solved iteratively but a good approximation can be found here[2] and be written as:

The solution is shown in the graph below. From this graph it follows that if  α=1-0.95=0.05 en β=0.95 roughly a sample size of n=90 is needed. This is rather close to the 80 subjects from the rule of thumb.

[1] Mood A. and F. Graybill, Introduction to the theory of statistics, pp 515-516

[2] http://www.itl.nist.gov/div898/handbook/prc/section2/prc255.htm

nonparametric tolerance intervals

30 June, 2015 Ruud van Lieshout

Post navigation

Are continuous measurements always better than binary ones? → ← P-values: useful or overrated?

Related Posts

Risks on product failures and tolerance intervals: what can you expect?

Consumers are increasingly demanding when it comes to the failure-free performance of products and services. Faulty performance can result in loss of sales, claims and/or bad publicity. So for development […]

How serious gaming can unlock the potential of quantitative methods

Serious gaming is about practicing activities under realistic circumstances in a safe setting. Recent research confirms what we already knew at CQM from our long training experience: serious gaming increases […]

Are continuous measurements always better than binary ones?

In statistics, it is often said that whenever possible a continuous version of a measurement is preferable to a binary one, at least as far as the statistical properties are […]

Problem solving in NPD: the power of statistical structuring

How wonderful would it be to design a new product or process flawlessly and on time? But apparently we live in a different reality. One where problems and drawbacks always […]

Recent Posts

Comparison of measurement systems

Comparison of measurement systems

Introduction of the comparison problem In DfSS and six sigma, the gage R&R study is a commonly known investigation of a measurement system. It studies how different the outcomes are […]

More Info

Mastering the reliability of complex systems

Many companies that develop complex systems, such as cars, luggage handling systems, robot arms, or lighting systems, are faced with: increasing complexity: the number of failure modes evolve [...]

More Info
Usage of Bayesian Methods in Reliability engineering

Usage of Bayesian Methods in Reliability engineering

There are many situations where product developers have solid prior information on particular aspects of reliability modelling based on physics of failure or previous experience with the same [...]

More Info
More factors than observations: what can we do?

More factors than observations: what can we do?

In the age of internet of things, connected systems and smart phone apps, the size of data sets is increasing enormously. The size of a dataset can typically be split […]

More Info

Archives

  • February 2017
  • June 2016
  • January 2016
  • November 2015
  • October 2015
  • September 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • June 2014
  • April 2014
  • March 2014
  • December 2013
  • November 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • October 2012
  • September 2012
  • January 2012
  • December 2011

Design for Six Sigma

* Process improvement strategies such as Six Sigma help to understand and tackle bottlenecks in the production phase in a structured manner. However, about 75% of production problems can be traced back to bad choices in the design phase. To guarantee high quality faster and with lower costs, it is therefore necessary to focus on statistical dispersion (variance reduction) starting at product development. By embedding the desired quality during the design process – Design for Six Sigma (DFSS) – we realize a cheaper process and shorter time to market! *

Define

This phase is about a clear project definition and getting support and approval for execution.

Identify

The main objective for this phase is to describe in more detail who the target customers are and what exactly makes them happy.

Design

This phase results in a high level design , the ‘product architecture’, for the selected concept.

Optimize

The objective of the optimize phase is to generate a detailed product design.

Verify

This phase focuses on the preparation for mass production and realizing the market introduction.

Monitor

In this phase, user, customer and stakeholder satisfaction will be verified.

Powered by WordPress | theme SG Window