I need someone who has experience in statistics to help me with this Test after 1 hour.
I have uploaded the study materials and test sample.
The payment is here and I will send questions on Whatsapp.
Instructions of Test-2 (STAT-101)
The due date of Test 2 is Monday, January 30, 2023, at 11:00 PM.
Test 2 covers the material of Weeks 5, 7 & 8
The test consists of 25 questions
10T/F (0.25 marks each) and 15MCQ (0.5marks each)
Total Marks = 10
You have only one attempt.
You have a time limit of 5 hours (300 minutes).
This assignment will be saved and submitted automatically when the time (5hrs) is expired.
This assignment can be saved and resumed at any point until the time (5hrs) has expired.
The time will continue to run if you leave the test.
STAT 101_SEU 00967775703091 Assignment _2_2023 2_نموذج
Review Test Submission: Assignment2-STAT101-
2022-23-2nd
User
Course (Current Semester – الفصل الحالي)STAT-101: Statistics *******************
Test Assignment2-STAT101-2022-23-2nd
Started 1/27/23 11:52 PM
Submitted 1/28/23 1:55 AM
Due Date 1/30/23 11:00 PM
Status Completed
Attempt
Score
9.75 out of 10 points
Time
Elapsed
2 hours, 3 minutes out of 5 hours
Instructions Instructions of Assignment-2(STAT-101)
The display date of Assignment 2 is Wednesday, January 25,
2023, 11:00 P.M.
The due date of Assignment 2 is Monday, January 30, 2023, at
11:00 PM.
Assignment 2 covers the material of Weeks 5, 7, & 8 (Chapters-6,
7 & 8)
The assignment consists of 25 questions
10T/F (0.25 marks each) and 15MCQ (0.5marks each)
Total Marks = 10
You have only one attempt.
You have a time limit of 5 hours (300 minutes).
This assignment will be saved and submitted automatically when
the time (5hrs) is expired.
This assignment can be saved and resumed at any point until the
time (5hrs) has expired.
The time will continue to run if you leave the test.
Good luck!
Saturday, January 28, 2023 1:56:01 AM AST
STAT 101_SEU 00967775703091 Assignment _2_2023 2_نموذج
Question 1
If the total area under standard normal probability distribution is k+1, then the value
of k is zero.
True
False
Question 2
If the z-score of normal distribution is –2.50, the mean of the distribution is 35 and the
standard deviation of normal distribution is 2, then the value of X for a normal
distribution is 40.
True
False
Question 3
Given that Z is a standard normal random variable. If P(Z > k)=0.0505, then the value of k is
1.64
True
False
Question 4
A confidence interval (or interval estimate) is a range (or an interval) of values used
to estimate the true value of a population parameter.
True
False
Question 5
The sample mean is not the best point estimate of the population mean.
True
False
Question 6
If the P-value for a one-sided test for testing a mean is 0.05, then the P-value for the
corresponding two-sided test would be 0.01.
True
False
Question 7
The probability of rejecting the null hypothesis when it is true is called Level of
significance.
True
False
Question 8
The alternative hypothesis for the following claim: “A car Company claims that its new car
will average more than 40 miles per gallon in the city” is H1: µ < 40.
True
False
STAT 101_SEU 00967775703091 Assignment _2_2023 2_نموذج
Question 9
The alternative hypothesis for the following claim: “A motorbike company claims that its new
model will give an average at least 60 km/l on a long route” is Ha: µ < 60.
True
False
Question 10
If the original claim says that the mean working hours in a day are same for men and
women in a company. Then symbolically it is represented as p1 = p2.
True
False
Question 11
You are given the following hypothesis test:
H0: μ=100
H1: μ ≠ 100
The calculated test statistic z = –1.0, and the critical value of z = ±1.97. Then, the
decision would be to:
Reject H0 since z < –1.97
Reject H0 since –1.97 < z < 1.97
Fail to reject H0 since –1.97 < z < 1.97
Fail to reject H0 since z < –1.97
Question 12
A prescription allergy medicine is supposed to contain an average of 245 parts per
million (ppm) of active ingredient. The manufacturer periodically collects data to
determine if the production process is working properly. A random sample of 64 pills
has a mean of 250 ppm with a standard deviation of 12
ppm.
Let µ denotes the average amount of the active ingredient in pills of this allergy
medicine. The null and alternative hypotheses are as H0: µ = 245, Ha:µ ≠ 245. The
level of significance is 1%.
The t-test statistic is 3.33 with a P-value of 0.0014. What is the correct conclusion?
The mean amount of active ingredient in pills of this allergy medicine is equal to 245
ppm.
The mean amount of active ingredient in pills of this allergy medicine is equal to 250
ppm.
The mean amount of active ingredient in pills of this allergy medicine is not equal to 245
ppm.
The mean amount of active ingredient in pills of this allergy medicine is greater than 245
ppm.
Question 13
Among 169 Egyptian-African men, the mean systolic blood pressure was 145 mmHg
with a standard deviation of 26. The t-test statistic to conclude that the mean systolic
blood pressure for a population of Egyptian-African men is greater than 142 is
-2.5
-1.3
1.5
-1.5
STAT 101_SEU 00967775703091 Assignment _2_2023 2_نموذج
Question 14
The degree of confidence is equal to:
1-α
β
α
1-β
Question 15
When carrying out a large sample test of H0: µ0 = 50, Ha: µ0 < 50, we reject H0 at
level of significance α when the calculated test statistic is:
Greater than zα
Less than – zα
Greater than zα/2
Less than zα
Question 16
A sample of 100 body temperatures has a mean of 98.6 oF. Assume that σ is known to
be 0.5 oF. Use a 0.05 significance level to test the claim that the mean body
temperature of the population is equal to 98.5 oF, as is commonly believed. What is
the value of test statistic for this testing?
1.0
3.0
-2.0
2.0
Question 17
With H0: μ = 100, Ha: μ < 100, the test statistic is z = – 1.75. Using a 0.05
significance level, the P-value and the conclusion about null hypothesis are (Given
that P(z < 1.75) =0.9599)
0.0401; reject H0
0.9599; fail to reject H0
0.0401; fail to reject H0
0.9599; reject H0
Question 18
A passing student is failed by an examiner, it is an example of:
Type-I error
Type-II error
Best Decision
All of above
Question 19
The confidence interval, 0.548 < p < 0.834 is obtained for a population proportion, p.
The margin of error, E using these confidence interval limits is
0.143
0.286
0.691
1.382
STAT 101_SEU 00967775703091 Assignment _2_2023 2_نموذج
Question 20
If the point estimate 𝑝 ̂ is 0.8 and the lower confidence limit is 0.6, then the upper
confidence limit is:
1.0
0.7
0.6
0.4
Question 21
If the Margin of error E is 0.5 and the upper confidence limit is 9, then the lower
confidence limit is:
10
14
8
2
Question 22
Evaluate P(-1< Z< 2), where P(Z < 2)=0.9772 and P(Z< -1)=0.1587
c. -0.1359
d. 0.8185
b. 0.1359
a. -0.8185
Question 23
The normal probability distribution curve is symmetrical about mean µ. Then P(X <
μ) = P(X > μ) is equal to
0.25
0
0.50
0.75
STAT 101_SEU 00967775703091 Assignment _2_2023 2_نموذج
Question 24
Which of the following is NOT true regarding the normal distribution?
d. The points of the curve meet the X-axis at z = –3 and z = 3
b. It has a single peak
c. It is symmetrical
a. Mean, median and mode are all equal
Question 25
Assume that the thermometer readings are normally distributed with a mean of 0°C and a
standard deviation of 1°C for freezing water. If one thermometer is randomly selected, find the
probability that it reads (at the freezing point of water) greater than -1.75 degrees.
a. 0.0401
b. -0.9599
c. 0.9599
d. None
7.1 – 2Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved
.
Lecture Slides
Elementary Statistics
Eleventh Edition
and the Triola Statistics Series
by Mario F. Triola
7.1 – 3Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Chapter 7
Estimates and Sample Sizes
7-1
Review and Preview
7-2 Estimating a Population
Proportion
7-3 Estimating a Population Mean: σ Know
n
7-4 Estimating a Population Mean: σ Not Known
7-5 Estimating a Population Variance
7.1 – 4Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 7-1
Review and Preview
7.1 – 5Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Review
❖ Chapters 2 & 3 we used “descriptive
statistics” when we summarized data using
tools such as graphs, and statistics such as
the mean and standard deviation.
❖ Chapter 6 we introduced critical values:
z denotes the z score with an area of to
its right.
If
=
0.025, the critical value is z0.025 = 1.96.
That is, the critical value z0.025 = 1.96 has an
area of 0.025 to its right.
7.1 – 6Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Preview
❖ The two major activities of inferential
statistics are (1) to use sample data to
estimate values of a population parameters,
and (2) to test hypotheses or claims made
about population parameters.
❖ We introduce methods for estimating values
of these important population parameters:
proportions, means, and variances.
❖ We also present methods for determining
sample sizes necessary to estimate those
parameters.
This chapter presents the beginning of
inferential statistics.
7.1 – 7Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 7-2
Estimating a Population
Proportion
7.1 – 8Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
In this section we present methods for using a
sample proportion to estimate the value of a
population proportion.
• The sample proportion is the best point
estimate of the population proportion.
• We can use a sample proportion to construct a
confidence interval to estimate the true value
of a population proportion, and we should
know how to interpret such confidence
intervals.
• We should know how to find the sample size
necessary to estimate a population proportion.
7.1 – 9Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
A point estimate is a single value (or
point) used to approximate a population
parameter.
7.1 – 10Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
The sample proportion p is the best
point estimate of the population
proportion p.
ˆ
Definition
7.1 – 11Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
Because the sample proportion is the best
point estimate of the population proportion, we
conclude that the best point estimate of p is
0.70. When using the sample results to
estimate the percentage of all adults in the
United States who believe in global warming,
the best estimate is 70%.
In the Chapter Problem we noted that in a Pew
Research Center poll, 70% of 1501 randomly
selected adults in the United States believe in
global warming, so the sample proportion is
= 0.70. Find the best point estimate of the
proportion of all adults in the United States
who believe in global warming.
p̂
7.1 – 12Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
A confidence interval (or interval
estimate) is a range (or an interval)
of values used to estimate the true
value of a population parameter. A
confidence interval is sometimes
abbreviated as CI.
7.1 – 13Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
A confidence level is the probability 1 – (often
expressed as the equivalent percentage value)
that the confidence interval actually does contain
the population parameter, assuming that the
estimation process is repeated a large number of
times. (The confidence level is also called degree
of confidence, or the confidence coefficient.)
Most common choices are 90%, 95%, or 99%.
( = 10%), (
= 5%
), ( = 1%)
Definition
7.1 – 14Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
We must be careful to interpret confidence intervals
correctly. There is a correct interpretation and many
different and creative incorrect interpretations of the
confidence interval 0.677 < p < 0.723.
“We are 95% confident that the interval from 0.677 to
0.723 actually does contain the true value of the
population proportion p.”
This means that if we were to select many different
samples of size 1501 and construct the corresponding
confidence intervals, 95% of them would actually
contain the value of the
population proportion p.
(Note that in this correct interpretation, the level of
95% refers to the success rate of the process being
used to estimate the proportion.)
Interpreting a Confidence Interval
7.1 – 15Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Know the correct interpretation of a
confidence interval.
Caution
7.1 – 16Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence intervals can be used
informally to compare different data
sets, but the overlapping of
confidence intervals should not be
used for making formal and final
conclusions about equality of
proportions.
Caution
7.1 – 17Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Critical Values
A standard z score can be used to distinguish
between sample statistics that are likely to
occur and those
that are unlikely to occur.
Such
a z score is called a critical value. Critical values
are based on the following observations:
1. Under certain conditions, the sampling
distribution of sample proportions can be
approximated by a normal
distribution.
7.1 – 18Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Critical Values
2. A z score associated with a sample
proportion has a probability of /2 of falling in
the right tail.
7.1 – 19Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Critical Values
3. The z score separating the right-tail region is
commonly denoted by z/2 and is referred to
as a critical value because it is on the
borderline separating z scores from sample
proportions that are likely to occur from those
that are unlikely to occur.
7.1 – 20Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
A critical value is the number on the
borderline separating sample statistics
that are likely to occur from those that are
unlikely to occur. The number z/2 is a
critical value that is a z score with the
property that it separates an area of /2 in
the
right tail of the standard normal
distribution.
7.1 – 21Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
The Critical Value z2
7.1 – 22Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Notation for Critical Value
The critical value z/2 is the positive z value
that is at the vertical boundary separating an
area of /2 in the right tail of the standard
normal distribution. (The value of –z/2 is at
the vertical boundary for the area of /2 in the
left tail.) The subscript /2 is simply a
reminder that the z score separates an area of
/2 in the right tail of the standard normal
distribution.
7.1 – 23Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding z2 for a 95%
Confidence Level
-z
2
z2
Critical Values
2 = 2.5% = .025
= 5%
7.1 – 24Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
z2 = 1.96−+
Use Table A-2 to find a z score of 1.96
= 0.05
Finding z2 for a 95%
Confidence Level – cont
7.1 – 25Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
When data from a simple random sample are
used to estimate a population proportion p, the
margin of error, denoted by E, is the maximum
likely difference (with probability 1 – , such as
0.95) between the observed proportion and
the true value of the population proportion p.
The margin of error E is also called the
maximum error of the estimate and can be found
by multiplying the critical value and the standard
deviation of the sample proportions:
p̂
7.1 – 26Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Margin of Error for
Proportions
2
ˆ ˆ
pq
E
z
n
=
7.1 – 27Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
p = population proportion
Confidence Interval for Estimating
a Population
Proportion p
= sample proportion
n = number of sample values
E = margin of error
z/2 = z score separating an area of /2 in
the right tail of the standard normal
distribution
p̂
7.1 – 28Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence Interval for Estimating
a Population Proportion p
1. The sample is a simple random sample.
2. The conditions for the binomial distribution
are satisfied: there is a fixed number of
trials, the trials are independent, there are
two categories of outcomes, and the
probabilities remain constant for each trial.
3. There are at least 5 successes and 5
failures.
7.1 – 29Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence Interval for Estimating
a Population Proportion p
p – E < < + Eˆ p̂p
where
2
ˆ ˆpq
E z
n
=
7.1 – 30Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
p – E < < +
E
p + E
pp
ˆ ˆ
Confidence Interval for Estimating
a Population Proportion p
ˆ
(p – E, p + E)ˆ ˆ
7.1 – 31Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Round-Off Rule for
Confidence Interval Estimates of p
Round the confidence interval limits
for p to
three significant digits.
7.1 – 32Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
1. Verify that the required assumptions are
satisfied.
(The sample is a simple random sample, the
conditions for the binomial distribution are satisfied,
and the normal distribution can be used to
approximate the distribution of sample proportions
because np 5, and nq 5 are both satisfied.)
2. Refer to Table A-2 and find the critical value z /2 that
corresponds to the desired confidence level.
3. Evaluate the margin of error
Procedure for Constructing
a Confidence Interval for p
2
ˆ ˆE z pq n=
7.1 – 33Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
4. Using the value of the calculated margin of error, E
and the value of the sample proportion, p, find the
values of p – E and p + E. Substitute those values
in the general format for the confidence interval:
ˆ
ˆ ˆ
p – E < p < p + Eˆ ˆ
5. Round the resulting confidence interval limits to
three significant digits.
Procedure for Constructing
a Confidence Interval for p – cont
7.1 – 34Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
a. Find the margin of error E that corresponds to a
95% confidence level.
b. Find the 95% confidence interval estimate of the
population proportion p.
c. Based on the results, can we safely conclude that
the majority of adults believe in global warming?
d. Assuming that you are a newspaper reporter, write
a brief statement that accurately describes the
results and includes all of the relevant information.
In the Chapter Problem we noted that a Pew Research
Center poll of 1501 randomly selected U.S. adults
showed that 70% of the respondents believe in global
warming. The sample results are n = 1501, and ˆ 0.70p =
7.1 – 35Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Requirement check: simple random sample;
fixed number of trials, 1501; trials are
independent; two categories of outcomes
(believes or does not); probability remains
constant. Note: number of successes and
failures are both at least 5.
Example:
a) Use the formula to find the margin of error.
( )
( )
2
0 70 0 30
1 96
1501
0 02318
3
ˆ ˆ . .
.
.
pq
E z
n
E
= =
=
7.1 – 36Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
b) The 95% confidence interval:
Example:
ˆ ˆp E p p E− +
0.70 − 0.023183 p 0.70 + 0.023183
0.677 p 0.723
7.1 – 37Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
c) Based on the confidence interval
obtained in part (b), it does appear that
the proportion of adults who believe in
global warming is greater than 0.5 (or
50%), so we can safely conclude that the
majority of adults believe in global
warming. Because the limits of 0.677 and
0.723 are likely to contain the true
population proportion, it appears that the
population proportion is a value greater
than 0.5.
Example:
7.1 – 38Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
d) Here is one statement that summarizes
the results: 70% of United States adults
believe that the earth is getting warmer.
That percentage is based on a Pew
Research Center poll of 1501 randomly
selected adults in the United States. In
theory, in 95% of such polls, the
percentage should differ by no more than
2.3 percentage points in either direction
from the percentage that would be found
by interviewing all adults in the United
States.
Example:
7.1 – 39Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Analyzing Polls
When analyzing polls consider:
1. The sample should be a simple random sample,
not an inappropriate sample (such as a voluntary
response sample).
2. The confidence level should be provided. (It is
often 95%, but media reports often neglect to
identify it.)
3. The sample size should be provided. (It is usually
provided by the media, but not always.)
4. Except for relatively rare cases, the quality of the
poll results depends on the sampling method and
the size of the sample, but the size of the
population is usually not a factor.
7.1 – 40Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Caution
Never follow the common misconception
that poll results are unreliable if the
sample size is a small percentage of the
population size. The population size is
usually not a factor in determining the
reliability of a poll.
7.1 – 41Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Sample Size
Suppose we want to collect sample
data in order to estimate some
population proportion. The question is
how many sample items must be
obtained?
7.1 – 42Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Determining Sample Size
(solve for n by algebra)
( )2 ˆp q 2Z n = ˆ
E 2
2zE =
p qˆ ˆ
n
7.1 – 43Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Sample Size for Estimating
Proportion p
When an estimate of p is known: ˆ
ˆ( )2 p qn = ˆ
E 2
2z
When no estimate of p is known:
( )2 0.25n =
E 2
2 z
ˆ
7.1 – 44Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Round-Off Rule for Determining
Sample Size
If the computed sample size n is not
a whole number, round the value of n
up
to the next larger whole number.
7.1 – 45Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
The Internet is affecting us all in many
different ways, so there are many reasons for
estimating the proportion of adults who use
it. Assume that a manager for E-Bay wants to
determine the current percentage of U.S.
adults who now use the Internet. How many
adults must be surveyed in order to be 95%
confident that the sample percentage is in
error by no more than three percentage
points?
a. In 2006, 73% of adults used the Internet.
b. No known possible value of the proportion.
7.1 – 46Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
a) Use
To be 95% confident
that our sample
percentage is within
three percentage
points of the true
percentage for all
adults, we should
obtain a simple
random sample of 842
adults.
Example:
2
ˆ ˆ ˆ0.73 and 1 0.27
0.05 so 1.96
0.03
p q p
z
E
= = − =
= =
=
( )
( ) ( )( )
( )
2
2
2
2
2
ˆ ˆ
1.96 0.73 0.27
0.03
841.3104
842
z pq
n
E
=
=
=
=
7.1 – 47Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
b) Use
To be 95% confident that
our sample percentage
is within three
percentage points of the
true percentage for all
adults, we should obtain
a simple random sample
of 1068 adults.
Example:
= 0.05 so z
2
= 1.96
E = 0.03
( )
( )
( )
2
2
2
2
2
0.25
1.96 0.25
0.03
1067.1111
1068
z
n
E
=
=
=
=
7.1 – 48Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding the Point Estimate and E
from a Confidence Interval
Margin of Error:
E = (upper confidence limit) — (lower confidence limit)
2
Point estimate of p:
p =
(upper confidence limit) + (lower confidence limit)
2
ˆ
ˆ
7.1 – 49Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Point estimates.
❖ Confidence intervals.
❖ Confidence levels.
❖ Critical values.
❖ Margin of error.
❖ Determining sample sizes.
7.1 – 50Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 7-3
Estimating a Population
Mean: Known
7.1 – 51Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
This section presents methods for
estimating a population mean. In
addition to knowing the values of the
sample data or statistics, we must also
know the value of the population
standard deviation, .
Here are three key concepts that
should be learned in this section:
7.1 – 52Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
1. We should know that the sample mean
is the best point estimate of the
population mean .
2. We should learn how to use sample data
to construct a confidence interval for
estimating the value of a population mean,
and we should know how to interpret such
confidence intervals.
3. We should develop the ability to determine
the sample size necessary to estimate a
population mean.
x
7.1 – 53Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Point Estimate of the
Population Mean
The sample mean x is the best point estimate
of the population mean µ.
7.1 – 54Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence Interval for
Estimating a Population Mean
(with Known)
= population mean
= population standard deviation
= sample mean
n = number of sample values
E = margin of error
z/2 = z score separating an area of /2 in the
right tail of the standard normal
distribution
x
7.1 – 55Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence Interval for
Estimating a Population Mean
(with Known)
1. The sample is a simple random sample.
(All samples of the same size have an
equal chance of being selected.)
2. The value of the population standard
deviation is known.
3. Either or both of these conditions is
satisfied: The population is normally
distributed or n > 30.
7.1 – 56Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence Interval for
Estimating a Population Mean
(with Known)
x − E x + E where E = z
2
n
or x E
or x − E,x + E( )
7.1 – 57Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
The two values x – E and x + E are
called confidence interval limits.
7.1 – 58Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
1. For all populations, the sample mean x is an unbiased
estimator of the population mean , meaning that the
distribution of sample means tends to center about
the value of the population mean .
2. For many populations, the distribution of sample
means x tends to be more consistent (with less
variation) than the distributions of other sample
statistics.
Sample Mean
7.1 – 59Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Procedure for Constructing a
Confidence Interval for µ (with Known )
1. Verify that the requirements are satisfied.
2. Refer to Table A-2 or use technology to find the
critical value z2 that corresponds to the desired
confidence level.
3. Evaluate the margin of error
5. Round using the confidence intervals round-off
rules.
4. Find the values of Substitute
those values in the general format of the
confidence interval:
2E z n =
x − E and x + E.
x − E x + E
7.1 – 60Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
1. When using the original set of data, round the
confidence interval limits to one more decimal
place than used in original set of data.
2. When the original set of data is unknown and only
the summary statistics (n, x, s) are used, round the
confidence interval limits to the same number of
decimal places used for the sample mean.
Round-Off Rule for Confidence
Intervals Used to Estimate µ
7.1 – 61Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
People have died in boat and aircraft accidents
because an obsolete estimate of the mean
weight of men was used. In recent decades, the
mean weight of men has increased
considerably, so we need to update our
estimate of that mean so that boats, aircraft,
elevators, and other such devices do not
become dangerously overloaded. Using the
weights of men from Data Set 1 in Appendix B,
we obtain these sample statistics for the simple
random sample: n = 40 and = 172.55 lb.
Research from several other sources suggests
that the population of weights of men has a
standard deviation given by = 26 lb.
x
7.1 – 62Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
a. Find the best point estimate of the mean
weight of the population of all men.
b. Construct a 95% confidence interval
estimate of the mean weight of all men.
c. What do the results suggest about the mean
weight of 166.3 lb that was used to
determine the safe passenger capacity of
water vessels in 1960 (as given in the
National Transportation and Safety Board
safety recommendation M-04-04)?
7.1 – 63Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
a. The sample mean of 172.55 lb is the best point
estimate of the mean weight of the population of all
men.
X – E < < x + E
b. A 95% confidence interval or 0.95 implies
= 0.05, so z/2 = 1.96.
Calculate the margin of error.
Construct the confidence interval.
x − E x − E
172.55 − 8.0574835 172.55 + 8.0574835
164.49 180.61
E = z/2
n
σ
=1.96
40
26
=8.0574835
7.1 – 64Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
c. Based on the confidence interval, it is
possible that the mean weight of 166.3 lb
used in 1960 could be the mean weight of
men today. However, the best point estimate
of 172.55 lb suggests that the mean weight
of men is now considerably greater than
166.3 lb. Considering that an underestimate
of the mean weight of men could result in
lives lost through overloaded boats and
aircraft, these results strongly suggest that
additional data should be collected.
(Additional data have been collected, and the
assumed mean weight of men has been
increased.)
7.1 – 65Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding a Sample Size for
Estimating a Population Mean
(z/2) •
n =
E
2
= population mean
σ = population standard deviation
= sample mean
E = desired margin of error
zα/2 = z score separating an area of /2 in the right tail of
the standard normal distribution
x
7.1 – 66Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Round-Off Rule for Sample Size n
If the computed sample size n is not a
whole number, round the value of n up
to the next larger whole number.
7.1 – 67Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding the Sample Size n
When is Unknown
1. Use the range rule of thumb (see Section 3-3)
to estimate the standard deviation as
follows: range/4.
2. Start the sample collection process without
knowing and, using the first several values,
calculate the sample standard deviation s and
use it in place of . The estimated value of
can then be improved as more sample data
are obtained, and the sample size can be
refined accordingly.
3. Estimate the value of by using the results
of some other study that was done earlier.
7.1 – 68Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
= 0.05
/2 = 0.025
z / 2 = 1.96
E = 3
= 15
n = 1.96 • 15 = 96.04 = 97
3
2
With a simple random sample of only
97 statistics students, we will be 95%
confident that the sample mean is
within 3 IQ points of the true
population mean .
Assume that we want to estimate the mean IQ score for
the population of statistics students. How many
statistics students must be randomly selected for IQ
tests if we want 95% confidence that the sample mean
is within 3 IQ points of the population mean, and
population standard deviation is 15 ?
7.1 – 69Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Margin of error.
❖ Confidence interval estimate of the
population mean with σ known.
❖ Round off rules.
❖ Sample size for estimating the mean μ.
7.1 – 70Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 7-4
Estimating a Population
Mean: Not Known
7.1 – 71Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
This section presents methods for estimating
a population mean when the population
standard deviation is not known. With σ
unknown, we use the Student t distribution
assuming that the relevant requirements are
satisfied.
7.1 – 72Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
The sample mean is the best point
estimate of the population mean.
Sample Mean
7.1 – 73Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
If the distribution of a population is essentially
normal, then the distribution of
is a
Student t Distribution
for all samples of
size n. It is often referred to as a t distribution
and is used to find critical values denoted by
t/2.
t =
x – µ
s
n
Student t Distribution
7.1 – 74Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
degrees of freedom = n – 1
in this section.
Definition
The number of degrees of freedom for a
collection of sample data is the number of
sample values that can vary after certain
restrictions have been imposed on all data
values. The degree of freedom is often
abbreviated df.
7.1 – 75Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Margin of Error E for Estimate of
(With σ Not Known)
Formula 7-6
where t2 has n – 1 degrees of freedom.
n
s
E = t 2
Table A-3 lists values for tα/2
7.1 – 76Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
= population mean
= sample mean
s = sample standard deviation
n = number of sample values
E = margin of error
t/2 = critical t value separating an area of /2
in the right tail of the t distribution
Notation
x
7.1 – 77Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
where E = t/2
n
s
x – E < µ < x + E
t/2 found in Table A-3
Confidence Interval for the
Estimate of μ (With σ Not Known)
df = n – 1
7.1 – 78Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
2. Using n – 1 degrees of freedom, refer to Table A-3 or use
technology to find the critical value t2 that corresponds to
the desired confidence level.
Procedure for Constructing a
Confidence Interval for µ
(With σ Unknown)
1. Verify that the requirements are satisfied.
3. Evaluate the margin of error E = t2 • s / n .
4. Find the values of Substitute those
values in the general format for the confidence interval:
5. Round the resulting confidence interval limits.
x − E and x + E.
x − E x + E
7.1 – 79Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
A common claim is that garlic lowers cholesterol
levels. In a test of the effectiveness of garlic, 49
subjects were treated with doses of raw garlic, and
their cholesterol levels were measured before and
after the treatment. The changes in their levels of LDL
cholesterol (in mg/dL) have a mean of 0.4 and a
standard deviation of 21.0. Use the sample statistics of
n = 49, = 0.4 and s = 21.0 to construct a 95%
confidence interval estimate of the mean net change in
LDL cholesterol after the garlic treatment. What does
the confidence interval suggest about the
effectiveness of garlic in reducing LDL cholesterol?
x
7.1 – 80Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
Requirements are satisfied: simple random
sample and n = 49 (i.e., n > 30).
2
21 0
2 009 6 027
49
.
. .E t
n
= = =
95% implies α = 0.05.
With n = 49, the df = 49 – 1 = 48
Closest df is 50, two tails, so t/2 = 2.009
Using t/2 = 2.009, s = 21.0 and n = 49 the
margin of error is:
7.1 – 81Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example:
Construct the confidence
interval:
x = 0.4, E = 6.027
We are 95% confident that the limits of –5.6 and 6.4
actually do contain the value of , the mean of the
changes in LDL cholesterol for the population. Because
the confidence interval limits contain the value of 0, it is
very possible that the mean of the changes in LDL
cholesterol is equal to 0, suggesting that the garlic
treatment did not affect the LDL cholesterol levels. It
does not appear that the garlic treatment is effective in
lowering LDL cholesterol.
x − E x + E
0.4 − 6.027 0.4 + 6.027
−5.6 6.4
7.1 – 82Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Important Properties of the
Student t Distribution
1. The Student t distribution is different for different sample sizes
(see the following slide, for the cases n = 3 and n = 12).
2. The Student t distribution has the same general symmetric bell
shape as the standard normal distribution but it reflects the
greater variability (with wider distributions) that is expected
with small samples.
3. The Student t distribution has a mean of t = 0 (just as the
standard normal distribution has a mean of z = 0).
4. The standard deviation of the Student t distribution varies with
the sample size and is greater than 1 (unlike the standard
normal distribution, which has a = 1).
5. As the sample size n gets larger, the Student t distribution gets
closer to the normal distribution.
7.1 – 83Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Student t Distributions for
n = 3 and n = 12
Figure 7-5
7.1 – 84Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Choosing the Appropriate Distribution
7.1 – 85Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Choosing the Appropriate Distribution
Use the normal (z)
distribution
known and normally
distributed population
or
known and n > 30
Use t distribution not known and
normally distributed
population
or
not known and n > 30
Use a nonparametric
method or
bootstrapping
Population is not
normally distributed
and n ≤ 30
7.1 – 86Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Point estimate of µ:
x = (upper confidence limit) + (lower confidence limit)
2
Margin of Error:
E = (upper confidence limit) – (lower confidence limit)
2
Finding the Point Estimate
and E from a Confidence Interval
7.1 – 87Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Confidence Intervals for
Comparing Data
As in Sections 7-2 and 7-3, confidence
intervals can be used informally to
compare different data sets, but the
overlapping of confidence intervals should
not be used for making formal and final
conclusions about equality of means.
7.1 – 88Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Student t distribution.
❖ Degrees of freedom.
❖ Margin of error.
❖ Confidence intervals for μ with σ unknown.
❖ Choosing the appropriate distribution.
❖ Point estimates.
❖ Using confidence intervals to compare data.
Copyright © 2010, 2007, 2004 Pearso
n
Education, Inc. All Rights Reserved. 8.1 – 2
Lecture Slides
Elementary Statistic
s
Eleventh Edition
and the Triola Statistics Series
by Mario F. Triola
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 3
Chapter 8
Hypothesis
Test
ing
8-1
Review and Preview
8-2 Basics of
Hypothesis Testing
8-3 Testing a Claim about a Proportio
n
8-4 Testing a Claim About a Mean: σ Known
8-5 Testing a Claim About a Mean: σ Not Known
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 4
Section 8-1
Review and Preview
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 5
Review
In Chapters 2 and 3 we used “descriptive statistics”
when we summarized data using tools such as graphs,
and statistics such as the mean and standard
deviation.
Methods of inferential statistics use sample data to
make an inference or conclusion about a population.
The two main activities of inferential statistics are
using sample data to (1) estimate a population
parameter (such as estimating a population parameter
with a confidence interval), and (2) test a hypothesis or
claim about a population parameter.
In Chapter 7 we presented methods for estimating a
population parameter with a confidence interval, and in
this chapter we present the method of hypothesis
testing.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 6
Definitions
In statistics, a hypothesis is a claim or
statement about a property of a population.
A hypothesis test (or test of significance) is a
standard procedure for testing a claim about a
property of a population.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 7
Main Objective
The main objective of this chapter is to
develop the ability to conduct hypothesis
tests for claims made about a:
– population proportion p,
– a population mean ,
– or a population standard deviation .
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 8
Examples of Hypotheses that can be Tested
• Genetics: The Genetics & IVF Institute claims
that its XSORT method allows couples to increase
the probability of having a baby girl.
• Business: A newspaper headline makes the
claim that most workers get their jobs through
networking.
• Medicine: Medical researchers claim that when
people with colds are treated with echinacea, the
treatment has no effect.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 9
Examples of Hypotheses that can be Tested
• Aircraft Safety: The Federal Aviation
Administration claims that the mean weight of an
airline passenger (including carry-on baggage) is
greater than 185 lb, which it was 20 years ago.
• Quality Control: When new equipment is used
to manufacture aircraft altimeters, the new
altimeters are better because the variation in the
errors is reduced so that the readings are more
consistent. (In many industries, the quality of
goods and services can often be improved by
reducing variation.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 10
Caution
When conducting hypothesis tests as
described in this chapter and the
following chapters, instead of jumping
directly to procedures and calculations,
be sure to consider the context of the
data, the source of the data, and the
sampling method used to obtain the
sample data.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 11
Section 8-2
Basics of Hypothesis
Testing
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 12
Key Concept
This section presents individual
components of a hypothesis test. We should
know and understand the following:
• How to identify the null hypothesis and alternative
hypothesis from a given claim, and how to express
both in symbolic form
• How to calculate the value of the test statistic, given a
claim and sample
data
• How to identify the critical value(s), given a
significance level
• How to identify the P-value, given a value of the test
statistic
• How to state the conclusion about a claim in simple
and nontechnical terms
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 13
Part 1:
The Basics of Hypothesis Testing
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 14
Rare Event Rule for
Inferential Statistics
If, under a given assumption, the
probability of a particular observed event
is exceptionally small, we conclude that
the assumption is probably not correct.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 15
Components of a
Formal Hypothesis
Test
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 16
Null Hypothesis:
H0
• The null hypothesis (denoted by H0) is
a statement that the value of a
population parameter (such as
proportion, mean, or standard
deviation) is equal to some claimed
value.
• We test the null hypothesis directly.
• Either reject H0 or fail to reject H0.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 17
Alternative Hypothesis:
H1
• The alternative hypothesis (denoted
by H1 or Ha or HA) is the statement that
the parameter has a value that
somehow differs from the null
hypothesis.
• The symbolic form of the alternative
hypothesis must use one of these
symbols: , <, >.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 18
Note about Forming Your
Own Claims (Hypotheses)
If you are conducting a study and want
to use a hypothesis test to support
your claim, the claim must be worded
so that it becomes the alternative
hypothesis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 19
Note about Identifying
H0 and H1
Figure 8-2
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 20
Example:
Consider the claim that the mean weight of
airline passengers (including carry-on
baggage) is at most 195 lb (the current value
used by the Federal Aviation Administration).
Follow the three-step procedure outlined in
Figure 8-2 to identify the null hypothesis and
the alternative hypothesis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 21
Example:
Step 1: Express the given claim in symbolic
form. The claim that the mean is at
most 195 lb is expressed in symbolic
form as ≤ 195 lb.
Step 2: If ≤ 195 lb is false, then > 195 lb
must be true.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 22
Example:
Step 3: Of the two symbolic expressions
≤ 195 lb and > 195 lb,
we see that > 195 lb does not contain
equality, so we let the alternative
hypothesis H1 be > 195 lb.
Also, the null hypothesis must be a
statement that the mean equals 195 lb, so
we let H0 be = 195 lb.
Note that the
original claim
that the mean is at most
195 lb is neither the alternative hypothesis nor the null
hypothesis.
(However, we would be able to address the original claim
upon completion of a hypothesis test.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 23
The test statistic is a value used in making
a decision about the null hypothesis, and is
found by converting the sample statistic to
a score with the assumption that the null
hypothesis is true.
Test Statistic
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 24
Test Statistic – Formulas
Test statistic for
proportion
z
=
p̂ − p
pq
n
Test statistic
for mean
z =
x −
n
or
t =
x −
s
n
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 25
Example:
Let’s again consider the claim that the XSORT
method of gender selection increases the
likelihood of having a baby girl.
Preliminary results from a test of the XSORT
method of gender selection involved 14 couples
who gave birth to 13 girls and 1 boy. Use the given
claim and the preliminary results to calculate the
value of the test statistic.
Use the format of the test statistic given above, so
that a normal distribution is used to approximate a
binomial distribution.
(There are other exact methods that do not use the
normal approximation.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 –
26
Example:
The claim that the XSORT method of gender
selection increases the likelihood of having a
baby girl results in the following null and
alternative hypotheses H0: p = 0.5 and
H1: p > 0.5.
We work under the assumption that the null
hypothesis is true with p = 0.5.
The sample proportion of 13 girls in 14 births
results in . Using p = 0.5,
and n = 14,
we find the value of the test statistic as follows:
p̂ = 13 14 = 0.929
p̂ = 0.929
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 27
Example:
We know from previous chapters that a z score of
3.21 is “unusual” (because it is greater than 2).
It appears that in addition to being greater than
0.5, the sample proportion of 13/14 or 0.929 is
significantly greater than 0.5.
The figure on the next slide shows that the sample
proportion of 0.929 does fall within the range of
values considered to be significant because they
are so far above 0.5 that they are not likely to
occur by chance (assuming that the population
proportion is p = 0.5).
z =
p̂ − p
pq
n
=
0.929 − 0.5
0.5( ) 0.5( )
14
= 3.21
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 28
Example:
Sample proportion of:
or
Test Statistic z = 3.21
p̂ = 0.929
Fail to reject H0 Reject H0
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 29
Critical Region
The critical region (or rejection region) is the
set of all values of the test statistic that
cause us to reject the null hypothesis. For
example, see the red-shaded region in the
previous figure.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 30
Significance Level
The significance level (denoted by ) is the
probability that the test statistic will fall in the
critical region when the null hypothesis is
actually true. This is the same introduced
in Section 7-2. Common choices for are
0.05, 0.01, and 0.10.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 31
Critical Value
A critical value is any value that separates the
critical region (where we reject the null
hypothesis) from the values of the test
statistic that do not lead to rejection of the null
hypothesis. The critical values depend on the
nature of the null hypothesis, the sampling
distribution that applies, and the significance
level . See the previous figure where the
critical value of z = 1.645 corresponds to a
significance level of = 0.05.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 32
P-Value
The P-value (or p-value or probability value)
is the probability of getting a value of the test
statistic that is at least as extreme as the one
representing the sample data, assuming that
the null hypothesis is true.
Critical region
in the left tail:
Critical region
in the right tail:
Critical region
in two tails:
P-value = area to the left of
the test statistic
P-value = area to the right of
the test statistic
P-value = twice the area in the
tail beyond the test statistic
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 33
P-Value
The null hypothesis is rejected if the P-value
is very small, such as 0.05 or less.
Here is a memory tool useful for interpreting
the P-value:
If the P is low, the null must go.
If the P is high, the null will fly.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 34
Procedure for Finding P-Values
Figure 8-5
H1:>H1:<
H1:≠
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 35
Caution
Don’t confuse a P-value with a proportion p.
Know this distinction:
P-value = probability of getting a test
statistic at least as extreme as
the one representing sample
data
p = population proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 36
Example
Consider the claim that with the XSORT method
of gender selection, the likelihood of having a
baby girl is different from p = 0.5, and use the
test statistic z = 3.21 found from 13 girls in 14
births.
First determine whether the given conditions
result in a critical region in the right tail, left tail,
or two tails, then use Figure 8-5 to find the P-
value. Interpret the P-value.
H0: p=0.5
H1: p≠0.5
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 37
Example
The claim that the likelihood of having a baby
girl is different from p = 0.5 can be expressed as
p ≠ 0.5 so the critical region is in two tails.
Using Figure 8-5 to find the P-value for a two-
tailed test, we see that the P-value is twice the
area to the right of the test statistic z = 3.21.
We refer to Table A-2 (or use technology) to find
that the area to the right of z = 3.21 is 0.0007.
In this case, the P-value is twice the area to the
right of the test statistic, so we have:
P-value = 2 0.0007 = 0.0014
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 38
Example
The P-value is 0.0014 (or 0.0013 if greater
precision is used for the calculations). The small
P-value of 0.0014 shows that there is a very
small chance of getting the sample results that
led to a test statistic of z = 3.21. This suggests
that with the XSORT method of gender
selection, the likelihood of having a baby girl is
different from 0.5.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 39
Types of Hypothesis Tests:
Two-tailed, Left-tailed, Right-tailed
The tails in a distribution are the extreme
regions bounded by critical values.
Determinations of P-values and critical values
are affected by whether a critical region is in
two tails, the left tail, or the right tail. It
therefore becomes important to correctly
characterize a hypothesis test as two-tailed,
left-tailed, or right-tailed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 –
40
Two-tailed Test
H0: =
H1:
is divided equally between
the two tails of the critical
region
Means less than or greater than
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 41
Left-tailed Test
H0: =
H1: <
Points Left
the left tail
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 42
Right-tailed Test
H0: =
H1: >
Points Right
the Right tail
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 43
Conclusions
in Hypothesis Testing
We always test the null hypothesis.
The initial conclusion will always be
one of the following:
1. Reject the null hypothesis.
2. Fail to reject the null hypothesis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 44
P-value method:
Using the significance level :
If P-value ,
reject H0.
If P-value > , fail to reject H0.
Decision Criterion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 45
Traditional method:
If the test statistic falls within the
critical region, reject H0.
If the test statistic does not fall
within the critical region, fail to
reject H0.
Decision Criterion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 46
Another option:
Instead of using a significance
level such as 0.05, simply identify
the P-value and leave the decision
to the reader.
Decision Criterion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 47
Decision Criterion
Confidence Intervals:
A confidence interval estimate of a
population parameter contains the
likely values of that parameter.
If a confidence interval does not
include a claimed value of a
population parameter, reject that
claim.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 48
Wording of Final Conclusion
Figure 8-7
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 49
Caution
Never conclude a hypothesis test with a
statement of “reject the null hypothesis”
or “fail to reject the null hypothesis.”
Always make sense of the conclusion
with a statement that uses simple
nontechnical wording that addresses the
original claim.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 50
Accept Versus Fail to Reject
• Some texts use “accept the null
hypothesis.”
• We are not proving the null hypothesis.
• Fail to reject says more correctly
• The available evidence is not strong
enough to warrant rejection of the null
hypothesis (such as not enough
evidence to convict a suspect).
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 51
Type I Error
• A Type I error is the mistake of
rejecting the null hypothesis when it
is actually true.
• The symbol (alpha) is used to
represent the probability of a type I
error.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 52
Type II Error
• A Type II error is the mistake of failing
to reject the null hypothesis when it is
actually false.
• The symbol (beta) is used to
represent the probability of a type II
error.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 53
Type I and Type II Errors
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 54
Example:
a) Identify a type I error.
b) Identify a type II error.
Assume that we are conducting a hypothesis
test of the claim that a method of gender
selection increases the likelihood of a baby
girl, so that the probability of a baby girls is p >
0.5. Here are the null and alternative
hypotheses: H0: p = 0.5, and H1: p > 0.5.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 55
Example:
a) A type I error is the mistake of rejecting a
true null hypothesis, so this is a type I error:
Conclude that there is sufficient evidence to
support p > 0.5, when in reality p = 0.5.
b) A type II error is the mistake of failing to
reject the null hypothesis when it is false, so
this is a type II error: Fail to reject p = 0.5
(and therefore fail to support p > 0.5) when in
reality p > 0.5.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 56
Controlling Type I and
Type II Errors
• For any fixed , an increase in the sample
size n will cause a decrease in
• For any fixed sample size n, a decrease in
will cause an increase in . Conversely,
an increase in will cause a decrease in
.
• To decrease both and , increase the
sample size.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 –
57
Comprehensive
Hypothesis Test –
P-Value Method
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 58
Comprehensive
Hypothesis Test –
Traditional Method
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 59
Comprehensive
Hypothesis Test – cont
A confidence interval estimate of a population
parameter contains the likely values of that
parameter. We should therefore reject a claim
that the population parameter has a value that
is not included in the confidence interval.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 60
In some cases, a conclusion based on a
confidence interval may be different
from a conclusion based on a
hypothesis test. See the comments in
the individual sections.
Caution
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 61
Part 2:
Beyond the Basics of
Hypothesis Testing:
The Power of a Test
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 62
Definition
The power of a hypothesis test is the
probability (1 – ) of rejecting a false null
hypothesis. The value of the power is
computed by using a particular significance
level and a particular value of the
population parameter that is an alternative to
the value assumed true in the null hypothesis.
That is, the power of the hypothesis test is the
probability of supporting an alternative
hypothesis that is true.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 63
Power and the
Design of Experiments
Just as 0.05 is a common choice for a significance
level, a power of at least 0.80 is a common
requirement for determining that a hypothesis test is
effective. (Some statisticians argue that the power
should be higher, such as 0.85 or 0.90.) When
designing an experiment, we might consider how
much of a difference between the claimed value of a
parameter and its true value is an important amount
of difference. When designing an experiment, a goal
of having a power value of at least 0.80 can often be
used to determine the minimum required sample
size.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 64
Recap
In this section we have discussed:
❖ Null and alternative hypotheses.
❖ Test statistics.
❖ Significance levels.
❖ P-values.
❖ Decision criteria.
❖ Type I and II errors.
❖ Power of a hypothesis test.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 65
Section 8-3
Testing a Claim About a
Proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 66
Key Concept
This section presents complete procedures
for testing a hypothesis (or claim) made about
a population proportion.
This section uses the components introduced
in the previous section for the P-value
method, the traditional method or the use of
confidence intervals.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 67
Key Concept
Two common methods for testing a claim
about a population proportion are (1) to use a
normal distribution as an approximation to
the binomial distribution, and (2) to use an
exact method based on the binomial
probability distribution.
Part 1 of this section uses the approximate
method with the normal distribution, and Part
2 of this section briefly describes the exact
method.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 68
Part 1:
Basic Methods of Testing Claims
about a Population Proportion p
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 69
Notation
p = population proportion (used in the
null hypothesis)
q = 1 – p
n = number of trials
p = (sample proportion)
x
n
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 70
1) The sample observations are a simple
random sample.
2) The conditions for a binomial distribution
are satisfied.
3) The conditions np 5 and nq 5 are both
satisfied, so the binomial distribution of
sample proportions can be approximated
by a normal distribution with µ = np and
= npq . Note: p is the assumed
proportion not the sample
proportion.
Requirements for Testing Claims
About a Population Proportion p
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 71
p – p
pq
n
z =
Test Statistic for Testing
a Claim About a Proportion
P-values:
Critical Values:
Use the standard normal
distribution (Table A-2) and refer to
Figure 8-5
Use the standard normal
distribution (Table A-2).
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 72
Caution
Don’t confuse a P-value with a proportion p.
P-value = probability of getting a test
statistic at least as extreme as
the one representing sample
data
p = population proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 73
P-Value Method:
Use the same method as described
in Section 8-2 and in Figure 8-8.
Use the standard normal
distribution (Table A-2).
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 74
Traditional Method
Use the same method as described
in Section 8-2 and in Figure 8-9.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 75
Confidence Interval Method
Use the same method as described
in Section 8-2 and in Table 8-2.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 76
CAUTION
When testing claims about a population proportion,
the traditional method and the P-value method are
equivalent and will yield the same result since they
use the same standard deviation based on the claimed
proportion p.
However, the confidence interval uses an estimated
standard deviation based upon the sample proportion p.
Consequently, it is possible that the traditional and P-
value methods may yield a different conclusion than the
confidence interval method.
A good strategy is to use a confidence interval to
estimate a population proportion, but use the P-value
or traditional method for testing a claim about the
proportion.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 77
Example:
The text refers to a study in which 57 out of
104 pregnant women correctly guessed the
sex of their babies.
Use these sample data to test the claim that
the success rate of such guesses is the 50%
success rate expected with random chance
guesses.
Use a 0.05 significance level.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 78
Example:
Requirements are satisfied: simple random
sample; fixed number of trials (104) with two
categories (guess correctly or do not); np =
(104)(0.5) = 52 ≥ 5 and nq = (104)(0.5) = 52 ≥ 5
Step 1: original claim is that the success rate
is no different from 50%: p = 0.50
Step 2: opposite of original claim is p ≠ 0.50
Step 3: p ≠ 0.50 does not contain equality so
it is H1.
H0: p = 0.50 null hypothesis and original claim
H1: p ≠ 0.50 alternative hypothesis
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 79
Example:
Step 4: significance level is = 0.05
Step 5: sample involves proportion so the
relevant statistic is the sample
proportion,
Step 6: calculate z:
p̂
z =
p̂ − p
pq
n
=
57
104
− 0.50
0.50( ) 0.50( )
104
= 0.98
two-tailed test, P-value is twice the
area to the right of test statistic
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 80
Example:
Table A-2: z = 0.98 has an area of 0.8365 to its
left, so area to the right is 1 – 0.8365 = 0.1635,
doubles yields 0.3270 (technology provides a
more accurate P-value of 0.3268
Step 7: the P-value of 0.3270 is greater than
the significance level of 0.05, so fail to
reject the null hypothesis
Here is the correct conclusion: There is not
sufficient evidence to warrant rejection of the
claim that women who guess the sex of their
babies have a success rate equal to 50%.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 81
(determining the sample proportion of households with cable TV)
p = = = 0.64
x
n
96
(96+54)
and 54 do not” is calculated using
p sometimes must be calculated
“96 surveyed households have cable TV
p sometimes is given directly
“10% of the observed sports cars are red”
is expressed as
p = 0.10
Obtaining P
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 82
Part 2:
Exact Method for Testing Claims
about a Proportion p
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 83
Testing Claims
We can get exact results by using the binomial
probability distribution.
Binomial probabilities are a nuisance to calculate
manually, but technology makes this approach
quite simple.
Also, this exact approach does not require that np
≥ 5 and nq ≥ 5 so we have a method that applies
when that requirement is not satisfied.
To test hypotheses using the exact binomial
distribution, use the binomial probability
distribution with the P-value method, use the value
of p assumed in the null hypothesis, and find P-
values as follows:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 84
Testing Claims
Left-tailed test:
The P-value is the probability of getting
x
or fewer
successes among n trials.
Right-tailed test:
The P-value is the probability of getting x or more
successes among n trials.
p p̂
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 85
Testing Claims
Two-tailed test:
the P-value is twice the probability of
getting x or more successes
the P-value is twice the probability of
getting x or fewer successes
If p̂ p,
If p̂ p,
p p
^
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 86
Recap
In this section we have discussed:
❖ Test statistics for claims about a proportion.
❖ P-value method.
❖ Confidence interval method.
❖ Obtaining p.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 87
Section 8-4
Testing a Claim About a
Mean: Known
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 88
Key Concept
This section presents methods for testing a
claim about a population mean, given that the
population standard deviation is a known
value.
This section uses the normal distribution with
the same components of hypothesis tests
that were introduced in Section 8-2.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 89
Notation
n = sample size
= sample mean
= population mean of all sample
means from samples of size n
= known value of the population
standard deviation
x
x
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 90
Requirements for Testing Claims About
a Population Mean (with Known)
1) The sample is a simple random
sample.
2) The value of the population standard
deviation is known.
3) Either or both of these conditions is
satisfied: The population is normally
distributed or
n > 30.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 91
Test Statistic for Testing a Claim
About a Mean (with Known)
n
x – µxz =
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 92
Example:
People have died in boat accidents because an
obsolete estimate of the mean weight of men was
used.
Using the weights of the simple random sample of
men, we obtain these sample statistics:
n = 40 and = 172.55 lb.
Research from several other sources suggests
that the population of weights of men has a
standard deviation given by = 26 lb.
Use these results to test the claim that men have a
mean weight greater than 166.3 lb, which was the
weight in the National Transportation and Safety
Board’s
recommendation.
Use a 0.05 significance level, and use the P-value
method outlined in Figure 8-8.
x
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 93
Example:
Requirements are satisfied: simple random
sample, is known (26 lb), sample size is 40
(n > 30)
Step 1: Express claim as > 166.3 lb
Step 2: alternative to claim is ≤ 166.3 lb
Step 3: > 166.3 lb does not contain equality,
it is the alternative hypothesis:
H0: = 166.3 lb null hypothesis
H1: > 166.3 lb alternative hypothesis and
original claim
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 94
Example:
Step 4: significance level is = 0.05
Step 5: claim is about the population mean,
so the relevant statistic is the sample
mean (172.55 lb), is known (26 lb),
sample size greater than 30
Step 6: calculate z
z =
x −
x
n
=
172.55 − 166.3
26
40
= 1.52
right-tailed test, so P-value is the area
to the right of z = 1.52;
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 95
Example:
Table A-2: area to the left of z = 1.52
is 0.9357, so the area to the right is
1 – 0.9357 = 0.0643.
The P-value is 0.0643
Step 7: The P-value of 0.0643 is greater than
the significance level of = 0.05, we
fail
to reject the null hypothesis.
x = 172.55
= 166.3
or
z = 0 or
z = 1.52
P-value = 0.0643
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 96
Example:
The P-value of 0.0643 tells us that if men have a
mean weight given by = 166.3 lb, there is a
good chance (0.0643) of getting a sample mean
of 172.55 lb.
A sample mean such as 172.55 lb could easily
occur by chance.
There is not sufficient evidence to support a
conclusion that the population mean is greater
than 166.3 lb, as in the National Transportation
and Safety Board’s recommendation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 97
Example:
The traditional method: Use z = 1.645 instead of
finding the P-value. Since z = 1.52 does not fall
in the critical region, again fail to reject the null
hypothesis.
Confidence Interval method: Use a one-tailed
test with = 0.05, so construct a 90%
confidence interval:
165.8 < < 179.3
The confidence interval contains 166.3 lb, we
cannot support a claim that is greater than
166.3. Again, fail to reject the null hypothesis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 98
Underlying Rationale of
Hypothesis Testing
If, under a given assumption, there is an
extremely small probability of getting sample
results at least as extreme as the results that
were obtained, we conclude that the
assumption is probably not correct.
When testing a claim, we make an
assumption (null hypothesis) of equality. We
then compare the assumption and the
sample results and we form one of the
following conclusions:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 99
• If the sample results (or more extreme results)
can easily occur when the assumption (null
hypothesis) is true, we attribute the relatively
small discrepancy between the assumption and
the sample results to chance.
• If the sample results cannot easily occur when
that assumption (null hypothesis) is true, we
explain the relatively large discrepancy between
the assumption and the sample results by
concluding that the assumption is not true, so
we reject the assumption.
Underlying Rationale of
Hypotheses Testing – cont
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 100
Recap
In this section we have discussed:
❖ Requirements for testing claims about
population means, σ known.
❖ P-value method.
❖ Traditional method.
❖ Confidence interval method.
❖ Rationale for hypothesis testing.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 101
Section 8-5
Testing a Claim About a
Mean: Not Known
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 102
Key Concept
This section presents methods for testing a
claim about a population mean when we do
not know the value of σ.
The methods of this section use the Student t
distribution introduced earlier.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 103
Notation
n = sample size
= sample mean
= population mean of all sample
means from samples of size n
x
x
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 104
Requirements for Testing Claims
About a Population
Mean (with Not Known)
1) The sample is a simple random sample.
2) The value of the population standard
deviation is not known.
3) Either or both of these conditions is
satisfied: The population is normally
distributed or n > 30.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 105
Test Statistic for Testing a
Claim About a Mean
(with Not Known)
P-values and Critical Values
❖Found in Table A-3
❖Degrees of freedom (df) = n – 1
x – µxt = s
n
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 106
Important Properties of the
Student t Distribution
1. The Student t distribution is different for different
sample sizes (see Figure 7-5 in Section 7-4).
2. The Student t distribution has the same general bell
shape as the normal distribution; its wider shape
reflects the greater variability that is expected when s is
used to estimate .
3. The Student t distribution has a mean of t = 0 (just as
the standard normal distribution has a mean of z = 0).
4. The standard deviation of the Student t distribution
varies with the sample size and is greater than 1 (unlike
the standard normal distribution, which has = 1).
5. As the sample size n gets larger, the Student t
distribution gets closer to the standard normal
distribution.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 107
Choosing between the Normal and
Student t Distributions when Testing a
Claim about a Population Mean µ
Use the Student t distribution when is
not known and either or both of these
conditions is satisfied:
The population is normally distributed or
n > 30.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 108
Example:
People have died in boat accidents because an
obsolete estimate of the mean weight of men was
used.
Using the weights of the simple random sample
of men from Data Set 1 in Appendix B, we obtain
these sample statistics: n = 40 and = 172.55 lb,
and s= 26.33 lb. Do not assume that the value of
is known.
Use these results to test the claim that men have a
mean weight greater than 166.3 lb, which was the
weight in the National Transportation and Safety
Board’s recommendation M-04-04.
Use a 0.05 significance level, and the traditional
method outlined in Figure 8-9.
x
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 109
Example:
Requirements are satisfied: simple random
sample, population standard deviation is not
known, sample size is 40 (n > 30)
Step 1: Express claim as > 166.3 lb
Step 2: alternative to claim is ≤ 166.3 lb
Step 3: > 166.3 lb does not contain equality,
it is the alternative hypothesis:
H0: = 166.3 lb null hypothesis
H1: > 166.3 lb alternative hypothesis and
original claim
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 110
Example:
Step 4: significance level is = 0.05
Step 5: claim is about the population mean,
so the relevant statistic is the sample
mean, 172.55 lb
Step 6: calculate t
t =
x −
x
s
n
=
172.55 − 166.3
26.33
40
= 1.501
df = n – 1 = 39, area of 0.05, one-tail
yields t = 1.685;
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 111
Example:
Step 7: t = 1.501 does not fall in the critical
region bounded by t = 1.685, we fail
to reject the null hypothesis.
= 166.3
or
z = 0
x = 172.55
or
t = 1.52
Critical value
t = 1.685
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 112
Example:
Because we fail to reject the null hypothesis, we
conclude that there is not sufficient evidence to
support a conclusion that the population mean
is greater than 166.3 lb, as in the National
Transportation and Safety Board’s
recommendation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 113
The critical value in the preceding example
was t = 1.782, but if the normal distribution
were being used, the critical value would have
been z = 1.645.
The Student t critical value is larger (farther to
the right), showing that with the Student t
distribution, the sample evidence must be
more extreme before we can consider it to be
significant.
Normal Distribution Versus
Student t Distribution
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 114
P-Value Method
❖ Use software or a TI-83/84 Plus
calculator.
❖ If technology is not available, use Table
A-3 to identify a range of P-values.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 115
a) In a left-tailed hypothesis test, the sample size
is n = 12, and the test statistic is t = –2.007.
b) In a right-tailed hypothesis test, the sample size
is n = 12, and the test statistic is t = 1.222.
c) In a two-tailed hypothesis test, the sample size
is n = 12, and the test statistic is t = –3.456.
Example: Assuming that neither software nor
a TI-83 Plus calculator is available, use Table
A-3 to find a range of values for the P-value
corresponding to the given results.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 116
Example: Assuming that neither software nor
a TI-83 Plus calculator is available, use Table
A-3 to find a range of values for the P-value
corresponding to the given results.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 117
Example: Assuming that neither software nor
a TI-83 Plus calculator is available, use Table
A-3 to find a range of values for the P-value
corresponding to the given results.
a) The test is a left-tailed test with test
statistic t = –2.007, so the P-value is the
area to the left of –2.007. Because of the
symmetry of the t distribution, that is the
same as the area to the right of +2.007. Any
test statistic between 2.201 and 1.796 has a
right-tailed P-value that is between 0.025
and 0.05. We conclude that
0.025 < P-value < 0.05.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 118
Example: Assuming that neither software nor
a TI-83 Plus calculator is available, use Table
A-3 to find a range of values for the P-value
corresponding to the given results.
b) The test is a right-tailed test with test
statistic t = 1.222, so the P-value is the
area to the right of 1.222. Any test
statistic less than 1.363 has a right-tailed
P-value that is greater than 0.10. We
conclude that P-value > 0.10.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 119
c) The test is a two-tailed test with test statistic
t = –3.456. The P-value is twice the area to
the left of –3.456 (= right of +3.456). Any test
statistic greater than 3.106 has a two-
tailed P-value that is less than 0.01. We
conclude that
P-value < 0.01.
Example: Assuming that neither software nor
a TI-83 Plus calculator is available, use Table
A-3 to find a range of values for the P-value
corresponding to the given results.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 8.1 – 120
Recap
In this section we have discussed:
❖ Assumptions for testing claims about
population means, σ unknown.
❖ Student t distribution.
❖ P-value method.
6.1 – 2Copyright ©
20
10, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Lecture Slides
Elementary Statistics
Eleventh Edition
and the Triola Statistics Series
by Mario F. Triola
6.1 – 3Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Chapter 6
Normal Probability
Distribution
s
6-1
Review
and Preview
6-2 The
Standard
Normal Distribution
6-3 Applications of Normal
Distributions
6-4 Sampling Distributions
and Estimators
6-5 The Central Limit
Theorem
6.1 – 4Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 6-1
Review and
Preview
6.1 – 5Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
❖ Chapter 2: Distribution of data
❖ Chapter 3: Measures of data sets,
including measures of center and
variation
❖ Chapter 4: Principles of probability
❖ Chapter 5: Discrete probability
distributions
Review
6.1 – 6Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Chapter focus is on:
❖ Continuous random variables
❖ Normal distributions
Preview
Figure 6-1
Formula 6-1
f
x
( )=
e
−
1
2
x−
2
2
Distribution determined
by fixed values of mean
and standard deviation
6.1 – 7Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 6-2
The Standard Normal
Distribution
6.1 – 8Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
This section presents the standard
normal
distribution which has three properties:
1. It’s graph is bell-shaped.
2. It’s mean is equal to 0 ( = 0).
3. It’s standard deviation is equal to 1 ( = 1).
Develop the skill to find areas (or probabilities
or relative frequencies) corresponding to
various regions under the graph of the
standard normal
distribution.
Find z-scores that
correspond to area under the graph.
6.1 – 9Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Uniform Distribution
A continuous random variable has a
uniform distribution if its values are
spread evenly over the range of
probabilities. The graph of a uniform
distribution results in a rectangular shape.
6.1 – 10Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
A density curve is the graph of a
continuous probability
distribution.
It
must satisfy the following properties:
Density Curve
1. The total area under the curve must
equal 1.
2. Every point on the curve must have a
vertical height that is 0 or greater.
(That is, the curve cannot fall below
the x-axis.)
6.1 – 11Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Because the total area under the
density curve is equal to 1,
there is a correspondence
between area and probability.
Area and Probability
6.1 – 12Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Using Area to Find Probability
Given the uniform distribution illustrated, find
the probability that a randomly selected
voltage level is greater than 124.5 volts.
Shaded area
represents
voltage levels
greater than
124.5 volts.
Correspondence
between area
and probability:
0.25.
6.1 – 13Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Standard Normal Distribution
The standard normal distribution is a
normal probability distribution with = 0
and = 1. The total area under its density
curve is equal to 1.
6.1 – 14Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding Probabilities When
Given z-scores
❖
Table A-2
(in Appendix A)
❖ Formulas and Tables insert card
❖ Find areas for many different
regions
6.1 – 15Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding Probabilities –
Other Methods
❖ STATDISK
❖ Minitab
❖ Excel
❖ TI-83/84 Plus
6.1 – 16Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Methods for Finding Normal
Distribution Areas
6.1 – 17Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Methods for Finding Normal
Distribution Areas
6.1 – 18Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Table A-2
6.1 – 19Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
1. It is designed only for the standard normal
distribution, which has a mean of 0 and a
standard deviation of 1.
2. It is on two pages, with one page for negative
z-scores and the other page for positive
z-scores.
3. Each value in the body of the table is a
cumulative area from the left up to a vertical
boundary above a specific z-score.
Using Table A-2
6.1 – 20Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
4. When working with a graph, avoid confusion
between z-scores and areas.
z Score
Distance along horizontal scale of the
standard normal distribution; refer to the
leftmost column and top row of Table A-2.
Area
Region under the curve; refer to the values in
the body of Table A-2.
5. The part of the z-score denoting hundredths
is found across the top.
Using Table A-2
6.1 – 21Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
The Precision Scientific Instrument Company
manufactures thermometers that are supposed
to give readings of 0ºC at the freezing point of
water. Tests on a large sample of these
instruments reveal that at the freezing point of
water, some thermometers give readings below
0º (denoted by negative numbers) and some give
readings above 0º (denoted by positive
numbers). Assume that the mean reading is 0ºC
and the standard deviation of the readings is
1.00ºC. Also assume that the readings are
normally distributed. If one thermometer is
randomly selected, find the probability that, at
the freezing point of water, the reading is less
than 1.27º.
Example – Thermometers
6.1 – 22Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
P(z < 1.27) =
Example – (Continued)
6.1 – 23Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Look at Table A-2
6.1 – 24Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
P (z < 1.27) = 0.8980
Example – cont
6.1 – 25Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
The probability of randomly selecting a
thermometer with a reading less than 1.27º
is 0.8980.
P (z < 1.27) = 0.8980
Example – cont
6.1 – 26Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Or 89.80% will have readings below 1.27º.
P (z < 1.27) = 0.8980 Example - cont
6.1 – 27Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
If thermometers have an average (mean) reading of 0
degrees and a standard deviation of 1 degree for
freezing water, and if one thermometer is randomly
selected, find the probability that it reads (at the
freezing point of water) above –1.23 degrees.
Probability of randomly selecting a thermometer
with a reading above –1.23º is 0.8907.
P (z > –1.23) = 0.8907
Example – Thermometers Again
6.1 – 28Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
P (z > –1.23) = 0.8907
89.07% of the thermometers have readings
above –1.23 degrees.
Example – cont
6.1 – 29Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
A thermometer is randomly selected. Find the probability
that it reads (at the freezing point of water) between –2.00
and 1.50 degrees.
P (z < –2.00) = 0.0228 P (z < 1.50) = 0.9332 P (–2.00 < z < 1.50) = 0.9332 – 0.0228 = 0.9104
The probability that the chosen thermometer has a
reading between – 2.00 and 1.50 degrees is 0.9104.
Example – Thermometers III
6.1 – 30Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
If many thermometers are selected and tested at the
freezing point of water, then 91.04% of them will read
between –2.00 and 1.50 degrees.
P (z < –2.00) = 0.0228 P (z < 1.50) = 0.9332 P (–2.00 < z < 1.50) = 0.9332 – 0.0228 = 0.9104 A thermometer is randomly selected. Find the probability that it reads (at the freezing point of water) between –2.00 and 1.50 degrees. Example - cont
6.1 – 31Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
P(a < z < b) denotes the probability that the z score is between a and b.
P(z > a)
denotes the probability that the z score is greater than a.
P(z < a) denotes the probability that the z score is less than a.
Notation
6.1 – 32Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding a z Score When Given a
Probability Using Table A-2
1. Draw a bell-shaped curve and identify the region
under the curve that corresponds to the given
probability. If that region is not a cumulative
region from the left, work instead with a known
region that is a cumulative region from the left.
2. Using the cumulative area from the left, locate the
closest probability in the body of Table A-2 and
identify the corresponding z score.
6.1 – 33Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding z Scores
When Given Probabilities
5% or 0.05
(z score will be positive)
Finding the 95th Percentile
6.1 – 34Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding z Scores
When Given Probabilities – cont
Finding the 95th Percentile
1.645
5% or 0.05
(z score will be positive)
6.1 – 35Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding the Bottom 2.5% and Upper 2.5%
(One z score will be negative and the other positive)
Finding z Scores
When Given Probabilities – cont
6.1 – 36Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding the Bottom 2.5% and Upper 2.5%
(One z score will be negative and the other positive)
Finding z Scores
When Given Probabilities – cont
6.1 – 37Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Finding the Bottom 2.5% and Upper 2.5%
(One z score will be negative and the other positive)
Finding z Scores
When Given Probabilities – cont
6.1 – 38Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Density curves.
❖ Relationship between area and probability.
❖ Standard normal
distribution.
❖ Using Table A-2.
6.1 – 39Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 6-3
Applications of Normal
Distributions
6.1 – 40Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
This section presents methods for working
with normal distributions that are not standard.
That is, the mean is not 0 or the standard
deviation is not 1, or both.
The key concept is that we can use a simple
conversion that allows us to standardize any
normal distribution so that the same methods
of the previous section can be used.
6.1 – 41Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Conversion Formula
x – µ
z =
Round z scores to 2 decimal places
6.1 – 42Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Converting to a Standard
Normal Distribution
x –
z =
6.1 – 43Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
In the Chapter Problem, we noted that the safe
load for a water taxi was found to be 3500
pounds. We also noted that the mean weight of a
passenger was assumed to be 140 pounds.
Assume the worst case that all passengers are
men. Assume also that the weights of the men
are normally distributed with a mean of 172
pounds and standard deviation of 29 pounds. If
one man is randomly selected, what is the
probability he weighs less than 174 pounds?
Example – Weights of
Water Taxi Passengers
6.1 – 44Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – cont
z =
174 – 172
29
= 0.07
= 29
= 172
6.1 – 45Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – cont
P ( x < 174 lb.) = P(z < 0.07)
= 0.5279
= 29
= 172
6.1 – 46Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
1. Don’t confuse z scores and areas. z scores are
distances along the horizontal scale, but areas
are regions under the normal curve. Table A-2
lists z scores in the left column and across the top
row, but areas are found in the body of the table.
2. Choose the correct (right/left) side of the graph.
3. A z score must be negative whenever it is located
in the left half of the
normal distribution.
4. Areas (or probabilities) are positive or zero values,
but they are never negative.
Helpful Hints
6.1 – 47Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Procedure for Finding Values
Using Table A-2 and Formula 6-2
1. Sketch a normal distribution curve, enter the given probability or
percentage in the appropriate region of the graph, and identify
the x value(s) being sought.
2. Use Table A-2 to find the z score corresponding to the cumulative
left area bounded by x. Refer to the body of Table A-2 to find the
closest area, then identify the corresponding z score.
3. Using Formula 6-2, enter the values for µ, , and the z score
found in step 2, then solve for x.
x = µ + (z • ) (Another form of Formula 6-2)
(If z is located to the left of the mean, be sure that it is a negative
number.)
4. Refer to the sketch of the curve to verify that the solution makes
sense in the context of the graph and the context of the problem.
6.1 – 48Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Lightest and Heaviest
Use the data from the previous example to determine
what weight separates the lightest 99.5% from the
heaviest 0.5%?
6.1 – 49Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
x = + (z ● )
x = 172 + (2.575 • 29)
x = 246.675 (247 rounded)
Example –
Lightest and Heaviest – cont
6.1 – 50Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
The weight of 247 pounds separates the
lightest 99.5% from the heaviest 0.5%
Example –
Lightest and Heaviest – cont
6.1 – 51Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
6.1 – 52Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
6.1 – 53Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
6.1 – 54Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Non-standard normal distribution.
❖ Converting to a standard normal distribution.
❖ Procedures for finding values using Table A-2
and Formula 6-2.
6.1 – 55Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 6-4
Sampling Distributions
and Estimators
6.1 – 56Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
The main objective of this section is to
understand the concept of a sampling
distribution of a statistic, which is the
distribution of all values of that statistic
when all possible samples of the same size
are taken from the same population.
We will also see that some statistics are
better than others for estimating population
parameters.
6.1 – 57Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
The sampling distribution of a statistic (such
as the sample mean or sample proportion) is
the distribution of all values of the statistic
when all possible samples of the same size n
are
taken from the same population. (The
sampling distribution of a statistic is typically
represented as a probability distribution in the
format of a table, probability histogram, or
formula.)
6.1 – 58Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
The
sampling distribution of the mean is
the distribution of sample means, with all
samples having the same sample size n
taken from the same population. (The
sampling distribution of the mean is
typically represented as a probability
distribution in the format of a table,
probability histogram, or formula.)
6.1 – 59Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Properties
❖ Sample means target the value of the
population mean. (That is, the mean of the
sample means is the population mean. The
expected value of the sample mean is equal
to the population mean.)
❖ The distribution of the sample means tends
to be a normal distribution.
6.1 – 60Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
The sampling distribution of the variance is the
distribution of sample variances, with all
samples having the same sample size n taken
from the same population. (The sampling
distribution of the variance is typically
represented as a probability distribution in the
format of a table, probability histogram, or
formula.)
6.1 – 61Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Properties
❖ Sample variances target the value of the
population variance. (That is, the mean of
the sample variances is the population
variance. The expected value of the sample
variance is equal to the population variance.)
❖ The distribution of the sample variances
tends to be a distribution skewed to the
right.
6.1 – 62Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
The sampling distribution of the proportion
is the distribution of sample proportions,
with all samples having the same sample
size n taken from the same population.
6.1 – 63Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
We need to distinguish between a
population proportion p and some sample
proportion:
p = population proportion
= sample proportion p̂
6.1 – 64Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Properties
❖ Sample proportions target the value of the
population proportion. (That is, the mean of
the sample proportions is the population
proportion. The expected value of the
sample proportion is equal to the population
proportion.)
❖ The distribution of the sample proportion
tends to be a normal distribution.
6.1 – 65Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Unbiased Estimators
Sample means, variances and
proportions are unbiased estimators.
That is they target the population
parameter.
These statistics are better in estimating
the population parameter.
6.1 – 66Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Biased Estimators
Sample medians, ranges and standard
deviations are biased estimators.
That is they do NOT target the
population parameter.
Note: the bias with the standard
deviation is relatively small in large
samples so s is often used to estimate.
6.1 – 67Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Sampling Distributions
Consider repeating this process: Roll a die 5
times, find the mean , variance s2, and the
proportion of odd numbers of the
results.
What do we know about the behavior of all
sample means that are generated as this
process continues indefinitely?
x
6.1 – 68Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Sampling Distributions
All outcomes are equally likely so the
population mean is 3.5; the mean of the
10,000 trials is 3.49. If continued indefinitely,
the sample mean will be 3.5. Also, notice the
distribution is “normal.”
Specific results from 10,000 trials
6.1 – 69Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Sampling Distributions
All outcomes are equally likely so the
population variance is 2.9; the mean of the
10,000 trials is 2.88. If continued indefinitely,
the sample variance will be 2.9. Also, notice
the distribution is “skewed to the right.”
Specific results from 10,000 trials
6.1 – 70Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Sampling Distributions
All outcomes are equally likely so the
population proportion of odd numbers is 0.50;
the proportion of the 10,000 trials is 0.50. If
continued indefinitely, the mean of sample
proportions will be 0.50. Also, notice the
distribution is “approximately normal.”
Specific results from 10,000 trials
6.1 – 71Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Why Sample with Replacement?
Sampling without replacement would have the very
practical advantage of avoiding wasteful duplication
whenever the same item is selected more than once.
However, we are interested in sampling with
replacement for these two reasons:
1. When selecting a relatively small sample form a
large population, it makes no significant
difference whether we sample with replacement
or without replacement.
2. Sampling with replacement results in
independent events that are unaffected by
previous outcomes, and independent events are
easier to analyze and result in simpler
calculations and formulas.
6.1 – 72Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Caution
Many methods of statistics require a simple
random sample. Some samples, such as
voluntary response samples or convenience
samples, could easily result in very wrong
results.
6.1 – 73Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Sampling distribution of a statistic.
❖ Sampling distribution of the mean.
❖ Sampling distribution of the variance.
❖ Sampling distribution of the proportion.
❖ Estimators.
6.1 – 74Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 6-5
The Central Limit
Theorem
6.1 – 75Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
The Central Limit Theorem tells us that for a
population with any distribution, the
distribution of the sample means approaches
a normal distribution as the sample size
increases.
The procedure in this section form the
foundation for estimating population
parameters and hypothesis testing.
6.1 – 76Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Central Limit Theorem
1. The random variable x has a distribution (which may
or may not be normal) with mean µ and standard
deviation .
2. Simple random samples all of size n are selected
from the population. (The samples are selected so
that all possible samples of the same size n have the
same chance of being selected.)
Given:
6.1 – 77Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
1. The distribution of sample x will, as the
sample size increases, approach a normal
distribution.
2. The mean of the sample means is the
population mean µ.
3. The standard deviation of all sample means
is
Conclusions:
Central Limit Theorem – cont.
n.
6.1 – 78Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Practical Rules Commonly Used
1. For samples of size n larger than 30, the
distribution of the sample means can be
approximated reasonably well by a normal
distribution. The approximation gets closer
to a normal distribution as the sample size n
becomes larger.
2. If the original population is normally
distributed, then for any sample size n, the
sample means will be normally distributed
(not just the values of n larger than 30).
6.1 – 79Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Notation
the mean of the sample means
the standard deviation of sample mean
(often called the standard error of the mean)
µx = µ
nx =
6.1 – 80Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Normal Distribution
As we proceed
from n = 1 to
n = 50, we see
that the
distribution of
sample means
is approaching
the shape of a
normal
distribution.
6.1 – 81Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – Uniform Distribution
As we proceed
from n = 1 to
n = 50, we see
that the
distribution of
sample means
is approaching
the shape of a
normal
distribution.
6.1 – 82Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example – U-Shaped Distribution
As we proceed
from n = 1 to
n = 50, we see
that the
distribution of
sample means
is approaching
the shape of a
normal
distribution.
6.1 – 83Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
As the sample size increases, the
sampling distribution of sample
means approaches a normal
distribution.
Important Point
6.1 – 84Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Use the Chapter Problem. Assume the
population of weights of men is normally
distributed with a mean of 172 lb and a
standard deviation of 29 lb.
Example – Water Taxi Safety
a) Find the probability that if an individual man
is randomly selected, his weight is greater
than 175 lb.
b) b) Find the probability that 20 randomly
selected men will have a mean weight that is
greater than 175 lb (so that their total weight
exceeds the safe capacity of 3500 pounds).
6.1 – 85Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
z = 175 – 172 = 0.10
29
a) Find the probability that if an individual man
is randomly selected, his weight is greater
than 175 lb.
Example – cont
6.1 – 86Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
b) Find the probability that 20 randomly
selected men will have a mean weight that is
greater than 175 lb (so that their total weight
exceeds the safe capacity of 3500 pounds).
Example – cont
z = 175 – 172 = 0.46
29
20
6.1 – 87Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
b) Find the probability that 20 randomly selected men
will have a mean weight that is greater than 175 lb (so
that their total weight exceeds the safe capacity of
3500 pounds).
It is much easier for an individual to deviate from the
mean than it is for a group of 20 to deviate from the mean.
a) Find the probability that if an individual man is
randomly selected, his weight is greater than 175 lb.
Example – cont
P(x > 175) = 0.4602
P(x > 175) = 0.3228
6.1 – 88Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Interpretation of Results
Given that the safe capacity of the water taxi
is 3500 pounds, there is a fairly good chance
(with probability 0.3228) that it will be
overloaded with 20 randomly selected men.
6.1 – 89Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Correction for a Finite Population
N – n
x
=
n N – 1
finite population
correction factor
When sampling without replacement and the sample
size n is greater than 5% of the finite population of
size N (that is, n > 0.05N ), adjust the standard
deviation of sample means by multiplying it by the
finite population correction factor:
6.1 – 90Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Central limit theorem.
❖ Practical rules.
❖ Effects of sample sizes.
❖ Correction for a finite population.
6.1 – 91Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Section 6-7
Assessing Normality
6.1 – 92Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Key Concept
This section presents criteria for determining
whether the requirement of a normal
distribution is satisfied.
The criteria involve visual inspection of a
histogram to see if it is roughly bell shaped,
identifying any outliers, and constructing a
graph called a normal quantile plot.
6.1 – 93Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Definition
A normal quantile plot (or normal
probability plot) is a graph of points (x,y),
where each x value is from the original set
of sample data, and each y value is the
corresponding z score that is a quantile
value expected from the standard normal
distribution.
6.1 – 94Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Procedure for Determining Whether
It Is Reasonable to Assume that
Sample Data are From a Normally
Distributed Population
1. Histogram: Construct a histogram. Reject
normality if the histogram departs dramatically
from a bell shape.
2. Outliers: Identify outliers. Reject normality if
there is more than one outlier present.
3. Normal Quantile Plot: If the histogram is
basically symmetric and there is at most one
outlier, use technology to generate a normal
quantile plot.
6.1 – 95Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Procedure for Determining Whether
It Is Reasonable to Assume that
Sample Data are From a Normally
Distributed Population
3. Continued
Use the following criteria to determine whether
or not the distribution is normal.
Normal Distribution: The population distribution
is normal if the pattern of the points is
reasonably close to a straight line and the
points do not show some systematic pattern
that is not a
straight-line pattern.
6.1 – 96Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Procedure for Determining Whether
It Is Reasonable to Assume that
Sample Data are From a Normally
Distributed Population
3. Continued
Not a Normal Distribution: The population distribution is
not normal if either or both of these two conditions
applies:
❖ The points do not lie reasonably close to a straight
line.
❖ The points show some systematic pattern that is not a
straight-line pattern.
6.1 – 97Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example
Normal: Histogram of IQ scores is close to being bell-
shaped, suggests that the IQ scores are from a normal
distribution. The normal quantile plot shows points that
are reasonably close to a straight-line pattern. It is safe to
assume that these IQ scores are from a normally
distributed population.
6.1 – 98Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example
Uniform: Histogram of data having a uniform distribution.
The corresponding normal quantile plot suggests that the
points are not normally distributed because the points
show a systematic pattern that is not a straight-line
pattern. These sample values are not from a population
having a normal distribution.
6.1 – 99Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Example
Skewed: Histogram of the amounts of rainfall in Boston for
every Monday during one year. The shape of the histogram
is skewed, not bell-shaped. The corresponding normal
quantile plot shows points that are not at all close to a
straight-line pattern. These rainfall amounts are not from a
population having a normal distribution.
6.1 – 100Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Manual Construction of a
Normal Quantile Plot
Step 1. First sort the data by arranging the values in
order from lowest to highest.
Step 2. With a sample of size n, each value represents a
proportion of 1/n of the sample. Using the known
sample size n, identify the areas of 1/2n, 3/2n,
and so on. These are the cumulative areas to the
left of the corresponding sample values.
Step 3. Use the standard normal distribution (Table A-2
or software or a calculator) to find the z scores
corresponding to the cumulative left areas found
in Step 2. (These are the z scores that are
expected from a normally distributed sample.)
6.1 – 101Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Manual Construction of a
Normal Quantile Plot
Step 4. Match the original sorted data values with their
corresponding z scores found in Step 3, then
plot the points (x, y), where each x is an original
sample value and y is the corresponding z score.
Step 5. Examine the normal quantile plot and determine
whether or not the distribution is normal.
6.1 – 102Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Ryan-Joiner Test
The Ryan-Joiner test is one of several formal
tests of normality, each having their own
advantages and disadvantages. STATDISK has a
feature of Normality Assessment that displays a
histogram, normal quantile plot, the number of
potential outliers, and results from the Ryan-
Joiner test. Information about the Ryan-Joiner
test is readily available on the Internet.
6.1 – 103Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Data Transformations
Many data sets have a distribution that is not
normal, but we can transform the data so that
the modified values have a normal distribution.
One common transformation is to replace each
value of x with log (x + 1). If the distribution of
the log (x + 1) values is a normal distribution,
the distribution of the x values is referred to as
a lognormal distribution.
6.1 – 104Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Other Data Transformations
In addition to replacing each x value with the
log (x + 1), there are other transformations,
such as replacing each x value with , or 1/x,
or x2. In addition to getting a required normal
distribution when the original data values are
not normally distributed, such transformations
can be used to correct other deficiencies, such
as a requirement (found in later chapters) that
different data sets have the same variance.
x
6.1 – 105Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.
Recap
In this section we have discussed:
❖ Normal quantile plot.
❖ Procedure to determine if data have a
normal distribution.