Initial Postings: Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.
Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.
Also, provide a graduate-level response to each of the following questions:
- Please discuss the p-value and what it is used for. Also, explain where the p-value resides on the bell curve.
[Your post must be substantive and demonstrate insight gained from the course material. Postings must be in the student’s own words – do not provide quotes!]
[Your initial post should be at least 200+ words and in APA format (including Times New Roman with font size 12 and double spaced).
Reasoning from Sample to Population
Chapter 3
© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education
Learning Objectives
Calculate standard summary statistics for a given data sample.
Explain the reasoning inherit in a confidence level.
Construct a confidence interval.
Explain the reasoning inherit in a hypothesis test.
Execute a hypothesis test.
Outline the roles of deductive and inductive reasoning in making active predictions.
‹#›
© 2019 McGraw-Hill Education.
Population parameter a numerical expression that summarizes some feature of the population
Objective degree of support using inductive and deductive reasoning
Construction of a confidence interval
Hypothesis testing
Distributions and Sample Statistics
‹#›
© 2019 McGraw-Hill Education.
Random variable
A variable that can take on multiple values, with any given realization of the variable being due to chance (or randomness)
Deterministic variable
A variable whose value can be predicted with certainty
Distributions of
Random Variables
‹#›
© 2019 McGraw-Hill Education.
Random Variables
Distributions of Random Variables
DISCRETE
COUNTABLE NUMBER OF VALUES (e.g., 5, 9, 19, 27…)
CONTINUOUS
UNCOUNTABLE INFINITE NUMBER OF VALUES (ALL THE NUMBERS, TO ANY DECIMAL PLACE, BETWEEN 0 AND 1)
‹#›
© 2019 McGraw-Hill Education.
Distributions of Random Variables
The probabilities of individual outcomes for a discrete random variable are represented by a probability function
EXAMPLE
OF 10 PEOPLE:
3 OF THEM ARE 25 YEARS OLD,
4 OF THEM ARE 30 YEARS OLD,
2 OF THEM ARE 40 TEARS OLD,
1 OF THEM IS 45 YEARS OLD.
PROBABILITY THAT A SINGLE DRAW WILL BE:
25 YEARS OLD IS 3/10 = 0.3
30 YEARS OLD IS 4/10 = 0.4,
40 TEARS OLD IS 2/10 = 0.2,
45 YEARS OLD IS 1/10 = 0.1.
DISCRETE RANDOM VARIABLE
POPULATION
‹#›
© 2019 McGraw-Hill Education.
Graphical Representation for a Discrete Random Variable (Age)
‹#›
© 2019 McGraw-Hill Education.
Distributions of Random Variables
The probabilities of individual outcomes for a continuous random variable are represented by a probability density function (pdf)
A special type of continuous random variable is called normal random variable, which has a “bell shaped” pdf
‹#›
© 2019 McGraw-Hill Education.
Graphical Representation for a Normal Random Variable
‹#›
© 2019 McGraw-Hill Education.
Distributions and Sample Statistics
For a normal random variable, and any other continuous random variable, the pdf allows us to calculate the probabilities that the random variable falls in various ranges.
The probability that a random variable falls between two numbers A and B is the area under the pdf curve between A and B
‹#›
© 2019 McGraw-Hill Education.
Probability that a Random Variable Falling Between Two Numbers
‹#›
© 2019 McGraw-Hill Education.
Distributions of Random Variables
Expected Value or Population Mean
The summation of each possible realization of Xi multiplied by the probability of that realization.
Variance
A common measure for the spread of the distribution; defined by E[(Xi – E(Xi)2].
Standard Deviation
The square root of the variance.
‹#›
© 2019 McGraw-Hill Education.
Data Samples and Sample Statistics
Sample Size of N
A collection of N realizations of Xi ; {Xi, X2…. XN }
Sample Statistics
Single measures of some feature of a data sample
Sample Mean
A common measure of the center of a sample
‹#›
© 2019 McGraw-Hill Education.
Sample Variance
Common measure of the spread of a sample
For a sample size of N for random variable Xi is:
Sample Standard Deviation
The square root of the sample variance
For a sample size of N for random variable Xi is:
Data and Sample Statistics
‹#›
© 2019 McGraw-Hill Education.
Confidence Interval
Suppose a firm wants to know the average age of its customers
It collects data from 872 of its customers, thus its sample size
Agei = a random variable defined as the age of a single customer
agei = the observed age of customer i in the sample
‹#›
© 2019 McGraw-Hill Education.
Confidence Interval
Estimator
A calculation using sample data that is used to provide information about a population parameter
Random sample
A sample where every member of the population has an equal chance of being selected
‹#›
© 2019 McGraw-Hill Education.
Confidence Interval
Deductive argument: If we have a random sample, the sample mean is a “reasonable guess” for the population mean
Inductive argument: Then the population mean is the same as the sample mean
How sure are we that the population mean in our example is the same as the sample mean?
Confidence interval a range of values such that there is a specified probability that they contain a population parameter
‹#›
© 2019 McGraw-Hill Education.
Confidence Interval
How do we build confidence intervals and determine their objective degree of support?
Independent
The distribution of one random variable does not depend on the realization of the another
Independent and identically distributed (i.i.d)
The distribution of one random variable does not depend on the realization of another and each has identical distribution.
‹#›
© 2019 McGraw-Hill Education.
Confidence Interval
Unbiased estimator
An estimator whose mean is equal to the population parameter it is used to estimate
Population standard deviation
The square root of the population variance
Population variance
The variance of a random variable over the entire population
‹#›
© 2019 McGraw-Hill Education.
Data and Sample Statistics
In order to construct a confidence interval for the population mean and know its objective degree of support, we must know something about its standard deviation and its type of distribution
The assumption that a data sample is a random sample implies the standard deviation of the sample mean is
The spread of the sample mean gets smaller as the sample size increases
‹#›
© 2019 McGraw-Hill Education.
Data and Sample Statistics
Assuming a random sample with reasonably large N (> 30) implies that the sample mean is normally distributed with the mean of µ and standard deviation of
This can be written as:
‹#›
© 2019 McGraw-Hill Education.
Probability Sample Mean within 1.96 Standard Deviations of Population Mean
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
Hypothesis test is the process of using sample data to assess the credibility of a hypothesis about a population
Making an assessment
Reject the hypothesis
Fail to reject the hypothesis
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
Null hypothesis
The hypothesis to be tested using a data sample
Written as H0: µ = K, where K is the hypothesized value for the population mean
The objective is to determine whether the null hypothesis is credible given the data we observe.
If a sample of size N is a random sample, N is “large” (>30) and µ = K, then
‹#›
© 2019 McGraw-Hill Education.
Probability Sample Mean within 1.96 Standard Deviations of Hypothesized Population Mean
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
Steps in Hypothesis Testing:
State the null hypothesis
Collect the data sample and calculate the sample mean
Decide whether or not to reject the deduced distribution for the sample mean
Degree of support
Measure how many standard deviations the sample mean is from the hypothesized population mean
Z =
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
To calculate Z, take the difference between the sample mean and the hypothesized population mean ()
Then take that difference and divide it by the standard deviation of the sample mean ()
t-stat is the difference between the sample mean and the hypothesized population mean () divided by the sample standard deviation (), or t =
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
Test statistic
Any single value derived from a sample that can be used to perform a hypothesis test
p-value
The probability of attaining a test statistic at least as extreme as the one that was observed
‹#›
© 2019 McGraw-Hill Education.
Graphical Illustration of a P-Value
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
The t-stat is an observed value from a t-distribution, a distribution that resembles a normal distribution and is centered at zero
In excel a p-value can be calculated using the formula:
2 × (1-norm.s.dist(, true))
If the observed t-stat is very unlikely (has a low p-value), then reject this distribution and vice versa
If the p-value is less than the cutoff, reject, and fail to reject otherwise
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
Cutoffs using p-values directly correspond to the degrees of support you chose for your inductive argument
If your chosen degree of support is D%, then the cutoff is
100 D%, or 1 D/100
Rejections will be incorrect 5% of the time using this rule because 5% of the time you will observe a p-value less than 0.05 even though the deduced distribution for the sample mean is correct
‹#›
© 2019 McGraw-Hill Education.
Hypothesis Testing
Standard degrees of confidence used are 90%, 95%, and 99%, the standard cutoffs using p-values are 0.10, 0.05, and 0.01
Reject the distribution if the p-value is less than 0.10; fail to reject otherwise. This generates a degree of support of 90%.
Reject the distribution if the p-value is less than 0.05; fail to reject otherwise. This generates a degree of support of 95%.
Reject the distribution if the p-value is less than 0.01; fail to reject otherwise. This generates a degree of support of 99%.
‹#›
© 2019 McGraw-Hill Education.
The Interplay Between Deductive and Inductive Reasoning in Active Predictions
The underlying reason for active predictions:
Forming the prediction uses deductive reasoning
Assume the causal relationship, when then implies the prediction
Estimating the causal relationship uses deductive and inductive reasoning
Deductive reasoning: Make assumptions that imply causality between X and Y and the distribution of an estimator for the magnitude of this causality in the population
Inductive reasoning: Using an observed data sample, build a confidence interval and/or determine whether to reject a null hypothesis for the magnitude of the population-level causality
‹#›
© 2019 McGraw-Hill Education.
image1
image2.JPG
image3.JPG
image4
image5.JPG
image6.JPG
image7.JPG
image8
image9.JPG
image10
image10.JPG
image12
image11.JPG
image14
image15
image12.JPG
image17
image18