Assigned Readings:
Chapter 1. The Roles of Data and Predictive Analytics in Business
Chapter 2. Reasoning with Data
Initial Postings: Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.
Also, provide a graduate-level response to each of the following questions:
- Based on what you have read in Chapters 1-2, please explain how data analytics applies to your current or future role? What value can data analytics bring to your position? Please share your thoughts. Please cite examples according to APA standards.
[Your post must be substantive and demonstrate insight gained from the course material. Postings must be in the student’s own words – do not provide quotes!]
[Your initial post should be at least 450+ words and in APA format (including Times New Roman with font size 12 and double spaced). Post the actual body of your paper in the discussion thread then attach a Word version of the paper for APA review]
The Roles of
Data
and Predictive Analytics in Business
Chapter 1
© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Learning Objectives
Explain how predictive analytics can help in business strategy formulation.
Distinguish structured from unstructured data.
Differentiate units of observation.
Outline a data-generating process.
Describe the primary ways that data analysis is used to aid business performance.
Discriminate between lead and lag information.
Discriminate between active and passive prediction.
Recognize questions pertaining to business strategy that may utilize (active) predictive analytics.
‹#›
© 2019 McGraw-Hill Education.
Defining Data & Data Uses in Business
Data
A collection of information
Database
Organized collection of data that firms use for analysis
Business analytics
The use of data analysis to aid in business decision making
Predictive analytics
The use of data analysis to designed to form predictions about future, unknown, events or outcomes
‹#›
© 2019 McGraw-Hill Education.
Business Strategy
Plan of action designed by a business practitioner to achieve a business objective
Business objectives include profit maximization, enhanced employee satisfaction, etc.
Examples of action include pricing decisions, advertising campaigns, and methods of employee compensation
‹#›
© 2019 McGraw-Hill Education.
Predictive Analytics for Business Strategy
With no data, strong theoretical model is often not enough to predict effective business strategies
Sound theoretical arguments coupled with data becomes a strong tool to predict effective business strategies
Predictive analytics is an ideal complement to create a successful business strategy
‹#›
© 2019 McGraw-Hill Education.
Data Features
Structured Data
Data with well-defined units of observation which can be classified and structured in the form of a spreadsheet.
An example:
‹#›
© 2019 McGraw-Hill Education.
Data Features
Unstructured Data
Any data that cannot be classified and structured.
An example:
‹#›
© 2019 McGraw-Hill Education.
The Unit of Observation
The entity for which information has been collected
Crucial component of structured data
Tells us the way the in which the information in a dataset varies
Answers the questions: What, Where, Who, When?
Four main groupings: cross-sectional data, pooled cross-sectional data, time-series data, and panel data
‹#›
© 2019 McGraw-Hill Education.
Data Types
Cross-Sectional Data
Data that provide a snapshot of information at one fixed point in time.
An example:
‹#›
© 2019 McGraw-Hill Education.
Data Types
Pooled Cross-Sectional Data
Combination of two or more unrelated cross-sectional data merged into one.
An example
‹#›
© 2019 McGraw-Hill Education.
More Data Types
Time-series data:
Data that exhibit only variation in time
An example
‹#›
© 2019 McGraw-Hill Education.
More Data Types
Panel data
Same cross-sectional units over multiple points in time
An example
‹#›
© 2019 McGraw-Hill Education.
Data Generating Process (DGP)
Data Generating Process
The underlying mechanism that produces the pieces of information contained in a dataset
Steps for DGP
Establish both formal and informal DGP
Understand what variables are important
Create a representative statistical model
Collect and analyze relevant variables and perform simple tests
‹#›
© 2019 McGraw-Hill Education.
Basic Uses of Data Analysis for Business
Categories include:
Queries
Pattern discovery
Causal inference
‹#›
© 2019 McGraw-Hill Education.
Queries
Any request for information from a database
Descriptive Statistics
Quantitative measures meant to summarize and interpret properties of a dataset
Pivot Table
A tool for data summarization that enables different views of the underlying dataset.
‹#›
© 2019 McGraw-Hill Education.
Pattern Discovery
Pattern
Any distinct relationship between observations within a dataset
Pattern discovery
The process of identifying distinctive relationships between observations in a dataset
Data mining
Pattern discovery, typically in large datasets
‹#›
© 2019 McGraw-Hill Education.
Pattern Discovery
Types of Pattern Discovery
Association analysis
Looking for conditional probabilities to determine relationships between two or more variables
Cluster analysis
Groups of observations according to some measure of similarity
Outlier detection
Small subsets of observations, if they exist, that contain information far different from the vast majority of the observations in the dataset
‹#›
© 2019 McGraw-Hill Education.
17
Examples of Pattern Discovery
Example of Outlier Detection and Cluster Analysis
Example of Association Analysis: Scatter Plot on Profit & Price
‹#›
© 2019 McGraw-Hill Education.
Causal Inference
The process of establishing a causal relationship between a variable(s)representing a cause and a variable(s) representing an effect, where a change in the cause variable results in changes in the effect variable
Causal Inference
Direct: A change in the causal variable, X, directly affects change variable Y
Indirect: A change in X causes a change in Y, but only through its impact on a third variable, Z
‹#›
© 2019 McGraw-Hill Education.
Use of Causal Inference
Causal inference occurs in two ways:
Causal Inference has two important applications
Using Experimentation
Econometric Models
Prediction
Campaign Evaluation
‹#›
© 2019 McGraw-Hill Education.
Data Analysis for the Past, Present, and Future
Lag information
Information about past outcomes
Typically contains information on key performance indicators (KPIs), or variables that are used to help measure firm performance
Designed to answer the question, “What happened/ What is happening?”
Lag information can be generated by queries, pattern discovery, and causal inference
‹#›
© 2019 McGraw-Hill Education.
Examples of Lag Information
Reports
Any structured presentation of the information in a dataset
Scorecards
Any structured assessment of variables of interest, typically KPIs, against a given benchmark
Dashboards
A graphical presentation of the current standing and historical trends for variables of interest, typically KPIs
‹#›
© 2019 McGraw-Hill Education.
Report Example
‹#›
© 2019 McGraw-Hill Education.
Dashboard Example
‹#›
© 2019 McGraw-Hill Education.
Scorecard Example
‹#›
© 2019 McGraw-Hill Education.
Lead Information
Lead Information
Information that provides insights about the future
Designed to answer the question, “What is going to happen?”
It helps firms in its future planning process with expectations and strategic moves.
Lead information is not generally presented in a standardized format
‹#›
© 2019 McGraw-Hill Education.
Predictive Analytics and Lead Information
Predictive analytics is data analysis designed to provide lead information
Two ways predictive analytics can predict the future
Active prediction
Passive Prediction
‹#›
© 2019 McGraw-Hill Education.
Passive Prediction
Passive prediction uses predictive analytics to make predictions based on actual or hypothetical data, where no variables are exogenously altered.
Exogenously altered – a variable in a dataset that changes due to factors outside the data-generating process that are independent of all other variables within the data-generating process
Examples: Weather forecasting, prediction about customers likely to drop service etc.
Pattern discovery (data mining) when used to make predictions, is generally used for passive predictions
Model fit – the basis on which analysts choose among competing models for passive prediction
‹#›
© 2019 McGraw-Hill Education.
Active Prediction
Active prediction uses predictive analytics to make predictions based on actual or hypothetical data, for which one or more variables are exogenously altered.
Making active predictions need causal relationship between variable ‘X’ and variable ‘Y’.
If change in X affects Y, this occurs due to a causal relationship between the two.
‹#›
© 2019 McGraw-Hill Education.
Active Prediction for Business Strategy Formation
Predicting an outcome for alternative strategies requires the application of active prediction
To accurately predict an outcome for a range of competing strategies, you must establish the causal effects of those strategies in that outcome
The leap from correlation to causality is a large one, and can lead to grossly incorrect predictions
‹#›
© 2019 McGraw-Hill Education.
image1
image2
image3
image4
image5
image6
image7
image8
image9
image10
image11
image12
image13
Reasoning with Data
Chapter 2
© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Learning Objectives
Define reasoning.
Execute deductive reasoning.
Explain an empirically testable conclusion.
Execute inductive reasoning.
Differentiate between deductive and inductive reasoning.
Explain how inductive reasoning can be used to evaluate an assumption.
Describe selection bias in inductive reasoning.
‹#›
‹#›
© 2019 McGraw-Hill Education.
What is Reasoning?
Reasoning is the process of forming conclusions, judgments, or inferences from facts or data
Reasoning and logic are often used interchangeably
Logic is a description of the rules and/or steps behind the reasoning process
‹#›
‹#›
© 2019 McGraw-Hill Education.
Two Arguments
Argument 1:
The companies profits are up more than 10% over the past year. An increase in profits of 10% is the result of excellent management. You were the manager over the past year. Therefore, I conclude that you engaged in excellent management last year.
Argument 2:
Ten of your 300 employees came to me with complaints about your management. They indicated that you treated them unfairly by not giving them a raise they deserved. Therefore, I conclude that all of your employees are disgruntled with your management.
‹#›
‹#›
© 2019 McGraw-Hill Education.
Understanding Reasoning
In presenting the two arguments, the goal is not to make a definitive decision about which you believe (if either)
The goal is to think about and distinguish different “lines” of reasoning
In distinguishing between the different types of reasoning, you will be able to establish why you believe or question the claims made in the two arguments
‹#›
‹#›
© 2019 McGraw-Hill Education.
Two Major Types of Reasoning
Reasoning
Deductive Reasoning
Inductive Reasoning
Both play an important role in interpreting and drawing conclusions from data analysis
‹#›
‹#›
© 2019 McGraw-Hill Education.
Deductive Reasoning
Deductive Reasoning
Goes from the general to the specific
Also known as top-down logic
Seeks to prove statements of the form “If A, then B”
‹#›
‹#›
© 2019 McGraw-Hill Education.
Deductive Reasoning
Such reasoning always implies three underlying components: assumptions (“If A”), methods of proof (“then”), and conclusions (“B”)
‹#›
‹#›
© 2019 McGraw-Hill Education.
Deductive Reasoning
The purest applications of deductive reasoning are in the field of mathematics
Two of the most used approaches are direct proofs and transposition
Direct proofs
Proof that begins with assumptions, explains methods of proof, and states the conclusion(s)
Transposition
Any time a group of assumptions implies a conclusion, then it is also true that any time the conclusion does not hold, at least one of the assumptions must not hold
‹#›
‹#›
© 2019 McGraw-Hill Education.
Direct Proof
Let’s prove the following statement by direct proof:
If X and Y are odd numbers, then their sum (X + Y) is an even number
An Example:
If X = 5(odd) and Y = 9(odd), then their sum X + Y = 14 is an even number
Failing to find a contradiction is not the same a proving a statement is generally true
‹#›
‹#›
© 2019 McGraw-Hill Education.
Direct Proof: A Mathematical Approach
If X and Y are odd numbers, then their sum (X + Y) is an even number
X and Y are odd numbers
If X is an odd number, then X can be written as X=2K+1, where K is an integer. (Example: X=13 X=(2 × 6)+1)
If Y is an odd number, then Y can be written as Y =2M+1, where M is an integer, (Example: Y=23 Y=(2 × 11)+1)
X+Y=(2K+1)+(2M+1)=2K+2M+2=2(K+M+1)
K+M+1 is an integer so X+Y is 2 times an integer
Any number that is 2 times an integer is divisible by 2
This means X+Y is even
‹#›
‹#›
© 2019 McGraw-Hill Education.
Direct Proof: Common Sense Approach
“If McDonald’s offers breakfast all day, their revenues will increase.”
McDonald’s stores offer breakfast all day.
The addition of breakfast during lunch/dinner hours implies more choices.
Customers already choosing McDonald’s during lunch/dinner hours can continue buying the same meals at McDonald’s.
Customers not choosing McDonald’s during lunch/dinner hours may start eating at McDonald’s.
Retaining current customers and adding new ones, McDonald’s revenues will increase overall.
‹#›
‹#›
© 2019 McGraw-Hill Education.
Transposition
While direct proofs are sufficient to prove a point logically, an alternative approach, transposition, may be more effective
Transposition
Is the equivalence between the statements “If A, then B” and “If not B, then A”
Any time a group of assumptions implies a conclusion, then it is also true that any time the conclusion does not hold, then at least one of the assumptions must not hold
‹#›
‹#›
© 2019 McGraw-Hill Education.
Transposition
“If A, then B” AND “If not B, then not A”
ASSUMPTIONS
(A)
CONCLUSIONS
(B)
‹#›
‹#›
© 2019 McGraw-Hill Education.
Transposition: A Mathematical Approach
Prove the statement: If X2 is even, then X is even
Suppose X is not an even number; it is instead an odd number
If X is an odd number, then X= (2K +1), where K is an integer
X2 = (2K+1)2 = 4K2 + 4K+1.
4K2 + 4K = 4(K2 +K) and so is divisible by 2
4K2 + 4K is an even number
X2 = 4(K2 +K)+1 is an even number plus 1, meaning it is an odd number
‹#›
‹#›
© 2019 McGraw-Hill Education.
Transposition
The statement was: If X2 is even, then X is even
Using transposition, the opposite of the conclusion is used to proof the opposite of the assumption: If X is odd, then X2 is even would be incorrect
Transposition can also be used without using mathematics to prove statements like “If A, then B”.
Transposition can be particularly effective if an assumption seems indisputably obvious.
‹#›
‹#›
© 2019 McGraw-Hill Education.
Transposition: An Example
Proof the statement: “If McDonald’s stores offer breakfast all day, revenue will increase”
McDonald’s stores revenues will not increase
This means total revenues from current and new customers will not increase
This means either there will be no new customers or revenues from current customers will decrease
This means there could not have been an expansion in the menu
McDonald’s stores do not offer breakfast all day
‹#›
‹#›
© 2019 McGraw-Hill Education.
Direct Proof and Transposition
Direct Proof
State assumptions
Explain methods of proof (mathematics, common sense, etc.)
State conclusions
Transposition
Assume the opposite of the conclusion
Explain methods of proof (mathematics, common sense, etc.)
State assumption(s) that is (are) violated (not A)
‹#›
‹#›
© 2019 McGraw-Hill Education.
Deductive Reasoning
Used commonly in the application of law
If there is disagreement with a conclusion there are two possible sources:
The method of proof, OR
The assumption
There are two ways of resolving disputes about assumptions
Show robustness- the persistent accuracy of a conclusion despite variation in the associated assumption(s) within the context of a deductive argument
Assess consistency with a collected dataset
‹#›
‹#›
© 2019 McGraw-Hill Education.
Empirically Testable Conclusions
An empirically testable conclusion is a conclusion whose validity can be meaningfully tested using observable data.
Example:
A banana company’s management staffs are divided into two groups about their product’s placement in a major grocery store chain.
Group 1 believes that change in current location will increase its sales.
Group 2 believes that current location is good enough.
‹#›
‹#›
© 2019 McGraw-Hill Education.
Empirically Testable Conclusions
Company has the sales data in the current location.
Company chooses to move its product to a new location and collects sales data.
Now the company can meaningfully test the validity of the management’s competing conclusions.
Making the actual decision about the validity of an empirically testable conclusion based on observable data is an application of inductive reasoning
‹#›
‹#›
© 2019 McGraw-Hill Education.
Inductive Reasoning
Inductive reasoning
Reasoning that goes from the specific to the general; bottom-up logic
Population
The entire set of potential observations about which we want to learn
Data sample
A subset of population that is collected and observed
‹#›
‹#›
© 2019 McGraw-Hill Education.
Inductive Reasoning
Business regularly collect data samples to draw conclusions about the population after applying inductive reasoning.
The conclusion from inductive reasoning requires degree of support (also called inductive probability).
Degree of support is also called the strength of the inductive argument.
Example: if we are 50% confident about the conclusion, then the degree of support is 50%.
‹#›
‹#›
© 2019 McGraw-Hill Education.
Degrees of Support
Two Types of Degrees of Support
Both play an important role in interpreting and drawing conclusions from data analysis
SUBJECTIVE DEGREE OF SUPPORT
(IT IS BASED ON OPINION AND LACKING STATISTICAL FOUNDATION)
OBJECTIVE DEGREE OF SUPPORT
(IT HAS A STATISTICAL FOUNDATION AND THUS MORE CREDIBLE THAN SUBJECTIVE DEGREE OF SUPPORT)
‹#›
‹#›
© 2019 McGraw-Hill Education.
Evaluating Assumptions
Through deductive reasoning, an empirically testable conclusion is made
Collect a data sample
Test the conclusion by comparing the observed outcomes in the data samples to their corresponding probabilities
Use inductive reasoning to decide whether the conclusion passes or fails
If it fails, transposition implies we must reject
If it passes, then we must not reject
‹#›
‹#›
© 2019 McGraw-Hill Education.
Inductive Reasoning for Evaluating Assumptions
‹#›
‹#›
© 2019 McGraw-Hill Education.
Selection Bias in Inductive Reasoning
Improper use of inductive reasoning may lead to inaccurate, or biased conclusions
Data-generating process is typically the source of the bias
Survey questions constructed in a leading way
Confirmation bias is the tendency to confirm a claim
Predictable patterns are discovered
Predictable-world bias is the tendency to find order when none exists, and occurs when people “read too much” into perceived patterns from random data
‹#›
‹#›
© 2019 McGraw-Hill Education.
Selection Bias
Selection bias
The act of drawing conclusions about a population using a selected data sample, without accounting for the means of selection
There are two common types:
Collector selection bias occurs when the collector selects the members of the data sample in a systematic way
Availability bias occurs when the collector of the data sample selects the members of the data sample according to what is most readily available
Member selection bias occurs when potential members of the data sample self-select into, or out of, the sample
‹#›
‹#›
© 2019 McGraw-Hill Education.
image1
image2
image3.JPG