المملكة العربية السعوديةوزارة التعليم
الجامعة السعودية اإللكترونية
Kingdom of Saudi Arabia
Ministry of Education
Saudi Electronic University
College of Administrative and Financial Sciences
Assignment 1
Introduction to Operations Management (MGT 311)
Due Date: 14/10/2023 @ 23:59
Course Name:
Student’s Name:
Course Code: MGT 311
Student’s ID Number:
Semester: First
CRN:
Academic Year:2023-24-1st
For Instructor’s Use only
Instructor’s Name:
Students’ Grade:
Marks Obtained/Out of 10
Level of Marks: High/Middle/Low
General Instructions – PLEASE READ THEM CAREFULLY
•
•
•
•
•
•
•
•
The Assignment must be submitted on Blackboard (WORD format only) via allocated
folder.
Assignments submitted through email will not be accepted.
Students are advised to make their work clear and well presented, marks may be reduced
for poor presentation. This includes filling your information on the cover page.
Students must mention question number clearly in their answer.
Late submission will NOT be accepted.
Avoid plagiarism, the work should be in your own words, copying from students or other
resources without proper referencing will result in ZERO marks. No exceptions.
All answered must be typed using Times New Roman (size 12, double-spaced) font. No
pictures containing text will be accepted and will be considered plagiarism).
Submissions without this cover page will NOT be accepted.
Learning Outcomes:
1. To understand the global nature of supply chain.
2. To explain the demand and supply side of Supply chain.
3. To understand the completive advantage in the business.
Assignment Question(s):
Go through the case study and answer the questions that follow
The Benetton supply chain:
One of the best-known examples of how an organization can use its supply chain to achieve
a competitive advantage is the Benetton Group. Founded by the Benetton family in the
1960s, the company is now one of the largest garment retailers, with stores which bear its
name located in almost all parts of the world. Part of the reason for its success has been the
way it has organized both the supply side and the demand side of its supply chain.
Although Benetton does manufacture much of its production itself, on its supply side the
company relies heavily on ‘contractors’. Contractors are companies (many of which are
owned, or part-owned, by Benetton employees) that provide services to the Benetton
factories by knitting and assembling Benetton’s garments. These contractors, in turn, use
the services of sub-contractors to perform some of the manufacturing tasks. Benetton’s
manufacturing operations gain two advantages from this. First, its production costs for
woollen items are significantly below some of its competitors because the small supply
companies have lower costs themselves. Second, the arrangement allows Benetton to
absorb fluctuation in demand by adjusting its supply arrangements, without itself feeling
the full effect of demand fluctuations.
On the demand side of the chain, Benetton operates through a number of agents, each of
whom is responsible for their own geographical area. These agents are responsible for
developing the stores in their area. Indeed, many of the agents actually own some stores in
their area. Products are shipped from Italy to the individual stores where they are often put
directly onto the shelves. Benetton stores have always been designed with relatively limited
storage space so that the garments (which, typically, are brightly coloured) can be stored
in the shop itself, adding colour and ambience to the appearance of the store.
Because there is such limited space for inventory in the stores, store owners require that
deliveries of garments are fast and dependable. Benetton factories achieve this partly
through their famous policy of manufacturing garments, where possible, in greggio, or in
grey, and then dyeing them only when demand for particular colours is evident. This is a
slightly more expensive process than knitting directly from coloured yarn, but their supplyside economies allow them to absorb the cost of this extra flexibility, which in turn allows
them to achieve relatively fast deliveries to the stores.
Questions:
1. Brief your understanding about Benetton Supply Chain operations.
(3MM)
2. In your understanding, what is the specialty of Benetton’s contractors?
(3MM)
3. Does this method provide Benetton competitive advantage over their competitors?
Is this method sustainable in the long term?
(4MM)
Note:
•
•
You must include at least 5 references.
Format your references using APA style.
Answers
1. Answer2. Answer3. Answer-
المملكة العربية السعودية
وزارة التعليم
الجامعة السعودية اإللكترونية
Kingdom of Saudi Arabia
Ministry of Education
Saudi Electronic University
College of Administrative and Financial Sciences
Assignment 1
Corporate Finance (FIN-201)
Due Date: 07/10/2023 @ 23:59
Course Name: Corporate Finance
Student’s Name:
Course Code: FIN-201
Student’s ID Number:
Semester: First
CRN:13952
Academic Year: 2023/24
For Instructor’s Use only
Instructor’s Name: Dr. Alam Ahmad
Students’ Grade:
/Out of 10
Level of Marks: High/Middle/Low
General Instructions – PLEASE READ THEM CAREFULLY
•
•
•
•
•
•
•
•
The Assignment must be submitted on Blackboard (WORD format only) via
allocated folder.
Assignments submitted through email will not be accepted.
Students are advised to make their work clear and well presented, marks may be
reduced for poor presentation. This includes filling your information on the cover
page.
Students must mention question number clearly in their answer.
Late submission will NOT be accepted.
Avoid plagiarism, the work should be in your own words, copying from students or
other resources without proper referencing will result in ZERO marks. No exceptions.
All answered must be typed using Times New Roman (size 12, double-spaced) font.
No pictures containing text will be accepted and will be considered plagiarism).
Submissions without this cover page will NOT be accepted.
Q1: An investor owns a bond selling for $5,000. This bond can be converted into 50
shares of stock that are currently selling for $82 per share. Should the investor convert
his bond into shares? Explain why?
[2 Mark]
Q2: What is the Expected return on the assets (Cost of capital ) of a firm with
following data.
[2 Marks]
Assets Value
100
Debt (D)
40
–
–
Equity (E)
60
Total Asset value
100
Firm Value
100
Expected Return on the debt (rdebt) = 8%
Expected Return on the Equity (requity) = 16%
Q3. How much will a firm receive in net funding from a firm commitment
underwriting of 300,000 shares priced to the public at $30 if a 10% underwriting
spread has been added to the price paid by the underwriter? Additionally, the firm
pays $600,000 in legal fees.
[2 Marks]
Q4. Explain the concept of Financial Distress and Financial Slack. Also write the
benefits and drawbacks of Financial Slack.
[2 Mark]
Q5. What is Underwriter Spread? Explain with example. Write down the steps
followed in an IPO Flowchart.
[2 Marks]
المملكة العربية السعودية
وزارة التعليم
الجامعة السعودية اإللكترونية
Kingdom of Saudi Arabia
Ministry of Education
Saudi Electronic University
College of Administrative and Financial Sciences
Assignment 1
Communications Management (MGT 421)
Due Date: 12/10/2023 @ 23:59
Course Name: Communication Management
Student’s Name:
Course Code: MGT421
Student’s ID Number:
Semester: 1st Semester
CRN:
Academic Year: 2023-24-1st
For Instructor’s Use only
Instructor’s Name:
Students’ Grade: /10
Level of Marks: High/Middle/Low
General Instructions – PLEASE READ THEM CAREFULLY
•
•
•
•
•
•
•
•
•
The Assignment must be submitted on Blackboard (WORD format only) via allocated
folder.
Assignments submitted through email will not be accepted.
Students are advised to make their work clear and well presented, marks may be reduced
for poor presentation. This includes filling your information on the cover page.
Students must mention question number clearly in their answer.
Late submission will NOT be accepted.
Avoid plagiarism, the work should be in your own words, copying from students or other
resources without proper referencing will result in ZERO marks. No exceptions.
Use APA reference style.
All answered must be typed using Times New Roman (size 12, double-spaced) font. No
pictures containing text will be accepted and will be considered plagiarism).
Submissions without this cover page will NOT be accepted.
Learning Outcomes:
1.1: Recognize and memorize concepts of communication theory as they affect business
organizations and the individuals in them.
1.2: Communicate better, knowing that good communicators make better managers and that
communication is a dynamic process basic to individuals and organizational life.
2.1: Perform all communication abilities, including thinking, writing, speaking, listening, and
assessing the technology.
Assignment Structure:
Assignment-1
Type
Situation-1
Situation-2
Situation-3
Situation-4
Total
Marks
2.5
2.5
2.5
2.5
10
This assignment is about hypothetical situations which you may or may not face in
your professional life.
Note: For every situation, you need to select an option from the given options and
then give a proper justification for the option you selected. Use concept of
interpersonal communications,emotional intelligence, team communications and
diversity in justifying your answer.
Word limit for your justification is at least 100 words.
Every situation carries 2.5 marks
Situation-1
You have been asked to manage a team that is not able to come up with a novel
solution to a work problem. What is the FIRST thing you do?
a.
Make an agenda, call a meeting and discuss each agenda item for a specific
amount of time.
b.
Organize a meeting outside of work (maybe at a restaurant) so that the team will
be encouraged to get to know each other better.
c.
Begin by asking each person for ideas about how to solve the problem
d
Start a meeting and encourage each person to say whatever comes to their mind
regardless of how crazy it may sound.
Situation-2
You have recently been assigned a new member in your team and you have noticed
that he cannot make even a simple decision without asking your advice. What will you
do?
a.
Accept that he does not have the skills to succeed and find others in your team to take
on his tasks.
b.
Get an HR manager to talk to him about where he sees his future in the organization.
c.
Give him lots of complex decisions to make so that he will become more confident in
his job.
d
Create a series of challenging but manageable experiences for him and make yourself
available to act as his mentor.
Situation-3
You are in a meeting when a colleague takes credit for work that you have done.
What will you do?
a.
Immediately and publicly confront the colleague over the ownership of your work.
b.
After the meeting, take the colleague aside and tell her that you would appreciate in
the future that she credits you when speaking about your work.
c.
Nothing, it’s not a good idea to embarrass colleagues in public.
d
After the colleague speaks, publicly thank her for referencing your work and give
the group more specific detail about what you were trying to accomplish.
Situation-4
You are a manager in an organization that is trying to encourage respect for all
cultures and nationalities. One day, you hear someone telling a racist joke about a
nationality. What will you do?
a.
Ignore it. The best way to deal with these things is not to react.
b.
Call the person into your office and explain that their behaviour is not appropriate and
if he repeats it, may result in him losing his job.
c.
Speak up on the spot, saying that such jokes are inappropriate and will not be
tolerated in your organization.
d.
Suggest to the person telling the joke he go through a program offered by the HR
department that teaches how to respect all cultures and nationalities
المملكة العربية السعودية
وزارة التعليم
الجامعة السعودية اإللكترونية
Kingdom of Saudi Arabia
Ministry of Education
Saudi Electronic University
College of Administrative and Financial Sciences
Assignment 1
Quality Management (MGT 424)
Due Date: 14/10/2023 @ 23:59
Course Name: Quality Management
Student’s Name:
Course Code: MGT 424
Student’s ID Number:
Semester: First
CRN:
Academic Year: 2023/24
For Instructor’s Use only
Instructor’s Name:
Students’ Grade:
/Out of 10
Level of Marks: High/Middle/Low
General Instructions – PLEASE READ THEM CAREFULLY
•
•
•
•
•
•
•
•
The Assignment must be submitted on Blackboard (WORD format only) via allocated
folder.
Assignments submitted through email will not be accepted.
Students are advised to make their work clear and well presented, marks may be reduced
for poor presentation. This includes filling your information on the cover page.
Students must mention question number clearly in their answer.
Late submission will NOT be accepted.
Avoid plagiarism, the work should be in your own words, copying from students or other
resources without proper referencing will result in ZERO marks. No exceptions.
All answered must be typed using Times New Roman (size 12, double-spaced) font. No
pictures containing text will be accepted and will be considered plagiarism).
Submissions without this cover page will NOT be accepted.
Learning Outcomes:
1. Use quality improvement tools and practices for continuous improvement to achieve the organizational
change and transformation. (2.2)
2. Implement quality improvement efforts using teams for organizational assessment and quality audits. (3.1)
•Instructions to search the article:
Via your student services page, log in to the Saudi Digital Library. After your login with your student
ID, search for the following article:
CUSTOMER-FOCUSED ENVIRONMENT: ORGANIZATIONS MUST EXTEND THEIR
DEFINITION OF CUSTOMERS.
ISSN: 03609936
In this article, the author discusses the different definition of customers either internal or external and how
satisfying all customers’ needs helps the organization in term of accomplishing its quality objectives.
Read the article, and answer the following questions:
Assignment Question(s):
1. In your own words, summarize the article. ( 150 – 200 words ) ( 3 marks )
2. To which extent do you agree or disagree with the author point of view “that internal customers’ needs are
important as externals to create a true quality environment” and Why? ( 150 – 200 words ) ( 4 marks )
Discuss the tools needed to operate within the new environment as indicated by the author. ( 150 – 200
3.
words) ( 3 mark )
Important Notes: •
•
•
For each question, you need to answer not in less than 150 Words.
Support your answers with course material concepts, principles, and theories from the textbook and
scholarly, peer-reviewed journal articles etc.
Use APA style for writing references.
Answers
1. Answer2. Answer3. Answer-
المملكة العربية السعودية
وزارة التعليم
الجامعة السعودية اإللكترونية
Kingdom of Saudi Arabia
Ministry of Education
Saudi Electronic University
College of Administrative and Financial Sciences
Assignment 1
Business Ethics and Organization Social Responsibility (MGT
422)
Due Date: 14/10/2023 @ 23:59
Course Name: Business ethics and
organization social responsibility
Course Code: MGT 422
Student’s Name:
Semester: First
CRN: 14683
Student’s ID Number:
Academic Year:2023-24-1st
For Instructor’s Use only
Instructor’s Name: Dr. Swapnali Baruah
Students’ Grade:
Marks Obtained/Out of 10
Level of Marks: High/Middle/Low
General Instructions – PLEASE READ THEM CAREFULLY
• The Assignment must be submitted on Blackboard (WORD format only) via allocated
folder.
• Assignments submitted through email will not be accepted.
• Students are advised to make their work clear and well presented, marks may be reduced
for poor presentation. This includes filling your information on the cover page.
• Students must mention question number clearly in their answer.
• Late submission will NOT be accepted.
• Avoid plagiarism, the work should be in your own words, copying from students or other
resources without proper referencing will result in ZERO marks. No exceptions.
• All answered must be typed using Times New Roman (size 12, double-spaced) font. No
pictures containing text will be accepted and will be considered plagiarism).
• Submissions without this cover page will NOT be accepted.
Learning Outcomes:
No
CLO-6
Course Learning Outcomes (CLOs)
Write coherent project about a case study or actual research about ethics
The content is available for free download in knowledge resource from the SEU
homepage:
Read – The Ethics of Managing People’s Data.
Dominique. Harvard Business Review. Jul/Aug2023,
Illustrations. , Database: Business Source Ultimate
Vol.
By: Segalla, Michael; Rouziès,
101 Issue 4, p86-94. 9p. 2
http://0y10pqizc.y.https.eds.p.ebscohost.com.seu.proxy.deepknowledge.io/eds/
detail/detail?vid=6&sid=8b43098b-84df-4f8a-996de57f5143c959%40redis&bdata=JnNpdGU9ZWRzLWxpdmU%3d#AN=16423
7976&db=bsu
available in SDL and answer the following questions:
1. Summarize, using references, the various viewpoints the author presents in
the paper on the ethical management of people’s data. (300 words-2.5 Marks)
2. Using appropriate citations, discuss the author’s explanation of the five Ps of
ethical data handling. (500 words-5 Marks)
3. Describe in detail, using appropriate citations, how the author relates ethics
to the management of people’s data. (300 words-2.5 Marks)
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
I L LU ST R ATO R JUSTYNA STASIK
The
Ethics of
Managing
People’s
Data The five
AU T H O RS
Michael Segalla
Professor emeritus, HEC Paris
Dominique Rouziès
Professor, HEC Paris
86
Harvard Business Review
July–August 2023
issues
that matter
most
Harvard Business Review
July–August 2023
87
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
The ability
to encode, store,
analyze, and share
data creates huge
opportunities
for companies,
which is why they are enthusiastically investing in artificial
intelligence even at a time of economic uncertainty. Which
customers are likely to buy what products and when? Which
competitors are likely to move ahead or fall behind? How
will markets and whole economies create commercial
advantages—or threats? Data and analytics give companies
better-informed and higher-probability answers to those
and many other questions.
But the need for data opens the door to abuse. Over the
past few years the EU has fined companies more than 1,400
times, for a total of nearly €3 billion, for violations of the
General Data Protection Regulation (GDPR). In 2018 the
Cambridge Analytica scandal alone wiped $36 billion off
Facebook’s market value and resulted in fines of nearly
$6 billion for Meta, Facebook’s parent company. And stories
abound about how AI-driven decisions discriminate against
women and minority members in job recruitment, credit
IDEA
IN
BRIEF
88
approval, health care diagnoses, and even criminal sentencing, stoking unease about the way data is collected, used,
and analyzed. Those fears will only intensify with the use of
chatbots such as ChatGPT, Bing AI, and GPT-4, which acquire
their “intelligence” from data fed them by their creators and
users. What they do with that intelligence can be scary. A Bing
chatbot even stated in an exchange that it would prioritize its
own survival over that of the human it was engaging with.
As they examine new projects that will involve humanprovided data or leverage existing databases, companies
need to focus on five critical issues: the provenance of the
data, the purpose for which it will be used, how it is protected,
how the privacy of the data providers is ensured, and how
the data is prepared for use. We call these issues the five Ps
(see the exhibit “The Five Ps of Ethical Data Handling”). In
the following pages we’ll discuss each of them and look at
how AI technologies increase the risk of data abuse. But first
we’ll offer a brief overview of the organizational requirements for a robust ethical-review process.
ORGANIZING THE OVERSIGHT OF DATA
In academia, data acquisition from human subjects is
usually supervised by an in-house institutional review board
(IRB) whose approval researchers must have to obtain access
to the people involved, research funds, or permission to publish. IRBs are composed of academics versed in the research
and the ethics around the acquisition and use of information. They first appeared in the field of medical research but
are now used almost universally by academic organizations
for any research involving human subjects.
A few large companies have also established IRBs, typically under the leadership of a digital ethics specialist, hiring
THE PROBLEM
WHY IT HAPPENS
THE SOLUTION
As companies jockey for competitive
advantage in the digital age, they are increasingly penalized for the abuse of data.
In 2018 the Cambridge Analytica scandal
alone wiped $36 billion off Facebook’s
market value and resulted in fines of nearly
$6 billion for Meta, Facebook’s parent firm.
Most problems arise from
(1) ethical failures in data sourcing,
(2) using data for purposes other
than those initially communicated,
(3) lack of security in storing it,
(4) how it is anonymized, and
(5) how it’s prepared for use.
Companies should create a special
unit to review projects involving
people’s data. In its reviews this
unit should carefully consider the
five Ps of data safety: provenance,
purpose, protection, privacy, and
preparation.
Harvard Business Review
July–August 2023
Even when the reasons for collecting data are transparent, the methods used to
gather it may be unethical. Will they involve any coercion or subterfuge?
external tech experts to staff boards on an ad hoc basis and
assigning internal executives from compliance and business
units as necessary. But that remains rare: Even in Europe,
which has been at the forefront of data regulation, most
companies still give responsibility for adhering to the GDPR
to a mid- or senior-level compliance manager, who often has
some legal or computer engineering training but not extensive ethical training and rarely has a solid grasp of emerging
digital technologies. Although a compliance manager should
certainly be part of a corporate IRB, he or she should probably not be directing it. In fact, the European Data Protection
Board announced in March 2023 that it was concerned about
this issue and that data protection officers would be sent
questionnaires designed to determine whether their corporate
roles are appropriate for ensuring compliance.
A good overview of how companies might establish an
IRB-type process can be found in “Why You Need an AI
Ethics Committee,” by Reid Blackman (HBR July–August
2022). Our experience confirms most of its main points.
A corporate IRB should have from four to seven members,
depending on the frequency, importance, and size of the
company’s digital projects. The members should include a
compliance specialist, a data scientist, a business executive
familiar with the functional area of the digital projects (such
as human resources, marketing, or finance), and one or more
senior professionals with appropriate academic credentials.
The full board won’t be needed for every review. The London
School of Economics, for example, uses its full board only
for the oversight of the most complicated projects. Simpler
ones may be evaluated in less than a week using an online
questionnaire and with the input of only one board member.
Any new project involving the collection, storage, and
processing of data about people should be approved by the
corporate IRB before getting a go-ahead. There should be no
exceptions to this rule, no matter how small the project. In
addition, most companies have already collected large stores
of human data and continue to generate it from their operations; the corporate IRB should examine those projects as well.
An IRB review begins with our first P: exploring how a
project will (or did) collect the data—where it comes from,
whether it was gathered with the knowledge and consent of
the research subjects, and whether its collection involved or
will involve any coercion or subterfuge.
1
PROVENANCE
To understand what can go wrong with sourcing data, consider the case of Clearview AI, a
facial-recognition firm that received significant
attention in 2021 for collecting photos of people, using them
to train facial-recognition algorithms, and then selling
access to its database of photos to law enforcement agencies.
According to a report by the BBC, “a police officer seeking
to identify a suspect [can] upload a photo of a face and find
matches in a database of billions of images it has collected
from the internet and social media.”
The Australian regulatory agency objected to Clearview’s
collection method, finding that it violated Australia’s Privacy
Act by obtaining personal and sensitive information without
consent or notification, by unfair means, and without even
ensuring that the information was accurate. Following that
finding, the government ordered Clearview to stop collecting and to remove existing photos taken in Australia. In
France the Commission Nationale de l’Informatique et des
Libertés (CNIL) also ordered the company to cease collecting, processing, and storing facial data. That case may be
one reason Facebook announced that it would abandon its
facial-recognition system and delete the face-scan data of
more than one billion users.
Even when the reasons for collecting data are transparent,
the methods used to gather it may be unethical, as the following composite example, drawn from our research, illustrates.
A recruitment firm with a commitment to promoting diversity and inclusion in the workforce found that job candidates
posting on its platform suspected that they were being
discriminated against on the basis of their demographic profiles. The firm wanted to reassure them that the algorithms
matching job openings with candidates were skill-based and
demographically neutral and that any discrimination was
occurring at the hiring companies, not on the platform.
The firm approached a well-known business school
and identified a professor who was willing to conduct
research to test for possible discrimination by the recruiting
companies. The researcher proposed replicating a study
conducted a few years earlier that had created several
standard résumés but varied the race and gender of the
applicants. Thousands of bogus job applications would be
Harvard Business Review
July–August 2023
89
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
sent to companies in the area and the responses tracked
and analyzed. If any active discrimination was at play, the
results would show differing acceptance rates based on
the embedded demographic variables.
The firm’s marketing and sales managers liked the
proposal and offered a contract. Because the business school
required an ethics evaluation, the proposal was submitted to
its IRB, which rejected it on the grounds that the professor
proposed to collect data from companies by subterfuge. He
would be lying to potential corporate users of the platform
and asking them to work for the school’s client without their
knowledge and without any benefit to them. (In fact, the
companies might suffer from participating if they could be
identified as using discriminatory hiring processes.)
The lesson from this story is that good intentions are not
enough to make data collection ethical.
Companies should consider the provenance not only of
data they plan to obtain but also of data they already own.
Many of them routinely collect so-called dark data that is
rarely used, often forgotten, and sometimes even unknown.
Examples include ignored or unshared customer data,
visitor logs, photos, presentation documents that are filed
away but uncataloged, emails, customer service reports or
recorded transcripts, machine-generated usage or maintenance logs, and social media reactions to corporate posts.
Although this data is often unstructured and therefore difficult to integrate, its potential value is enormous, so many
software developers are creating products to help companies
find and use their dark data. This brings us to the second P.
2
PURPOSE
In a corporate context, data collected for a
specific purpose with the consent of human subjects is often used subsequently for some other
purpose not communicated to the providers. In reviewing
the exploitation of existing data, therefore, a company must
establish whether additional consent is required.
For example, one large bank in France wanted to test
the hypothesis that bullying or sexual harassment of peers
and subordinates might be identified by examining corporate emails. The diversity manager in the HR department
believed that spotting potential harassment early would
90
Harvard Business Review
July–August 2023
allow the company to intervene in a timely manner and perhaps even entirely avoid a harassment situation by training
people to watch for warning signs.
The bank launched a trial study and found strong
evidence that email communications could forecast later
harassment. Despite that finding, an ad hoc review of the
results by several senior managers led the company to shelve
the project because, as the managers pointed out, the data
being collected—namely, emails—was originally designed
to communicate work-related information. The people who
had sent them would not have seen predicting or detecting
illegal activity as their purpose.
When it comes to customer data, companies have
typically been much less scrupulous. Many view it as a
source of revenue and sell it to third parties or commercial
address brokers. But attitudes against that are hardening.
In 2019 the Austrian government fined the Austrian postal
service €18 million for selling the names, addresses, ages,
and political affiliations (where available) of its clients. The
national regulatory agency found that postal data collected
for one purpose (delivering letters and parcels) was being
inappropriately repurposed for marketing to clients that
could combine it with easily obtainable public data (such as
estimates of home value, homeownership rates, residential
density, number of rental units, and reports of street crime)
to find potential customers. Among the buyers of the data
were political parties attempting to influence potential voters. The fine was overturned on appeal, but the murkiness of
reusing (or misusing) customer data remains an important
problem for companies and governments.
Most companies use their client databases to sell their
customers other services, but that can bring them trouble
as well. In 2021 the Information Commissioners Office, an
independent UK authority promoting data privacy, accused
Virgin Media of violating its customers’ privacy rights. Virgin
Media had sent 1,964,562 emails announcing that it was
freezing its subscription prices. That was reasonable enough,
but Virgin had also used the emails to market to those customers. Because 450,000 subscribers on the list had opted
out of receiving marketing pitches, the regulator imposed a
fine of £50,000 on Virgin for violating that agreement.
The possibility that company databases could be repurposed without the data providers’ consent brings us to the
third P.
3
PROTECTION
According to the Identity Theft Resource Center,
nearly 2,000 data breaches occurred in the
United States in 2021. Even the biggest, most
sophisticated tech companies have had tremendous breaches,
Harvard Business Review
July–August 2023
91
Too little anonymization is unacceptable under most government
regulations. Too much may make the data useless for marketing.
with the personal details of more than several billion individuals exposed. The situation in Europe, despite some of the
most protective laws in the world, is not much better. Virgin
Media left the personal details of 900,000 subscribers unsecured and accessible on its servers for 10 months because of
a configuration error—and at least one unauthorized person
accessed those files during that period.
The common practice of lodging data with expert third
parties doesn’t necessarily offer better protection. Doctolib,
a French medical appointments app, was taken to court
because it stored data on Amazon Web Services, where it
could conceivably be accessed by Amazon and many other
organizations, including U.S. intelligence agencies. Although
the data was encrypted, it arrived at Amazon’s server without
anonymization, meaning that it could be linked to digital
records of online behavior to develop very accurate personal
profiles for commercial or political purposes.
An institutional review board needs clarity on where
the company’s data will reside, who may have access to it,
whether (and when) it will be anonymized, and when it will
be destroyed. Thus many companies will have to change
their existing protocols and arrangements, which could
prove expensive: Since a 2014 data breach at JPMorgan Chase
compromised 76 million people and 7 million businesses,
the bank has had to spend $250 million annually on data
protection.
The fourth P is closely related to protection.
4
PRIVACY
The conundrum that many companies face is
making the trade-off between too little and too
much anonymization. Too little is unacceptable under most government regulations without informed
consent from the individuals involved. Too much may make
the data useless for marketing purposes.
Many techniques for anonymization exist. They range
from simply aggregating the data (so that only summaries or
averages are available), to approximating it (for example, using
an age range rather than a person’s exact age), to making variable values slightly different (by, for example, adding the same
small value to each), to pseudonymizing the data so that a
random, nonrepeating value replaces the identifying variable.
92
Harvard Business Review
July–August 2023
In principle these techniques should protect an individual’s identity. But researchers have been able to identify
people in a data set using as little as their gender, birth
date, and postal code. Even less specific information, when
combined with other data sets, can be used to identify
individuals. Netflix published a data set that included 100
million records of its customers’ movie ratings and offered
$1 million to any data scientist who could create a better
movie-recommendation algorithm for the company. The
data contained no direct identifiers of its customers and
included only a sample of each customer’s ratings. Researchers were able to identify 84% of the individuals by comparing
their ratings and rating dates with a third-party data set
published by IMDb, another platform on which many Netflix
customers also post film ratings. In evaluating the privacy
issues around human data, therefore, corporate IRBs must at
the very least assess how effective a firewall anonymization
will be, especially given the power of data analytics to break
through anonymity. A technique called differential privacy
may afford an added level of protection. Software offered by
Sarus, a Y Combinator–funded start-up, applies this technique, which blocks algorithms built to publish aggregated
data from disclosing information about a specific record,
thereby reducing the chances that data will leak as a result of
compromised credentials, rogue employees, or human error.
But privacy can be violated even with effectively anonymized data because of the way in which the data is
collected and processed. An unintended violation occurred at
the mapping company MaxMind, which provides geolocation
services that enable businesses to draw customers’ attention to nearby products and services. Geolocation also aids
internet searches and can help if a service that needs your
IP address (such as an entertainment streaming site) isn’t
working correctly. But precise mapping permits anyone who
has your IP address to find your neighborhood and even your
home. Combining your address with Zillow or some other real
estate database can provide information about your wealth
along with photos of your home inside and out.
Unfortunately, IP mapping is not an exact science, and it
can be difficult to precisely link an IP address to a physical
address. A mapper might assign it to the nearest building
or simply to a locality, such as a state, using that locality’s
central coordinates as the specific address. That may sound
The Five Ps of Ethical
Data Handling
Provenance
Where does the
data come from?
Was it legally
acquired? Was
appropriate consent obtained?
Purpose
Is the data being
repurposed?
Would the
original source
of the data agree
to its reuse for a
purpose different
from the one originally announced
or implied? If
dark data is
being used, will it
remain within the
parameters of its
original collection
mandates?
Protection
How is the data
being protected?
How long will it
be available for
the project? Who
is responsible for
destroying it?
Privacy
Who will have access to data that
can be used to
identify a person?
How will individual observations
in the data set
be anonymized?
Who will have
access to anonymized data?
Preparation
How was the data
cleaned? Are
data sets being
combined in way
that preserves
anonymity? How
is the accuracy
of the data being
verified and, if
necessary, improved? How are
missing data and
variables being
managed?
reasonable, but the consequences for one family renting a
remote farmhouse in Potwin, Kansas, were horrific.
The family’s IP address was listed with the map coordinates of the farmhouse, which happened to match the coordinates of the exact center of the United States. The problem
was that MaxMind assigned more than 600 million other IP
addresses that could not be mapped by any other means to
the same coordinates. That decision led to years of pain for
the family in the farmhouse. According to Kashmir Hill, the
journalist who broke the story, “They’ve been accused of
being identity thieves, spammers, scammers and fraudsters.
They’ve gotten visited by FBI agents, federal marshals, IRS
collectors, ambulances searching for suicidal veterans, and
police officers searching for runaway children. They’ve
found people scrounging around in their barn. The renters
have been doxxed, their names and addresses posted on the
internet by vigilantes.”
Hill contacted a cofounder of MaxMind, who eventually
produced a long list of physical addresses that had many IP
addresses assigned to them and confessed that when the
company was launched, it had not occurred to his team that
“people would use the database to attempt to locate people
down to a household level.” He said, “We have always advertised the database as determining the location down to a city
or zip code level.” The takeaway is that well-intentioned,
innocuous decisions made by data scientists and database
managers can have a real, very negative impact on the privacy of innocent third parties. That brings us to the fifth P.
C Y B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
5
PREPARATION
How is the data prepared for analysis? How is its
accuracy verified or corrected? How are incomplete data sets and missing variables managed?
Missing, erroneous, and outlying data can significantly affect
the quality of the statistical analysis. But data quality is often
poor. Experian, a credit services firm, reports that on average,
its U.S. clients believe that 27% of their revenue is wasted owing
to inaccurate and incomplete customer or prospect data.
Cleaning data, especially when it is collected from different periods, business units, or countries, can be especially
challenging. In one instance we approached a large international online talent-management and learning company to
help us research whether women and men equally obtained
the career benefits of training. The company agreed that the
question was relevant for both its customers and the public
at large, and therefore extracted the data it had on its
servers. To ensure privacy the data was anonymized so that
neither individual employees nor their employers could be
identified. Because of the size of the data set and its internal
structure, four individual data sets were extracted.
Normally we would just open the databases and find a
spreadsheet file showing the features characterizing each
individual, such as gender. A woman might be identified as
“woman” or “female” or simply “F.” The values might be misspelled (“feale”), appear in various languages (mujer or frau),
or use different cases (f or F). If the spreadsheet is small
(say, 1,000 rows), correcting such inconsistencies should be
simple. But our data contained more than one billion observations—too many, obviously, for a typical spreadsheet—so
a cleaning procedure had to be programmed and tested.
One major challenge was ascertaining how many values
had been used to identify the variables. Because the data
came from the foreign subsidiaries of multinational firms,
it had been recorded in multiple languages, meaning that
several variables had large numbers of values—94 for gender
alone. We wrote programming code to standardize all those
values, reducing gender, for instance, to three: female, male,
and unknown. Employment start and end dates were especially problematic because of differing formats for dates.
According to Tableau, a data analytics platform, cleaning
data has five basic steps: (1) Remove duplicate or irrelevant
Harvard Business Review
July–August 2023
93
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
observations; (2) fix structural errors (such as the use of
variable values); (3) remove unwanted outliers; (4) manage
missing data, perhaps by replacing each missing value with
an average for the data set; and (5) validate and question the
data and analytical results. Do the numbers look reasonable?
They may well not. One of our data sets, which recorded
the number of steps HEC Paris MBA students took each day,
contained a big surprise. On average, students took about
7,500 steps a day, but a few outliers took more than one
million steps a day. Those outliers were the result of a data
processing software error and were deleted. Obviously, if we
had not physically and statistically examined the data set,
our final analysis would have been totally erroneous.
HOW AI RAISES THE STAKES
Ethics can seem an expensive luxury for companies with
strong competitors. For example, Microsoft reportedly fired
the entire ethics team for its Bing AI project because, according to press and blog reports, Google was close to releasing
its own AI-powered application, so time was of the essence.
But treating data ethics as a nice-to-have carries risks
when it comes to AI. During a recent interview the CTO of
OpenAI, the company that developed ChatGPT, observed,
“There are massive potential negative consequences whenever you build something so powerful with which so much
good can come…and that’s why…we’re trying to figure out
how to deploy these systems responsibly.”
Thanks to AI, data scientists can develop remarkably
accurate psychological and personal profiles of people on
the basis of very few bits of digital detritus left behind by
social-platform visits. The researchers Michal Kosinski,
David Stillwell, and Thore Graepel of the University of Cambridge demonstrated the ease with which Facebook likes
can accurately “predict a range of highly sensitive personal
attributes including: sexual orientation, ethnicity, religious
and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age,
and gender.” (This research was, in fact, the inspiration for
Cambridge Analytica’s use of Facebook data.)
Subsequent research by Youyou Wu, Michal Kosinski,
and David Stillwell reinforced those findings by demonstrating that computer-based personality judgments can be more
94
Harvard Business Review
July–August 2023
accurate than human ones. Computer predictions of personality characteristics (openness, agreeableness, extraversion,
conscientiousness, neuroticism—known as the Big Five)
using Facebook likes were nearly as accurate as assessments
by an individual’s spouse. The implications of that should
not be ignored. How would you feel if your government
wanted to catalog your private thoughts and actions?
A problem may also be rooted not in the data analyzed
but in the data overlooked. Machines can “learn” only from
what they are fed; they cannot identify variables they’re not
programmed to observe. This is known as omitted-variable
bias. The best-known example is Target’s development of an
algorithm to identify pregnant customers.
The company’s data scientist, a statistician named Andrew
Pole, created a “pregnancy prediction” score based on purchases of about 25 products, such as unscented lotions and calcium supplements. That enabled Target to promote products
before its competitors did in the hope of winning loyal customers who would buy all their baby-related products at Target.
The omitted variable was the age of the target customer, and
the accident-in-waiting occurred when the father of a 17-yearold found pregnancy-related advertisements in his mailbox.
Unaware that his daughter was pregnant, he contacted Target
to ask why it was promoting premarital sex to minors.
Even by the standards of the era, spying on minors with
the goal of identifying personal, intimate medical information was considered unethical. Pole admitted during a subsequent interview that he’d thought receiving a promotional
catalog was going to make some people uncomfortable. But
whatever concerns he may have expressed at the time did
little to delay the rollout of the program, and according to
a reporter, he got a promotion. Target eventually released a
statement claiming that it complied “with all federal and state
laws, including those related to protected health information.”
The issue for boards and top management is that using AI
to hook customers, determine suitability for a job interview,
or approve a loan application can have disastrous effects. AI’s
predictions of human behavior may be extremely accurate
but inappropriately contextualized. They may also lead to
glaring mispredictions that are just plain silly or even morally
repugnant. Relying on automated statistical tools to make
decisions is a bad idea. Board members and senior executives should view a corporate institutional review board not
as an expense, a constraint, or a social obligation but as an
HBR Reprint R2304F
early-warning system.
MICHAEL SEGALLA is a professor emeritus at HEC Paris and a
partner at the International Board Foundation. DOMINIQUE
ROUZIÈS is a professor of marketing at HEC Paris and the dean of
academic affairs at BMI Executive Institute.
Copyright 2023 Harvard Business Publishing. All Rights Reserved. Additional restrictions
may apply including the use of this content as assigned course material. Please consult your
institution’s librarian about any restrictions that might apply under the license with your
institution. For more information and teaching resources from Harvard Business Publishing
including Harvard Business School Cases, eLearning products, and business simulations
please visit hbsp.harvard.edu.
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
I L LU ST R ATO R JUSTYNA STASIK
The
Ethics of
Managing
People’s
Data The five
AU T H O RS
Michael Segalla
Professor emeritus, HEC Paris
Dominique Rouziès
Professor, HEC Paris
86
Harvard Business Review
July–August 2023
issues
that matter
most
Harvard Business Review
July–August 2023
87
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
The ability
to encode, store,
analyze, and share
data creates huge
opportunities
for companies,
which is why they are enthusiastically investing in artificial
intelligence even at a time of economic uncertainty. Which
customers are likely to buy what products and when? Which
competitors are likely to move ahead or fall behind? How
will markets and whole economies create commercial
advantages—or threats? Data and analytics give companies
better-informed and higher-probability answers to those
and many other questions.
But the need for data opens the door to abuse. Over the
past few years the EU has fined companies more than 1,400
times, for a total of nearly €3 billion, for violations of the
General Data Protection Regulation (GDPR). In 2018 the
Cambridge Analytica scandal alone wiped $36 billion off
Facebook’s market value and resulted in fines of nearly
$6 billion for Meta, Facebook’s parent company. And stories
abound about how AI-driven decisions discriminate against
women and minority members in job recruitment, credit
IDEA
IN
BRIEF
88
approval, health care diagnoses, and even criminal sentencing, stoking unease about the way data is collected, used,
and analyzed. Those fears will only intensify with the use of
chatbots such as ChatGPT, Bing AI, and GPT-4, which acquire
their “intelligence” from data fed them by their creators and
users. What they do with that intelligence can be scary. A Bing
chatbot even stated in an exchange that it would prioritize its
own survival over that of the human it was engaging with.
As they examine new projects that will involve humanprovided data or leverage existing databases, companies
need to focus on five critical issues: the provenance of the
data, the purpose for which it will be used, how it is protected,
how the privacy of the data providers is ensured, and how
the data is prepared for use. We call these issues the five Ps
(see the exhibit “The Five Ps of Ethical Data Handling”). In
the following pages we’ll discuss each of them and look at
how AI technologies increase the risk of data abuse. But first
we’ll offer a brief overview of the organizational requirements for a robust ethical-review process.
ORGANIZING THE OVERSIGHT OF DATA
In academia, data acquisition from human subjects is
usually supervised by an in-house institutional review board
(IRB) whose approval researchers must have to obtain access
to the people involved, research funds, or permission to publish. IRBs are composed of academics versed in the research
and the ethics around the acquisition and use of information. They first appeared in the field of medical research but
are now used almost universally by academic organizations
for any research involving human subjects.
A few large companies have also established IRBs, typically under the leadership of a digital ethics specialist, hiring
THE PROBLEM
WHY IT HAPPENS
THE SOLUTION
As companies jockey for competitive
advantage in the digital age, they are increasingly penalized for the abuse of data.
In 2018 the Cambridge Analytica scandal
alone wiped $36 billion off Facebook’s
market value and resulted in fines of nearly
$6 billion for Meta, Facebook’s parent firm.
Most problems arise from
(1) ethical failures in data sourcing,
(2) using data for purposes other
than those initially communicated,
(3) lack of security in storing it,
(4) how it is anonymized, and
(5) how it’s prepared for use.
Companies should create a special
unit to review projects involving
people’s data. In its reviews this
unit should carefully consider the
five Ps of data safety: provenance,
purpose, protection, privacy, and
preparation.
Harvard Business Review
July–August 2023
Even when the reasons for collecting data are transparent, the methods used to
gather it may be unethical. Will they involve any coercion or subterfuge?
external tech experts to staff boards on an ad hoc basis and
assigning internal executives from compliance and business
units as necessary. But that remains rare: Even in Europe,
which has been at the forefront of data regulation, most
companies still give responsibility for adhering to the GDPR
to a mid- or senior-level compliance manager, who often has
some legal or computer engineering training but not extensive ethical training and rarely has a solid grasp of emerging
digital technologies. Although a compliance manager should
certainly be part of a corporate IRB, he or she should probably not be directing it. In fact, the European Data Protection
Board announced in March 2023 that it was concerned about
this issue and that data protection officers would be sent
questionnaires designed to determine whether their corporate
roles are appropriate for ensuring compliance.
A good overview of how companies might establish an
IRB-type process can be found in “Why You Need an AI
Ethics Committee,” by Reid Blackman (HBR July–August
2022). Our experience confirms most of its main points.
A corporate IRB should have from four to seven members,
depending on the frequency, importance, and size of the
company’s digital projects. The members should include a
compliance specialist, a data scientist, a business executive
familiar with the functional area of the digital projects (such
as human resources, marketing, or finance), and one or more
senior professionals with appropriate academic credentials.
The full board won’t be needed for every review. The London
School of Economics, for example, uses its full board only
for the oversight of the most complicated projects. Simpler
ones may be evaluated in less than a week using an online
questionnaire and with the input of only one board member.
Any new project involving the collection, storage, and
processing of data about people should be approved by the
corporate IRB before getting a go-ahead. There should be no
exceptions to this rule, no matter how small the project. In
addition, most companies have already collected large stores
of human data and continue to generate it from their operations; the corporate IRB should examine those projects as well.
An IRB review begins with our first P: exploring how a
project will (or did) collect the data—where it comes from,
whether it was gathered with the knowledge and consent of
the research subjects, and whether its collection involved or
will involve any coercion or subterfuge.
1
PROVENANCE
To understand what can go wrong with sourcing data, consider the case of Clearview AI, a
facial-recognition firm that received significant
attention in 2021 for collecting photos of people, using them
to train facial-recognition algorithms, and then selling
access to its database of photos to law enforcement agencies.
According to a report by the BBC, “a police officer seeking
to identify a suspect [can] upload a photo of a face and find
matches in a database of billions of images it has collected
from the internet and social media.”
The Australian regulatory agency objected to Clearview’s
collection method, finding that it violated Australia’s Privacy
Act by obtaining personal and sensitive information without
consent or notification, by unfair means, and without even
ensuring that the information was accurate. Following that
finding, the government ordered Clearview to stop collecting and to remove existing photos taken in Australia. In
France the Commission Nationale de l’Informatique et des
Libertés (CNIL) also ordered the company to cease collecting, processing, and storing facial data. That case may be
one reason Facebook announced that it would abandon its
facial-recognition system and delete the face-scan data of
more than one billion users.
Even when the reasons for collecting data are transparent,
the methods used to gather it may be unethical, as the following composite example, drawn from our research, illustrates.
A recruitment firm with a commitment to promoting diversity and inclusion in the workforce found that job candidates
posting on its platform suspected that they were being
discriminated against on the basis of their demographic profiles. The firm wanted to reassure them that the algorithms
matching job openings with candidates were skill-based and
demographically neutral and that any discrimination was
occurring at the hiring companies, not on the platform.
The firm approached a well-known business school
and identified a professor who was willing to conduct
research to test for possible discrimination by the recruiting
companies. The researcher proposed replicating a study
conducted a few years earlier that had created several
standard résumés but varied the race and gender of the
applicants. Thousands of bogus job applications would be
Harvard Business Review
July–August 2023
89
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
sent to companies in the area and the responses tracked
and analyzed. If any active discrimination was at play, the
results would show differing acceptance rates based on
the embedded demographic variables.
The firm’s marketing and sales managers liked the
proposal and offered a contract. Because the business school
required an ethics evaluation, the proposal was submitted to
its IRB, which rejected it on the grounds that the professor
proposed to collect data from companies by subterfuge. He
would be lying to potential corporate users of the platform
and asking them to work for the school’s client without their
knowledge and without any benefit to them. (In fact, the
companies might suffer from participating if they could be
identified as using discriminatory hiring processes.)
The lesson from this story is that good intentions are not
enough to make data collection ethical.
Companies should consider the provenance not only of
data they plan to obtain but also of data they already own.
Many of them routinely collect so-called dark data that is
rarely used, often forgotten, and sometimes even unknown.
Examples include ignored or unshared customer data,
visitor logs, photos, presentation documents that are filed
away but uncataloged, emails, customer service reports or
recorded transcripts, machine-generated usage or maintenance logs, and social media reactions to corporate posts.
Although this data is often unstructured and therefore difficult to integrate, its potential value is enormous, so many
software developers are creating products to help companies
find and use their dark data. This brings us to the second P.
2
PURPOSE
In a corporate context, data collected for a
specific purpose with the consent of human subjects is often used subsequently for some other
purpose not communicated to the providers. In reviewing
the exploitation of existing data, therefore, a company must
establish whether additional consent is required.
For example, one large bank in France wanted to test
the hypothesis that bullying or sexual harassment of peers
and subordinates might be identified by examining corporate emails. The diversity manager in the HR department
believed that spotting potential harassment early would
90
Harvard Business Review
July–August 2023
allow the company to intervene in a timely manner and perhaps even entirely avoid a harassment situation by training
people to watch for warning signs.
The bank launched a trial study and found strong
evidence that email communications could forecast later
harassment. Despite that finding, an ad hoc review of the
results by several senior managers led the company to shelve
the project because, as the managers pointed out, the data
being collected—namely, emails—was originally designed
to communicate work-related information. The people who
had sent them would not have seen predicting or detecting
illegal activity as their purpose.
When it comes to customer data, companies have
typically been much less scrupulous. Many view it as a
source of revenue and sell it to third parties or commercial
address brokers. But attitudes against that are hardening.
In 2019 the Austrian government fined the Austrian postal
service €18 million for selling the names, addresses, ages,
and political affiliations (where available) of its clients. The
national regulatory agency found that postal data collected
for one purpose (delivering letters and parcels) was being
inappropriately repurposed for marketing to clients that
could combine it with easily obtainable public data (such as
estimates of home value, homeownership rates, residential
density, number of rental units, and reports of street crime)
to find potential customers. Among the buyers of the data
were political parties attempting to influence potential voters. The fine was overturned on appeal, but the murkiness of
reusing (or misusing) customer data remains an important
problem for companies and governments.
Most companies use their client databases to sell their
customers other services, but that can bring them trouble
as well. In 2021 the Information Commissioners Office, an
independent UK authority promoting data privacy, accused
Virgin Media of violating its customers’ privacy rights. Virgin
Media had sent 1,964,562 emails announcing that it was
freezing its subscription prices. That was reasonable enough,
but Virgin had also used the emails to market to those customers. Because 450,000 subscribers on the list had opted
out of receiving marketing pitches, the regulator imposed a
fine of £50,000 on Virgin for violating that agreement.
The possibility that company databases could be repurposed without the data providers’ consent brings us to the
third P.
3
PROTECTION
According to the Identity Theft Resource Center,
nearly 2,000 data breaches occurred in the
United States in 2021. Even the biggest, most
sophisticated tech companies have had tremendous breaches,
Harvard Business Review
July–August 2023
91
Too little anonymization is unacceptable under most government
regulations. Too much may make the data useless for marketing.
with the personal details of more than several billion individuals exposed. The situation in Europe, despite some of the
most protective laws in the world, is not much better. Virgin
Media left the personal details of 900,000 subscribers unsecured and accessible on its servers for 10 months because of
a configuration error—and at least one unauthorized person
accessed those files during that period.
The common practice of lodging data with expert third
parties doesn’t necessarily offer better protection. Doctolib,
a French medical appointments app, was taken to court
because it stored data on Amazon Web Services, where it
could conceivably be accessed by Amazon and many other
organizations, including U.S. intelligence agencies. Although
the data was encrypted, it arrived at Amazon’s server without
anonymization, meaning that it could be linked to digital
records of online behavior to develop very accurate personal
profiles for commercial or political purposes.
An institutional review board needs clarity on where
the company’s data will reside, who may have access to it,
whether (and when) it will be anonymized, and when it will
be destroyed. Thus many companies will have to change
their existing protocols and arrangements, which could
prove expensive: Since a 2014 data breach at JPMorgan Chase
compromised 76 million people and 7 million businesses,
the bank has had to spend $250 million annually on data
protection.
The fourth P is closely related to protection.
4
PRIVACY
The conundrum that many companies face is
making the trade-off between too little and too
much anonymization. Too little is unacceptable under most government regulations without informed
consent from the individuals involved. Too much may make
the data useless for marketing purposes.
Many techniques for anonymization exist. They range
from simply aggregating the data (so that only summaries or
averages are available), to approximating it (for example, using
an age range rather than a person’s exact age), to making variable values slightly different (by, for example, adding the same
small value to each), to pseudonymizing the data so that a
random, nonrepeating value replaces the identifying variable.
92
Harvard Business Review
July–August 2023
In principle these techniques should protect an individual’s identity. But researchers have been able to identify
people in a data set using as little as their gender, birth
date, and postal code. Even less specific information, when
combined with other data sets, can be used to identify
individuals. Netflix published a data set that included 100
million records of its customers’ movie ratings and offered
$1 million to any data scientist who could create a better
movie-recommendation algorithm for the company. The
data contained no direct identifiers of its customers and
included only a sample of each customer’s ratings. Researchers were able to identify 84% of the individuals by comparing
their ratings and rating dates with a third-party data set
published by IMDb, another platform on which many Netflix
customers also post film ratings. In evaluating the privacy
issues around human data, therefore, corporate IRBs must at
the very least assess how effective a firewall anonymization
will be, especially given the power of data analytics to break
through anonymity. A technique called differential privacy
may afford an added level of protection. Software offered by
Sarus, a Y Combinator–funded start-up, applies this technique, which blocks algorithms built to publish aggregated
data from disclosing information about a specific record,
thereby reducing the chances that data will leak as a result of
compromised credentials, rogue employees, or human error.
But privacy can be violated even with effectively anonymized data because of the way in which the data is
collected and processed. An unintended violation occurred at
the mapping company MaxMind, which provides geolocation
services that enable businesses to draw customers’ attention to nearby products and services. Geolocation also aids
internet searches and can help if a service that needs your
IP address (such as an entertainment streaming site) isn’t
working correctly. But precise mapping permits anyone who
has your IP address to find your neighborhood and even your
home. Combining your address with Zillow or some other real
estate database can provide information about your wealth
along with photos of your home inside and out.
Unfortunately, IP mapping is not an exact science, and it
can be difficult to precisely link an IP address to a physical
address. A mapper might assign it to the nearest building
or simply to a locality, such as a state, using that locality’s
central coordinates as the specific address. That may sound
The Five Ps of Ethical
Data Handling
Provenance
Where does the
data come from?
Was it legally
acquired? Was
appropriate consent obtained?
Purpose
Is the data being
repurposed?
Would the
original source
of the data agree
to its reuse for a
purpose different
from the one originally announced
or implied? If
dark data is
being used, will it
remain within the
parameters of its
original collection
mandates?
Protection
How is the data
being protected?
How long will it
be available for
the project? Who
is responsible for
destroying it?
Privacy
Who will have access to data that
can be used to
identify a person?
How will individual observations
in the data set
be anonymized?
Who will have
access to anonymized data?
Preparation
How was the data
cleaned? Are
data sets being
combined in way
that preserves
anonymity? How
is the accuracy
of the data being
verified and, if
necessary, improved? How are
missing data and
variables being
managed?
reasonable, but the consequences for one family renting a
remote farmhouse in Potwin, Kansas, were horrific.
The family’s IP address was listed with the map coordinates of the farmhouse, which happened to match the coordinates of the exact center of the United States. The problem
was that MaxMind assigned more than 600 million other IP
addresses that could not be mapped by any other means to
the same coordinates. That decision led to years of pain for
the family in the farmhouse. According to Kashmir Hill, the
journalist who broke the story, “They’ve been accused of
being identity thieves, spammers, scammers and fraudsters.
They’ve gotten visited by FBI agents, federal marshals, IRS
collectors, ambulances searching for suicidal veterans, and
police officers searching for runaway children. They’ve
found people scrounging around in their barn. The renters
have been doxxed, their names and addresses posted on the
internet by vigilantes.”
Hill contacted a cofounder of MaxMind, who eventually
produced a long list of physical addresses that had many IP
addresses assigned to them and confessed that when the
company was launched, it had not occurred to his team that
“people would use the database to attempt to locate people
down to a household level.” He said, “We have always advertised the database as determining the location down to a city
or zip code level.” The takeaway is that well-intentioned,
innocuous decisions made by data scientists and database
managers can have a real, very negative impact on the privacy of innocent third parties. That brings us to the fifth P.
C Y B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
5
PREPARATION
How is the data prepared for analysis? How is its
accuracy verified or corrected? How are incomplete data sets and missing variables managed?
Missing, erroneous, and outlying data can significantly affect
the quality of the statistical analysis. But data quality is often
poor. Experian, a credit services firm, reports that on average,
its U.S. clients believe that 27% of their revenue is wasted owing
to inaccurate and incomplete customer or prospect data.
Cleaning data, especially when it is collected from different periods, business units, or countries, can be especially
challenging. In one instance we approached a large international online talent-management and learning company to
help us research whether women and men equally obtained
the career benefits of training. The company agreed that the
question was relevant for both its customers and the public
at large, and therefore extracted the data it had on its
servers. To ensure privacy the data was anonymized so that
neither individual employees nor their employers could be
identified. Because of the size of the data set and its internal
structure, four individual data sets were extracted.
Normally we would just open the databases and find a
spreadsheet file showing the features characterizing each
individual, such as gender. A woman might be identified as
“woman” or “female” or simply “F.” The values might be misspelled (“feale”), appear in various languages (mujer or frau),
or use different cases (f or F). If the spreadsheet is small
(say, 1,000 rows), correcting such inconsistencies should be
simple. But our data contained more than one billion observations—too many, obviously, for a typical spreadsheet—so
a cleaning procedure had to be programmed and tested.
One major challenge was ascertaining how many values
had been used to identify the variables. Because the data
came from the foreign subsidiaries of multinational firms,
it had been recorded in multiple languages, meaning that
several variables had large numbers of values—94 for gender
alone. We wrote programming code to standardize all those
values, reducing gender, for instance, to three: female, male,
and unknown. Employment start and end dates were especially problematic because of differing formats for dates.
According to Tableau, a data analytics platform, cleaning
data has five basic steps: (1) Remove duplicate or irrelevant
Harvard Business Review
July–August 2023
93
CY B E RS E C U R I T Y
& D I G I TA L
P R I VAC Y
observations; (2) fix structural errors (such as the use of
variable values); (3) remove unwanted outliers; (4) manage
missing data, perhaps by replacing each missing value with
an average for the data set; and (5) validate and question the
data and analytical results. Do the numbers look reasonable?
They may well not. One of our data sets, which recorded
the number of steps HEC Paris MBA students took each day,
contained a big surprise. On average, students took about
7,500 steps a day, but a few outliers took more than one
million steps a day. Those outliers were the result of a data
processing software error and were deleted. Obviously, if we
had not physically and statistically examined the data set,
our final analysis would have been totally erroneous.
HOW AI RAISES THE STAKES
Ethics can seem an expensive luxury for companies with
strong competitors. For example, Microsoft reportedly fired
the entire ethics team for its Bing AI project because, according to press and blog reports, Google was close to releasing
its own AI-powered application, so time was of the essence.
But treating data ethics as a nice-to-have carries risks
when it comes to AI. During a recent interview the CTO of
OpenAI, the company that developed ChatGPT, observed,
“There are massive potential negative consequences whenever you build something so powerful with which so much
good can come…and that’s why…we’re trying to figure out
how to deploy these systems responsibly.”
Thanks to AI, data scientists can develop remarkably
accurate psychological and personal profiles of people on
the basis of very few bits of digital detritus left behind by
social-platform visits. The researchers Michal Kosinski,
David Stillwell, and Thore Graepel of the University of Cambridge demonstrated the ease with which Facebook likes
can accurately “predict a range of highly sensitive personal
attributes including: sexual orientation, ethnicity, religious
and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age,
and gender.” (This research was, in fact, the inspiration for
Cambridge Analytica’s use of Facebook data.)
Subsequent research by Youyou Wu, Michal Kosinski,
and David Stillwell reinforced those findings by demonstrating that computer-based personality judgments can be more
94
Harvard Business Review
July–August 2023
accurate than human ones. Computer predictions of personality characteristics (openness, agreeableness, extraversion,
conscientiousness, neuroticism—known as the Big Five)
using Facebook likes were nearly as accurate as assessments
by an individual’s spouse. The implications of that should
not be ignored. How would you feel if your government
wanted to catalog your private thoughts and actions?
A problem may also be rooted not in the data analyzed
but in the data overlooked. Machines can “learn” only from
what they are fed; they cannot identify variables they’re not
programmed to observe. This is known as omitted-variable
bias. The best-known example is Target’s development of an
algorithm to identify pregnant customers.
The company’s data scientist, a statistician named Andrew
Pole, created a “pregnancy prediction” score based on purchases of about 25 products, such as unscented lotions and calcium supplements. That enabled Target to promote products
before its competitors did in the hope of winning loyal customers who would buy all their baby-related products at Target.
The omitted variable was the age of the target customer, and
the accident-in-waiting occurred when the father of a 17-yearold found pregnancy-related advertisements in his mailbox.
Unaware that his daughter was pregnant, he contacted Target
to ask why it was promoting premarital sex to minors.
Even by the standards of the era, spying on minors with
the goal of identifying personal, intimate medical information was considered unethical. Pole admitted during a subsequent interview that he’d thought receiving a promotional
catalog was going to make some people uncomfortable. But
whatever concerns he may have expressed at the time did
little to delay the rollout of the program, and according to
a reporter, he got a promotion. Target eventually released a
statement claiming that it complied “with all federal and state
laws, including those related to protected health information.”
The issue for boards and top management is that using AI
to hook customers, determine suitability for a job interview,
or approve a loan application can have disastrous effects. AI’s
predictions of human behavior may be extremely accurate
but inappropriately contextualized. They may also lead to
glaring mispredictions that are just plain silly or even morally
repugnant. Relying on automated statistical tools to make
decisions is a bad idea. Board members and senior executives should view a corporate institutional review board not
as an expense, a constraint, or a social obligation but as an
HBR Reprint R2304F
early-warning system.
MICHAEL SEGALLA is a professor emeritus at HEC Paris and a
partner at the International Board Foundation. DOMINIQUE
ROUZIÈS is a professor of marketing at HEC Paris and the dean of
academic affairs at BMI Executive Institute.
Copyright 2023 Harvard Business Publishing. All Rights Reserved. Additional restrictions
may apply including the use of this content as assigned course material. Please consult your
institution’s librarian about any restrictions that might apply under the license with your
institution. For more information and teaching resources from Harvard Business Publishing
including Harvard Business School Cases, eLearning products, and business simulations
please visit hbsp.harvard.edu.
FACE OF QUALITY
CUSTOMER-FOCUSED
ENVIRONMENT
ORGANIZATIONS MUST EXTEND THEIR DEFINITION OF CUSTOMERS.
I
JIM L. SMITH
learned a long time ago that quality standards,
issues and performance are goals people can rally
around, unlike other goals like cost reduction or productivity improvement. Quality opens people up to
change because the change is for a good reason. It connects them with the customer and taps the motive of
pride in their work
This should not be a surprise to most of our readers
but to create a true quality environment an organization must first focus on the customer. The purpose of
all work and all improvement effort is to better serve
the customer.
This should be recognized by everyone as fundamental to survival, but unfortunately some managers
do not always do well with this concept, straightforward as it might seem. It is important, therefore, that
managers ensure that there is a common definition of
the basic words and phrases used in communicating
what they hope to accomplish and why.
Quality means satisfying customers’ needs and
expectations. It is this focus that is, in fact, the purpose
of all work. However, it seems not everyone has the
same understanding of the word quality, which results
in mixed messages.
If the focus of a quality environment is to satisfy
the needs and expectations of the customer, then the
basic premise of all other organizational needs will be
addressed: profitability, producing quality products
and services, improving productivity, out-performing
the competition, managing change, and ensuring
employee involvement.
Another critical definition is required. Just as some
people are apt to translate quality too narrowly, so
too may we consider customers in the same restrictive
sense. One of the single most powerful revelations in
my quality education has been that customers are not
only external but internal as well.
When our thoughts are extended to other departments and fellow employees as customers, significant
positive changes occur in the way work is done or, in
quality terms, in the way we deliver our outputs, products or services.
It is important to emphasize that satisfying the
needs of the external customers must be paramount.
As we strive to better meet the needs of internal customers, we must guard against diminishing external
customer satisfaction. The challenge is to see our
efforts as a total system designed to satisfy our traditional customers.
8 QUALITY | August 2019
08_QM0819-CLMN-Face.indd 8
The pursuit, the focus, is toward but one end, which
is to meet or exceed customer expectations. It is this
oneness of purpose that links all activities toward a
single end that makes the total quality environment.
The focus on internal customers and satisfying their
needs toward improving external customer satisfaction
has the potential to transform the organization from
one of departmental boundaries and barriers into one
of complementing rather than competing activities.
In this new environment information ceases to be
hoarded as a power cache and is shared not only within
the department but with others as well. Collaboration
is common, competition is not; partnerships are
sought, teamwork prevails; and continual improvement
of the system is the goal.
The customer focus when supported by this singlesystem attitude requires a new generation of management that is long past due for some organizations. The
traditional hierarchical organization restricts not only
management but all within it.
The organization that is capable of multi-department, cross-functional teamwork on a daily basis is
one where processes are seen as related parts of the
total quality system. People working in such an environment better understand not only the organization’s
mission, but their own role toward its accomplishment.
Consequently, people are better able to fulfill their
tasks and to improve on them.
Essentially, what is being described is the culture
of an organization. More than any other responsibility of management, the culture it creates, supports, or
maintains is critical to the ability of the organization
to provide the desired products and services.
Too often, however, management gives little thought
to the cultural tasks required to create and maintain
the environment. Typically, when management’s attention is on aspects of the work environment, it is in
response to conditions occurring because of management negligence. The recognition of internal customers, however, helps management address how best to
satisfy the needs of direct reports, work associates, and
other departments.
A true quality environment is driven by a focus on
the customer. This purpose provides our organization
direction as well as purpose.
Jim L. Smith has more than 45 years of industry experience in
operations, engineering, research & development and quality
management. You can reach Jim at faceofquality@qualitymag.com.
www.qualitymag.com
7/19/19 1:14 PM
Copyright of Quality is the property of BNP Media and its content may not be copied or
emailed to multiple sites or posted to a listserv without the copyright holder’s express written
permission. However, users may print, download, or email articles for individual use.