1.read the case and fill out the rest of the table and follow the instructions.

2. fill out the table about diabetic devices.

Introduction to Data Analysis

and Basic Statistical Concepts

Inna Miroshnyk, PhD

New Curriculum/Spring 2022

DESCRiptive Statistics

Summarizing, organizing, and presenting data

Learning Objectives (Part II)

❑ Describe and calculate the measures of central tendency of data sets

❑ mean

❑ median

❑ mode

❑ Define and calculate the measures of dispersion

❑ range and IQR

❑ Variance/CV

❑ SD

❑ Describe the measures of distribution shape

❑ Compare and contrast normal and skewed distributions

❑ Differentiae between and calculate the measures of risk

❑ Incidence, prevalence, mortality rate

❑ Define and calculate performance measures for diagnostic tests

❑ Sensitivity, specificity, accuracy, positive/negative predictive values

❑ Organize and present data in a scientifically meaningful way

3

Spring 2021-2022

Measures of Central Tendencies / Central

Location

❑ Mean (average)

(appropriate for interval and ration levels of data

measurement)

❑ Median

(the value in the middle of the ordered list)

• 20 20 20 38 42 42 51 68

Most appropriate for data sets with outliers

❑ Mode

(most frequently occurred value)*

*Rarely used in medical research

How to Measure the Variability between

Two Groups?

Mean 1 = Mean 2

?

Measures of Variability/Dispersion

Describe how data are spread

Range

• Used for any numerical data

Interquartile Range (IQR)

• Used for any numerical data

Variance

• Used for continuous & some discrete data

Standard deviation (SD)

• Used for continuous & some discrete data

Measures of Variability: Range

Range = MAX value – MIN value

Range 1 = (202 -170) = 32

Range 2 = (235 -140) = 95

Interquartile Range (IQR)

The range restricted to values within the middle 50% of the distribution.

50%

IQR = Q3 – Q1

25% 25%

25%

25%

Upper half

Lower half

20 20 20 38 42 42 51 68

Median = Q2

(40)

Min

Q1 =

20+20

2

= 20

Max

Q3 =

42+51

2

= 46.5

•

Range = 68 – 20 = 48

•

IQR = 46.5 – 20 = 26.5

Standard Deviation & Variance

Standard Deviation (SD)

Variance (𝛔2)

The “average” deviation of all

values from the sample mean

Quantifies the

spread around the

mean

Σ(Data Value − Mean)2

Total # of observations

𝛔2 = SD2

SD =

Same units as the original data

❑ Coefficient of variation (CV) shows the extent of variability

CV =

𝑆𝐷

𝑚𝑒𝑎𝑛

Mean = 37.6

Standard Deviation (SD) = 17.2

Variance = 296

CV = 0.46

Calculation of the Standard

Deviation

x

ẋ (the mean)

x- ẋ

(x- ẋ)^2

101.8

103

-1.2

1.44

103.2

103

0.2

0.04

104.0

103

1.0

1.00

102.5

103

-0.5

0.25

103.5

103

0.5

0.25

Σ (x – ẋ)^2 = 2.98

Σ x = 515

SD =

Σ (x − ẋ)^2

𝑁

=

2.98

= 0.77

5

Give it Some Thought!

Does this study properly describe the dose given?

Measures of Distribution Shape

Does Distribution Shape Matter?

Skewness (asymmetry)

= uneven distribution of the data around

the mean

Normal (Gaussian) Distribution

Symmetrical,

bell-shaped

curve

50% values are

on the right

side

50% values are

on the left side

The skewness

and kurtosis

are zero

AUC = 1

#SD

Mean=Median=Mode

Skewed Distributions

Negative (left) skewness due to outliers

Mean, Median and Mode are

NOT equal

Reasons: small sample size or

due to outliers (extreme

values)

Skew refers to the direction

of the tail

Skewed distributions need to

be converted into

approximately normal for

further analysis

More high values

(mean < median < mode)
True positive (right) skewness
More low values
(mean > median > mode)

Median as preferable measure of central

tendency for skewed distributions

Give it Some Thought!

Does this study properly describe the dose given?

Measures of Risk in Clinical

Research

Epidemiology of Diabetes in the US

Incidence and Prevalence as Morbidity

Measures

❑ Morbidity is defined as any departure, subjective or objective, from a

state of physiological or psychological well-being.

❑ disease

❑ injury

❑ disability

❑ Measures of morbidity frequency

❑ Incidence

❑ Prevalence

Prevalence as Morbidity Measure

Prevalence

Measures # of existing cases (new and preexisting) at a

particular point in time

Point Prevalence

(Prevalence Rate)

# 𝐨𝐟 𝐞𝐱𝐢𝐬𝐭𝐢𝐧𝐠 𝐜𝐚𝐬𝐞𝐬 𝐚𝐭 𝐚 𝐬𝐩𝐞𝐜𝐢𝐟𝐢𝐞𝐝 𝐩𝐨𝐢𝐧𝐭 𝐨𝐟 𝐭𝐢𝐦𝐞

= population

at the same specified point in time

Incidence as Morbidity Measure

Incidence

Measures the number of new cases of a disease during a

given period

Incidence proportion is a measure of the risk of disease or the

probability of developing the disease during the specified period.

Incidence Proportion =

(Risk)

# 𝐨𝐟 𝐍𝐄𝐖 𝐜𝐚𝐬𝐞𝐬 𝐝𝐮𝐫𝐢𝐧𝐠 𝐚 𝐬𝐩𝐞𝐜𝐢𝐟𝐢𝐞𝐝 𝐩𝐨𝐢𝐧𝐭 𝐨𝐟 𝐭𝐢𝐦𝐞

population at start of the time interval

Global

Epidemiology

of COVID-19:

MORTALITY

Mortality Rate =

# 𝐨𝐟 𝐢𝐧𝐝𝐢𝐯𝐢𝐝𝐮𝐥𝐬 𝐝𝐢𝐞𝐝 𝐝𝐮𝐫𝐢𝐧𝐠 𝐚 𝐬𝐩𝐞𝐜𝐢𝐟𝐢𝐞𝐝 𝐩𝐞𝐫𝐢𝐨𝐝 𝐨𝐟 𝐭𝐢𝐦𝐞

population at start of the time interval

Diagnostic Test and Their

Performance Measures

Real-World Performance of COVID-19 Rapid

Antigen Tests

https://asm.org/Articles/2021/December/Real-World-Performance-of-COVID-19-Rapid-Antigen-T

Test Performance Measures

GOAL: ↑True positives / ↓ False-positives

1. Sensitivity or true positive rate

Test ability to correctly identify individuals WITH disease

calculated as

proportion of individuals with the disease who are correctly identified

by the test

Sensitivity =

# 𝑜𝑓𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (+) 𝑡𝑒𝑠𝑡𝑠

# 𝐴𝐿𝐿 𝑝𝑎𝑡𝑒𝑖𝑛𝑒𝑡𝑠 𝑤𝑖𝑡ℎ 𝑡ℎ𝑒 𝑑𝑖𝑠𝑒𝑎𝑠𝑒

Typical sensitivity ~ 80%

Test Performance Measures (cont’d)

2. Specificity or true negative rate

Test ability to correctly identify individuals WITHOUT disease

calculated as

proportion of individuals without the disease who are correctly identi

fied by the test

SPECIFICITY =

# 𝑜𝑓 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑖𝑡𝑖𝑣𝑒 (−) 𝑡𝑒𝑠𝑡𝑠

# 𝐴𝐿𝐿 𝑝𝑎𝑡𝑒𝑖𝑛𝑒𝑡𝑠 𝑤𝑖𝑡ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑑𝑖𝑠𝑒𝑎𝑠𝑒

Typical specificity ~ 90%

Sensitivity and Specificity of tests depend on the prevalence of the disease.

Test Performance Measures (cont’d)

3. Accuracy

proportion of all tests that are correct classification

ACCURACY =

# 𝑜𝑓 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡𝑠 + # 𝑜𝑓 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑖𝑡𝑖𝑣𝑒 (−) 𝑡𝑒𝑠𝑡𝑠

# 𝐴𝐿𝐿 𝑝𝑎𝑡𝑒𝑖𝑛𝑒𝑡𝑠 𝑡𝑒𝑠𝑡𝑒𝑑

Test Performance Measures (cont’d)

4. Predictive Value

Shows how likely it is that the tested individual does/does not have

the disease

Positive predictive value

Probability that a positively tested patient has the disease

PPV =

# 𝑜𝑓 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (+) 𝑡𝑒𝑠𝑡𝑠

# 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡+# 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡𝑠

Negative predictive value

Probability that a negatively tested patient dose not have the disease

NPV =

# 𝑜𝑓 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑖𝑡𝑖𝑣𝑒 (−) 𝑡𝑒𝑠𝑡𝑠

# 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡𝑠

Visual Representation of Data

Tables, Plots, Graphs, and Charts

Examples of Frequency Table

• Lack of labels and

description of

units

• Clear organization

+ appropriate

description

Example Box and Whisker Plot

Used to present the

range/spread of data

❑ “Box” part = IRQ

❑ Line inside the box

represents the median

❑ The ‘whiskers” mark the

min and max values

Example Bar

Chart

Visualizes the ordinal

data (categorical &

discrete) for a

question that used a

Likert-type scale

How often does your pharmacist offer to provide information about the

prescriptions you fill? (N = 100)

Example Histogram

Mammography screening data

Visualizes the

continuous data by

venue over time

Commonly used to

display the frequency

distribution of a single

interval or ratio varia

ble.

Example Pie

Chart

Visualizes the

proportions or relative

quantities of values

Criteria to consider:

–

Useful for small 3 of

categories

–

Must be clearly

labeled and colored

(or legend must be

used)

–

Logical dividing of

data

Recap

Data Variables

Four Data Measurement Levels

Descriptive Statistics

• Measures of Central Tendency (Mean, Median, Mode)

• Measures of Dispersion (Range/IRQ, SD and variance)

• Measures of Distribution shape (Normal vs Skewed)

Measures of Risk

• Morbidity & Mortality

Performance Measures of Diagnostic tests

• Sensitivity

• Specificity

• Accuracy

• Predictive values

Visual Data Representation

Students Names: ___________________________

_______________________

____________________________

_________________________

____________________________

Date: _________________

Total Points (total: 84 points): ___________________

Complete the SOAP note PRIOR to the laboratory session and be able to present every category

in the exact order listed in the rubric below. Every group member must present information to get

credit. Your group has a maximum of 15 minutes to present the information. Review the rubric

and the entire group must practice together in advance of the session. You can write your

answers in the sections listed below. Your grade will be based on your group’s presentation and

not the written material (the assignment does not need to be submitted).

Criteria

Full Credit

(2 points)

Subjective Section

·

·

·

·

·

Chief complaint

– Find out results of blood work

HPI

– 65 female

– caucasian

PMH

– Type 2 DM

– Stroke

Family history

– Mother and father 85 YO

– Both still alive

– Both DM, HTN, and hyperlipidemia

Surgical/social history

– Walks 150mins/week (with resistance

exercises 2 times/week) and eats fried foods

couple times a month

Objective Section

·

·

Allergies

– NKDA

Immunization history

– Need pneumonia vaccine

– All others up to date

Half

Credit

(1 point)

No Credit

(0 points)

·

·

·

·

Medication list

– Metformin 1000mg twice a day

– ASA 81 mg once a day

– Lisinopril 10 mg once a day

ROS/physical exam

Vital signs

– BP: 134/96

– Pulse: 80 bpm

– Height: 5’3’’

– Weight: 160 lbs

– Temp: 98.6 degrees F

– BMI 28.3

Labs/diagnostic test results

– Fasting blood glucose: 150 mg/dL

– HDL: 43 mg/dL

– LDL: 150 mg/dL

– TG: 200 mg/dL

– Total cholesterol: 243 mg/dL

– ALT: 16

– AST: 18

– Urine albumin excretion: 35 mcg/mg

creatinine (same result for the 2 out of 3 in a 3-6

month period)

– Na: 140

– K: 4.0

– Mg: 2.0

– Ca: 9.5

– Albumin: 4.0

– Scr: 1.1 (normal range for female)

– BUN: 12

– HCO3: 20

– HbA1c: 7.9%

– TSH: WNL

– GFR: = 65 ml/min/1.73m2

Assessment and Problem List_Nellie A.

· Correctly identifies primary problem

Type 2 diabetes mellitus

· Characterizes (controlled, uncontrolled, etc.) and

provides appropriate action (requiring therapy

initiation, therapy continuation, etc.) for primary

problem

AM has uncontrolled type 2 DM due to both her

parents having DM; I would suggest GLP-1

treatment since the patient is already on Metformin.

Since the pt would prefer a once weekly medication;

I would suggest Ozempic (Semaglutide).

· Correctly identifies secondary problems, if

applicable

AM also has HTN and Stroke

· Characterizes (controlled, uncontrolled, etc.)

and provides appropriate action (requiring therapy

initiation, therapy continuation, etc.) for secondary

problems

AM has uncontrolled HTN which requires her to

start therapy (also noticed she is only on Lisinopril )

· Correctly identifies a second secondary

problems, if applicable

AM has Hyperlipidemia (perhaps she can exercise

more, adding more fiber into her diet/eating

healthier )

· Characterizes (controlled, uncontrolled, etc.) and

provides appropriate action (requiring therapy

initiation, therapy continuation, etc.) for the

secondary problem

Due to her having hyperlipidemia and stroke hence, i noticed she is not on other medications such

as statins; her doctor should initiate this therapy

immediately.

Plan for Primary Problem_Nellie A.

· Includes three SMART goals of therapy (need to

include time frame for every goal)

Since her A1C is out of control, her A1C needs to

be lower to under 7% -measure every 3 months until

therapy changes or at goal.

Fasting blood glucose needs to be lower to under

126 mg/dL

Frequent eye examination instead

Monitor blood pressure