This problem set assignment will involve activities designed to solidify the concepts learned in both Modules One and Two. Problems will be similar to those you will face in Module Four and will include one or two real-world applications to prepare you to think like a biostatistician.
Check the module resource list to see which videos on the StatCrunch Help channel will help with this assignment.
For support on the concepts of descriptive statistics, variables and sampling, visit the suggest Khan Academy videos in the module resources list.
Textbook link: https://bncvirtual.com/vb_econtent.php?ACTION=econtent&FVENCKEY=AD9EE8D798DCAFC7E76B5FB7C978DD86&j=43766531&sfmc_sub=1597096465&l=23329524_HTML&u=695880241&mid=524003857&jb=40753&utm_term=10242022&utm_source=transactional&utm_medium=email&utm_campaign=Direct_Ebooks
Textbook: Basic Biostatistics: Statistics for Public Health Practice, Chapter 3 and Chapter 4
Video: https://www.khanacademy.org/math/statistics-probability/displaying-describing-data
https://www.statcrunch.com/
IHP 525 Module Two Problem Set
Five-City Project. The Stanford Five-City Project is a comprehensive community health education study of five moderately sized Northern California towns. Multiple-risk factor intervention strategies were randomly applied to two of the communities. The other three cities served as controls. Outline the design of this study in schematic form.
Employee counseling. An employer offers its employees a program that will provide up to four free psychological counseling sessions per calendar year. To evaluate satisfaction with this service, the counseling office mails questionnaires to every 10th employee who used the benefit in the prior year. There were 1000 employees who used the benefit. Therefore, 100 surveys were sent out. However, only 25 of the potential respondents completed and returned their questionnaire.
Describe the population for the study.
Describe the sample.
What concern is raised by the fact that only 25 of the 100 questionnaires were completed and returned?
Air samples. An environmental study looked at suspended particulate matter in air samples (µg/m3) at two different sites. Data are listed here. Construct a stemplot of for each site (here is where side-by-side stemplots would be helpful but not required) and then compare the distribution of suspended particulate matter between the two sites (remember to mention center and spread). No calculations necessary.
Site 1: 68 22 36 32 42 24 28 38
Site 2: 36 38 39 40 36 34 33 32
What would you report? What is an appropriate measure of central location for data that are really skewed? What is an appropriate measure of spread for data that are really skewed?
Melanoma treatment. A study by Morgan and coworkers used genetically modified white blood cells to treat patients with melanoma who had not responded to standard treatments. In patients in whom the cells were cultured ex vivo for an extended period of time (cohort 1), the cell doubling times were {8.7, 11.9, 10.0} days. In a second group of patients in whom the cells were cultured for a shorter period of time (cohort 2), the cell doubling times were {1.4, 1.0, 1.3, 1.0, 1.3, 2.0, 0.6, 0.8, 0.7, 0.9, 1.9} days. In a third group of patients (cohort 3), actively dividing cells were generated by performing a second rapid expansion via active cell transfer. Cell doubling times for cohort 3 were {0.9, 3.3, 1.2, 1.1} days.
1. Create side-by-side boxplots of these data (so each boxplot uses the same y-axis).
1.
In addition, calculate the mean and standard deviations within each group. Comment on your findings.
Problems retrieved from Gerstman, B. B. (2015).
Basic biostatistics: Statistics for public health practice (2nd ed.). Burlington, MA: Jones and Bartlett. ISBN: 978-1-284-03601-5
image1