Q3
Review the data from the State of New York open-source data portal
https://healthdata.ny.gov/
1. Select two data sets from the NY state open data portal that would help you address the problem area of high costs and waste in the US healthcare system. Describe how they could help you address the problem area of interest.
2. Review your data sources and identify:
a) What medical terminology systems are used in the data? ( SNOMED, CPT, etc)
b) (i) If you do not find any standardized codes, find at least two concepts in the data that could have been captured using standard codes.
b) (ii) If you do not find any standardized codes, choose two other data sets from the NY open-data portal that use the appropriate codes. Provide a snapshot of the clippings of the data sets
c) In 3-4 sentences, explain how these standardized codes might help your analyses.
d) Describe how it might be challenging to integrate the data for analysts.
e) Review the data sets you have chosen, and look at the data and the meta-data to think about what fields are available for linkages. Provide about 2-3 sentences to address each of the questions below
1. How you would work with an analytics team to link data from the various data sources?
2. Why?
3. What fields could be used to link the data sources?
4. What kind of linkage methods might you try?
5. Would these fields be sufficient to obtain high-quality matches? Why or why not? How would you know?
6. What privacy or legal concerns during the linkage effort?
7. Anything else about the data you notice?
Class Notes
Overview
Let us study healthcare’s high costs and waste as an area with opportunities for improvement because high costs and waste are serious problems for the US. A study by Arizona State University in November 2020, found that between $600 billion to more than $1.9 trillion was wasted every year, or between $1,800 and $5,700 per person, per year. Any opportunity for improvement will be beneficial to the US.
Healthcare spending is high in the US compared with other developed countries because of the complexity of administration expenses, high physician salaries, and the costs of pharmaceuticals. Wastes are associated with – clinical inefficiencies, missed prevention opportunities, overuse, administrative waste, excessive prices, and fraud and abuse.
For the purpose of this assignment Here are the general guidelines
1. review the data from the State of New York open-source data portal
https://healthdata.ny.gov/
.
2. think about a US healthcare problem area: high cost and waste at a conceptual level only and you are not required to access the data file to do any data analytics;
3. describe the high cost and waste problem in the US healthcare system
4. describe the data you found on the state’s portal and how you plan to address the challenges for improvement
Specific hints for Q 3
1. To select a data set from the NY state open data portal that would help you address the problem area of high costs and waste in the US healthcare system, you could start by browsing through the categories of data available on the website, such as “Hospital Data,” “Healthcare Costs,” “Healthcare Quality,” and “Population Health.” Within these categories, you could look for data sets that specifically pertain to healthcare costs and waste, such as data on hospital charges, claims data, or data on utilization of healthcare services.
Step 1
List and describe two data sets that could augment or improve the analysis and explain why you chose them. Provide a snipping of the data sets in your answer to demonstrate your choice.
To justify your choice, for example, you might think it is important to know if your particular hospital patients had various types of health coverage. You might also want to know if particular counties or regions of the state have patients who receive abnormally long lengths of hospital stays and also have other public health problems such as high poverty rates or high obesity rates.
Step 2
Once you have selected a relevant data set, you can review the data documentation to identify what medical terminology systems are used in the data.
For example, you may see that the data uses codes from the SNOMED CT (Systematized Nomenclature of Medicine – Clinical Terms) or the ICD-10 (International Classification of Diseases, 10th Revision) for diagnosis codes, or codes from the CPT (Current Procedural Terminology) for procedure codes.
1. If you do not find any standardized codes in your choice of data in Step 1,
a) find at least two concepts in the data that could have been captured using standard codes.
b) choose another two data sets that use the appropriate codes. This exercise is to show that you recognize these terminologies for the purpose of communication.
In each of the above situations, provide a snapshot clipping of the relevant data sets
Step 3
In terms of how these standardized codes might help your analyses, they can allow you to link and aggregate data from different sources and make comparisons across patients, providers, and settings. They can also enable you to track trends over time and evaluate the effectiveness of different treatments. Furthermore, it helps to improve the data quality, and makes the data more comparable, consistent and accurate. This can help to inform policy decisions, evaluate the impact of interventions, and identify opportunities for improving the efficiency of the healthcare system.