Jason Li
Project Personnel, Mayo Clinic
3 active projects
Duplicate of How to Get Started with Registered Tier Data (v7)
Scientific Questions Being Studied
We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.
What should you expect? This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Record (EHR), Physical Measurements (PM), and Survey data.
Project Purpose(s)
- Educational
- Methods Development
- Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)
Scientific Approaches
This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:
1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.
Anticipated Findings
By reading and running the notebooks in this Tutorial Workspace, you will understand the following:
All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierDuplicate of Demo - Medication Sequencing
Scientific Questions Being Studied
1- What are the main prescribed medication sequences that participants with type 2 diabetes and depression took over three years of treatment?
In this questions, we are extracting the anti-diabetes and anti-depressant medications used to to treated participants who have T2D and depression codes. We retrieved medications prescribed after the first diagnosis code for each disease. We represented the medications using their ATC 4th level.
2- What is the most common first anti-diabetic and anti-depressant that were prescribed for All of Us participants? We extracted the first medications prescribed to treat T2D and depression. We identified the most common first medication with the highest number of participants.
3- Is there a change in the percentages of participants who were prescribed first common medication, treated using one medication, treated only using one common medication between 2000-2018?
Project Purpose(s)
- Disease Focused Research (type 2 diabetes, depression)
- Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)
Scientific Approaches
In this project, we plan on using the medication sequencing developed at Columbia University and the OHDSI network as a means to characterize treatment pathways at scale. Further, we want to demonstrate implementation of these medication sequencing algorithms in the All of Us research dataset to show how the various sources of data contained within the program can be used to characterize treatment pathways at scale. We will perform separate medication sequence analyses for three different common, complex diseases: type 2 diabetes, depression
1- Data manipulation
Using python and BigQuery to:
A- Retrieve medication and their classes
B-Create the medications sequences
2- Visualization:
A- Creating sunburst to visualize the sequences
B- Plotting the percentages of participants the first common medication and one medication during three years
Anticipated Findings
For this study, we anticipate demonstrating the validity of the data by showing expected treatment patterns despite gathering data from over 30 individual EHR sites. Specifically, we expect to find:
1- Variation in the medication sequences prescribed to treat All of Us participants who had type 2 diabetes and depression.
2- The most common medication used to treat participants as first line treatment with type 2 diabetes and depression diagnosis.
3- A trend or change over time of prescribing the first common medication over the study period
4- Trend overtime for the percentage of participants
Importantly, the detailed code developed herein is made available within the Researcher Workbench to researchers, so that they may more easily extract medication data and class information using a common medication ontology, an approach useful in many discovery studies.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierMDD research cohort
Scientific Questions Being Studied
What can we do to improve the quality of care?
What factors significantly impact patients’ antidepressant responses?
Can we predict a patient’s response based on other information?
What side effects most likely make patients switch treatments?
Project Purpose(s)
- Disease Focused Research (MDD)
- Educational
Scientific Approaches
Linear model/ Mixed effect models
Finite mixture modeling
Latent variable modeling
Multivariable adaptive regression splines
Anticipated Findings
Potential predictors which are significant for antidepressants drugs:
The severity of depression: Generally, individuals with more severe depressive symptoms may be more likely to respond to antidepressant treatment. However, it's worth noting that antidepressants can be effective for individuals with mild to moderate depression as well.
Diagnostic subtype: Different subtypes of depression, such as major depressive disorder (MDD), persistent depressive disorder (dysthymia), or bipolar depression, may have varying responses to different classes of antidepressants. Some antidepressants may be more effective for certain subtypes.
Previous treatment response: If an individual has had a positive response to a specific antidepressant in the past, there is a higher likelihood of responding to that same medication again. This information can guide treatment decisions.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierYou can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.