Hiral Master
Project Personnel, All of Us Program Operational Use
13 active projects
V7 PASC Workspace
Scientific Questions Being Studied
This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.
Project Purpose(s)
- Educational
- Ancestry
- Other Purpose (practice notebook to familiarize with RW)
Scientific Approaches
We will apply algorithms developed by the RECOVER PCORnet Adult Cohort and compare the overlap in cohorts with the set derived though the N3C algorithm
Anticipated Findings
We expect to find a high degree of concordance between the RECOVER Adult Cohort algorithm and the N3C algorithm, even though the approaches were developed through different machine learning methods on different source patient data sets
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
- Mark Weiner - Mid-career Tenured Researcher, Cornell University
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Aashri Aggarwal - Undergraduate Student, Cornell University
Ruderfer - Brittain Collaboration
Scientific Questions Being Studied
The demo project aims to explore the strategies to leverage Fitbit along with genomics, survey and EHR data on the cloud-based platform in a cost-efficient fashion. These strategies can lay the foundation to multiple research studies which can drive evidence-based care for all. Specifically, the project aims to develop the workspace on Researcher Workbench to develop use cases for digital biomarker development for the Fitbit data and its integration with other AoU data types. One existing challenge is how to ensure that information about multiple streams of health data can be conveyed appropriately to enable fit-for-purpose analyses. Therefore, the intent of the demonstration projects is to understand the challenges that users might face who would like to leverage Fitbit data in tandem with surveys, measurements, genomics and EHR data.
Project Purpose(s)
- Population Health
- Social / Behavioral
- Educational
- Ancestry
Scientific Approaches
Data wrangling strategies to meaningfully combine Fitbit data with EHR and genomics data on the Researcher Workbench. Develop strategy in R and Python (RMarkdown and Jupyter Notebooks), including calculation of summary statistics and data visualizations, for users of varying levels of digital health literacy.
Develop educational materials to acquaint researchers with the benefits and limitations of combining Fitbit, EHR and genomics data. Materials to be developed include peer-reviewed manuscript, articles/blogs, videos, and user guides.
Anticipated Findings
All of Us Research program (AoURP) currently provides multiple streams of health data (i.e., genomics, surveys, electronic healthcare records (EHR) and Fitbit) to registered users on Researcher Workbench - cloud based platform. This in turn provides a unique opportunity to answer clinically relevant questions. Wearable devices enable continuous monitoring of physiological signals, which may be used for discovery, diagnostic, and prognostic purposes. The Fitbit study as a part of the AoURP includes Fitbit data from approximately 12,000 patients. The information available from the Fitbits (such as activity, heart rate, sleep patterns and device metadata) can be used to develop new digital biomarkers by exploring their correlations with clinical measurements, genetic risk scores and information from surveys such as social determinants of health.
Demographic Categories of Interest
- Race / Ethnicity
- Geography
- Access to Care
- Education Level
- Income Level
Data Set Used
Controlled TierResearch Team
Owner:
- Douglas Ruderfer - Mid-career Tenured Researcher, Vanderbilt University Medical Center
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Lide Han - Project Personnel, Vanderbilt University Medical Center
- Jeffrey Annis - Other, Vanderbilt University Medical Center
Collaborators:
- Brandon Lowery - Other, Vanderbilt University Medical Center
Practice Notebook to Explore AoU dataset
Scientific Questions Being Studied
This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.
Project Purpose(s)
- Educational
- Other Purpose (practice notebook to familiarize with RW)
Scientific Approaches
We will apply algorithms developed by the RECOVER PCORnet Adult Cohort and compare the overlap in cohorts with the set derived though the N3C algorithm
Anticipated Findings
We expect to find a high degree of concordance between the RECOVER Adult Cohort algorithm and the N3C algorithm, even though the approaches were developed through different machine learning methods on different source patient data sets
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
- Mark Weiner - Mid-career Tenured Researcher, Cornell University
- Hiral Master - Project Personnel, All of Us Program Operational Use
Collaborators:
- Aashri Aggarwal - Undergraduate Student, Cornell University
Type 2 DM and Wearables Data RTDv6
Scientific Questions Being Studied
Our primary goal is to understand the interaction between activity levels and sleep quality with the development and progression of human disease with a primary focus on type 2 diabetes mellitus. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses guiding clinical and research interventions focused on activity and sleep to reduce morbidity and mortality in patients seeking care.
Project Purpose(s)
- Disease Focused Research (type 2 diabetes mellitus)
- Population Health
- Social / Behavioral
Scientific Approaches
We will examine the relationship between daily activity (steps, activity intensity) over time and the prevalence and progression of coded human diseases with a primary focus on Type 2 DM. We will use the Fitbit data, EHR-curated diagnoses, laboratory values, quality of life survey results, and clinical outcomes (hospitalizations/mortality).
Anticipated Findings
We expect to find that lower levels of activity are associated with a higher prevalence and more rapid progression of Type 2 DM and other diseases. These data will provide the rationale to link wearables data with electronic health records nationwide as a window into behavioral activity choice as a modifiable risk factor for chronic diseases. We may find substantial variation in activity and disease prevalence/severity by socioeconomic status, which would motivate studies/interventions to reduce these health disparities.
Demographic Categories of Interest
- Race / Ethnicity
- Geography
- Access to Care
- Education Level
- Income Level
Data Set Used
Registered TierResearch Team
Owner:
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Aymone Kouame - Other, All of Us Program Operational Use
- Jeffrey Annis - Other, Vanderbilt University Medical Center
AOU_Recover_Long_Covid_v6
Scientific Questions Being Studied
The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program. N3C, All of Us, PCORnet and RECOVER teams collaborated to execute this purpose to enhance the overall PASC/Long COVID efforts.
Project Purpose(s)
- Disease Focused Research (Long COVID)
Scientific Approaches
To achieve this objective, data science workflows were used to apply ML algorithms on the Researcher Workbench. This effort allowed an expansion in the number of participants used to evaluate the ML models used to identify risk of PASC/Long COVID and also serve to validate the efforts of one team and providing insight to other teams. These models were implemented within the All of Us Controlled Tier data (C2022Q2R2), which was last refreshed on June 22, 2022. We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset. It also evaluated demographic characteristics for participants who were identified as possibly having PASC/Long COVID, and provides additional details on model performance, such as areas under the receiver operator characteristic curve and confusion matrix.
Anticipated Findings
We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset. The findings and code use to generate the demographic characteristics for participants who were identified as possibly having PASC/Long COVID, and provides additional details on model performance, such as areas under the receiver operator characteristic curve and confusion matrix.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- WeiQi Wei - Other, All of Us Program Operational Use
- Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
- Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
- Mark Weiner - Mid-career Tenured Researcher, Cornell University
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
- David Mohs - Other, All of Us Program Operational Use
- Christopher Lord - Project Personnel, All of Us Program Operational Use
- Chenchal Subraveti - Project Personnel, All of Us Program Operational Use
Collaborators:
- Jun Qian - Other, All of Us Program Operational Use
- Chris Lunt - Other, All of Us Program Operational Use
Duplicate of Skills Assessment Training Notebooks For Users
Scientific Questions Being Studied
This workspace contains multiple notebooks that assess users' understanding of the workbench and OMOP. These notebooks are meant to help users check their knowledge not only on Python, R, and SQL, but also on the general data structure and data model used by the All of Us program.
Project Purpose(s)
- Educational
Scientific Approaches
There are no scientific approach used in this workspace because it is meant for educational purposes only. We will cover all aspects of OMOP, and hence will use most datasets available in the workbench.
Anticipated Findings
We do not anticipate to have any findings. Instead, we are educating people on the use of the workbench and the common data model OMOP used by the program.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Kammarauche Aneni - Other, Yale University
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Chenchal Subraveti - Project Personnel, All of Us Program Operational Use
- Aymone Kouame - Other, All of Us Program Operational Use
Collaborators:
- Michael Lyons - Project Personnel, All of Us Program Operational Use
- Hunter Hollis - Project Personnel, All of Us Program Operational Use
- Christopher Lord - Project Personnel, All of Us Program Operational Use
Wearables and The Human Phenome (Published Work)
Scientific Questions Being Studied
Our primary goal is to understand the relation between activity levels with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses guiding clinical and research interventions focused on activity to reduce morbidity and mortality in patients seeking care.
This workspace is replication workspace for Wearables and The Human Phenome project. We replicated the workspace to provide a clean and reduced version of code that was used to generate the findings, which were published in Nature Medicine (https://www.nature.com/articles/s41591-022-02012-w).
Project Purpose(s)
- Population Health
- Social / Behavioral
Scientific Approaches
We will examine the relationship between daily activity (steps, activity intensity) over time and the prevalence and progression of coded human diseases. We will use the Fitbit data, EHR-curated diagnoses, laboratory values, and survey results.
Anticipated Findings
We expect to find that lower levels of activity are associated with a higher prevalence and more rapid progression of chronic diseases. These data will provide the rationale to link wearables data with electronic health records nationwide as a window into behavioral activity choice as a modifiable risk factor for chronic diseases. We may find substantial variation in activity and disease prevalence/severity by socioeconomic status, which would motivate studies/interventions to reduce these health disparities.
Demographic Categories of Interest
- Race / Ethnicity
- Geography
- Access to Care
- Education Level
- Income Level
Data Set Used
Registered TierResearch Team
Owner:
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Christopher Lord - Project Personnel, All of Us Program Operational Use
- Chenchal Subraveti - Project Personnel, All of Us Program Operational Use
- Jeffrey Annis - Other, Vanderbilt University Medical Center
Collaborators:
- Jun Qian - Other, All of Us Program Operational Use
Wearables and The Human Phenome (v3)
Scientific Questions Being Studied
This replicates the workspace Wearables and The Human Phenome. We would like to create a clean and reduced version of our prior workspace for public facing code that was requested from us by Nature Medicine.
Our primary goal is to understand the interaction between activity levels and sleep quality with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses guiding clinical and research interventions focused on activity and sleep to reduce morbidity and mortality in patients seeking care.
Project Purpose(s)
- Population Health
- Social / Behavioral
Scientific Approaches
We will examine the relationship between daily activity (steps, activity intensity) over time and the prevalence and progression of coded human diseases. We will use the Fitbit data, EHR-curated diagnoses, laboratory values, quality of life survey results, and clinical outcomes (hospitalizations/mortality).
Anticipated Findings
We expect to find that lower levels of activity are associated with a higher prevalence and more rapid progression of chronic diseases. These data will provide the rationale to link wearables data with electronic health records nationwide as a window into behavioral activity choice as a modifiable risk factor for chronic diseases. We may find substantial variation in activity and disease prevalence/severity by socioeconomic status, which would motivate studies/interventions to reduce these health disparities.
Demographic Categories of Interest
- Race / Ethnicity
- Geography
- Access to Care
- Education Level
- Income Level
Data Set Used
Registered TierResearch Team
Owner:
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Jeffrey Annis - Other, Vanderbilt University Medical Center
Work with All of Us Physical Measurements Data - Class Teaching
Scientific Questions Being Studied
How to navigate around physical measurements?
Project Purpose(s)
- Other Purpose (Testing and operations purposes)
Scientific Approaches
N/A
Anticipated Findings
N/A
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Hunter Hollis - Project Personnel, All of Us Program Operational Use
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Aymone Kouame - Other, All of Us Program Operational Use
Wearables Data and the Human Phenome
Scientific Questions Being Studied
Our primary goal is to understand the interaction between activity levels and sleep quality with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses guiding clinical and research interventions focused on activity and sleep to reduce morbidity and mortality in patients seeking care.
Project Purpose(s)
- Population Health
- Social / Behavioral
Scientific Approaches
We will examine the relationship between daily activity (steps, activity intensity) over time and the prevalence and progression of coded human diseases. We will use the Fitbit data, EHR-curated diagnoses, laboratory values, quality of life survey results, and clinical outcomes (hospitalizations/mortality).
Anticipated Findings
We expect to find that lower levels of activity are associated with a higher prevalence and more rapid progression of chronic diseases. These data will provide the rationale to link wearables data with electronic health records nationwide as a window into behavioral activity choice as a modifiable risk factor for chronic diseases. We may find substantial variation in activity and disease prevalence/severity by socioeconomic status, which would motivate studies/interventions to reduce these health disparities.
Demographic Categories of Interest
- Race / Ethnicity
- Geography
- Access to Care
- Education Level
- Income Level
Data Set Used
Registered TierResearch Team
Owner:
- Shi Huang - Other, Vanderbilt University Medical Center
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Evan Brittain - Mid-career Tenured Researcher, Vanderbilt University Medical Center
- Jeffrey Annis - Other, Vanderbilt University Medical Center
Implementing Recover Algorithm on AOU Data
Scientific Questions Being Studied
Identify potential long COVID patients among three groups in the database: All COVID-19 patients, patients hospitalized with COVID-19, and patients who had COVID-19 but were not hospitalized. The models proved to be accurate, as people identified as at risk for long COVID were similar to patients seen at long COVID clinics.
Project Purpose(s)
- Disease Focused Research (Long COVID)
Scientific Approaches
XGBoost machine learning model is developed to identify potential patients with long COVID.
Base population is defined as any non-deceased adult patient (age ≥18 years) with either an International Classification of Diseases-10-Clinical Modification COVID-19 diagnosis code (U07.1) from an inpatient or emergency visit, or a positive SARS-CoV-2 PCR or antigen test, and for whom at least 90 days have passed since COVID-19 index date.
The model examines demographics, health-care utilization, diagnoses, and medications for adults with COVID-19.
Anticipated Findings
Identify with high accuracy, patients who potentially have long COVID. Find the important features.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- WeiQi Wei - Other, All of Us Program Operational Use
- Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
- Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
Collaborators:
- Chris Lunt - Other, All of Us Program Operational Use
RECOVER+AoU
Scientific Questions Being Studied
The goal of this initial cross-platform testing effort is focused on expanding the analytical capability of available data sources that have collected data on SARS-CoV-2. As we gather data across the US, we can use independent data sources to better understand PASC in our population and identify possible interventions. As a first step, we hope to leverage available RECOVER data tools and apply within the All of Us Researcher Workbench to assess cross-platform interoperability and analytical equivalence. This would provide a path to engage our research community and guide research towards our understanding of PASC.
Project Purpose(s)
- Population Health
- Methods Development
- Control Set
- Other Purpose (Testing PASC ML Algorithm from N3C-RECOVER in AoU Platform)
Scientific Approaches
Bring existing data query code and data analytics code from the RECOVER researcher team into the All of Us Researcher Workbench. Use “equivalent” code sets to explore and expand our understanding of PASC and its effects on the US population. Share reproducible findings through programming “notebook” and analysis of standardized datasets (OMOP).
Anticipated Findings
This research activity will be developed in conjunction with an awareness campaign of the collaborative efforts undertaken by both RECOVER and AoU. We intend to highlight the available datasets with SARS-CoV-2 data, as well as the cloud-based researcher workspaces (RECOVER, AoU). With the awareness campaign and cross-platform testing, we intent to create an on-ramp for experienced and young researchers within two large and diverse datasets.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- WeiQi Wei - Other, All of Us Program Operational Use
- Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
- Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
- Hiral Master - Project Personnel, All of Us Program Operational Use
- Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
- Chris Lunt - Other, All of Us Program Operational Use
Test - Work With Wearable Device Data
Scientific Questions Being Studied
Testing and operational use
Project Purpose(s)
- Other Purpose (Testing and operational use)
Scientific Approaches
This Tutorial Workspace contains one Jupyter Notebook written in Python. The notebook contains information on how to extract and work with the current set of All of Us Fitbit data. What are the anticipated findings from the study? How would your findings contribute to the body of scientific knowledge in the field? By reading and running the notebook in this Tutorial Workspace, researchers will learn how to query information about steps, heart rate, and daily activity summary.
Anticipated Findings
By reading and running the notebook in this Tutorial Workspace, researchers will understand how to work with Fitbit CDR data from the workbench. They will learn how to query information about steps, heart rate, and daily activity summary.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Hiral Master - Project Personnel, All of Us Program Operational Use
Collaborators:
- Francis Ratsimbazafy - Other, All of Us Program Operational Use
You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.