Research Projects Directory

Research Projects Directory

17,157 active projects

This information was updated 4/2/2025

The Research Projects Directory includes information about all projects that currently exist in the Researcher Workbench to help provide transparency about how the Workbench is being used. Each project specifies whether Registered Tier or Controlled Tier data are used.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

156 projects have 'nutrition' in the scientific questions being studied description
< Go back to All Projects View or enter a new search query

Nutrition x Chronic Condition Study

We would like to know: 1. What are incidences of nutritional disorders and chronic conditions in the study population? 2. What is the nature of the relationships between nutritional disorders, chronic conditions, and population health factors? Are any of these…

Scientific Questions Being Studied

We would like to know:
1. What are incidences of nutritional disorders and chronic conditions in the study population?
2. What is the nature of the relationships between nutritional disorders, chronic conditions, and population health factors? Are any of these relationships statistically significant?
3. How do age or sex play a role in prevalence of nutrition disorders and chronic conditions?

Project Purpose(s)

  • Population Health

Scientific Approaches

Our team is interested in utilizing data that capture various types of nutritional disorders such as Nutritional Deficiency Disorder, Vitamin Disease, Disorder of Hyperalimentation, and Failure to Thrive as well as a multitude of chronic conditions. We will also capture data about participants social determinants of health, lifestyle, overall health, and personal and family history effects nutritional disorders. We plan to access R through the Jupyter Notebook to conduct descriptive and statistical analysis with the data. Potential statistical approaches may include chi-square analysis, bivariate analysis, and regression modeling.

Anticipated Findings

We believe that our study will help elucidate relationships between social determinants of health and other factors related to nutritional disorders and chronic conditions. We also anticipate that lower health quality and inadequate access to healthcare and nutrition may contribute to more cases of nutritional disorders combined with chronic conditions. Finally, we expect that older age groups may potentially see higher rates of both nutritional disorders and chronic conditions, while younger age groups may have more cases of nutritional disorders compared to chronic conditions. We believe that our study will help advance understanding of the relationships between these factors and health outcomes, and identify potential education/outreach needs in dietary research.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

v7 of Prevalence of carriers of inborn errors of metabolism

Inborn errors of metabolism (IEM) are rare diseases that are often recessive. But several large cohort studies revealed metabolic individuality in healthy populations. In other words, one's unique genetic makeup affects the way our body performs metabolism and the disposition…

Scientific Questions Being Studied

Inborn errors of metabolism (IEM) are rare diseases that are often recessive. But several large cohort studies revealed metabolic individuality in healthy populations. In other words, one's unique genetic makeup affects the way our body performs metabolism and the disposition of metabolism-related diseases. Understanding the prevalence and characteristics of metabolic outliers may help us devise precision nutrition strategies that may help improve the health of these individuals. The specific question we would like to investigate in this area using the All of Us data are:
1. The frequency of pathogenic variants in the population
2. The frequency of GWAS metabolism-related variants
3. The demographic characteristics of these predicted metabolic outlier individuals

Project Purpose(s)

  • Disease Focused Research (inherited metabolic disorder)
  • Ancestry

Scientific Approaches

We will first perform some exploratory analysis to understand the data structure of the All of Us data. Then, we will prepare a list of pathogenic variants on IEM genes, as well as a list of GWAS variants on IEM genes that associate with related metabolic traits from existing databases and publications. We will evaluate the frequency of these variants in the All of Us data. The demographic characteristics of the variant carriers will be summarized. As many of these variants may be rare but with large effect sizes, we will aggregate data to the gene level instead of the variant level.

Anticipated Findings

We expect this work will give us an estimate of metabolic outlier population size and characteristics for future studies that try to implement precision nutrition.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Ling Cai - Project Personnel, University of Texas Southwestern Medical Center

NAFLD

What is the association of NAFLD with life style esp. nutrition?

Scientific Questions Being Studied

What is the association of NAFLD with life style esp. nutrition?

Project Purpose(s)

  • Disease Focused Research (NAFLD)
  • Population Health
  • Educational

Scientific Approaches

Explore the Nutritional patterns in adults and incidence of NAFLD; what is the outcome over time
Correlate NAFLD incidence and prognosis among vegetarian and nonvegetarians

Anticipated Findings

It is a anticipated that vegetarians have a better outcome

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Micronutrient Levels in Mothers with hyperemesis gravidarum

During pregnancy, the maternal intake of micronutrients supports fetal development thus it is of great importance for those who are pregnant to maintain adequate intake of certain micronutrients (both through diet and prenatal supplements). However, this is near impossible in…

Scientific Questions Being Studied

During pregnancy, the maternal intake of micronutrients supports fetal development thus it is of great importance for those who are pregnant to maintain adequate intake of certain micronutrients (both through diet and prenatal supplements). However, this is near impossible in some women who suffer from a condition that is marked by excessive vomiting and nausea during pregnancy, known as hyperemesis gravidarum (HG). HG impacts anywhere from 0.3%-10.8% of pregnant women and it is possible that this number may be more as it is often misdiagnosed as simply “morning sickness”. Undernutrition as a result of HG, may impact fetal development and HG has been shown to be associated with later neurodevelopmental disorders in children born to mothers with HG. For this project, we aim to use the All of Us cohort to determine socioeconomic and environmental factors that may exacerbate HG outcomes as well as discover potential genes involved beyond the known genes, GDF15 and IGFBP7.

Project Purpose(s)

  • Disease Focused Research (Hyperemesis gravidarum)
  • Population Health
  • Ancestry

Scientific Approaches

We aim to use the All of Us cohort to determine socioeconomic and environmental factors that may exacerbate HG outcomes as well as discover potential genes involved beyond the known genes, GDF15 and IGFBP7. Additionally, we aim to discover any co-occurring conditions that may exist with HG.

Anticipated Findings

This project would inform how maternal diet impacts metabolic and neurological health outcomes of children. It will also help provide evidence that since this condition may have long lasting impacts on both mother and child, stakeholders should take this condition more seriously.

Demographic Categories of Interest

  • Geography
  • Access to Care
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Roshonda Jones - Early Career Tenure-track Researcher, North Carolina A&T State University

Collaborators:

  • Jasmyn Grant - Undergraduate Student, North Carolina A&T State University

Periodontitis and cause-specific mortality

Periodontitis, a chronic inflammatory disease of tooth-supporting tissues, affects approximately 50% of adults, with 10% experiencing severe manifestations. The condition increases risk of tooth loss, edentulism, and masticatory dysfunction, compromising nutrition and quality of life. Periodontitis has been associated with…

Scientific Questions Being Studied

Periodontitis, a chronic inflammatory disease of tooth-supporting tissues, affects approximately 50% of adults, with 10% experiencing severe manifestations. The condition increases risk of tooth loss, edentulism, and masticatory dysfunction, compromising nutrition and quality of life. Periodontitis has been associated with systemic comorbidities including diabetes, cardiovascular diseases (CVD), and various malignancies. This study investigates the associations between incident and persistent periodontitis and both all-cause and cause-specific mortality.

Project Purpose(s)

  • Disease Focused Research (periodontal disease)

Scientific Approaches

Using All of Us cohort data, we examine the prospective association between periodontitis and mortality.

Anticipated Findings

The findings strengthen the evidence supporting periodontitis as a risk factor for mortality by employing robust epidemiological designs and advanced statistical methods to address biases present in prior research. This contributes to a more accurate understanding of the association and its implications for public health.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Periodontitis and Mortality

Periodontitis, a chronic inflammatory disease of tooth-supporting tissues, affects approximately 50% of adults, with 10% experiencing severe manifestations. The condition increases risk of tooth loss, edentulism, and masticatory dysfunction, compromising nutrition and quality of life. Periodontitis has been associated with…

Scientific Questions Being Studied

Periodontitis, a chronic inflammatory disease of tooth-supporting tissues, affects approximately 50% of adults, with 10% experiencing severe manifestations. The condition increases risk of tooth loss, edentulism, and masticatory dysfunction, compromising nutrition and quality of life. Periodontitis has been associated with systemic comorbidities including diabetes, cardiovascular diseases (CVD), and various malignancies. This study investigates the associations between incident and persistent periodontitis and both all-cause and cause-specific mortality.

Project Purpose(s)

  • Disease Focused Research (periodontal disease)

Scientific Approaches

Using All of Us cohort data, we examine the prospective association between periodontitis and mortality.

Anticipated Findings

The findings strengthen the evidence supporting periodontitis as a risk factor for mortality by employing robust epidemiological designs and advanced statistical methods to address biases present in prior research. This contributes to a more accurate understanding of the association and its implications for public health.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Nutrition&Cardiovascular

This study aims to investigate the impact of nutritional and dietary factors on the risk of cardiovascular disease (CVD) and explore potential genetic susceptibility factors. Cardiovascular disease (CVD) is a leading cause of death and morbidity worldwide. In recent years,…

Scientific Questions Being Studied

This study aims to investigate the impact of nutritional and dietary factors on the risk of cardiovascular disease (CVD) and explore potential genetic susceptibility factors.

Cardiovascular disease (CVD) is a leading cause of death and morbidity worldwide. In recent years, the role of dietary and nutritional factors in the development and progression of CVD has garnered increasing attention. However, current research findings on the associations between specific dietary components, dietary patterns, and CVD risk remain inconsistent, and there is a lack of in-depth exploration into potential genetic susceptibility factors.

The findings of this study will provide important references for developing evidence-based dietary guidelines and CVD prevention strategies. They will also offer a basis for identifying high-risk populations and formulating targeted nutritional interventions, which will help reduce the incidence and mortality of CVD and alleviate the disease burden.

Project Purpose(s)

  • Disease Focused Research (cardiovascular disease)
  • Control Set

Scientific Approaches

1. Datasets used:
(1) Cardiovascular disease data in the condition section of Electronic Health Records.
(2) The Labs & Measurements section of Electronic Health Records about the nutritional dietary factors in body fluids (saturated fatty acids, unsaturated fatty acids, dietary fiber, sodium, potassium, etc.).
(3) SNVs and WGS datasets in Genomics.
(4) Measurements & Wearables include body and body shape data measurement.
(5) Results of lifestyle and overall health questionnaire in the survey part.

2. methods adopted:
(1) Epidemiological methods: population based cohort study.
(2) Statistical analysis methods: Multivariate regression analysis , gene-environment interaction analysis and mediating effect analysis.
(3) Bioinformatics analysis methods:
Pathway analysis, network analysis and mendelian randomization.

3. Tools: R (version 4.4.2) software will be used for analysis

Anticipated Findings

1.Expected Outcomes

(1)Identify dietary components and dietary patterns significantly associated with CVD risk.

(2)Discover genetic susceptibility factors influencing individual responses to dietary factors.

(3)Clarify the mechanisms of interaction between dietary factors and genetic susceptibility factors.

(4)Develop risk prediction models.

2.Scientific Contributions

(1)Deepen understanding of the etiology of CVD.

(2)Advance the development of personalized nutrition interventions.

(3)Provide scientific evidence for public health policy formulation.

(4)Promote interdisciplinary collaboration.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Yu Zhou - Graduate Trainee, Indiana University

Multivariate trajectories of treatment response

Treatment response in chronic diseases is a multifaceted and heterogeneous process that evolves over time and differs per patient. In practice, it is often defined as a binary outcome at a single point in time, which discards valuable temporal information.…

Scientific Questions Being Studied

Treatment response in chronic diseases is a multifaceted and heterogeneous process that evolves over time and differs per patient. In practice, it is often defined as a binary outcome at a single point in time, which discards valuable temporal information. The goal of this proposed research is to use multivariate temporal data to identify clusters of patients with different therapeutic response trajectories. This work can inform future therapeutic decision-making, trial design, and drug discovery for patients with chronic diseases. This work will also provide a foundation that can be extended to other modalities (metabolome, microbiome, nutrition, wearables) as real-world datasets increase in volume and complexity.

Project Purpose(s)

  • Drug Development
  • Methods Development

Scientific Approaches

By applying unsupervised clustering methods (VaDER, CRLI) to multivariate longitudinal measures (laboratory results, vital signs, physical measurements) from real-world datasets (All of Us), we will identify common response trajectories to chronic disease therapies grouped by mechanisms of action. Leveraging diverse data (health records, participant surveys, genomics) we will perform quantitative and qualitative assessment of associations between identified trajectory clusters and baseline patient characteristics (demographics, comorbidities, medication use, social determinants of health, genetic variants).

Anticipated Findings

We anticipate the discovery of discrete patient clusters based on their variable response to treatment across a number of longitudinal clinical measures. We will characterize our cluster results from two deep learning-based methods (VaDER, CRLI) based on cluster associations with patient characteristics of interest, like genetic variants and comorbidities. As such, we intend to propose novel hypotheses regarding optimal prescribing practices in the context of chronic disease management. Additionally, our work will test the ability of these methods to generalize to real-world EHR data (as apposed to clinical trial or registry data).

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Exploring Coaching and Intervention datasets

Our research aims to explore what coaching/intervention techniques are most effective in helping patients make successful changes to improve their sleep, nutritional and physical health.

Scientific Questions Being Studied

Our research aims to explore what coaching/intervention techniques are most effective in helping patients make successful changes to improve their sleep, nutritional and physical health.

Project Purpose(s)

  • Social / Behavioral
  • Methods Development

Scientific Approaches

We intend to use datasets where we have patient background data such as demographics and sleep, fitness or nutrition related medical or behavioral data, the various coaching/therapy/intervention they received in order to mitigate their problems or reach the goals they set out to achieve, what was their activity in those domains of sleep, nutrition and fitness between 2 intervention sessions, and finally did they solve their problem or reach their goal.

Anticipated Findings

We intend to find out what kind of intervention techniques lead to what follow-up activities in patients, and whether these activities lead to successful health outcomes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Pressure Sores risk prediction - IX machine learning

Pressure ulcers (bed sores) are a serious and costly problem in hospitals. They cause significant pain and suffering for patients, increase the length of hospital stays, and lead to higher healthcare costs. Early identification allows for proactive interventions like specialised…

Scientific Questions Being Studied

Pressure ulcers (bed sores) are a serious and costly problem in hospitals. They cause significant pain and suffering for patients, increase the length of hospital stays, and lead to higher healthcare costs. Early identification allows for proactive interventions like specialised mattresses, frequent turning, and improved nutrition, significantly reducing the risk of these severe complications. However, predicting which patients are at risk of developing a pressure ulcer is difficult. Whilst many factors have been identified as increasing an individual's risk, including immobility, age and malnutrition, it remains unclear how these factors interact to determine a patient's overall risk. Furthermore, existing risk scores are often static, and do not account for the increase in risk over the course of a patient's hospital stay. Therefore, we aim to explore whether machine learning methods can accurately predict which hospitalised patients are at high risk of developing pressure ulcers.

Project Purpose(s)

  • Disease Focused Research (decubitus ulcer)

Scientific Approaches

This study will employ a machine learning approach to predict the risk of pressure ulcer development in hospitalized patients. We will utilize the pseudo-anonymised All of Us database of de-identified dataset containing electronic health records (EHRs). First, we shall extract the relevant features from the data, such as age and mobility scores. We will next split the cohort into a train, test and validation series, which will be used later to assess the accuracy and generalisability of our risk model. By applying existing supervised machine learning methods to the train and test cohorts, we will build a risk model to estimate how likely a given patient is to develop a pressure sore.

We will test the accuracy of our risk tool using our validation series and compare the predictions to existing scales, such as the Braden Scale. Finally, we will investigate which features contribute most strongly to the model's predictions to gain insights into the underlying risk factors.

Anticipated Findings

We anticipate that the machine learning models will demonstrate higher accuracy in predicting pressure ulcer risk compared to traditional risk assessment tools like the Braden Scale. This is because deep learning models can capture complex interactions between multiple factors that contribute to pressure ulcer development. Furthermore, the models utilizing time-series analysis may be able to capture dynamic changes in patient condition over time. This could enable more accurate and timely risk assessment, allowing for earlier interventions and potentially preventing pressure ulcer development. Finally, the interactions between the various risk factors, and how these change over time, will be elucidated.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of asthma and diabetes

To study the co-occurence of diabetes and asthma in the US population to evaluate their pathophysiology. We will also examine how their together affect nutrition status of women.

Scientific Questions Being Studied

To study the co-occurence of diabetes and asthma in the US population to evaluate their pathophysiology.
We will also examine how their together affect nutrition status of women.

Project Purpose(s)

  • Disease Focused Research (asthma, diabetes)
  • Population Health

Scientific Approaches

This study is going to use data on lab reports, SDOH, medication use.
We will use descriptive statistics for the understanding these conditions in the population.

Anticipated Findings

It is expected those with co-morbid asthma and diabetes will have poorer nutrition outcome including anemia and iron deficiency.

Demographic Categories of Interest

  • Sex at Birth

Data Set Used

Registered Tier

Research Team

Owner:

  • Sixtus Aguree - Early Career Tenure-track Researcher, Indiana University

Collaborators:

  • Humberto López Castillo - Early Career Tenure-track Researcher, University of Central Florida
  • Anoushka Shinde - Graduate Trainee, Indiana University

asthma and diabetes

To study the co-occurence of diabetes and asthma in the US population to evaluate their pathophysiology. We will also examine how their together affect nutrition status of women.

Scientific Questions Being Studied

To study the co-occurence of diabetes and asthma in the US population to evaluate their pathophysiology.
We will also examine how their together affect nutrition status of women.

Project Purpose(s)

  • Disease Focused Research (asthma, diabetes)
  • Population Health

Scientific Approaches

This study is going to use data on lab reports, SDOH, medication use.
We will use descriptive statistics for the understanding these conditions in the population.

Anticipated Findings

It is expected those with co-morbid asthma and diabetes will have poorer nutrition outcome including anemia and iron deficiency.

Demographic Categories of Interest

  • Sex at Birth

Data Set Used

Registered Tier

Research Team

Owner:

  • Sixtus Aguree - Early Career Tenure-track Researcher, Indiana University

Collaborators:

  • Humberto López Castillo - Early Career Tenure-track Researcher, University of Central Florida
  • Anoushka Shinde - Graduate Trainee, Indiana University

CCDGS_WebToolsQueryDemo

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH - NHGRI)

CCDGS_Tutorial_dev

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH - NHGRI)

Genomics_env_v7_Prevalence_of_carriers_of_inborn_errors_of_metabolism

Inborn errors of metabolism (IEM) are rare diseases that are often recessive. But several large cohort studies revealed metabolic individuality in healthy populations. In other words, one's unique genetic makeup affects the way our body performs metabolism and the disposition…

Scientific Questions Being Studied

Inborn errors of metabolism (IEM) are rare diseases that are often recessive. But several large cohort studies revealed metabolic individuality in healthy populations. In other words, one's unique genetic makeup affects the way our body performs metabolism and the disposition of metabolism-related diseases. Understanding the prevalence and characteristics of metabolic outliers may help us devise precision nutrition strategies that may help improve the health of these individuals. The specific question we would like to investigate in this area using the All of Us data are:
1. The frequency of pathogenic variants in the population
2. The frequency of GWAS metabolism-related variants
3. The demographic characteristics of these predicted metabolic outlier individuals

Project Purpose(s)

  • Disease Focused Research (inherited metabolic disorder)
  • Ancestry

Scientific Approaches

We will first perform some exploratory analysis to understand the data structure of the All of Us data. Then, we will prepare a list of pathogenic variants on IEM genes, as well as a list of GWAS variants on IEM genes that associate with related metabolic traits from existing databases and publications. We will evaluate the frequency of these variants in the All of Us data. The demographic characteristics of the variant carriers will be summarized. As many of these variants may be rare but with large effect sizes, we will aggregate data to the gene level instead of the variant level.

Anticipated Findings

We expect this work will give us an estimate of metabolic outlier population size and characteristics for future studies that try to implement precision nutrition.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Ling Cai - Project Personnel, University of Texas Southwestern Medical Center

Post-Bariatric Surgery Secondary Procedures

Primary Objective: Bariatric surgery is a therapeutic intervention employed to manage obesity and its associated comorbidities. The primary objective of this study is to evaluate the surgical management of bariatric surgery patients, with a focus on characterizing the types and…

Scientific Questions Being Studied

Primary Objective:

Bariatric surgery is a therapeutic intervention employed to manage obesity and its associated comorbidities. The primary objective of this study is to evaluate the surgical management of bariatric surgery patients, with a focus on characterizing the types and frequencies of bariatric procedures currently being performed in this patient population.

Secondary Objectives:

To assess the incidence of surgery-specific complications, including wound healing disturbances, infections, nutritional deficiencies, and venous thromboembolism, in patients undergoing bariatric procedures.
To identify predictors of these complications through comparative analyses, utilizing chi-square tests and logistic regression.

Project Purpose(s)

  • Control Set

Scientific Approaches

Inclusion Criteria: Adult patients (≥18 years) who underwent bariatric surgery and had subsequent surgeries.
Variables:
Demographics: Age, sex, BMI, race, smoking status.
Clinical Data: Type of bariatric surgery, nutritional status, comorbidities, and surgical outcomes.
Complications: Wound infections, seroma, VTE, nutritional deficiencies, and readmissions.
Research Methods

Retrospective Cohort Study
Descriptive Analysis:
Summarize demographics and clinical characteristics using frequencies, means, medians, and IQRs.
Comparative Analysis:
Chi-Square/Fisher's Exact Tests: For categorical variables.
T-tests/Mann-Whitney U Tests: For continuous variables based on data normality.
Multivariate Analysis:
Logistic Regression: Identify predictors of complications (e.g., age, BMI, surgery type).
Kaplan-Meier Survival Analysis: Assess time-to-complication events.
Cox Proportional Hazards Models: Explore complication risks over time.

Anticipated Findings

This study aims to identify the incidence and predictors of complications in post-bariatric patients undergoing secondary procedures. Expected findings include:
Complication Rates:
Wound healing issues, infections, nutritional deficiencies, and VTE.
Predictors:
Type of bariatric surgery, BMI, age, sex, and nutritional status.
Nutritional Impact:
Deficiencies and their link to complications.
Comparative Analysis:
Complication rates by secondary surgery type.
Longitudinal Outcomes:
Timing, readmissions, reoperations, and mortality.
Contribution to Knowledge
Findings will:

Fill Gaps: Provide comprehensive data on complications and risk factors.
Inform Guidelines: Improve care for nutrition, wound management, and VTE prevention.
Enhance Risk Models: Predict complications and aid surgical planning.
Personalize Care: Support tailored approaches for better outcomes.

This study will enhance clinical care for post-bariatric patients.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Eunkyoung Bae - Research Associate, Korea Advanced Institute of Science and Technology (KAIST)

Health Disparities in Cancer Cachexia

Cancer cachexia is a complex metabolic syndrome characterized by severe muscle wasting, weight loss, and systemic inflammation, significantly affecting the quality of life and survival of cancer patients. Health disparities in cancer cachexia arise due to unequal access to healthcare,…

Scientific Questions Being Studied

Cancer cachexia is a complex metabolic syndrome characterized by severe muscle wasting, weight loss, and systemic inflammation, significantly affecting the quality of life and survival of cancer patients. Health disparities in cancer cachexia arise due to unequal access to healthcare, variations in socioeconomic status, and underlying systemic inequities. These disparities are further exacerbated by delayed diagnosis, limited access to nutritional interventions, and differences in the availability of supportive care. Addressing these disparities requires a multifaceted approach, including improving healthcare access, promoting culturally competent care, and fostering inclusive research to better understand the socioeconomic and biological factors contributing to cachexia progression. Equitable interventions tailored to diverse populations can mitigate the burden of cachexia and improve health outcomes for all cancer patients.

Project Purpose(s)

  • Disease Focused Research (Cancer Cachexia )
  • Population Health

Scientific Approaches

The All of Us Research Program provides a comprehensive and diverse dataset that can be utilized to explore a variety of scientific questions, including those related to metabolic processes and health disparities. This program encompasses data from surveys, electronic health records (EHRs), genomic analyses, and social determinants of health (SDOH), creating an unparalleled resource for multidisciplinary research. Researchers can examine the interplay of biological, environmental, and social factors in health outcomes, enabling a deeper understanding of complex conditions.

Anticipated Findings

Race Disparities: The All of Us dataset includes participants from various racial and ethnic backgrounds, making it possible to explore how race may influence susceptibility to cachexia. For example, genetic, environmental, and social factors differ by race and could impact the severity of cachexia. Additionally, the dataset allows for the exploration of social determinants of health, such as access to care and socio-economic factors, that could exacerbate or mitigate cachexia in different racial groups.
Sex Disparities: Sex differences in cachexia are also well-documented, with men often exhibiting more severe muscle wasting compared to women, due to higher baseline muscle mass and sex hormones like testosterone. The All of Us dataset provides the opportunity to examine how these biological factors, such as hormonal differences and muscle metabolism, interact with other health variables across sexes.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Educational
  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Investigating prevalence and treatment of dysphagia in parkinson disease

This study explores the prevalence of dysphagia in Parkinson’s disease (PD) patients and how frequently they seek treatment. Dysphagia in PD can lead to complications like aspiration pneumonia and malnutrition, making its early identification crucial. Understanding treatment-seeking behaviors may uncover…

Scientific Questions Being Studied

This study explores the prevalence of dysphagia in Parkinson’s disease (PD) patients and how frequently they seek treatment. Dysphagia in PD can lead to complications like aspiration pneumonia and malnutrition, making its early identification crucial. Understanding treatment-seeking behaviors may uncover barriers to care, guiding support improvements. Results will inform healthcare providers on the need for proactive screening to enhance quality of life and reduce hospitalizations. Findings may also shape policies to improve access and referrals for dysphagia care in PD patients.

Project Purpose(s)

  • Disease Focused Research (Parkinsons disease)

Scientific Approaches

In this study, we’ll use a retrospective cohort design, using CPT codes to identify Parkinson’s disease patients diagnosed with dysphagia. We’ll utilize data on the frequency and types of treatments these patients sought for dysphagia and compare it to a cohort with non-neurogenic causes of dysphagia. Statistical analyses will analyze the likelihood and frequency of seeking treatment for these patients. Tools such as SPSS will support data analysis, enabling comparisons of treatment-seeking behaviors and dysphagia prevalence in PD patients.

Anticipated Findings

We expect a high prevalence of dysphagia in Parkinson’s patients, with evidence that they are often undertreated. Findings may reveal lower treatment-seeking rates compared to non-neurogenic dysphagia cases. This study aims to raise awareness of dysphagia in PD and highlight the need for timely referrals. Results will support improved care pathways for PD patients experiencing dysphagia.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Nader Wehbi - Graduate Trainee, University of Arizona

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

CCDGS_GeneralPediatricCohortReview

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH - NHGRI)

Collaborators:

  • Yixing Han - Senior Researcher, National Human Genome Research Institute (NIH - NHGRI)
  • Eva Jason - Graduate Trainee, University of California, San Diego

Duplicate of Demo - Hypertension Prevalence_Version3

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

1 - 25 of 156
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.