Research Projects Directory

Research Projects Directory

13,577 active projects

This information was updated 10/4/2024

The Research Projects Directory includes information about all projects that currently exist in the Researcher Workbench to help provide transparency about how the Workbench is being used. Each project specifies whether Registered Tier or Controlled Tier data are used.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

150 projects have 'nutrition' in the scientific questions being studied description
< Go back to All Projects View or enter a new search query

asthma and diabetes

To study the co-occurence of diabetes and asthma in the US population to evaluate their pathophysiology. We will also examine how their together affect nutrition status of women.

Scientific Questions Being Studied

To study the co-occurence of diabetes and asthma in the US population to evaluate their pathophysiology.
We will also examine how their together affect nutrition status of women.

Project Purpose(s)

  • Disease Focused Research (asthma, diabetes)
  • Population Health

Scientific Approaches

This study is going to use data on lab reports, SDOH, medication use.
We will use descriptive statistics for the understanding these conditions in the population.

Anticipated Findings

It is expected those with co-morbid asthma and diabetes will have poorer nutrition outcome including anemia and iron deficiency.

Demographic Categories of Interest

  • Sex at Birth

Data Set Used

Registered Tier

Research Team

Owner:

  • Sixtus Aguree - Early Career Tenure-track Researcher, Indiana University

Collaborators:

  • Humberto López Castillo - Early Career Tenure-track Researcher, University of Central Florida
  • Anoushka Shinde - Graduate Trainee, Indiana University

Comprehensive Nutrition Status Report

This project aims to investigate the concentration and prevalence of biochemical indicators of diet and nutrition in the U.S. population regarding the diversity of the United States including biological gender, races, ethnicities, age groups, geographic regions, education level and income…

Scientific Questions Being Studied

This project aims to investigate the concentration and prevalence of biochemical indicators of diet and nutrition in the U.S. population regarding the diversity of the United States including biological gender, races, ethnicities, age groups, geographic regions, education level and income level. The results of this project will provide a basis for future analyses and research on the association between biomedical indicator and health outcomes. The research question will also potentially inform and support clinicians, scientists or public health professionals with the disparity and inequity of the current nutrition status in the U.S. population.

Project Purpose(s)

  • Population Health

Scientific Approaches

The databased we will be using is that All of Us study, controlled tire. The prevalence of deficiency, insufficiency and sufficiency of each diet and nutrition related biomedical indicators will be estimated and reported as n% (95%CI). The prevalence will also be estimated by subpopulation based on biological gender, races, ethnicities, age groups, geographic regions, education level and income level. For all variables, outliers will be identified as greater or less than three times of interquartile range and biologically implausible. Outliers will be excluded from this study. For continuous variables, the medians and IQR will be reported. A p-value of 0.05 will be used to indicate statistical significance.

Anticipated Findings

These proposed analyses will allow us to have a systematic understanding of the current nutrition status in the U.S.. The results of the proposed study will provide a basis for research which perform more in-depth analyses regarding diet and nutrition related biomedical indicators using All of Us data. The subpopulation prevalence of deficiency, insufficiency and sufficiency will potentially indicate the research opportunities related to the relationship between biology, lifestyle, environment and health outcomes.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Geography
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Kirtana Devaraj - Graduate Trainee, Cornell University

Osteoporosis and food insecurity

The question I intend to study is the correlation between self-reported food insecurity and the diagnosis of osteoporosis. Osteoporosis can be an extremely costly disease with its increased risk for fractures and its associated care. There are known ties between…

Scientific Questions Being Studied

The question I intend to study is the correlation between self-reported food insecurity and the diagnosis of osteoporosis. Osteoporosis can be an extremely costly disease with its increased risk for fractures and its associated care. There are known ties between osteoporosis and Vitamin D or Calcium insufficiency, so I hope to see if we can indirectly connect food insecurity as a risk factor for these nutritional insufficiencies, and thus a risk factor for osteoporosis.

Project Purpose(s)

  • Disease Focused Research (osteoporosis)
  • Population Health
  • Social / Behavioral

Scientific Approaches

We will be using data from the Social Determinants of Health survey, question codes 40192517 and 40192426, for food insecurity, as well as the EHR Domains for a diagnosis of osteoporosis. We will also separate data by age at the survey as well to identify any confounding variables. Data will be analyzed using RStudio to identify the Pearson correlation coefficient (r), assuming linear relationships, or Spearman's rank correlation coefficient (rs) if construed with many outliers.

Anticipated Findings

The anticipated finding of this study is to identify a relationship, not causal but correlation, between food insecurity and the diagnosis of osteoporosis. With this knowledge, we can better educate our patients and community on low-cost foods rich in calcium or vitamin D that are still accessible using the SNAP program or that can be commonly found in limited-access environments. With enhanced education, we can help prevent the adverse, costly outcomes of osteoporosis like fractures.

Demographic Categories of Interest

  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

Food Insecurity among AYA Cancer Survivors

Food insecurity is a pressing issue for adolescents and young adults (AYAs) who have survived cancer. AYA cancer survivors are especially vulnerable to adverse psychosocial impacts of cancer treatment, including financial toxicity and disruption to normative developmental milestones such as…

Scientific Questions Being Studied

Food insecurity is a pressing issue for adolescents and young adults (AYAs) who have survived cancer. AYA cancer survivors are especially vulnerable to adverse psychosocial impacts of cancer treatment, including financial toxicity and disruption to normative developmental milestones such as transitioning into adulthood, living independently, and pursuing higher education. The high rates of financial toxicity can impact basic human needs, including food, putting AYA survivors at risk of severe long-term nutritional challenges such as food insecurity. Despite the growing population of over 630,000 AYA survivors in the United States (US), no specific studies have examined their food insecurity.

Hence, our two research questions are:
What are the food insecurity rates among AYA cancer survivors in the United States?
What are the predictors associated with food insecurity among AYA cancer survivors?

Project Purpose(s)

  • Disease Focused Research (cancer)
  • Population Health

Scientific Approaches

Using a cross-sectional research design, we will identify eligible cancer survivors and comparison group individuals in the All of Us Research Program.
AYA survivors. Study inclusion criteria will be: (1) participants who were ever told by their health care provider they have or had cancer; (2) age 18 years to 39; and completed questions on food security, cancer diagnosis, type of cancer, age, sex, race, ethnicity, and other sociodemographic information.
Control Group. Individuals who participated in the All of US Research Program and who did not have a history of diagnosis will be eligible. We will match each survivor to cancer-free individuals by age, race, and sex assigned at birth.
We will analyze the descriptive statistics for both the AYA survivor and the comparison groups.
We will apply binary logistic regression models to calculate the relative risk for food insecurity for AYA survivors and the comparison group, using 95% confidence intervals with robust standard errors.

Anticipated Findings

This will be the first study to estimate food insecurity prevalence among AYA survivors of cancer in the US. A comprehensive PubMed search completed on 08/02/2024 revealed no existing research specifically addressing food insecurity and needs assessment targeting AYA survivors of cancer. By pioneering this research, we aim to provide crucial data on the impact of food insecurity in this group, thereby informing future interventions tailored to their specific needs.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

  • Trishnee Bhurosy - Early Career Tenure-track Researcher, University of Vermont

Duplicate of Utilizing GWAS in understanding the mechanism behind kidney stones

According to the most recent report by the National Health and Nutrition Examination Survey (NHANES), 20% of men and 10% of women will suffer from kidney stones by the age of 70 years. The reoccurrence rate for kidney stones has…

Scientific Questions Being Studied

According to the most recent report by the National Health and Nutrition Examination Survey (NHANES), 20% of men and 10% of women will suffer from kidney stones by the age of 70 years. The reoccurrence rate for kidney stones has been estimated to be 30-40% after 5 years. Thus, there is a need for novel tools in preventing, diagnosing, and treating kidney stones.

The heritability of kidney stone formation is estimated to be around 50%. Kidney stones is also an example of a complex disorder meaning that it is not only one gene that causes the disease but the interaction of many genes.

For this study, we hypothesize that utilizing the latest bioinformatic and genetic techniques such as GWAS could help identify genetic variants and genes that play a role in the development in kidney stone formation. This would both produce a more accurate polygenic risk score (PRS) for kidney stones and identify new drug targets that can aid in prevention and diagnosis of this complex disorder.

Project Purpose(s)

  • Disease Focused Research (nephrolithiasis, urolithiasis)
  • Ancestry

Scientific Approaches

I will be using the All of Us database with the inclusion criteria of “Nephrolithiasis” and/or “Urolithiasis” with the former being the medical term for kidney stones while the later for stones in the ureter. I’ll select datasets with “Global Diversity Array” or “Short read whole genome sequencing” available on the participants. I will also be looking at datasets based on ancestral group (race) and age.
I will be doing a genome-wide association study (GWAS) on the data. Specifically, I will compare the frequency of SNPs between people with and without kidney stones.
I will be utilizing the population genomics tool PLINK to do quality control and analysis of the data. I will then utilize fine mapping tools such as PAINTOR and FINEMAP to identify causal variants from the loci, and co-localization tools to overlap GWAS loci with quantitative trait loci (QTL) for gene and protein expression, and metabolite levels in the blood and urine.

Anticipated Findings

Based on extensive literature review, I expect that genetic variants associated with kidney stones will identify genes involved in the renal handling of calcium and phosphate, as well as urate. It is possible that the discovery of new mutations that play a role in kidney stone would be discovered.

I would also like to compare the differences in genetic associations among different ancestral groups as many of them have not been accurately represented in past studies, and utilizing databases such as All of Us would address these shortcomings. Kidney stones is an example of a complex trait genetic disorder, thus tools such as GWAS in helping to develop an accurate PRS would be helpful for the patient and medical community in preventing and treating this ailment.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

Utilizing GWAS in understanding the mechanism behind kidney stones

According to the most recent report by the National Health and Nutrition Examination Survey (NHANES), 20% of men and 10% of women will suffer from kidney stones by the age of 70 years. The reoccurrence rate for kidney stones has…

Scientific Questions Being Studied

According to the most recent report by the National Health and Nutrition Examination Survey (NHANES), 20% of men and 10% of women will suffer from kidney stones by the age of 70 years. The reoccurrence rate for kidney stones has been estimated to be 30-40% after 5 years. Thus, there is a need for novel tools in preventing, diagnosing, and treating kidney stones.

The heritability of kidney stone formation is estimated to be around 50%. Kidney stones is also an example of a complex disorder meaning that it is not only one gene that causes the disease but the interaction of many genes.

For this study, we hypothesize that utilizing the latest bioinformatic and genetic techniques such as GWAS could help identify genetic variants and genes that play a role in the development in kidney stone formation. This would both produce a more accurate polygenic risk score (PRS) for kidney stones and identify new drug targets that can aid in prevention and diagnosis of this complex disorder.

Project Purpose(s)

  • Disease Focused Research (nephrolithiasis, urolithiasis)
  • Ancestry

Scientific Approaches

I will be using the All of Us database with the inclusion criteria of “Nephrolithiasis” and/or “Urolithiasis” with the former being the medical term for kidney stones while the later for stones in the ureter. I’ll select datasets with “Global Diversity Array” or “Short read whole genome sequencing” available on the participants. I will also be looking at datasets based on ancestral group (race) and age.
I will be doing a genome-wide association study (GWAS) on the data. Specifically, I will compare the frequency of SNPs between people with and without kidney stones.
I will be utilizing the population genomics tool PLINK to do quality control and analysis of the data. I will then utilize fine mapping tools such as PAINTOR and FINEMAP to identify causal variants from the loci, and co-localization tools to overlap GWAS loci with quantitative trait loci (QTL) for gene and protein expression, and metabolite levels in the blood and urine.

Anticipated Findings

Based on extensive literature review, I expect that genetic variants associated with kidney stones will identify genes involved in the renal handling of calcium and phosphate, as well as urate. It is possible that the discovery of new mutations that play a role in kidney stone would be discovered.

I would also like to compare the differences in genetic associations among different ancestral groups as many of them have not been accurately represented in past studies, and utilizing databases such as All of Us would address these shortcomings. Kidney stones is an example of a complex trait genetic disorder, thus tools such as GWAS in helping to develop an accurate PRS would be helpful for the patient and medical community in preventing and treating this ailment.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • vinodh srinivasasainagendra - Project Personnel, University of Alabama at Birmingham
  • Sydney Grooms - Graduate Trainee, University of Alabama at Birmingham

CCDGS_WebToolsQueryDemo

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH-NHGRI)

Collaborators:

  • Thalia Billawala - Research Assistant, National Human Genome Research Institute (NIH-NHGRI)

CCDGS_GeneralPediatricCohortReview

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH-NHGRI)

Collaborators:

  • Yixing Han - Senior Researcher, National Human Genome Research Institute (NIH - NHGRI)
  • Eva Jason - Graduate Trainee, University of California, San Diego

Double and triple burden of malnutrition

The specific question we intend to study is the prevalence and correlates of intraindividual double burden of malnutrition (DBM) and triple burden of malnutrition (TBM) in individuals assigned female at birth. In the US, DBM and TBM are a serious…

Scientific Questions Being Studied

The specific question we intend to study is the prevalence and correlates of intraindividual double burden of malnutrition (DBM) and triple burden of malnutrition (TBM) in individuals assigned female at birth. In the US, DBM and TBM are a serious problems. According to a 2020 literature utilizing 2006 data, authors found that the coexistence of micronutrient deficiency and overweight/obesity was 21.9% among women of reproductive age (15 - 49 year-olds). More importantly, most of the supplementary programs in the US are focusing on providing credits or foods to solve the public health problem, undernutrition. However, studies have shown that the dietary disparities persisted or worsened for most dietary components among the recipients and may worsen the health outcomes. The prevalence of overweight, obesity and diet-related noncommunicable diseases in the US is rising. There is a potential high prevalence of DBM and TBM in the US, which could cause worse outcomes.

Project Purpose(s)

  • Population Health

Scientific Approaches

The databased we will be using is that All of Us study, controlled tire. The prevalence of each undernutrition and overnutrition indicator, any single burden of malnutrition, intra-individual DBM and TBM will be estimated and reported as n% (95%CI). The potential correlates will be identified form previous screening of existing papers regarding risk factors for undernutrition, overnutrition, and DBM. For all variables, outliers will be identified as greater or less than three times of interquartile range and biologically implausible. Outliers will be excluded from this study. For continuous variables, the medians and IQR will be reported. Univariate binomial models will be used to calculate unadjusted risks ratios and 95%CI of the associations between the potential risk factors and DBM and TBM in relation to its correlates. A p-value of 0.05 will be used to indicate statistical significance.

Anticipated Findings

These proposed studies will allow us to have a deeper understanding of the dimension of the rising public health issues, DBM and TBM. We will have better insights into future interventions or programs to eliminate malnutrition in the US. Future interventions and policies may not only focus on enhancing total energy intake or protein intake to eliminate undernutrition but also pay more attention to the double burden of malnutrition which may potentially exacerbate poor health outcomes.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Saurabh Mehta - Mid-career Tenured Researcher, Cornell University
  • Naiwen Ji - Graduate Trainee, Cornell University

Collaborators:

  • Srishti Sinha - Research Fellow, Cornell University
  • Samantha Huey - Research Fellow, Cornell University
  • Sarada Ghosh - Research Fellow, Cornell University

NutritionalAssessment

This project is intended to develop methods for better assessment of malnutrition though automated interpretation of clinical observations. All Of Us data is useful for defining a full range of potential clinical presentations relating to malnutrition.

Scientific Questions Being Studied

This project is intended to develop methods for better assessment of malnutrition though automated interpretation of clinical observations. All Of Us data is useful for defining a full range of potential clinical presentations relating to malnutrition.

Project Purpose(s)

  • Educational
  • Methods Development

Scientific Approaches

I plan to use a software package developed for application of large language models, OntoGPT (https://github.com/monarch-initiative/ontogpt). Statistical analyses will also be applied.

Anticipated Findings

The result of this study will be a provisional method for inferring likelihood of malnutrition from clinical notes.

Demographic Categories of Interest

  • Age
  • Access to Care

Data Set Used

Registered Tier

Research Team

Owner:

  • Harry Caufield - Research Associate, Lawrence Berkeley National Laboratory

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mina Peyton - Project Personnel, National Institute of Allergy and Infectious Diseases (NIH - NIAID)

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of LearningWorkspace

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the…

Scientific Questions Being Studied

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the fourth year of our five year grant. Therefore, we want to be as experienced as possible working with existing All of Us data prior to access of the NPH data when it is available.

Project Purpose(s)

  • Other Purpose (This workspace is used for familiarization and preparation for analysis of the Nutrition for Precision Health (NPH) data in the Workbench. The purpose is to ensure that team members are well versed in developing models using the complex data in All of Us before the NPH data is available.)

Scientific Approaches

We will be exploring the genetic data, FitBit data, and EHR data to join data sets and test code and develop models that predict outcomes (e.g. blood pressure). We will be using the Cohort Builder, Dataset Builder and SQL code to explore and combine All of Us data.

Anticipated Findings

The primary purpose of these explorations are not research driven, that is, we are not trying to predict blood pressure from genetic, physical activity or health related variables. Rather the primary purpose is to ensure that our team is well versed in developing models using the complex data in All of Us before the NPH data is available.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Jonathan Day - Teacher/Instructor/Professor, West Point, United States Military Academy

Duplicate of LearningWorkspace

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the…

Scientific Questions Being Studied

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the fourth year of our five year grant. Therefore, we want to be as experienced as possible working with existing All of Us data prior to access of the NPH data when it is available.

Project Purpose(s)

  • Other Purpose (This workspace is used for familiarization and preparation for analysis of the Nutrition for Precision Health (NPH) data in the Workbench. The purpose is to ensure that team members are well versed in developing models using the complex data in All of Us before the NPH data is available.)

Scientific Approaches

We will be exploring the genetic data, FitBit data, and EHR data to join data sets and test code and develop models that predict outcomes (e.g. blood pressure). We will be using the Cohort Builder, Dataset Builder and SQL code to explore and combine All of Us data.

Anticipated Findings

The primary purpose of these explorations are not research driven, that is, we are not trying to predict blood pressure from genetic, physical activity or health related variables. Rather the primary purpose is to ensure that our team is well versed in developing models using the complex data in All of Us before the NPH data is available.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mason Crow - Project Personnel, West Point, United States Military Academy
  • Raymond Blaine - Senior Researcher, West Point, United States Military Academy
  • Jessica Starck - Project Personnel, West Point, United States Military Academy
  • Jonathan Day - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jacob Baxter - Senior Researcher, West Point, United States Military Academy
  • Chris Morrell - Teacher/Instructor/Professor, West Point, United States Military Academy

Collaborators:

  • Diana Thomas - Late Career Tenured Researcher, West Point, United States Military Academy
  • Sarah Bartsch - Project Personnel, City University of New York (CUNY)
  • Robert Thomson - Mid-career Tenured Researcher, West Point, United States Military Academy
  • Michael Scioletti - Late Career Tenured Researcher, West Point, United States Military Academy
  • Marie Martinez - Project Personnel, City University of New York (CUNY)
  • Kevin Cummiskey - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Kevin Chin - Project Personnel, City University of New York (CUNY)
  • Joseph Lindquist - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jessie Heneghan - Project Personnel, City University of New York (CUNY)
  • Grover LaPorte - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Andrew Lee - Teacher/Instructor/Professor, West Point, United States Military Academy
  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University

Duplicate of LearningWorkspace

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the…

Scientific Questions Being Studied

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the fourth year of our five year grant. Therefore, we want to be as experienced as possible working with existing All of Us data prior to access of the NPH data when it is available.

Project Purpose(s)

  • Other Purpose (This workspace is used for familiarization and preparation for analysis of the Nutrition for Precision Health (NPH) data in the Workbench. The purpose is to ensure that team members are well versed in developing models using the complex data in All of Us before the NPH data is available.)

Scientific Approaches

We will be exploring the genetic data, FitBit data, and EHR data to join data sets and test code and develop models that predict outcomes (e.g. blood pressure). We will be using the Cohort Builder, Dataset Builder and SQL code to explore and combine All of Us data.

Anticipated Findings

The primary purpose of these explorations are not research driven, that is, we are not trying to predict blood pressure from genetic, physical activity or health related variables. Rather the primary purpose is to ensure that our team is well versed in developing models using the complex data in All of Us before the NPH data is available.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mason Crow - Project Personnel, West Point, United States Military Academy
  • Raymond Blaine - Senior Researcher, West Point, United States Military Academy
  • Jessica Starck - Project Personnel, West Point, United States Military Academy
  • Jonathan Day - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jacob Baxter - Senior Researcher, West Point, United States Military Academy
  • Chris Morrell - Teacher/Instructor/Professor, West Point, United States Military Academy

Collaborators:

  • Diana Thomas - Late Career Tenured Researcher, West Point, United States Military Academy
  • Sarah Bartsch - Project Personnel, City University of New York (CUNY)
  • Robert Thomson - Mid-career Tenured Researcher, West Point, United States Military Academy
  • Michael Scioletti - Late Career Tenured Researcher, West Point, United States Military Academy
  • Marie Martinez - Project Personnel, City University of New York (CUNY)
  • Kevin Cummiskey - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Kevin Chin - Project Personnel, City University of New York (CUNY)
  • Joseph Lindquist - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jessie Heneghan - Project Personnel, City University of New York (CUNY)
  • Grover LaPorte - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Andrew Lee - Teacher/Instructor/Professor, West Point, United States Military Academy
  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University

Duplicate of LearningWorkspace

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the…

Scientific Questions Being Studied

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the fourth year of our five year grant. Therefore, we want to be as experienced as possible working with existing All of Us data prior to access of the NPH data when it is available.

Project Purpose(s)

  • Other Purpose (This workspace is used for familiarization and preparation for analysis of the Nutrition for Precision Health (NPH) data in the Workbench. The purpose is to ensure that team members are well versed in developing models using the complex data in All of Us before the NPH data is available.)

Scientific Approaches

We will be exploring the genetic data, FitBit data, and EHR data to join data sets and test code and develop models that predict outcomes (e.g. blood pressure). We will be using the Cohort Builder, Dataset Builder and SQL code to explore and combine All of Us data.

Anticipated Findings

The primary purpose of these explorations are not research driven, that is, we are not trying to predict blood pressure from genetic, physical activity or health related variables. Rather the primary purpose is to ensure that our team is well versed in developing models using the complex data in All of Us before the NPH data is available.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mason Crow - Project Personnel, West Point, United States Military Academy
  • Raymond Blaine - Senior Researcher, West Point, United States Military Academy
  • Jessica Starck - Project Personnel, West Point, United States Military Academy
  • Jonathan Day - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jacob Baxter - Senior Researcher, West Point, United States Military Academy
  • Chris Morrell - Teacher/Instructor/Professor, West Point, United States Military Academy

Collaborators:

  • Diana Thomas - Late Career Tenured Researcher, West Point, United States Military Academy
  • Sarah Bartsch - Project Personnel, City University of New York (CUNY)
  • Robert Thomson - Mid-career Tenured Researcher, West Point, United States Military Academy
  • Michael Scioletti - Late Career Tenured Researcher, West Point, United States Military Academy
  • Marie Martinez - Project Personnel, City University of New York (CUNY)
  • Kevin Cummiskey - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Kevin Chin - Project Personnel, City University of New York (CUNY)
  • Joseph Lindquist - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jessie Heneghan - Project Personnel, City University of New York (CUNY)
  • Grover LaPorte - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Andrew Lee - Teacher/Instructor/Professor, West Point, United States Military Academy
  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University

LearningWorkspace

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the…

Scientific Questions Being Studied

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the fourth year of our five year grant. Therefore, we want to be as experienced as possible working with existing All of Us data prior to access of the NPH data when it is available.

Project Purpose(s)

  • Other Purpose (This workspace is used for familiarization and preparation for analysis of the Nutrition for Precision Health (NPH) data in the Workbench. The purpose is to ensure that team members are well versed in developing models using the complex data in All of Us before the NPH data is available.)

Scientific Approaches

We will be exploring the genetic data, FitBit data, and EHR data to join data sets and test code and develop models that predict outcomes (e.g. blood pressure). We will be using the Cohort Builder, Dataset Builder and SQL code to explore and combine All of Us data.

Anticipated Findings

The primary purpose of these explorations are not research driven, that is, we are not trying to predict blood pressure from genetic, physical activity or health related variables. Rather the primary purpose is to ensure that our team is well versed in developing models using the complex data in All of Us before the NPH data is available.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mason Crow - Project Personnel, West Point, United States Military Academy
  • Raymond Blaine - Senior Researcher, West Point, United States Military Academy
  • Jessica Starck - Project Personnel, West Point, United States Military Academy
  • Jacob Baxter - Senior Researcher, West Point, United States Military Academy
  • Chris Morrell - Teacher/Instructor/Professor, West Point, United States Military Academy

Collaborators:

  • Diana Thomas - Late Career Tenured Researcher, West Point, United States Military Academy
  • Sarah Bartsch - Project Personnel, City University of New York (CUNY)
  • Robert Thomson - Mid-career Tenured Researcher, West Point, United States Military Academy
  • Michael Scioletti - Late Career Tenured Researcher, West Point, United States Military Academy
  • Marie Martinez - Project Personnel, City University of New York (CUNY)
  • Kevin Cummiskey - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Kevin Chin - Project Personnel, City University of New York (CUNY)
  • Joseph Lindquist - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Jessie Heneghan - Project Personnel, City University of New York (CUNY)
  • Grover LaPorte - Teacher/Instructor/Professor, West Point, United States Military Academy
  • Andrew Lee - Teacher/Instructor/Professor, West Point, United States Military Academy
  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University
  • Jonathan Day - Teacher/Instructor/Professor, West Point, United States Military Academy

Unsupervised Learning for Dietary Pattern Modeling

Coronary artery disease is a leading cause of death in the U.S and diet is an important primary prevention strategy for coronary artery disease. Precision nutrition aims to create personalized dietary recommendations based on an individual’s unique physiology but is…

Scientific Questions Being Studied

Coronary artery disease is a leading cause of death in the U.S and diet is an important primary prevention strategy for coronary artery disease. Precision nutrition aims to create personalized dietary recommendations based on an individual’s unique physiology but is not yet the reality because we cannot accurately predict what dietary pattern(s) will associate with disease at an individual level. Two major barriers, modeling complex dietary intake patterns and identifying objective and precise measures of dietary intake, need to be addressed to support the adoption of precision nutrition. This proposal aims to use innovative unsupervised learning methods to model compositional dietary exposures and compare the utility to traditional dietary pattern analysis methods.

Project Purpose(s)

  • Disease Focused Research (coronary artery disease)
  • Population Health
  • Methods Development

Scientific Approaches

Using linked dietary intake (total energy intake, macronutrients, micronutrients, supplements) and clinical (medication, procedure, disease status) data, I will use unsupervised learning to model complex patterns of dietary intake. I will perform a comparative analysis with traditional, data-driven dietary pattern analysis methods, such as principal components analysis (PCA) and k-means clustering and compare: (1) the consistency and novelty of dietary patterns identified across methods; (2) the predictive power for coronary artery disease; (3) ability to handle missing data; and (4) model interpretability/explainability.

Anticipated Findings

I hypothesize that innovative methods applied will identify novel dietary patterns and outperform traditional methods in coronary artery disease risk prediction. By applying innovative artificial intelligence to large-scale biobank data, this project will improve the efficacy of dietary pattern modeling. Ultimately, this study has the capacity to advance the prevention of cardiometabolic disease and support precision nutrition by addressing the difficulties of modeling and analyzing complex dietary intake data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Hannah Kittrell - Graduate Trainee, Icahn School of Medicine at Mount Sinai

Duplicate of CCDGS_GeneralPediatricCohortReview eva

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH-NHGRI)
  • Eva Jason - Graduate Trainee, University of California, San Diego

Creatine Global

Q1: To evaluate links between dietary creatine intake and health outcomes and biomarkers of creatine turnover in the general population Q2: To establish a treshold of creatine intake linked to impaired health in the general population Q3: To establish the…

Scientific Questions Being Studied

Q1: To evaluate links between dietary creatine intake and health outcomes and biomarkers of creatine turnover in the general population
Q2: To establish a treshold of creatine intake linked to impaired health in the general population
Q3: To establish the contribution of various demographic and nutritional variables that interfere with dietary creatine intake
Q4: To establish correlation between creatine intake and biomarkers of creatine turnover in blood and urine

Above issues are relevant to provide additional evidence whether creatine should be considered as a essential or semi-essential nutrient

Project Purpose(s)

  • Population Health

Scientific Approaches

Datasets: general population (0+ years),
Research methods: inferential statistics, regression/correlation models (linear and logistic)
Tools: Dietary assessment data, biospecimen, antropometric measurements, health data etc.

Anticipated Findings

Anticipated findings: More creatine is associated with lower risk of chronic diseases and malnutrition
Contribution: Creatine recognized as semi-essential nutrient in human nutrition

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Sergej Ostojic - Mid-career Tenured Researcher, Applied Bioenergetics Lab at the University of Novi Sad

Duplicate of LearningWorkspace_MD

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the…

Scientific Questions Being Studied

This work is in preparation for analysis of the Nutrition for Precision Health (NPH) in the Workbench. Team members have less than a year to develop the NPH models as the NPH data will not become fully available until the fourth year of our five year grant. Therefore, we want to be as experienced as possible working with existing All of Us data prior to access of the NPH data when it is available.

Project Purpose(s)

  • Other Purpose (This workspace is used for familiarization and preparation for analysis of the Nutrition for Precision Health (NPH) data in the Workbench. The purpose is to ensure that team members are well versed in developing models using the complex data in All of Us before the NPH data is available.)

Scientific Approaches

We will be exploring the genetic data, FitBit data, and EHR data to join data sets and test code and develop models that predict outcomes (e.g. blood pressure). We will be using the Cohort Builder, Dataset Builder and SQL code to explore and combine All of Us data.

Anticipated Findings

The primary purpose of these explorations are not research driven, that is, we are not trying to predict blood pressure from genetic, physical activity or health related variables. Rather the primary purpose is to ensure that our team is well versed in developing models using the complex data in All of Us before the NPH data is available.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University

Duplicate of Duplicate of NPH AI Summit Workshop 2024 MD

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. By running the exercises in this workspace, researchers will become more familiar with the phenotypic data on the Workbench (e.g., surveys, electronic health records,…

Scientific Questions Being Studied

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. By running the exercises in this workspace, researchers will become more familiar with the phenotypic data on the Workbench (e.g., surveys, electronic health records, and Fitbit) and better understand how to leverage this data for Nutrition for Precision Health research.

Project Purpose(s)

  • Educational

Scientific Approaches

We are using the All of Us dataset to explore phenotypic data on the Researcher Workbench. In the workshop, we will give an introduction to the All of Us Researcher Workbench and demonstrate how to use the Cohort Builder and Jupyter Notebooks to set up a research project. Using Jupyter notebooks, we will create a dataset containing survey, electronic health records, and Fibit information on participates with type II diabetes.

Anticipated Findings

We anticipate that the workshop participants will be able to apply similar methods to their future research using the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University

Duplicate of NPH AI Summit Workshop 2024 MC

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. By running the exercises in this workspace, researchers will become more familiar with the phenotypic data on the Workbench (e.g., surveys, electronic health records,…

Scientific Questions Being Studied

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. By running the exercises in this workspace, researchers will become more familiar with the phenotypic data on the Workbench (e.g., surveys, electronic health records, and Fitbit) and better understand how to leverage this data for Nutrition for Precision Health research.

Project Purpose(s)

  • Educational

Scientific Approaches

We are using the All of Us dataset to explore phenotypic data on the Researcher Workbench. In the workshop, we will give an introduction to the All of Us Researcher Workbench and demonstrate how to use the Cohort Builder and Jupyter Notebooks to set up a research project. Using Jupyter notebooks, we will create a dataset containing survey, electronic health records, and Fibit information on participates with type II diabetes.

Anticipated Findings

We anticipate that the workshop participants will be able to apply similar methods to their future research using the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mason Crow - Project Personnel, West Point, United States Military Academy

Collaborators:

  • MOUSSA DOUMBIA - Early Career Tenure-track Researcher, Howard University

Duplicate of Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.
Citation for this work: Chandler, P.D., Clark, C.R., Zhou, G. et al. Hypertension prevalence in the All of Us Research Program among groups traditionally underrepresented in medical research. Sci Rep 11, 12849 (2021). https://doi.org/10.1038/s41598-021-92143-w

Project Purpose(s)

  • Educational
  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

CCDGS_Tutorial_Level1_WebToolsQueryDemo

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access…

Scientific Questions Being Studied

Our center studies genetic mechanisms that drives pediatric diseases in diverse populations. We have a couple of ongoing studies on hypertension, malnutrition, and alloimmunization progression that uses both clinical data as well as a handful candidate genetic variants to access the effects of genes. To access the diverse populations enriched with those candidate genetic variants, we need to formulate our search into SQL statements and calculate the number of samples and summarize the clinical characteristic of the samples. We also need to merge the EHR data with genetic variant data to create dataset for statistical analysis. This workspace will contain demos on 1) how to refine clinical criterion based on different EHR tables 2) how to merge demographic, clinical, and genetic variants data in an efficient way 3) Given a dataset, how to generate based descriptive statistics to evaluate the validity of the cohort.

Project Purpose(s)

  • Educational

Scientific Approaches

We are going to write out SQL statement that select observation data within the reasonable time frame of our clinical end point. We are going to tabulate the number of samples by self-report ethnicity, ancestry estimated from genomic data, age, and other important factors. We are going to summarized the clinical data related to the phenotypes, stratified by ancestry, age and gender. Some examples of the clinical data would be: blood pressure (hypertension project), BMI, weight, height, related diseases (malnutrition), and antibody reactions (alloimmunization). Our approach is to use SQL to select the appropriate EHR to be evaluated, to use SQL to get an estimation on sample size, and to use R code to create summary report on the cohorts. This will be a pipeline that other can follow if they would like to use AoU data to select observation cohort for any given diseases.

Anticipated Findings

We expect that demo will help others to understand how to use web tool query to start exploratory search on possible cohort for their study of interest. Our code will help others to filter through EHR longitudinal records and identify the relevant ones efficiently. We expect our SQL statement will perform well even for huge number of cohorts. This will prepare our lab members knowledge and code to embank their own large scale analysis. Other research may follow the similar strategy to create their cohort and build dataset. This will serve as a repository for code, pipelines for other researchers who would want to use the similar AoU data query approach.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Qing Li - Research Associate, National Human Genome Research Institute (NIH-NHGRI)

Collaborators:

  • Eva Jason - Graduate Trainee, University of California, San Diego
1 - 25 of 150
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.