Research Projects Directory

Research Projects Directory

1,637 active projects

This information was updated 5/28/2022

The Research Projects Directory includes information about all projects that currently exist in the Researcher Workbench to help provide transparency about how the Workbench is being used. Each project specifies whether Registered Tier or Controlled Tier data are used.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

Introductory example of GWAS with type 2 diabetes phenotype

Not applicable - this workspace is intended to be an introductory example of how to do a genome-wide association study on the All of Us genomic data that individuals can easily click through and understand.

Scientific Questions Being Studied

Not applicable - this workspace is intended to be an introductory example of how to do a genome-wide association study on the All of Us genomic data that individuals can easily click through and understand.

Project Purpose(s)

  • Educational

Scientific Approaches

Not applicable - this workspace is intended to be an introductory example of how to do a genome-wide association study on the All of Us genomic data that individuals can easily click through and understand.

Anticipated Findings

Not applicable - this workspace is intended to be an introductory example of how to do a genome-wide association study on the All of Us genomic data that individuals can easily click through and understand.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Taotao Tan - Other, Baylor College of Medicine
  • Elizabeth Atkinson - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Xyanthine Parillon - Other, Baylor College of Medicine
  • Victoria Mgbemena - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Varuna Chander - Graduate Trainee, Baylor College of Medicine
  • Titilayo Olubajo - Research Fellow, Baylor College of Medicine
  • Erik Stricker - Graduate Trainee, Baylor College of Medicine
  • Sangeeta Tiwari - Early Career Tenure-track Researcher, University of Texas at El Paso
  • Shamika Ketkar - Other, Baylor College of Medicine
  • Sabur Badmos - Research Fellow, University of Texas at El Paso
  • Robert Petrovic - Graduate Trainee, Baylor College of Medicine
  • Renita Horton - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Zaida Ramirez-Ortiz - Early Career Tenure-track Researcher, University of Massachusetts Medical School
  • Nirav Shah - Graduate Trainee, Baylor College of Medicine
  • Nilsson Holguin - Early Career Tenure-track Researcher, Icahn School of Medicine at Mount Sinai
  • Nyasha Chambwe - Early Career Tenure-track Researcher, Feinstein Institute for Medical Research
  • Chang In Moon - Graduate Trainee, Baylor College of Medicine
  • Leslie Johnson - Early Career Tenure-track Researcher, Emory University
  • Lesley Chapman Hannah - Research Fellow, National Cancer Institute (NIH - NCI)
  • Luisa Cervantes-Barragan - Early Career Tenure-track Researcher, Emory University
  • Kim Worley - Other, Baylor College of Medicine
  • Kevin Wilhelm - Graduate Trainee, Baylor College of Medicine
  • Kimiko Krieger - Research Fellow, Baylor College of Medicine
  • Panagiotis Katsonis - Other, Baylor College of Medicine
  • Jose Nolazco - Research Fellow, Baylor College of Medicine
  • Joyonna Gamble-George - Research Fellow, New York University
  • Janitza Montalvo-Ortiz - Early Career Tenure-track Researcher, Yale University
  • Heather Danhof - Other, Baylor College of Medicine
  • Paola Giusti-Rodriguez - Other, University of Florida
  • Gary Huang - Graduate Trainee, Baylor College of Medicine
  • Fei Yue - Research Fellow, Baylor College of Medicine
  • Erick Olivares Bravo - Research Fellow, University of Texas, San Antonio
  • Emily Jackson-Osagie - Early Career Tenure-track Researcher, Southern University and A&M College
  • Elisa Marroquin - Research Fellow, University of Texas Health Science Center, Houston
  • Deyana Lewis - Research Fellow, National Institutes of Health (NIH)
  • Dawanna White - Early Career Tenure-track Researcher, Hampton University
  • Cathy Samayoa - Research Fellow, University of California, San Francisco
  • Charcacia Sanders - Other, Baylor College of Medicine
  • Catherine Ann Gavile - Research Fellow, University of Utah
  • Carlos Eduardo Guerra Amorim - Early Career Tenure-track Researcher, California State University, Northridge
  • Yuan Yao - Project Personnel, Baylor College of Medicine
  • Amy Adams - Other, University of South Alabama
  • Adriana Visbal - Early Career Tenure-track Researcher, Baylor College of Medicine

Quick Demo of Plots and Analyses

Not applicable - this workspace is intended to be an introductory example of how to easily click-through analyses for new users to the All of Us Researcher Workbench with whom this Notebook will be shared. This workspace is adapted from…

Scientific Questions Being Studied

Not applicable - this workspace is intended to be an introductory example of how to easily click-through analyses for new users to the All of Us Researcher Workbench with whom this Notebook will be shared. This workspace is adapted from the demo workspace "How to Get Started with the Registered Tier Data"

Project Purpose(s)

  • Educational
  • Methods Development

Scientific Approaches

Not applicable - this workspace is intended to be an introductory example of how to easily click-through analyses for new users to the All of Us Researcher Workbench with whom this Notebook will be shared. This workspace is adapted from the demo workspace "How to Get Started with the Registered Tier Data"

Anticipated Findings

Not applicable - this workspace is intended to be an introductory example of how to easily click-through analyses for new users to the All of Us Researcher Workbench with whom this Notebook will be shared. This workspace is adapted from the demo workspace "How to Get Started with the Registered Tier Data"

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Shamika Ketkar - Other, Baylor College of Medicine
  • Xyanthine Parillon - Other, Baylor College of Medicine
  • Victoria Mgbemena - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Varuna Chander - Graduate Trainee, Baylor College of Medicine
  • Titilayo Olubajo - Research Fellow, Baylor College of Medicine
  • Taotao Tan - Other, Baylor College of Medicine
  • Erik Stricker - Graduate Trainee, Baylor College of Medicine
  • Sangeeta Tiwari - Early Career Tenure-track Researcher, University of Texas at El Paso
  • Sabur Badmos - Research Fellow, University of Texas at El Paso
  • Robert Petrovic - Graduate Trainee, Baylor College of Medicine
  • Renita Horton - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Zaida Ramirez-Ortiz - Early Career Tenure-track Researcher, University of Massachusetts Medical School
  • Nirav Shah - Graduate Trainee, Baylor College of Medicine
  • Nilsson Holguin - Early Career Tenure-track Researcher, Icahn School of Medicine at Mount Sinai
  • Nyasha Chambwe - Early Career Tenure-track Researcher, Feinstein Institute for Medical Research
  • Chang In Moon - Graduate Trainee, Baylor College of Medicine
  • Leslie Johnson - Early Career Tenure-track Researcher, Emory University
  • Luz Garcini - Early Career Tenure-track Researcher, University of Texas Health Science Center, San Antonio
  • Lesley Chapman Hannah - Research Fellow, National Cancer Institute (NIH - NCI)
  • Luisa Cervantes-Barragan - Early Career Tenure-track Researcher, Emory University
  • Lalita Wadhwa - Other, Baylor College of Medicine
  • Kim Worley - Other, Baylor College of Medicine
  • Kevin Wilhelm - Graduate Trainee, Baylor College of Medicine
  • Kimiko Krieger - Research Fellow, Baylor College of Medicine
  • Panagiotis Katsonis - Other, Baylor College of Medicine
  • Jose Nolazco - Research Fellow, Baylor College of Medicine
  • Rommel Johnson - Early Career Tenure-track Researcher, University of Texas, Rio Grande Valley
  • Joyonna Gamble-George - Research Fellow, New York University
  • Janitza Montalvo-Ortiz - Early Career Tenure-track Researcher, Yale University
  • Heather Danhof - Other, Baylor College of Medicine
  • Paola Giusti-Rodriguez - Other, University of Florida
  • Gary Huang - Graduate Trainee, Baylor College of Medicine
  • Fei Yue - Research Fellow, Baylor College of Medicine
  • Fatemeh Choupani - Research Fellow, University of Washington
  • Erick Olivares Bravo - Research Fellow, University of Texas, San Antonio
  • Emily Jackson-Osagie - Early Career Tenure-track Researcher, Southern University and A&M College
  • Elizabeth Atkinson - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Elisa Marroquin - Research Fellow, University of Texas Health Science Center, Houston
  • Deyana Lewis - Research Fellow, National Institutes of Health (NIH)
  • Dawanna White - Early Career Tenure-track Researcher, Hampton University
  • Cathy Samayoa - Research Fellow, University of California, San Francisco
  • Charcacia Sanders - Other, Baylor College of Medicine
  • Catherine Ann Gavile - Research Fellow, University of Utah
  • Carlos Eduardo Guerra Amorim - Early Career Tenure-track Researcher, California State University, Northridge
  • Yuan Yao - Project Personnel, Baylor College of Medicine
  • Bassent Abdelbary - Other, University of Texas, Rio Grande Valley
  • Amy Adams - Other, University of South Alabama
  • Adriana Visbal - Early Career Tenure-track Researcher, Baylor College of Medicine

Studies of Autosomal Dominant Polycystic Kidney Disease (ADPKD)

Not applicable - this workspace is intended to be an example of how use the Researcher Workbench by studying participants with Autosomal Polycystic Kidney Disease (ADPKD)

Scientific Questions Being Studied

Not applicable - this workspace is intended to be an example of how use the Researcher Workbench by studying participants with Autosomal Polycystic Kidney Disease (ADPKD)

Project Purpose(s)

  • Educational

Scientific Approaches

Not applicable - this workspace is intended to be an example of how use the Researcher Workbench by studying participants with Autosomal Polycystic Kidney Disease (ADPKD)

Anticipated Findings

Not applicable - this workspace is intended to be an example of how use the Researcher Workbench by studying participants with Autosomal Polycystic Kidney Disease (ADPKD)

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Xyanthine Parillon - Other, Baylor College of Medicine
  • Victoria Mgbemena - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Varuna Chander - Graduate Trainee, Baylor College of Medicine
  • Titilayo Olubajo - Research Fellow, Baylor College of Medicine
  • Taotao Tan - Other, Baylor College of Medicine
  • Erik Stricker - Graduate Trainee, Baylor College of Medicine
  • Sangeeta Tiwari - Early Career Tenure-track Researcher, University of Texas at El Paso
  • Shamika Ketkar - Other, Baylor College of Medicine
  • Sabur Badmos - Research Fellow, University of Texas at El Paso
  • Robert Petrovic - Graduate Trainee, Baylor College of Medicine
  • Renita Horton - Early Career Tenure-track Researcher, Baylor College of Medicine
  • Zaida Ramirez-Ortiz - Early Career Tenure-track Researcher, University of Massachusetts Medical School
  • Nirav Shah - Graduate Trainee, Baylor College of Medicine
  • Nilsson Holguin - Early Career Tenure-track Researcher, Icahn School of Medicine at Mount Sinai
  • Nyasha Chambwe - Early Career Tenure-track Researcher, Feinstein Institute for Medical Research
  • Chang In Moon - Graduate Trainee, Baylor College of Medicine
  • Leslie Johnson - Early Career Tenure-track Researcher, Emory University
  • Luz Garcini - Early Career Tenure-track Researcher, University of Texas Health Science Center, San Antonio
  • Lesley Chapman Hannah - Research Fellow, National Cancer Institute (NIH - NCI)
  • Luisa Cervantes-Barragan - Early Career Tenure-track Researcher, Emory University
  • Lalita Wadhwa - Other, Baylor College of Medicine
  • Kevin Wilhelm - Graduate Trainee, Baylor College of Medicine
  • Kimiko Krieger - Research Fellow, Baylor College of Medicine
  • Panagiotis Katsonis - Other, Baylor College of Medicine
  • Jose Nolazco - Research Fellow, Baylor College of Medicine
  • Rommel Johnson - Early Career Tenure-track Researcher, University of Texas, Rio Grande Valley
  • Joyonna Gamble-George - Research Fellow, New York University
  • Janitza Montalvo-Ortiz - Early Career Tenure-track Researcher, Yale University
  • Heather Danhof - Other, Baylor College of Medicine
  • Paola Giusti-Rodriguez - Other, University of Florida
  • Gary Huang - Graduate Trainee, Baylor College of Medicine
  • Fei Yue - Research Fellow, Baylor College of Medicine
  • Fatemeh Choupani - Research Fellow, University of Washington
  • Erick Olivares Bravo - Research Fellow, University of Texas, San Antonio
  • Emily Jackson-Osagie - Early Career Tenure-track Researcher, Southern University and A&M College
  • Elisa Marroquin - Research Fellow, University of Texas Health Science Center, Houston
  • Deyana Lewis - Research Fellow, National Institutes of Health (NIH)
  • Dawanna White - Early Career Tenure-track Researcher, Hampton University
  • Cathy Samayoa - Research Fellow, University of California, San Francisco
  • Charcacia Sanders - Other, Baylor College of Medicine
  • Catherine Ann Gavile - Research Fellow, University of Utah
  • Carlos Eduardo Guerra Amorim - Early Career Tenure-track Researcher, California State University, Northridge
  • Yuan Yao - Project Personnel, Baylor College of Medicine
  • Bassent Abdelbary - Other, University of Texas, Rio Grande Valley
  • Amy Adams - Other, University of South Alabama
  • Adriana Visbal - Early Career Tenure-track Researcher, Baylor College of Medicine

Duplicate of Wearables Data and the Human Phenome Tier 5

Our primary goal is to understand the interaction between activity levels and sleep quality with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses…

Scientific Questions Being Studied

Our primary goal is to understand the interaction between activity levels and sleep quality with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses guiding clinical and research interventions focused on activity and sleep to reduce morbidity and mortality in patients seeking care.

Project Purpose(s)

  • Population Health
  • Social / Behavioral

Scientific Approaches

We will examine the relationship between daily activity (steps, activity intensity) over time and the prevalence and progression of coded human diseases. We will use the Fitbit data, EHR-curated diagnoses, laboratory values, quality of life survey results, and clinical outcomes (hospitalizations/mortality).

Anticipated Findings

We expect to find that lower levels of activity are associated with a higher prevalence and more rapid progression of chronic diseases. These data will provide the rationale to link wearables data with electronic health records nationwide as a window into behavioral activity choice as a modifiable risk factor for chronic diseases. We may find substantial variation in activity and disease prevalence/severity by socioeconomic status, which would motivate studies/interventions to reduce these health disparities.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

RECOVER+AoU

The goal of this initial cross-platform testing effort is focused on expanding the analytical capability of available data sources that have collected data on SARS-CoV-2. As we gather data across the US, we can use independent data sources to better…

Scientific Questions Being Studied

The goal of this initial cross-platform testing effort is focused on expanding the analytical capability of available data sources that have collected data on SARS-CoV-2. As we gather data across the US, we can use independent data sources to better understand PASC in our population and identify possible interventions. As a first step, we hope to leverage available RECOVER data tools and apply within the All of Us Researcher Workbench to assess cross-platform interoperability and analytical equivalence. This would provide a path to engage our research community and guide research towards our understanding of PASC.

Project Purpose(s)

  • Population Health
  • Methods Development
  • Control Set
  • Other Purpose (Testing PASC ML Algorithm from N3C-RECOVER in AoU Platform)

Scientific Approaches

Bring existing data query code and data analytics code from the RECOVER researcher team into the All of Us Researcher Workbench. Use “equivalent” code sets to explore and expand our understanding of PASC and its effects on the US population. Share reproducible findings through programming “notebook” and analysis of standardized datasets (OMOP).

Anticipated Findings

This research activity will be developed in conjunction with an awareness campaign of the collaborative efforts undertaken by both RECOVER and AoU. We intend to highlight the available datasets with SARS-CoV-2 data, as well as the cloud-based researcher workspaces (RECOVER, AoU). With the awareness campaign and cross-platform testing, we intent to create an on-ramp for experienced and young researchers within two large and diverse datasets.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)

Collaborators:

  • WeiQi Wei - Other, All of Us Program Operational Use
  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center

Genomic risk prediction of opioid use disorder in diverse populations

Genome-wide association studies of substance addiction and dependence have identified variants that contribute to a small portion of the total disorder variance. While these data have been useful in identifying novel genetic associations, in large part these data have not…

Scientific Questions Being Studied

Genome-wide association studies of substance addiction and dependence have identified variants that contribute to a small portion of the total disorder variance. While these data have been useful in identifying novel genetic associations, in large part these data have not been used predict risk of disease. Recently, groups have begun leverage the contribution of variants that do not reach the level of genome-wide significance to improve the prediction of complex diseases. Unfortunately, these measures have not been developed or applied to non-European populations. Our hope is leverage the All of Us data to estimate disease effect estimates for opioid use and use disorder in diverse populations so as to be able to appropriately apply a ancestry-tailored risk component to polygenic risk scores involving admixed populations (i.e., populations admixed with multiple continental including European ancestry).

Project Purpose(s)

  • Disease Focused Research (Opioid use disorder and related mental comorbidities)
  • Population Health
  • Social / Behavioral
  • Methods Development
  • Ancestry

Scientific Approaches

We will cover both non-Hispanic white, black and asian populations in this study. Three types of phenotypes for substance use will be investigated: 1) opioid use; 2) opioid use disorder; 3) other comorbid disorders. Aim 1: perform GWAS, create and validate PRS in All-of-US dataset A standard GWAS will be conducted. PRS will be created by a few different variant sets and approaches. Approaches include BSLMM, LDPred, PRSice and Lasso penalized regression. These PRSs will be evaluated in the second half of All-of-Us cohort. The one with highest Area under ROC curve, or other discrimination metrics will be selected. Aim 2: test and evaluate PRS in other datasets we collected before. We will test PRS in our study populations consisting of multiple ancestry groups.. Testing risk scores in African Americans will involve applying the appropriate risk estimate for African, Asian and European ancestry to each individual based on their admixture and local ancestry.

Anticipated Findings

1. Ancestry-shared and ancestry-specific susceptibility loci for opioid use and use disorder. 2. Optimized PRS for each disease or phenotype in European population and African-descent populations. 3. Clinically actionable risk prediction model incorporating PRS, family history, and environmental factors will be built to help patient treatment and risk management. 4. This will greatly improve risk prediction in non-white, minority populations for opioid use disorder and other related comorbidities.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Hongsheng Gui - Early Career Tenure-track Researcher, Henry Ford Health System

Collaborators:

  • Ze Meng - Early Career Tenure-track Researcher, Henry Ford Health System

LBP

Practice

Scientific Questions Being Studied

Practice

Project Purpose(s)

  • Other Purpose (Practice)

Scientific Approaches

Practice

Anticipated Findings

Practice

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Nilsson Holguin - Early Career Tenure-track Researcher, Icahn School of Medicine at Mount Sinai

ARI Workspace V5

We now have 4 goals in our research. This workspace is for goals 1 through 3. We have created a new workspace for Goal #4. 1. Determine prevalence of autoimmune diseases, individually and as a class of disease, in the…

Scientific Questions Being Studied

We now have 4 goals in our research. This workspace is for goals 1 through 3. We have created a new workspace for Goal #4.

1. Determine prevalence of autoimmune diseases, individually and as a class of disease, in the US.

2. Determine comorbidity of autoimmune diseases, including statistics on comorbidity of other autoimmune diseases and non-autoimmune diseases for each autoimmune disease.

3. Determine the impact of COVID-19 on the autoimmune and autoinflammatory disease population. This work will be conducted in parallel with work we are doing at University of Southern California under an IRB there.

4. Explore the genomic component of autoimmune diseases, particularly among patients with more than one autoimmune disease, so that the underlying mechanisms of disease among these diseases can be better understood.

Project Purpose(s)

  • Disease Focused Research (Autoimmune diseases)
  • Ancestry

Scientific Approaches

We will create three data sets for analysis:

1. A list of diseases rated in the following ways:

a. Evidence Class
i. Strong evidence it is autoimmune
ii. Moderate evidence it is autoimmune
iii. Weak evidence for autoimmunity
iv. A comorbidity of autoimmune disease
v. Symptom or symptom set with no known mechanism

b. Autoinflammatory versus autoimmune flag

c. “Not always autoimmune” flag – to indicate diseases that could have alternative mechanisms of cause

2. A list of patients, anonymized, with socioeconomic, geographic and other data that would be of interest to patients and public health officials to understand which communities are affected by these diseases
3. Outcomes data for patients over time assessing quality of life using PROMIS metrics

Anticipated Findings

The current NIH estimate of 23.5 million people with autoimmune disease was a guess by a knowledgable clinician, but has no scientific support. As a consequence, there are numerous figures in the public sphere and nobody knows which one is correct.

Many reports say autoimmune diseases are on the increase, but since the number is unknown, it is impossible to say whether this is a public health issue or not. Having a methodology that can be used to recompute the number of people with autoimmune disease will help us understand if these reports are true.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Francis Ratsimbazafy - Other, All of Us Program Operational Use
  • Jeremy Harper - Senior Researcher, Autoimmune Registry
  • Jeffrey Green - Project Personnel, Autoimmune Registry
  • Emily Holladay - Project Personnel, Autoimmune Registry
  • Alexander Burrows - Research Assistant, Autoimmune Registry
  • Adnaan Jhetam - Project Personnel, Autoimmune Registry
  • Jagannadha Avasarala - Other, University of Kentucky

Heart Failure Quality of Care and Relation to Social Determinant of Health

Background: Multiple studies have shown that patients with heart failure (HF) who experience adverse downstream effects of social determinants of health (SDOH) and healthcare disparities are less able to access care and more likely to experience poor HF outcomes over…

Scientific Questions Being Studied

Background:

Multiple studies have shown that patients with heart failure (HF) who experience adverse downstream effects of social determinants of health (SDOH) and healthcare disparities are less able to access care and more likely to experience poor HF outcomes over time due to the inability to achieve GDMT. The most up to date first-line medications for all populations are ARNIs, beta blockers, aldosterone antagonists, and SGLT2 inhibitors. We will examine adherence to GDMT in US adults with HF and the relation to race/ethnicity (Asian-American, Black, Hispanic, and White) and their socio-economic status (income level, education status, and health insurance).

SCIENTIFIC QUESTIONS BEING STUDIED:

AIM 1: We will examine the adherence to guideline directed medical therapy (GDMT) for heart failure in participants with different ethnicities and socioeconomic statuses.

AIM 2: We will examine predictors of adherence to GDMT in HF among ethnicity and social determinants of health

Project Purpose(s)

  • Disease Focused Research (Heart Failure)

Scientific Approaches

We will use the cohort builder to find US participants aged 18 years and older who have been diagnosed with HF. We would also like to utilize the participants’ provided information and electronic health records to access information about participants’ overall health standing, lifestyle, medication lists, and demographic details.

We will identify the proportion of persons with HF who are on all four recommended medications as compared to 3, 2, or only 1 recommended therapy. We will use the Chi-Square test to compare the extent of adherence to recommended medications to demographic characteristics and social determinants of health. Multiple logistic regression will be used to examine how these factors relate to the odds of being on at least 3 of the recommended therapies.

Anticipated Findings

With our cross-sectional study, we anticipate observing significant differences in GDMT adherence among different races/ethnicities. Also, certain non-white ethnic groups and those of lower socioeconomic status or without health insurance will be less likely to be on GDMT. Our findings hope to make physicians more aware of socio-economic barriers to care that may undermine the ability to achieve GDMT, thus achieving a better quality of life for patients.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care
  • Education Level
  • Income Level
  • Others

Data Set Used

Registered Tier

Research Team

Owner:

  • Trinh Do - Graduate Trainee, University of California, Irvine

Collaborators:

  • Yufan Gong - Graduate Trainee, University of California, Los Angeles

Srushti_LongCovid

Train Machine Learning models to identify potential long-COVID patients among (1) all COVID-19 patients, (2) patients hospitalized with COVID-19, and (3) patients who had COVID-19 but were not hospitalized.

Scientific Questions Being Studied

Train Machine Learning models to identify potential long-COVID patients among (1) all COVID-19 patients, (2) patients hospitalized with COVID-19, and (3) patients who had COVID-19 but were not hospitalized.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

To reflect that long-COVID may look different depending on the severity of the patient’s acute COVID-19, we built three different ML models using the three-site subset: (1) all patients, (2) patients who had been hospitalized with acute COVID-19, and (3) patients who were not hospitalized. The intent of each model is to identify the patients most likely to have long-COVID, using attendance at a long-COVID specialty clinic as a proxy for long-COVID diagnosis. To train and test each model, patients were randomly sampled to yield similar patient counts in both classes (long-COVID clinic patients and patients who did not attend the long-COVID clinic). For the all-patient model, data were also sampled to yield similar numbers of hospitalized and non-hospitalized patients.

Anticipated Findings

The combined demographics of the long-COVID clinic patients show significant differences from the COVID-19 patients at those sites who did not attend the long-COVID clinic (third and fourth columns of Table 1). Notably, non-hospitalized long-COVID clinic patients are disproportionately female.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

VEK -- Duplicate of How to Work with All of Us Physical Measurements Data

This is a testing workspace that is a copy of the example workspace "How to Work with All of Us Physical Measurements Data". This project focuses on learning how to navigate around physical measurements in the All of Us Researcher…

Scientific Questions Being Studied

This is a testing workspace that is a copy of the example workspace "How to Work with All of Us Physical Measurements Data". This project focuses on learning how to navigate around physical measurements in the All of Us Researcher Workbench.

I am using this workspace to get an overview of the data available in the Registered Tier of the AoU dataset.

Project Purpose(s)

  • Other Purpose (Improve the researcher's understanding of the All of Us Researcher Workbench. I am using this workspace to get an overview of the data available in the Registered Tier of the AoU dataset.)

Scientific Approaches

This workspace uses the data provided in the original All of Us "How to Work with All of Us Physical Measurements Data".

I am using this workspace to get an overview of the data available in the Registered Tier of the AoU dataset.

Anticipated Findings

The researcher will gain additional knowledge on how to use the All of Us Research Workbench.

I am using this workspace to get an overview of the data available in the Registered Tier of the AoU dataset.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center

VEK -- Duplicate of How to Get Started with Registered Tier Data (tier 5)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? This notebook will give you an overview of what data is available in the current…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect? This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Record (EHR), Physical Measurements (PM), and Survey data.

I am using this to get an overview of the data available in the Registered Tier of the AoU dataset.

Project Purpose(s)

  • Educational
  • Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation. I am using this workspace to develop familiarity with the AoU workspace.)

Scientific Approaches

This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:

1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

I am using this workspace to get an overview of the data available in the Registered Tier of the AoU dataset.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, you will understand the following:

All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.

I am using this workspace to get an overview of the data available in the Registered Tier of the AoU dataset.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center

Health Traits Associated with PSAT1

We would like to know what health-related traits are associated with reductions in the activity of PSAT1. The PSAT1 protein helps to produce a metabolite (serine) that is required for many cellular processes. However, the effects of PSAT1 loss of…

Scientific Questions Being Studied

We would like to know what health-related traits are associated with reductions in the activity of PSAT1. The PSAT1 protein helps to produce a metabolite (serine) that is required for many cellular processes. However, the effects of PSAT1 loss of function mutations have mostly been studied in patients with severe deficiencies in PSAT1 function. Because these patients are rare and suffer from a severe (often fatal) developmental syndrome, it is likely that there are additional health related traits associated with more modest reductions in PSAT1 activity that have not been discovered. The study we propose is important because it has been shown that for some of these patients, supplying additional serine in the diet can alleviate and, if started early, even prevent the onset of symptoms. It is therefore possible that serine supplementation is a precision treatment that may benefit individuals with specific mutations in the PSAT1 gene.

Project Purpose(s)

  • Disease Focused Research (inherited metabolic disorder)
  • Ancestry

Scientific Approaches

Our lab has developed an experimental method for predicting which sequence changes in the human PSAT1 protein are likely to impair its ability to function properly, i.e. loss of function mutations. Using this method we have been able to characterize which PSAT1 mutations are likely to cause a severe defects, mild to moderate defects, or no effect at all. We will examine the genome sequences of All of Us participants to identify a group of individuals with PSAT1 loss of function mutations. We will then use statistical methods to test whether this group of participants is at greater risk of specific health-related traits than other participants.

Anticipated Findings

From this study we expect to identify a set of health related traits that may be caused by a reduction in the activity of the PSAT1 protein. Medical case reports and other scientific studies based on very small numbers of patients have suggested that such traits might include epilepsy, macular degeneration, and skin disorders. The analysis we propose would allow us to test these hypotheses in a statistically rigorous way and possibly to identify new associations between PSAT1 and health related traits.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Aimee Dudley - Mid-career Tenured Researcher, Pacific Northwest Research Institute

Collaborators:

  • Michelle Tang - Research Fellow, Pacific Northwest Research Institute
  • Michael Xie - Graduate Trainee, Pacific Northwest Research Institute
  • Russell Lo - Late Career Tenured Researcher, Pacific Northwest Research Institute

Survey of Data Relevant to Urea Cycle Disorders

Urea cycle disorders (UCDs) are rare, inherited diseases caused by mutations that affect the function of one of eight proteins in the urea cycle (ARG1, ASL, ASS1, CPS1, NAGS, OTC, SLC25A13, and SLC25A15). The urea cycle is a series of…

Scientific Questions Being Studied

Urea cycle disorders (UCDs) are rare, inherited diseases caused by mutations that affect the function of one of eight proteins in the urea cycle (ARG1, ASL, ASS1, CPS1, NAGS, OTC, SLC25A13, and SLC25A15). The urea cycle is a series of biochemical steps that remove a metabolic waste product (nitrogen) from the blood and convert it to a compound called urea, which is removed from the body through the urine. In urea cycle disorders, nitrogen accumulates in the form of a highly toxic substance (ammonia), resulting in high levels of ammonia in the blood. Ammonia then reaches the brain, where it can cause brain damage, coma and/or death.
The disease progression of urea cycle disorders is highly variable and strongly influenced by a variety of genetic and environmental factors. The overall goal of our research effort is to improve the understanding, diagnosis, and treatment of urea cycle disorders.

Project Purpose(s)

  • Disease Focused Research (urea cycle disorders)
  • Ancestry

Scientific Approaches

Our group includes clinicians who treat UCD patients, geneticists and biochemists who are experts in urea cycle biology, and technologists who have developed high throughput assays for measuring the effect of urea cycle gene mutations on the function of the proteins. In this exploratory study, we will examine the All of Us genome sequence data and health related traits to answer the following questions: How many mutations in the urea cycle genes are present in the All of Us dataset? How many of these mutations are novel or previously described (in scientific publications)? What is the degree to which these mutations overlap with mutations we are analyzing in our ongoing clinical and laboratory research? How many participants have mutations in these genes and how many of those also have detailed health data which could be useful for future analysis?

Anticipated Findings

At the completion of this exploratory study, we hope have a thorough understanding of the extent to which the data in All of Us, might allow us to perform statistically rigorous analyses of the relationships between genetic variation in the urea cycle genes and health related traits.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Aimee Dudley - Mid-career Tenured Researcher, Pacific Northwest Research Institute

Collaborators:

  • Russell Lo - Late Career Tenured Researcher, Pacific Northwest Research Institute
  • Michelle Tang - Research Fellow, Pacific Northwest Research Institute
  • Michael Xie - Graduate Trainee, Pacific Northwest Research Institute

Wernicke-Korsakoff

What are the genetic aspects of Wernicke encephalopathy? How rare are they? Could understanding these help the public identify the disease before it develops into Korsakoff syndrome?

Scientific Questions Being Studied

What are the genetic aspects of Wernicke encephalopathy? How rare are they? Could understanding these help the public identify the disease before it develops into Korsakoff syndrome?

Project Purpose(s)

  • Disease Focused Research (Wernicke encephalopathy)
  • Educational
  • Ancestry

Scientific Approaches

We hope to gather research covering the genetic aspect, environmental aspect, and treatment options for Wernicke encephalopathy.

Anticipated Findings

We hope to find the rarity of Wernicke encephalopathy and explain this less known disease to the public using infographics and creativity.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

ML Genetic Diabetes

Large-scale genome wide association studies (GWAS) have identified many genetic variants associated with complex diseases. Most GWAS have been developed within European ancestries and have shown to perform poorly in other race/ethnic groups, exaggerating health disparities across ancestries. Scientific aim…

Scientific Questions Being Studied

Large-scale genome wide association studies (GWAS) have identified many genetic variants associated with complex diseases. Most GWAS have been developed within European ancestries and have shown to perform poorly in other race/ethnic groups, exaggerating health disparities across ancestries.

Scientific aim 1: Collection, harmonization and integration of large-scale, multi-ancestry cohorts with diabetes traits across the life-span and genomics for the discovery of genetic variants associated with several forms of diabetes.
Scientific aim 2: Development of methods to improve PRS prediction in non-European populations by using Bayesian approaches that allow integration of linkage disequilibrium and summary statistics from several ancestries.
Scientific aim 3: Development, testing, and comparing performance of PRS for each trait, development of risk prediction tools that integrate clinical and genetic risk factors, and assessment of scenarios where PRS improve the prediction.

Project Purpose(s)

  • Disease Focused Research (type 2 diabetes mellitus, type 1 diabetes mellitus)
  • Educational
  • Drug Development
  • Methods Development
  • Control Set
  • Ancestry

Scientific Approaches

We will integrate GWAS data from other large-scale genetic studies with the ALLofUs data in order to:

1- Maximize the discovery of ancestry-specific genetic variants associated with a variety of forms of diabetes and complications. We will use a variety of genetic association tools, including Regenie, for GWAS analysis, METAL, for meta-analysis of results across ancestries, Finemap and Susie, and COJO to identify distinct signals.

2- Develop PRSs for diverse ancestries. We will use PRS-CS, PRS-CSx and other methods to develop and test PRSs for all ancestries.

3- We will functionally annotate genetic variants using publicly available data like the Roadmap Epigenomics browser, FANTOM, VEP, GNOMAD, etc …

Anticipated Findings

We anticipate that that results of this study will, in part, address health disparities in several ways:

1- We will improve PRSs for all ancestries so that they can be useful for all individuals if applied clinically.

2- We will identify potential drug targets, including some drug targets that can only be identified when analyzing specific ancestries.

3- We will discover novel variants or genes that will help better understand the biology of diabetes and related traits.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Josephine Li - Research Fellow, Mass General Brigham

PREVENT

Our goal is to develop polygenic risk scores (PRSs) for diverse ancestry groups to ensure equitable implementation of genomic medicine and reduce the potential worsening of health disparities in the context of genomic medicine. Our focus is on atherosclerotic vascular…

Scientific Questions Being Studied

Our goal is to develop polygenic risk scores (PRSs) for diverse ancestry groups to ensure equitable implementation of genomic medicine and reduce the potential worsening of health disparities in the context of genomic medicine. Our focus is on atherosclerotic vascular disease (ASCVD) including coronary heart disease (CHD), peripheral artery disease (PAD), abdominal aortic aneurysm (AAA), and the related risk factors: hypertension, diabetes, obesity, and hypercholesterolemia. We hypothesize that we can reduce the gap in the performance of PRSs between diverse populations by developing methods to generate PRSs for populations of diverse ancestry. All of Us will be a critical resource in this context, given that diversity is a priority in this program. We will meta-analyze the available genotype data along with similar data from dbGaP and additional datasets to improve performance of PRSs in African American, Latino, and Asian populations.

Project Purpose(s)

  • Disease Focused Research (Coronary Heart Disease)
  • Population Health
  • Methods Development
  • Control Set
  • Ancestry

Scientific Approaches

To generate PRSs for diverse ancestries, we will use data from the eMERGE consortium, Million Veteran’s Program (MVP), the All of Us (AoU) program, dbGaP, PRIMED consortium sites, the UK Biobank, and collaborations with several international groups representing Middle Eastern, South Asian, and East Asian cohorts. Our specific aims are: Aim 1. Integrate and harmonize data from heterogeneous sources to enable cross platform phenotyping and generation of PRSs for common diseases in diverse ancestry groups. Aim 2. Develop PRSs for CHD and its major risk factors (hypertension, diabetes, obesity, hypercholesterolemia) in populations of diverse ancestry. Aim 3. Develop novel statistical and computational methods to account for diverse genetic ancestry and admixture in models of polygenic risk. Aim 4. Develop ‘clinic ready’ PRSs for diverse ancestry groups by creating reference distributions of a CHD PRS and integrate it with clinical information to compute absolute risk estimates.

Anticipated Findings

We anticipate that increasing representation of diverse populations in genotyped datasets will enable the generation of more robust PRSs in these populations. We expect to advance PRS methodology for diverse populations and use novel population genetics approaches. Additionally, we will develop ‘clinic ready’ PRSs for diverse ancestry groups by creating reference distributions of a CHD PRS and integrate it with clinical information to compute absolute risk estimates.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

Identifying Gaps in Health Data Capture and Standardization

What types of data are not well represented in the All of Us Controlled Tier Dataset v5 for answering precision medicine-related questions? What are the current limitations in health data capture and standardization for researchers? How can the All of…

Scientific Questions Being Studied

What types of data are not well represented in the All of Us Controlled Tier Dataset v5 for answering precision medicine-related questions? What are the current limitations in health data capture and standardization for researchers? How can the All of Us Dataset improve to better answer precision medicine-related questions? How can health systems improve EHR data capture and data standardization models to improve health outcomes? How can the Researcher Workbench provide an educational opportunity around the value of participating and engaging in health research? In what ways can the All of Us Researcher Workbench and Research Hub increase participation and engagement from underrepresented groups in health research?

Project Purpose(s)

  • Population Health
  • Social / Behavioral
  • Educational
  • Drug Development
  • Methods Development
  • Control Set
  • Ancestry
  • Ethical, Legal, and Social Implications (ELSI)
  • Other Purpose (Community Based Participatory Research (CBPR))

Scientific Approaches

The plan is to test community-driven health research questions and evaluate the practicality and data availability for addressing these specific questions within the All of Us Controlled Tier Dataset v5. Utilizing a Community Based Participatory Research (CBPR) approach, research questions will be sourced from Health Provider Organizations (HPOs), Community Engagement Partners, Participants, and prospective participants. The research methods and dissemination process will be co-developed with any relevant stakeholders for addressing each research question.

Anticipated Findings

Through the research process and dissemination of the findings, general public awareness of health research and precision medicine is expected to increase as a result of the study. Fielding precision health-related research questions from a diverse cohort of question submitters may reveal gaps in our ability to answer specific types of research questions with the All of Us Controlled Dataset v5. The findings may include specific examples where relevant health information is not well captured or standardized to be analyzable by health researchers. Identification of these gaps may be used to better inform changes in local health data infrastructure, Natural Language Processing (NLP) algorithms, and data standardization models that better enable precision medicine approaches.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Willy Ju - Project Personnel, University of California, Davis

Exercise and Vision Impairment

We intend to assess the effect of daily activity on mood symptoms in patients with vision impairment.

Scientific Questions Being Studied

We intend to assess the effect of daily activity on mood symptoms in patients with vision impairment.

Project Purpose(s)

  • Disease Focused Research (blindness, glaucoma)
  • Social / Behavioral

Scientific Approaches

There are three primary datasets we will use:
- Fitbit data to create aggregate measures of daily or weekly activity
- Comorbidity data to determine which participants suffer from vision impairment or particular ocular diseases like glaucoma
- Survey data on mood symptoms

We intend to perform statistical analysis to determine how physical activity impacts self-reported mood in patients with vision impairment.

Anticipated Findings

Irreversible vision impairment can be a devastating psychological stressor for patients, as it impacts their ability to interact with loved ones and participate in activities they enjoy. Understanding how social/behavioral factors are associated with better mood outcomes can help physicians form stronger therapeutic relationships with their patients.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Medical access and utilization among people with food allergy reactions

Food allergy is a life-threatening health condition, affecting 1 in 13 children and 1 in 10 adults in the United States. People with food allergies are heavily relying on multidisciplinary healthcare team including primary physicians, allergists, nurses, dietitians, psychologists, pharmacists,…

Scientific Questions Being Studied

Food allergy is a life-threatening health condition, affecting 1 in 13 children and 1 in 10 adults in the United States. People with food allergies are heavily relying on multidisciplinary healthcare team including primary physicians, allergists, nurses, dietitians, psychologists, pharmacists, and other health professionals for prevention, management, and treatment of food allergic reactions. Therefore, it is critical for such population have accessible medical resources. Previous studies have suggested the substantial differences in food allergy management across ethnicity and socioeconomic status in the UK and two major cities in the U.S. (i.e., Chicago and Cincinnati). This study aim to investigate the differences in medical access and utilization among people with food allergy reactions based on ethnics and socioeconomic status using All of Us dataset.

Project Purpose(s)

  • Disease Focused Research (food allergy)

Scientific Approaches

People with food allergy reactions will first be identified using the following conditions: (1)Food anaphylaxis, (2) seafood-induced anaphylaxis, (3) peanut-induced anaphylaxis, (4) anaphylaxis due to fish, (5) anaphylaxis due to shellfish, and (6) anaphylaxis due to ingested food. Next, demographic, health care access and utilization information, as well as other allergy associated health issues such as asthma, eczema and dermatitis will be extracted. Descriptive and inferential statistics (e.g., Chi-squared test) will be used when appropriate.

Anticipated Findings

It is anticipated that there are disparities in health care access and utilization among people with food allergies. The findings shall provide a snapshot on health inequity among people with food allergies.

Demographic Categories of Interest

  • Access to Care

Data Set Used

Registered Tier

Research Team

Owner:

Quantifying Inequality in Underreported Medical Conditions

Estimating the prevalence of a medical condition is a fundamental problem in healthcare and public health. Accurate estimates of the relative prevalence across groups --- capturing, for example, that a condition affects women more frequently than men --- facilitate effective…

Scientific Questions Being Studied

Estimating the prevalence of a medical condition is a fundamental problem in healthcare and public health. Accurate estimates of the relative prevalence across groups --- capturing, for example, that a condition affects women more frequently than men --- facilitate effective and equitable health policy which prioritizes groups who are disproportionately affected by a condition. However, it is difficult to estimate relative prevalence when a medical condition is underreported. In this project, we developed a method for accurately estimating the relative prevalence of underreported medical conditions by building upon the positive unlabeled learning framework.

In this study, we are investigating an extension of our method to quantifying underdiagnosis due to continuous variables, and are focusing on BMI. This is relevant to public health because such methods can allow us to quantitatively examine the extent to which weight bias affects patient health outcomes.

Project Purpose(s)

  • Population Health
  • Social / Behavioral
  • Methods Development

Scientific Approaches

We plan to use machine learning and statistics in this work. Specifically, we will use patient diagnoses and demographic information to train a machine learning model to estimate both the likelihood of a disease given a set of symptoms, and the likelihood of underdiagnosis given a particular demographic variable.

Anticipated Findings

Our anticipated findings are that first, underdiagnosis among higher weight patients is common (in magnitude) and widespread (in terms of diseases). Our findings would augment the literature in a few ways: 1) we would extend a broad literature on quantifying fairness in healthcare to weight, 2) we would provide the first estimates of the extent to which diseases are underdiagnosed among heavier patients, and 3) we would provide a new method to examine how health disparities arise due to continuous, rather than categorical variables.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

  • Divya Shanmugam - Graduate Trainee, Massachusetts Institute of Technology

Social determinants of smoking, vaping and related diseases

Background and significance: It is well-established that the promotion and point-of-sale advertising of tobacco products influence the choice of product for people who smoke as well as their transitions in and out of abstinence. For example, individual tobacco use behaviors…

Scientific Questions Being Studied

Background and significance: It is well-established that the promotion and point-of-sale advertising of tobacco products influence the choice of product for people who smoke as well as their transitions in and out of abstinence. For example, individual tobacco use behaviors are correlated with the density of tobacco outlets around their home and/or school.

Geographic variations in tobacco product availability and promotion within each participant’s local area are expected to interact dynamically with tobacco use and tobacco-related disease outcomes.

Specifically, it is hypothesized that greater product availability and promotion within each person’s neighborhood area (i.e., activity space, or address buffer, as available) will increase the likelihood of use and subsequent disease.

Project Purpose(s)

  • Disease Focused Research (Tobacco-related diseases)
  • Population Health
  • Social / Behavioral
  • Methods Development

Scientific Approaches

Individual differences in neighborhood determinants will be characterized using mixed-effect linear regression, accounting for differences due to geography. Covariates include those at the individual level (e.g. age, gender, tobacco-use history) and those describing aspects of the environment. Hierarchical generalized linear mixed models (GLMMs) will explore the impact of the retail environment on use outcomes, controlling for sociodemographic characteristics, individual attitudinal factors, and community-level effects. Multi-level versions of these models will integrate neighborhood level effects, e.g. census tract level SES variables. Retail outlet data – pre-processed and imported by our team – will be located with address-level precision, and outlet density will be computed using adaptive Kernel Density Estimation (aKDE) which produces estimates that are sensitive to local variations in the spatial pattern of outlets.

Anticipated Findings

With a new generation of noncombustible tobacco products introduced into the marketplace, it is unclear how point-of-sale marketing will evolve as the Tobacco Industry continues to promote smoking and vaping products, particularly to youth and young adults.

This project will provide data that directly address this knowledge gap, evaluating the association between neighborhood factors, tobacco use patterns and tobacco-related disease outcomes.

This project is also expected to seed related research projects with information about the potential to enrich the All of Us dataset with interconnections to multi-level, policy-relevant sources of data about tobacco and other harmful, addictive behaviors in the future.

Demographic Categories of Interest

  • Others

Data Set Used

Controlled Tier

Research Team

Owner:

  • Thomas Kirchner - Early Career Tenure-track Researcher, New York University

Polygenic risk score across diverse ancestries and biobanks

Polygenic risk score (PRS) is an emerging tool to evaluate the risk of complex diseases and traits using aggregated effects from millions of variants. We would like to compare the prediction accuracy of PRS between biobanks (e.g UK Biobank). We…

Scientific Questions Being Studied

Polygenic risk score (PRS) is an emerging tool to evaluate the risk of complex diseases and traits using aggregated effects from millions of variants. We would like to compare the prediction accuracy of PRS between biobanks (e.g UK Biobank). We hypothesize that the same ancestries on different biobanks may have different origins and exposure to different environments. We then evaluate whether the variation might come from differences in genetic architectures of individuals and investigate the environmental effects on disease risk for individuals. Finally, we would like to harmonize data and propose appropriate approaches and models to improve PRS prediction accuracy.

Project Purpose(s)

  • Population Health
  • Methods Development
  • Control Set
  • Ancestry

Scientific Approaches

We employ different PRS methods including PT, PRS-CS for single population and PRS-CSx in the cross-ancestry context. We next compare the PRS with R2 or Nagelkerke R2 between different biobanks. We then estimate the heritability of each trait in each biobank with LD Score regression approach. We then utilize GxE interaction analysis to explore the interaction between genetic variants and environment on risk of diseases.

Anticipated Findings

We are looking forward to observing the difference between PRS accuracy across different data. The next expected result is to identify the different genetic architecture and environment effects which contributes to the difference. Finally, we hope to leverage all differences to improve the prediction accuracy including PRS and environmental effects.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Leland Hull - Early Career Tenure-track Researcher, Mass General Brigham
  • Margaret Sunitha Selvaraj - Research Fellow, Broad Institute
  • Satoshi Koyama - Research Fellow, The Broad Institute

AD analysis

Age-associated neurodegenerative diseases (A2ND) contribute to the second leading cause of death worldwide and account for the majority of life-years lost to disability. As the global population ages, the prevalence and societal impact of A2ND will become ever greater with…

Scientific Questions Being Studied

Age-associated neurodegenerative diseases (A2ND) contribute to the second leading cause of death worldwide and account for the majority of life-years lost to disability. As the global population ages, the prevalence and societal impact of A2ND will become ever greater with annual costs in the US expected to exceed 1 trillion dollars by 2050. The management and treatment of A2ND risk factors, such as diabetes and hypertension, might delay or prevent A2ND onset. To this point, an untreated risk factor generates a cascade that with time increases in complexity and ultimately results in A2ND. Our aim is to select risk factor profiles that are targetable with a drug class and assess if a therapeutic or a combination of therapeutics can disrupt the risk cascade modifying A2ND onset. We aim to include in our analyses demographic data (age, sex, ethnicity, genotype, disease stage, socioeconomic status) to stratify efficacy of each therapeutic for AD prevention in a given population.

Project Purpose(s)

  • Disease Focused Research (neurodegenerative disease)

Scientific Approaches

Our first analysis will examine most common A2ND diagnosis incidence rates in subjects with long-term use of a drug targeting a specific risk factor in comparison to subjects with no drug use. We will focus, first on preventative drugs within classes such as the HMG-CoA reductase inhibitors, estrogen modulating therapeutics, psychiatric drugs and therapeutics targeting the inflammatory and metabolic system. Factors such as race, age, gender, genotype (e.g. APOE for Alzhimer's Disease (AD)) and other co-morbidities will be used to determine populations of subjects in which this preventative effect is most pronounced. Our second analysis will evaluate possible therapeutic effects on A2ND diagnosis. For this, cognitive measures in subjects with long-term use of a therapeutic will be compared to those in subjects with no known use while accounting for variability in age, gender and race. The drug effect on cognitive scores will be examined in sub-populations defined by their genotype.

Anticipated Findings

Previous work from our group has shown benefits of treating AD risk factors with commonly used therapeutics (statins, estrogen replacement, selective estrogen receptor modulators, etc) to reduce AD onset. Of clinical importance, there may be selective benefits for specific genotype patients with and the ability to advance a precision prevention approach for AD. Going forward, key issues to be explored are sex differences in response to therapy, genotypes and phenotypes most appropriate for a given therapy and which specific therapies have greatest preventative efficacy in the context of other variables such as sex, ApoE genotype, and other comorbidities. Results of our pilot analyses and the work proposed here will contribute to a growing body of evidence indicative of therapeutic benefit of AD risk factor treatment in a responder subset and thus have the potential to impact risk and course of Alzheimer's disease. We will extend our analysis to other neurodegenerative disease.

Demographic Categories of Interest

  • Age
  • Sex at Birth
  • Gender Identity
  • Geography
  • Education Level

Data Set Used

Registered Tier

Research Team

Owner:

  • Francesca Vitali - Early Career Tenure-track Researcher, University of Arizona

mental health during covid

According to a 2021 poll by the National Council for Mental Wellbeing, Nearly half of all Black, Hispanic, Asian, Native American and LGBTQ+ individuals say they have personally experienced increased mental health challenges between July 2020 and July 2021. Half…

Scientific Questions Being Studied

According to a 2021 poll by the National Council for Mental Wellbeing, Nearly half of all Black, Hispanic, Asian, Native American and LGBTQ+ individuals say they have personally experienced increased mental health challenges between July 2020 and July 2021. Half or more of adults surveyed said they have frequently experienced feeling tired or having less energy (63%); had difficulty sleeping (58%); felt nervous, anxious or on edge (51%); and had trouble relaxing (50%). We propose to use the All of Us data to study if the COVID-19 pandemic has disproportionately impacted the mental health of underrepresented population, and if so, shed lights on its impact.

Project Purpose(s)

  • Population Health
  • Social / Behavioral

Scientific Approaches

We propose to use multiple data modalities including EHR diagnosis codes and survey data, and appropriate statistical tools such as mixed effects model to analyse correlations.

Anticipated Findings

We hypothesise underrepresented groups may experience unique hardships and mental health struggle as a result of the COVID-19 pandemic. This study aims to investigate whether there are changes in minority mental health and well-being before and after the start of COVID-19.

Demographic Categories of Interest

  • Race / Ethnicity
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Bo Wang - Research Fellow, Mass General Brigham

Collaborators:

  • zhaowen liu - Research Fellow, Mass General Brigham
  • Yi-han Sheu - Research Fellow, Mass General Brigham
  • Jun Qian - Other, All of Us Program Operational Use
  • Young A Lee - Research Fellow, Mass General Brigham
1 - 25 of 1637
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.