Research Projects Directory

Research Projects Directory

Information about each research project within the Workbench is available in the Research Projects Directory below. Approved researchers provide their project’s research purpose, description, populations of interest, and more. This information helps All of Us ensure transparency on the type of research being conducted.

At this time, all listed projects are using data in the Registered Tier. The Registered Tier contains individual-level data from electronic health records, survey answers, physical measurements, and Fitbit. These data have been altered to protect participant privacy.

Note: Researcher Workbench users provide information about their research projects independently. Any views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program.

Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

There are currently 422 active workspaces. This information was updated on 3/2/2021.

Sort By Title:

Mast cell disorders, depression, and inflammation

Project Purpose(s)

  • Disease Focused Research (mast cell disorders and depression)
  • Social / Behavioral ...

Scientific Questions Being Studied

With few exceptions (e.g., Nicoloro, Lobel & Wolfe, 2016), mast cell disorders have received little attention from psychological science. Therefore, reliable estimates of the prevalence of emotional distress in this population are largely nonexistent. Documenting levels, types, and contributors to depression in this population can facilitate the development of appropriate interventions and highlight pathways through which emotional states such as depression may exacerbate mast cell disorders, as hypothesized by some researchers (Theoharides & Konstantinidou, 2007). There is considerable evidence in other populations that negative emotional states can influence physical health through a variety of pathways; a number of these are implicated in mast cell disorders, including the immune, endocrine, cardiovascular, and central nervous systems(e.g., Kiecolt-Glaser, McGuire, Robles, & Glaser, 2002; Herbert & Cohen, 1993; Kiecolt-Glaser, Malarkey, Cacioppo, & Glaser, 1994).

Scientific Approaches

We plan to examine the relationship between depression and markers of inflammation in individuals with mast cell disorders.

Anticipated Findings

The findings can be used to identify individuals at risk, to develop effective interventions, to inform the care of people with mast cell disorders,
and to reduce their suffering.

Demographic Categories of Interest

  • Others

Research Team

Owner:

  • Jennifer SantaBarbara - Research Fellow, University of California, Los Angeles

Maternal mortality Patient journey

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

I am interested to know about the systems factors that cause maternal mortality and morbidity and the healthcare disparity that can cause this high SMM.

Scientific Approaches

I plan to use information regarding the SMM for understanding the causes for healthcare disparity. First I plan to conduct data analysis to see what is the socio-economic and socio demographic factors of the people with SMM. Then If available, i will select data with interview data from the patients if available.

Anticipated Findings

We anticipate to identify the systems factors that causes healthcare disparity and try to address those disparities. Identifying those factors will help ensure that health equity is increased and improve patient care.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Sreenath Chalil Madathil - Early Career Tenure-track Researcher, University of Texas at El Paso

MDD Assess Real-World Experience

Project Purpose(s)

  • Population Health
  • Social / Behavioral ...

Scientific Questions Being Studied

This initial exploratory data analysis is aimed at evaluating the real-world data (surveys, potentially wearable data from FitBits) for participants that have reported having depression in the past.

Scientific Approaches

Our underlying hypothesis is that there maybe be inter- and intra- individual differences in real-world behavior that is impacted by people's lived experience of depression. We will be using longitudinal wearable data to look for patterns and deviations that are related to depression symptoms.

Anticipated Findings

We will assess how real-world behavior such as #steps, #heart variability is connected to mental health specifically depression and anxiety

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Abhishek Pratap - Senior Researcher, Sage Bionetworks

MDD_test

Project Purpose(s)

  • Population Health ...

Scientific Questions Being Studied

Initial exploratory data analysis to assess the AoU data potential hypotheses that could be explored

Scientific Approaches

Case-Control statistical analysis to compare across significant differences across various sub-cohorts in AoU based on labels that will be derived based on survey outcomes.

Anticipated Findings

To help develop better understanding of AoU datasets and workbench that will be used to inform future research studies

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Abhishek Pratap - Senior Researcher, Sage Bionetworks

Mental Health

Project Purpose(s)

  • Population Health
  • Social / Behavioral ...
  • Educational
  • Methods Development

Scientific Questions Being Studied

Predictors of mental health across racial/ethnic groups, different SES, gender identity, sexual orientation, and geography will be examined using different methodologies including machine learning, mixture modeling, and mixed effect models.

Scientific Approaches

Mental health assessments will be evaluated to examine measurement invariance across racial/ethnic groups, different SES, gender identity, sexual orientation, and geographical locations.

Anticipated Findings

Mental health assessments were shown to be invariant across different samples. The proposed study aims to assess the validity of mental health assessment among historically underrepresented populations.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Elif Dede Yildirim - Early Career Tenure-track Researcher, Auburn University

Mental Health and Substance Use Demo Projects

Project Purpose(s)

  • Disease Focused Research (disease of mental health)
  • Population Health ...

Scientific Questions Being Studied

What are the prevalences of mental health conditions in the AoURP?

Scientific Approaches

Not available.

Anticipated Findings

AoURP data can be used to assess mental health conditions in previously under-represented populations.

Demographic Categories of Interest

  • Sex at Birth
  • Education Level
  • Income Level

Research Team

Owner:

  • Chen Yeh - Project Personnel, Northwestern University

Collaborators:

  • Kai Yin Ho - Project Personnel, Northwestern University
  • Joyce Ho - Mid-career Tenured Researcher, Northwestern University

Mental Health Demonstration Project

Project Purpose(s)

  • Disease Focused Research (generalized anxiety disorder, depressive disorder, bipolar disorder)
  • Other Purpose (“This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use”.) ...

Scientific Questions Being Studied

As a demonstration project, this study aimed to explore the usability of the All of Us dataset and examined the prevalence of mental health conditions in the All of Us Research Program cohort. Specifically, we explored the lifetime prevalence of depressive disorder, bipolar disorder, and generalized anxiety disorder.

Our study looked prevalence rates for the above conditions in the following ways:
1. Prevalence in EHR data available by various demographic factors
2. Cohort characteristics
3. Congruency for diagnoses in EHR and self-report questionnaire
4. Among individuals who self-report as having been diagnosed with a mental health condition listed above, the percentage of individuals in treatment and associations between treatment and various demographic factors

Scientific Approaches

In this analysis, we calculated prevalence of mental health conditions by leveraging demographic information, questionnaire responses, and EHR data Specifically, we utilized the following surveys: Basics, Overall Health, Personal Medical History, and Healthcare Access PPIs. We utilized EHR data by creating a cohort of individuals with specific diagnoses code in their EHR. We referenced all relevant parent and child SNOMED codes for each mental health condition of the investigation (documented in Concept Set). Associations were calculated using Chi Square.

Anticipated Findings

We anticipated that the prevalence rates found in All of Us will be consistent with previous large scale studies, such as the National Comorbidity Survey. We found that the All of Us dataset is sensitive to detecting mood disorders and is usable for examining mental health conditions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Chen Yeh - Project Personnel, Northwestern University

Collaborators:

  • Kai Yin Ho - Project Personnel, Northwestern University
  • Joyce Ho - Mid-career Tenured Researcher, Northwestern University

Mental_Health_v1

Project Purpose(s)

  • Population Health
  • Social / Behavioral ...
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Questions Being Studied

The project aims to assess the links between contextual risk factors, physical activity, and mental health among fathers with young children.

Scientific Approaches

The datasets and variables will be explored to assess whether the impact of contextual risk factors impact mental health equally across racial/ethnic groups and different socioeconomic status.

Anticipated Findings

Studies documented the robust associations between paternal mental health and quality of father-child relationships. Further, poor mental health was found to be partially depended on individual's socio-demographic characteristics, including unemployment or unstable job conditions, family instability, residential status, marital status, age, education level, living under the federal poverty line, and race/ethnicity. Yet it remains unclear whether race/ethnicity accounts for the links between contextual risk factors and mental health outcomes in high-risk communities.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Elif Dede Yildirim - Early Career Tenure-track Researcher, Auburn University

miRNA research

Project Purpose(s)

  • Population Health
  • Ancestry ...

Scientific Questions Being Studied

Understanding genetic variation in the human genome. Genetic variation i s what makes us all unique, and at the same time, it is the biological basis of many disorders and disease disparities. The continued efforts to sequence individuals have revealed an unprecedented number of genetic variations.
Most of these variants will have a null or negligible effect. However, hidden in among them there will be those with important physiological and medical consequences. My objective is to combine bioinformatic approaches with high-throughput methods to interrogate the mechanistic impact of population variants affecting miRNA function. This will provide the field a new perspective for their interpretation.

Scientific Approaches

Identify generic variants with a potential impact on RNA structure, base-pairing or thermodynamics.
Identify which tissues the miRNA will be expressed and therefore propose hypothesis driven diseases that might be of relevance.
Study a cohort of individuals with and without the genetic variants.

Anticipated Findings

Create a framework to understand the role of genetic variants on non-coding RNAs.
Establish association between variants in microRNAs and diseases.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Xavier Bofill De Ros - Research Fellow, NIH

Miscellaneous

Project Purpose(s)

  • Other Purpose (Trouble-shooting, thanks for your help, Francis.) ...

Scientific Questions Being Studied

Trouble-shooting, thanks for your help, Francis.

Scientific Approaches

no approach is necessary for this workspace because this is for operational use only.

Anticipated Findings

Trouble-shooting, thanks for your help, Francis.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Guohai Zhou - Early Career Tenure-track Researcher, Massachusetts General Hospital

Collaborators:

  • Robert Carroll - Other, All of Us Program Operational Use
  • Francis Ratsimbazafy - Other, All of Us Program Operational Use

MyFirstWorkspace

Project Purpose(s)

  • Disease Focused Research (COVID-19)
  • Population Health ...
  • Social / Behavioral
  • Drug Development
  • Methods Development
  • Control Set
  • Ancestry
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Questions Being Studied

I am looking for specifically the outcome prediction capability using Electronic Health Record for COVID-19 Patients.

Scientific Approaches

I am more focused on causal inference but some point I will be using Deep Learning as well. I will be using EHR dataset in this regard. Deep learning library Keras will be used as well.

Anticipated Findings

I aim to find an efficient deep learning framework that can be used to predict potential outcomes of Covid-19 patients when they get admitted. I aim to develop a scientific tool, that can be used by doctors to detect the next step for covid-19 positive patients, such as intubation, oxygenation etc.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Mohammad Arif Ul Alam - Other, University of Maryland, Baltimore County

N3C Comparison

Project Purpose(s)

  • Disease Focused Research (COVID-19) ...

Scientific Questions Being Studied

We plan to provide phecode counts and percentages across COVID-19 diagnosis, both of enrolled participants and EHR observations for AoU and N3C. We plan to stratify by demographics of interest and age and compare measures of relative risk and odds ratios between AoU and N3C COVID-19 positive and negative participants. This analysis is at a high level, but provides some valuable measures to compare common diagnoses and comorbidities associated with COVID-19. It also provides a good measure for differences in conditions between AoU and N3C participants.

Scientific Approaches

We plan to use N3C OMOP compliant datasets in order to compare diagnostic level counts with AoU. Mainly will be working with SQL, R, and Python programming languages to merge this information and conduct the broad analysis.

Anticipated Findings

We anticipate that some common respiratory, disease, and heart complications will be more common in the COVID-19 positive patients from N3C when compared to phenotypes from our AoU cohort. We do not have great expectations when comparing the patients not exhibiting COVID-19 in N3C with the AoU population.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Kyle Webb - Project Personnel, NIH

NAFLD

Project Purpose(s)

  • Disease Focused Research (fatty liver disease) ...

Scientific Questions Being Studied

Our primary research objective is to evaluate whether standard diagnostic tools for NAFLD, such as ALT and TG, are the strong predictors for disease between various races and ethnicities. We will further evaluate whether a recent prediction model developed using the IMI Direct cohorts developed using individuals classified as white European ancestry applies to other racial and ethnic groups. See initial paper here: 10.1371/journal.pmed.1003149

Scientific Approaches

We will have two strategies for selecting the clinical variables. For models 1–3, we will select variables based on clinical accessibility and their established association with fatty liver from existing literature without applying statistical procedures for data reduction. For model 4, a pairwise Pearson correlation matrix will be used for feature selection of the clinical variables by placing a pairwise correlation threshold of r > 0.8, and we will then selected the collinear variables. Feature selection will be undertaken in the combined cohort (diabetes and non-diabetes) in order to maximize sample size and statistical power.

Anticipated Findings

We anticipate to identify differences in the utility of diagnostic measures predicting NAFLD.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Income Level

Research Team

Owner:

  • Ferris Ramadan - Project Personnel, University of Arizona

NIMIWAE

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

How can we use deep learning techniques to handle EHR data with missingness that is non-ignorable (or MNAR)?
Does a method that correctly accounts for MNAR missingness improve performance of imputation of missing data? And does this improvement translate to improvement in downstream tasks like prediction of mortality or disease outcome. Would such a method aid clinicians in risk assessment, and guide decision for early intervention?

Scientific Approaches

We plan to look at real life EHR datasets, and either simulate missingness on fully-observed EHR data, or attempt to validate our method via prediction of some outcome of interest using the dataset with inherent missingness. We will use the imputed dataset to perform this learning task, in an indirect way to validate the quality of the imputation.

Anticipated Findings

We anticipate that properly accounting for the MNAR nature of EHR data will increase performance of the imputation of missing data, thereby increasing performance of prediction of disease or mortality. We believe that this would help guide treatment decisions, and help with risk assessment for patients.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • David Lim - Graduate Trainee, University of North Carolina, Chapel Hill

NIV Failure Characterization

Project Purpose(s)

  • Disease Focused Research (respiratory failure) ...

Scientific Questions Being Studied

We are attempting to identify the failure mechanisms of noninvasive ventilation therapy in patients with acute respiratory failure.

Scientific Approaches

We plan to validate an existing phenotyping method for ventilation therapy patients and generate summary statistics between the different cohorts to characterize any major differences.

Anticipated Findings

We anticipate that our approach will validate the previously created phenotyping algorithm and identify major differences between patients that successfully undergo noninvasive ventilation therapy and those that fail noninvasive therapy and require subsequent endotracheal intubation.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Patrick Essay - Graduate Trainee, University of Arizona

Collaborators:

  • Vignesh Subbian - Early Career Tenure-track Researcher, University of Arizona

NOVA

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

Historically, dietary quality has been assessed via singular nutrient categories. Recently, there has been more attention turned to ultra-processed food (UPF) from the NOVA groups as major factor in multiple non communicable disease. The many facets include calorie dense and inexpensive convenience items.
However, limited centralized data is available related to industrial ingredients' impacts/interaction on human health. From maternal nutrition throughout the life-cycle, this is a wildly underresearched dietary quality lens. This may be due to limited consensus and lack of collaboration or awareness on nonnutrition fields. Utilizing recently converted code from Stata to Python, I would like to investigate if any statistically significant patterns are observed when NOVA is appled to All of Us data sets.

Scientific Approaches

Datasets include various populations, descriptive statistics and dietary assessment data. I developed NOVA coding with major input from the creators of NOVA for interoperability with Python.

Anticipated Findings

Examining populations related to race, ethnicity, age, geographical location and other comobidities will reveal phenotypic differences between high UPF consumers and low UPF consumers. High and low categories to be defined per cohort. Results from the exploratory analyses may inform future precision nutrition interventions and policy related to food industry marketing to children.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Kathryn Whyte - Research Fellow, Columbia University

Nutrition for Precision Health v1

Project Purpose(s)

  • Ancestry ...

Scientific Questions Being Studied

We plan to study individual differences (genetics, epigenetics, microbiome composition) that influences dietary responses in terms of metabolic and signaling pathways. Our goal is to predict biological responses to nutrition and develop dietary interventions to prevent or revert diseases.

Scientific Approaches

We plan to develop computational tools that integrate and harmonize multi-modal and multi-dimensional nutrition-related –omics data. We plan to identify genetic/epigenetic/microbiome signatures with unique biological functions. We plan to utilize and incorporate known functional annotated information to understand underlying biological mechanisms and responses to nutrition. Using AI and machine learning algorithms, we plan to build an all-inclusive model that provides dietary interventions and recommendations based on each patients’ multi-omic signature.

Anticipated Findings

The anticipated findings from the study includes identification of underlying mechanisms behind nutrition and health. Our findings will contribute to the understanding of individual health trajectories and aid in the development of dietary interventions that contribute to individuals' prevention of diseases or their transition from a disease state back to a normal one.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Jacqueline Chyr - Research Fellow, University of Texas Health Science Center, Houston

NW_AOU_Informatics

Project Purpose(s)

  • Population Health ...

Scientific Questions Being Studied

We are informatics researchers using AOU data to study research data sets. We want to understand the kinds of data enclosed and the utility for data reuse research.

Scientific Approaches

Data mining, mostly.

Anticipated Findings

We will find the data reuse potential of AOU.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Nick Williams - Research Associate, NIH

Obesity analysis

Project Purpose(s)

  • Disease Focused Research (obesity) ...

Scientific Questions Being Studied

state level obesity

Scientific Approaches

Not available.

Anticipated Findings

state level obesity disparities after adjusting for socioeconomic factors

Demographic Categories of Interest

  • Sex at Birth
  • Education Level
  • Income Level

Research Team

Owner:

  • Guohai Zhou - Early Career Tenure-track Researcher, Massachusetts General Hospital

Collaborators:

  • Francis Ratsimbazafy - Other, All of Us Program Operational Use
  • Paulette Chandler - Early Career Tenure-track Researcher, Massachusetts General Hospital
  • Andrea Ramirez - Other, All of Us Program Operational Use
  • Elizabeth Karlson - Late Career Tenured Researcher, Massachusetts General Hospital
  • Cheryl Clark

obesity_mansucript_rerun

Project Purpose(s)

  • Disease Focused Research (obesity)
  • Educational ...
  • Methods Development
  • Other Purpose (This work is the result of an All of Us Research Program Demonstration Project. Demonstration Projects are efforts by the All of Us Research Program designed to meet the goal of ensuring the quality and utility of the Research Hub as a resource for accelerating precision medicine. This work has been approved, reviewed, and overseen by the All of Us Research Program Science Committee and Data and Research Center to ensure compliance with program policy.)

Scientific Questions Being Studied

National obesity prevention and intervention strategies may benefit from precision medicine approaches that incorporate integrated data on environments, social determinants of health, and genomic factors. We examined the quality and utility of the All of Us Research Hub Workbench for accelerating precision medicine by replicating methods from existing studies that examine the prevalence of obesity at the population level. We evaluated the measurements of obesity in the participant measurement (PM) data set and the electronic health record (EHR) data set using methods similar to the Ward et al. NEJM December 2019 publication that assessed prevalence of obesity in the US by state using BRFSS data.

Scientific Approaches

For this population-based cross-sectional study of All of Us Research Workbench participants, we excluded individuals with measurements obtained during pregnancy or inpatient visits and individuals from states with fewer than 100 participants. Physical measurements (PM) of height and weight at the time of program enrollment of 142,116 participants and measured weight and height extracted from electronic health records (EHR) of 40,885 individuals were used to calculate body-mass index (BMI). We did a complete case analysis for All of Us participants with known sex (male or female), race, income and education levels and estimated state-specific and demographic subgroup-specific prevalence of categories of BMI [obesity (BMI ≥30) and extreme obesity (BMI ≥ 35)] nationwide and for each state: overall and by subgroups, male and female. We examined the difference between EHR and PM calculated BMI by state.

Anticipated Findings

Using states with at least 100 participants, PM data included 142,116 individuals (mean [SD] age, 51.2 [16.6] and EHR data on height and weight included 40,885 individuals (mean [SD] age, 52.5 [16.5]. The median BMI for PM participants was 28.4 [24.4 to 33.7]; the median BMI for EHR was 29.0 [24.8 to 34.5]. The PM national prevalence for obesity (includes BMI>30 and BMI >35) and extreme obesity (BMI >35) were 41.2 % (95% Confidence Interval [CI], 40.9 to 41.4) and 20.8% (95% CI, 20.6 to 21.0), respectively, with large variations across states. Women had higher prevalence of extreme obesity than men in all selected states. Subgroups with extreme obesity (BMI, >35) prevalence greater than 25% included subgroup, N, prevalence %, (95% CI): Black NH, 8913, 28.9 (25.8 to 32.0) , individuals with income less than $25,000, 13,244, 25.1 (22.1 to 28.1); education of high school to some college, 17, 272, 26.1 (23.1 to 29.1) and the region of the South, 6,639, 25.3 (22.3 to 28.3).

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Education Level
  • Income Level

Research Team

Owner:

  • Guohai Zhou - Early Career Tenure-track Researcher, Massachusetts General Hospital

Collaborators:

  • Paulette Chandler - Early Career Tenure-track Researcher, Massachusetts General Hospital
  • Elizabeth Karlson - Late Career Tenured Researcher, Massachusetts General Hospital
  • Karthik Natarajan - Other, All of Us Program Operational Use

obesitypaper02202020

Project Purpose(s)

  • Disease Focused Research (obesity) ...

Scientific Questions Being Studied

how does bmi differ by state, ses, race/ethnicity

Scientific Approaches

Not available.

Anticipated Findings

obesity will vary by region

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Paulette Chandler - Early Career Tenure-track Researcher, Massachusetts General Hospital

OGTT and cortisol

Project Purpose(s)

  • Disease Focused Research (type 2 diabetes mellitus) ...

Scientific Questions Being Studied

Is the way cortisol concentration changes throughout the day associated with how fast glucose is cleared from the blood stream? This question is important because the result could inform the creation of a new risk factor for type 2 diabetes, or a new way to predict the development of type 2 diabetes or severity of insulin resistance. This information could also inform future treatment development for pre-diabetics and diabetics.

Scientific Approaches

Data will include all participants who have both a cortisol measurement and an oral glucose tolerance test. Cox proportional hazard model and linear and logistic regression will be used to find associations, calculate odds ratios, and calculate relative risk. Sex, age, diabetes diagnosis, body mass index, race/ethnicity, and medication use will be adjustments in the models.

Anticipated Findings

We anticipate that people with higher morning cortisol will have poorer oral glucose tolerance test results (it will take them longer to clear the glucose from their blood stream). No one has tested this hypothesis yet and published the results. These results could inform the creation of new risk factors, mechanistic knowledge, and treatments for type 2 diabetes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Amaris Williams - Research Fellow, Ohio State University

Collaborators:

  • Bjorn Kluwe - Project Personnel, Ohio State University

Old Duplicate of Systemic Disease and Glaucoma

Project Purpose(s)

  • Disease Focused Research (primary open angle glaucoma)
  • Other Purpose (This work is the result of an All of Us Research Program Demonstration Project. Demonstration Projects are efforts by the All of Us Research Program designed to meet the goal of ensuring the quality and utility of the Research Hub as a resource for accelerating precision medicine. This work has been approved, reviewed, and overseen by the All of Us Research Program Science Committee and Data and Research Center to ensure compliance with program policy. ) ...

Scientific Questions Being Studied

We have previously published a predictive model of glaucoma progression using electronic health record (EHR) data pertaining to systemic attributes from a single institution. We aim to use the All of Us dataset to 1) serve as external validation for this single-center model and 2) to train new models focused on predicting glaucoma progression using systemic predictors. This is important to understand whether the original findings are generalizable and provide additional knowledge about the utility of systemic predictors on a national-level dataset.

Scientific Approaches

We will develop predictive models using the All of Us dataset using multivariable logistic regression, random forests, and artificial neural networks.

Anticipated Findings

We anticipate that the All of Us data will validate the findings from the model, which demonstrated that blood pressure-related metrics and certain medication classes had predictive value for glaucoma progression. In addition, we anticipate that the models trained with All of Us data will outperform the model trained with single institution data due to larger sample size and greater diversity.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Sally Baxter - Research Fellow, University of California, San Diego

Collaborators:

  • Tsung-Ting Kuo - Early Career Tenure-track Researcher, University of California, San Diego
  • Roxana Loperena Cortes - Other, All of Us Program Operational Use
  • Paulina Paul - Project Personnel, University of California, San Diego
  • Lucila Ohno-Machado
  • Luca Bonomi - Research Fellow, University of California, San Diego
  • Katherine Kim - Early Career Tenure-track Researcher, University of California, Davis
  • Jihoon Kim - Project Personnel, University of California, San Diego
  • Bharanidharan Radha Saseendrakumar - Project Personnel, University of California, San Diego

Older Adults

Project Purpose(s)

  • Population Health
  • Methods Development ...

Scientific Questions Being Studied

Exploration of traditional regression models vs. machine learning methods for large population health studies of older adults (65+).

Scientific Approaches

Application and comparison of traditional/contemporary regression models (e.g. generalized linear models) and "machine learning" methods (e.g. gradient boosting, random forest, LASSO) in supervised learning applications.

Anticipated Findings

There is an active debate in the literature as to whether (potentially) increased predictive ability of machine learning techniques is worth the additional, potential black-box complexity vs. traditional regression models. A particular thread of this discussion examines the potential for unintended bias in machine learning methods.

Demographic Categories of Interest

  • Age

Research Team

Owner:

  • John Boscardin - Other, University of California, San Francisco

OMOP

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

I am interested in studying clinical care across populations. How well can we predict adverse outcomes in large-scale datasets? How does this differ across populations and across diseases?

Scientific Approaches

We will use supervised learning methods to generate time series data to predict adverse outcomes. We will use standard baseline models like logistic regression and support vector machines and then more state-of-the-art methods including convolutional neural networks.

Anticipated Findings

I hope to determine how accurately can machine learning models predict adverse outcomes across different populations.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Irene Chen - Graduate Trainee, Massachusetts Institute of Technology

One

Project Purpose(s)

  • Ethical, Legal, and Social Implications (ELSI) ...

Scientific Questions Being Studied

I am interested in exploring the race and ethnicity distribution of AoU participants. It is important that AoU participants represent the diversity of the United States.

Scientific Approaches

We will use basic statistical methods to compare race/ethnicity and socioeconomic status of AoU participants of different US regions.

Anticipated Findings

By understanding trends in research participation, we can develop ways to promote a more equitable distribution of the risks and benefits of health research

Demographic Categories of Interest

  • Race / Ethnicity
  • Income Level

Research Team

Owner:

  • Susan Passmore - Other, University of Wisconsin, Madison

One

Project Purpose(s)

  • Ethical, Legal, and Social Implications (ELSI) ...

Scientific Questions Being Studied

I am interested in exploring the race and ethnicity distribution of AoU participants. It is important that AoU participants represent the diversity of the United States.

Scientific Approaches

We will use basic statistical methods to compare race/ethnicity and socioeconomic status of AoU participants of different US regions.

Anticipated Findings

By understanding trends in research participation, we can develop ways to promote a more equitable distribution of the risks and benefits of health research

Demographic Categories of Interest

  • Race / Ethnicity
  • Income Level

Research Team

Owner:

  • Susan Passmore - Other, University of Wisconsin, Madison

ophthalmology epidemiology

Project Purpose(s)

  • Disease Focused Research (eye diseases)
  • Population Health ...

Scientific Questions Being Studied

We would like to evaluate the epidemiology, treatments, and health outcomes of eye diseases using the diverse population in the All Of Us project. Over 12 million people in the United States over the age of 40 have visual impairment, and over 3 million have visual impairment despite glasses, contacts, or other treatments. Visual impairment has severe impacts on patients' quality of life and mortality. There are many common causes of visual impairment, including some reversible (such as cataract) and others that are treatable but can still cause irreversible vision loss (macular degeneration, glaucoma, diabetic retinopathy). Some of these diseases disproportionately impact minority populations (e.g. glaucoma in African Americans and Hispanics).
We hope to broadly characterize the prevalence of eye diseases in this cohort, as well as associated medical and surgical treatments. We hope to be able to investigate risk factors, patterns and outcomes of treatment of different eye diseases.

Scientific Approaches

We plan to primarily use the EHR, survey, and physical measurements dataset to describe the epidemiology of eye diseases, using encounter-level billing codes to determine their presence or absence. We plan to investigate risk factors for these eye diseases, including demographics, medications, physical measurements (to the extent available), survey data, and other associated diagnoses. We will begin with simple descriptive statistics. In diagnoses with sufficiently sized cohort, we will also build logistic regressions to evaluate risk factors for diagnosis.
We will also evaluate treatment patterns (medical and surgical) for different eye diseases, using EHR data of medications and surgeries undergone. We will characterize demographic and patterns in patterns of medications and surgeries.

Anticipated Findings

We anticipate that our findings will contribute broadly to the knowledge of epidemiology of eye diseases in the US, as well as improve our understanding of patterns of treatments and outcomes of eye diseases in the US. In this diverse population, we will also be able to see if there are disparities in eye diseases and their treatment patterns and outcomes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Sophia Wang - Early Career Tenure-track Researcher, Stanford University

Collaborators:

  • Wendeng Hu - Project Personnel, Stanford University

Opioid Use

Project Purpose(s)

  • Disease Focused Research (Back Pain and Opioid Use) ...

Scientific Questions Being Studied

We will look at Opioid Use and Back Pain.

We will look to see if opioid use can be reduced with specific interventions, and who is most at risk for being prescribed opioids.

Scientific Approaches

We will use the available cohorts within the All of Us study group to analyze the risk factors for back pain and opioid use.

Anticipated Findings

We hope to find groups at risk for opioid use or back pain and intervene earlier.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Phillip Cezayirli - Research Fellow, Albert Einstein College of Medicine

Opioid Use in Cancer Patients

Project Purpose(s)

  • Disease Focused Research (cancer)
  • Population Health ...

Scientific Questions Being Studied

Is there a difference between how much patients with cancer use opioids depending on ethnicity and environmental setting

Scientific Approaches

Not available.

Anticipated Findings

That there is a racial disparity among opioid use as evidenced by how much they are prescribed.

Demographic Categories of Interest

  • Sex at Birth
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Toluwalase (Lasė) Ajayi - Early Career Tenure-track Researcher, Scripps Research

Collaborators:

  • Francis Ratsimbazafy - Other, All of Us Program Operational Use

Orgs_NCD

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

A noncommunicable disease (NCD) is traditionally thought of as a disease that is not spread from human to human, such as heart disease or cancer. Over the past few decades however, researchers have found pathogenic organisms (POs) that either increase risk for or directly cause what would be traditionally thought of as an NCD, such as certain human papillomavirus strains causing cervical cancer or Epstein-Barr virus increasing the risk of developing multiple sclerosis. In this study we are exploring if there are still unknown associations between POs and human disease. Previously, utilizing a different cohort, we identified many new associations. However, we would like to verify these results by seeing which associations replicate on the All of Us data. Finding these relationships between POs and human disease would highlight the importance of vaccination, as preventing the infection could also protect or reduce a person’s risk of developing other diseases later in life.

Scientific Approaches

We plan to test models developed using a different cohort on the All of Us data.
Datasets:
- Clinical diagnoses for many different diseases
- Laboratory measurements of antibody titers for different pathogenic organisms
- Sociodemographic data to adjust for possible confounding.

Research Methods:
We will be using statistical analysis to look for associations between pathogenic organisms and disease diagnoses. Our previously built model uses a logistic regression to calculate this association.
Tools:
For the most part we will be using custom R and Python code.

Anticipated Findings

If we can replicate our results found in the previously analyzed cohort, we will have identified new links between certain pathogenic organisms and human diseases. Previously, similar results describing how certain human papillomavirus strains cause most cervical cancers helped spur on the development of a vaccine against those strains and after widescale rollout of that vaccine we are starting to see significant drops in the rate of cervical precancers. We would hope our results could encourage similar developments.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Mike Lape - Research Assistant, Cincinnati Children's Hospital Medical Center

Orgs_NCD_v4

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

A noncommunicable disease (NCD) is traditionally thought of as a disease that is not spread from human to human, such as heart disease or cancer. Over the past few decades however, researchers have found pathogenic organisms (POs) that either increase risk for or directly cause what would be traditionally thought of as an NCD, such as certain human papillomavirus strains causing cervical cancer or Epstein-Barr virus increasing the risk of developing multiple sclerosis. In this study we are exploring if there are still unknown associations between POs and human disease. Previously, utilizing a different cohort, we identified many new associations. However, we would like to verify these results by seeing which associations replicate on the All of Us data. Finding these relationships between POs and human disease would highlight the importance of vaccination, as preventing the infection could also protect or reduce a person’s risk of developing other diseases later in life.

Scientific Approaches

We plan to test models developed using a different cohort on the All of Us data.
Datasets:
- Clinical diagnoses for many different diseases
- Laboratory measurements of antibody titers for different pathogenic organisms
- Sociodemographic data to adjust for possible confounding.

Research Methods:
We will be using statistical analysis to look for associations between pathogenic organisms and disease diagnoses. Our previously built model uses a logistic regression to calculate this association.
Tools:
For the most part we will be using custom R and Python code.

Anticipated Findings

If we can replicate our results found in the previously analyzed cohort, we will have identified new links between certain pathogenic organisms and human diseases. Previously, similar results describing how certain human papillomavirus strains cause most cervical cancers helped spur on the development of a vaccine against those strains and after widescale rollout of that vaccine we are starting to see significant drops in the rate of cervical precancers. We would hope our results could encourage similar developments.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Mike Lape - Research Assistant, Cincinnati Children's Hospital Medical Center

Original ARI Workspace

Project Purpose(s)

  • Disease Focused Research (Autoimmune diseases) ...

Scientific Questions Being Studied

The goal of our research is to determine prevalence of autoimmune diseases, individually and as a class of disease, in the US. This work will help understand the likelihood of having autoimmune disease and we hope it will improve the ability of doctors to diagnose patients as it will establish the prior probability of having one of these many diseases.

Scientific Approaches

We will create three data sets for analysis:

1. A list of diseases rated in the following ways:

a. Evidence Class
i. Strong evidence it is autoimmune
ii. Moderate evidence it is autoimmune
iii. Weak evidence for autoimmunity
iv. A comorbidity of autoimmune disease
v. Symptom or symptom set with no known mechanism

b. Autoinflammatory versus autoimmune flag

c. “Not always autoimmune” flag – to indicate diseases that could have alternative mechanisms of cause

2. A list of patients, anonymized, with socioeconomic, geographic and other data that would be of interest to patients and public health officials to understand which communities are affected by these diseases
3. Outcomes data for patients over time assessing quality of life using PROMIS metrics

Anticipated Findings

The current NIH estimate of 23.5 million people with autoimmune disease was a guess by a knowledgable clinician, but has no scientific support. As a consequence, there are numerous figures in the public sphere and nobody knows which one is correct.

Many reports say autoimmune diseases are on the increase, but since the number is unknown, it is impossible to say whether this is a public health issue or not. Having a methodology that can be used to recompute the number of people with autoimmune disease will help us understand if these reports are true.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Aaron Abend - Senior Researcher, Autoimmune Registry

Collaborators:

  • Priya Padathula - Project Personnel, Autoimmune Registry
  • Jeffrey Green - Project Personnel, Autoimmune Registry
  • Darrison Haftarczyk - Research Assistant, Autoimmune Registry

OSA and comorbidities

Project Purpose(s)

  • Disease Focused Research (obstructive sleep apnea) ...

Scientific Questions Being Studied

We are interested in the conditions that accompany obstructive sleep apnea (OSA). There are conditions that are risk factors for OSA, so we expect to see those prior to a diagnosis of the sleep problem. There are also many comorbidities of OSA, such as cardiovascular disease, mental health problems, and diabetes. We believe that there may be mechanisms underlying OSA and these comorbidities that are present before diagnosis. We therefore want to look at when co-occurring conditions occur before and after a diagnosis of OSA. We also wish to assess those conditions in people without OSA as a comparison.

Scientific Approaches

We will look at conditions relative to the date of diagnosis of OSA. We will assess the timeline of occurrence of events (diagnosis, survey, vital) relative to OSA, and assess the conditions that occur early, simultaneously or late. We will look at the conditions commonly associated with OSA (CV disease, mental health, etc) in people without indication of a sleep disorder, and see whether the timeline and severity differs with the OSA group.

Anticipated Findings

We expect to see conditions occurring prior to a diagnosis of OSA. This would suggest 1) that we may be detecting OSA late, and that these conditions may be potentially able to help improve screening for OSA, and 2) we may identify possible underlying pathophysiology that could be present in OSA that is also contributing to other health problems.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Paul Macey - Mid-career Tenured Researcher, University of California, Los Angeles

Outcomes of People with IDD

Project Purpose(s)

  • Population Health
  • Social / Behavioral ...

Scientific Questions Being Studied

People with intellectual and developmental disabilities (IDD) experience health disparities compared to nondisabled people and people with other disabilities. The health disparities people with IDD face are in large part environmental – due to social determinants of health. Yet, there is less research not only about the social determinants of health of people with IDD, but also about quality standards which improves the outcomes of people with IDD. For these reasons, I plan to use the All of US research to explore quality metrics which are associated with improved outcomes for people with IDD.

Scientific Approaches

I plan to conduct research which explores the relationship between people with IDD’s survey and EHR data. More specific research questions will be created after further exploring the data.

Anticipated Findings

The anticipated findings will provide guidance for promoting the outcomes and quality of life for people with IDD.

Demographic Categories of Interest

  • Disability Status

Research Team

Owner:

  • Carli Friedman - Research Associate, The Council on Quality and Leadership