Research Projects Directory

Research Projects Directory

673 active projects

This information was updated 9/20/2021

Information about each project within the Researcher Workbench is available in the Research Projects Directory below. Approved researchers provide their project’s research purpose, description, populations of interest, and more. This information helps All of Us ensure transparency on the type of research being conducted.

At this time, all listed projects are using data in the Registered Tier. The Registered Tier contains individual-level data from electronic health records, surveys, physical measurements, and wearables. Personal identifiers have been removed from these data to protect participant privacy.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

African American Prostate Cancer Polygenic Risk Score - WCM SPORE

build a multiethnic polygenic risk score for prostate cancer onset, and identify modifiable risk factors associated with the score. this is important for disease screening and public health

Scientific Questions Being Studied

build a multiethnic polygenic risk score for prostate cancer onset, and identify modifiable risk factors associated with the score. this is important for disease screening and public health

Project Purpose(s)

  • Disease Focused Research (prostate cancer)

Scientific Approaches

generate a polygenic risk score from all of us data and publicly available gwas summary statistics. regression models will be used to achieve these aims

Anticipated Findings

expect to a cross-ancestry risk model that can be deployed for preventative measures. existing models have poor cross-ancestry portability

Demographic Categories of Interest

  • Race / Ethnicity

Research Team

Owner:

  • Yajas Shah - Graduate Trainee, Cornell University
  • Scott Kulm - Graduate Trainee, Cornell University

Pathways to Adverse Perinatal and Birth Outcomes Among Ethnic Minorities

There are remarkable racial disparities in perinatal and birth outcomes in the US. For example, African American women experience higher rates of perinatal mood and anxiety disorders and preterm birth/low birthweight compared to Caucasian American women. Environmental stress (e.g., racial…

Scientific Questions Being Studied

There are remarkable racial disparities in perinatal and birth outcomes in the US. For example, African American women experience higher rates of perinatal mood and anxiety disorders and preterm birth/low birthweight compared to Caucasian American women. Environmental stress (e.g., racial discrimination, SES), biological dysregulation (e.g., cortisol), unhealthy behaviors (e.g. lack of exercise), or inadequate coping resources (e.g., low social support) have been found to be risk factors for these adverse perinatal and birth outcomes. We want to investigate how these risk factors independently or interactively predict adverse outcomes for ethnically diverse women.

Project Purpose(s)

  • Disease Focused Research (perinatal mood and anxiety disorders, preterm birth, low birthweight)
  • Population Health
  • Social / Behavioral
  • Educational
  • Control Set

Scientific Approaches

We plan to analyze the data among pregnant and postpartum women that includes Overall Health, Lifestyle, COPE Survey, Lab Measurements, and Medical Records through the National Institutes of Health All of Us Research Program.

Anticipated Findings

We anticipate that environmental stress and/or biological dysregulation will lead to adverse perinatal and birth outcomes with mediators/moderators including health behaviors and coping resources in ethnically diverse women.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Yuqing Guo - Mid-career Tenured Researcher, University of California, Irvine

Collaborators:

  • Zara Satre - Undergraduate Student, University of California, Irvine
  • Suheilah Abdalla - Graduate Trainee, University of California, Irvine
  • Jasmine Wang - Undergraduate Student, University of California, Irvine

Gestational Diabetes

We are exploring how placental morphology can predict gestational diabetes. Currently, gestational diabetes is diagnosed late in gestation. Lifestyle interventions are recommended as treatment and delivering these interventions earlier than current diagnoses may reverse the complication.

Scientific Questions Being Studied

We are exploring how placental morphology can predict gestational diabetes. Currently, gestational diabetes is diagnosed late in gestation. Lifestyle interventions are recommended as treatment and delivering these interventions earlier than current diagnoses may reverse the complication.

Project Purpose(s)

  • Educational

Scientific Approaches

We plan to use placental morphology data, demographic data, and pregnancy outcomes to address this question.

Anticipated Findings

We anticipate developing predictive models that can identify the presence of gestational diabetes earlier than current practice.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Diana Thomas - Late Career Tenured Researcher, West Point, United States Military Academy

Patient Time-Series-Based Record Linkage

In this study we are interested in validating a novel method that could efficiently link de-identified patient records to their time-series data (e.g. ergometric tests). This improved linkage method should aid medical institutions (e.g. rehabilitation centers) to assess the quality…

Scientific Questions Being Studied

In this study we are interested in validating a novel method that could efficiently link de-identified patient records to their time-series data (e.g. ergometric tests). This improved linkage method should aid medical institutions (e.g. rehabilitation centers) to assess the quality of their recorded data without using any personally identifying information. This kind of data quality assessment has been demonstrated in a previous peer-reviewed research article. Although we have verified our novel method on synthetic data, we want to run our algorithm on real time-series data that belongs to de-identified patient records. This way our assessment will be more sound as we will show that our method works efficiently on real datasets.

Project Purpose(s)

  • Methods Development

Scientific Approaches

We plan to apply our novel method for patient record linkage which is based on time-series matching. By record linkage we mean identifying records (containing time-series data) pertinent to the same individual. Our Linking methods/algorithms generally employ hierarchical clustering algorithms. We also use a fast sorting algorithm to help eliminate identical records. Then, we construct a graph that links similar records. Finally, we find the connected components within such graph. Please see http://www.rlatools.com for more information about theses linking tools.

This study will use datasets of time-series data (e.g. heart rate) collected from ergo-meters or wearable devices, such as fitbits. Please note that this research will never use any external data sources. Also, we are interested in a diverse sample in general.

Anticipated Findings

This study is anticipated to present a novel linking method that is faster and more-efficient than what is available via the currently available tools. This novel method will enable medical institutions to assess the quality of their collected data without compromising the privacy of the patients.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Nalini Ravishanker - Late Career Tenured Researcher, University of Connecticut

JW

I am interested in data in all domains: EHR, Survey Questions and Physical Measurements. My research aims are to test the impact of COVID-19 pandemic on health outcomes (e.g., overall health, mental health, CVD, type 2 diabetes, hypertension, functional limitation…

Scientific Questions Being Studied

I am interested in data in all domains: EHR, Survey Questions and Physical Measurements. My research aims are to test the impact of COVID-19 pandemic on health outcomes (e.g., overall health, mental health, CVD, type 2 diabetes, hypertension, functional limitation and disability) across groups by gender, age, race/ethnicity, comorbidities, etc. I am also interested in assessing risk factors for mortality caused by COVID-19 pandemic.

Project Purpose(s)

  • Population Health

Scientific Approaches

I am planning to use the epidemiological approach to analyze data to calculate descriptive statistics (e.g., percent) of both health outcomes and risk factors. I also plan to conduct inferential statistical analysis (e.g., logistic regression) to study the association between risk factors and health outcomes

Anticipated Findings

The prevalence of poor health outcomes increased during the COVID-19 pandemic, particularly in groups associated with with risk factors (e.g., older age with comorbidities).

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Jing Wang - Late Career Tenured Researcher, University of Texas at Arlington

COPC

Chronic pains are often overlapping with each other, forming COPC. The project will use all of us data to identify COPC developing trajectories and genetic mechanisms once the genetic data is available

Scientific Questions Being Studied

Chronic pains are often overlapping with each other, forming COPC. The project will use all of us data to identify COPC developing trajectories and genetic mechanisms once the genetic data is available

Project Purpose(s)

  • Disease Focused Research (Chronic overlapping pain conditions (COPC))

Scientific Approaches

We will use logistic regression to study the pairwise overlapping, counting the time series of the occurrence of the diseases. We will also use similar models with lasso to identify most relevant pairs and their trajectories.

Anticipated Findings

We expect to identify true COPC developing pairs and clusters, providing insights for the development of COPC conditions, and the underlying conditions, such as mental status

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth

Research Team

Owner:

  • Jungwei Fan - Early Career Tenure-track Researcher, Mayo Clinic
  • Haiquan Li - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Wenting luo - Graduate Trainee, University of Arizona
  • Edwin Baldwin - Graduate Trainee, University of Arizona

Hispanic Community Representation

1. To what extent does the All of Us database reflect the diversity of the United States, particularly as it relates to the proportion of Hispanics in the US and those registered in the database? 2. To what extent is…

Scientific Questions Being Studied

1. To what extent does the All of Us database reflect the diversity of the United States, particularly as it relates to the proportion of Hispanics in the US and those registered in the database?

2. To what extent is diversity in terms of demographics, disease, health risk, and intersectionality represented within the sample of Hispanic participants in the All of Us Research Program?

Justification: Hispanics represent more than one in six persons in the U.S. (18.5%), 60.7 million people; the nation’s second largest population group compared to non-Hispanic White (60.1%), non-Hispanic Black (12.2%), non-Hispanic Asian (5.6%), non-Hispanic American Indian/Alaska Native (0.7%), and non-Hispanic Native Hawaiian/Other Pacific Islander (0.2%). Thus, it is in the interest of good science and the success of the All of Us Research Program to ensure adequate inclusion of the Hispanic community in the AoU database.

Project Purpose(s)

  • Other Purpose (This Workspace will be used for the Hispanic Community Representation Demo Project, which will aim to assess the extent to which the Hispanic community, including the plurality and diversity within the community, is represented in the All of Us database. As the Hispanic community is diverse in and of itself, it is important to determine the representation gaps that remain to be addressed in terms of demographics, disease, health risk, and intersectional factors.)

Scientific Approaches

First, we will select the Hispanic cohort from the demographic dataset. We will then summarize the demographic characteristics (age, gender, Hispanic origin, educational attainment, industry/occupation, geography, annual household income, insurance, etc.) of the Hispanic cohort using the self-reported data from the Participant Survey information dataset and compare the findings to the American Community Survey 2020 data set release. Finally, we will highlight any demographic gaps in the Hispanic All of Us participant cohort and potential barriers that may impede and recommendations for targeted enrollment efforts to ensure a study cohort reflective of the general US Hispanic population.

Anticipated Findings

We expect that the Hispanic cohort in the All of Us Research Program reflects the distribution of major Hispanic population centers, diversity of Hispanic subpopulation groups, and a mix of participants from rural and urban centers as well as key economic, educational, health access indicators. Our findings would contribute to the validity of the All of Us research hub as a diverse pool of participants that reflect the diverse composition of the United States and of the Hispanic community itself.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Natalia Cañas - Research Associate, National Alliance for Hispanic Health

Predicting Major Adverse Cardiac Events in Heart Failure Patients with COVID-19

Aim 1: Determine the predictors of mortality and hospitalization for patients with acute or chronic heart failure (A/CHF) that had a diagnosis of COVID-19. Rapid onset of new or worsening heart failure symptoms are characteristic of AHF. Concomitance of COVID-19…

Scientific Questions Being Studied

Aim 1: Determine the predictors of mortality and hospitalization for patients with acute or chronic heart failure (A/CHF) that had a diagnosis of COVID-19. Rapid onset of new or worsening heart failure symptoms are characteristic of AHF. Concomitance of COVID-19 presents additional challenges towards treating A/CHF patients. Studies provide several candidate clinical and laboratory measures associated with worse clinical outcomes for patients with A/CHF and COVID-19. Identifying COVID-19 specific predictors of mortality and hospitalization for A/CHF patients would help explain the pathophysiology behind the progression of COVID-19 in A/CHF patients.

Aim 2: Stratify the risk for suboptimal guideline-directed medical therapy (GDMT) for A/CHF patients with COVID-19. COVID-19 obstructs A/CHF patients from reaching their optimal target doses. Assigning patients into different strata at risk of not achieving optimal GDMT targets may provide clinicians with more impactful treatment options.

Project Purpose(s)

  • Disease Focused Research (severe acute respiratory syndrome, acute on chronic heart failure)

Scientific Approaches

This retrospective study will include demographic characteristics and clinical features from the All of Us A/CHF and COVID-19 combined cohorts. Missing values will be imputed by multiple imputation. Dimensionality of the data will be reduced by supervised selection. Associations between demographic and clinical features will be made with the outcome of 1-year re-hospitalization with A/CHF as the primary diagnosis. Models generated will utilize standard regression, random forests, and gradient boosting, and will be evaluated by their predictive values, sensitivity, specificity, and c-statistics.

Combined clinical features at baseline will undergo k-means cluster analysis to subset groups. Features will undergo processing as described above. A predictive model will be developed, and a Cox proportional hazards regression analysis for re-hospitalization will be performed for each subgroup. All analyses are to be conducted on the All of Us workbench in the latest version of R and Python.

Anticipated Findings

We may expect to find clinical features and laboratory parameters associated with elevated systemic inflammation, endothelial dysfunction, and hypercoagulation to be strong predictors of adverse outcomes for A/CHF patients who has contracted COVID-19. Clinical features like carbon dioxide and oxygen partial pressures in arterial blood may serve as correlates of worse outcomes. Predictive laboratory features may include high-sensitivity C-reactive protein (hs-CRP), brain and atrial natriuretic peptides (BNP/ANP), ferritin, interleukins, neutrophils, complete blood count and d-dimer quantities among others.

In stratifying patients at-risk of not adhering to GDMT, stratification we expect that data pertaining to a patient’s health care access and utilization, as well as the severity of their COVID-19 infection, may put them at greater risk of non-adherence. Severity of COVID-19 infection may be understood as a profile of high inflammation like elevated levels of hs-CRP or interleukins.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Suad Alshammari - Graduate Trainee, Virginia Commonwealth University
  • Silas Contaifer - Graduate Trainee, Virginia Commonwealth University
  • Daniel Contaifer Junior - Project Personnel, Virginia Commonwealth University
  • VIRGINIA UNIVERSITY - Graduate Trainee, Virginia Commonwealth University
  • Kevin Ledezma - Graduate Trainee, Boston University

Cutaneous Eruptions

What are the patterns of cutaneous eruptions in specific demographic groups following oncologic therapy?

Scientific Questions Being Studied

What are the patterns of cutaneous eruptions in specific demographic groups following oncologic therapy?

Project Purpose(s)

  • Disease Focused Research (Cutaneous Eruptions)

Scientific Approaches

Datasets will include cohorts of patients receiving immunotherapy and/or targeted therapy and concept sets pertaining to cutaneous eruptions. Analyses will be done in R.

Anticipated Findings

We anticipate finding differential patterns of cutaneous eruptions between demographic groups. This will increase awareness amongst dermatologists treating patients with cutaneous eruptions, thereby preventing discontinuation of immunotherapy/targeted therapy due to cutaneous eruptions.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Research Team

Owner:

Postoperative surgery pain management and recovery

We are exploring the data to identify indicators of postoperative surgery outcomes and preoperative predictors of those outcomes. We will also build predictive models of postoperative outcomes.

Scientific Questions Being Studied

We are exploring the data to identify indicators of postoperative surgery outcomes and preoperative predictors of those outcomes. We will also build predictive models of postoperative outcomes.

Project Purpose(s)

  • Social / Behavioral
  • Methods Development
  • Ancestry

Scientific Approaches

This project will explore the potential to identify indicators of postoperative surgery outcomes and to identify predictors of those outcomes from All of Us data. We plan to use AllofUs EHR data, FitBit data and genotype data for this exploration. If successful, we will build predictive models for various postoperative surgery outcomes (e.g., pain management, recovery time, etc.)

Anticipated Findings

We anticipate demonstrating the value of combining EHR, physical activity monitoring, and genotype data to predict postoperative surgery outcomes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Casey Taylor - Early Career Tenure-track Researcher, Johns Hopkins University

Project 1

We hope to elucidate information surrounding weekly physical activity among patients with unstable angina (preinfraction syndrome) and seek to understand whether there is an impact that medically adherent patients with prescribed medication, namely beta blockers, can have on weekly physical…

Scientific Questions Being Studied

We hope to elucidate information surrounding weekly physical activity among patients with unstable angina (preinfraction syndrome) and seek to understand whether there is an impact that medically adherent patients with prescribed medication, namely beta blockers, can have on weekly physical activity levels. This information could help further our understanding of the behavioral factors that have significant influences on the overall health outcomes of this patient population.

Project Purpose(s)

  • Educational

Scientific Approaches

Initially, I will look at the existing data for patients with conditions of chest pain, more specifically preinfraction syndrome. From that data I will explore further information regarding their weekly physical activity levels as recorded by the Fitbit data as well as the existing information on prescribed medications. At this stage, we are still in a exploratory phase and have not planned a structured analysis for this data.

Anticipated Findings

I anticipate that we will see that patients with preinfraction syndrome who are medically adherent to their prescribed medication of beta blockers will be more likely to have higher levels of weekly physical activity (indicated by active minutes) than those patients who are not prescribed medication/non-adherent. Based on the information collected in this study, we would posit potential explanations to the observed results based on the existing peer-reviewed literature. With any information found with this study, the scientific community would have a further understanding of the potential factors that may or may not contribute to behavioral health outcomes, like physical activity, in this patient population and how it may contribute to overall well-being.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Lilly Derby - Graduate Trainee, Rutgers, The State University of New Jersey

Cortisol, Prolactin, and Pain Persistence in Women with Endometriosis

I am currently exploring the data at this stage to formalize a specific research question regarding women with persistent chronic pelvic pain following endometrial ablation and what role, if any, their stress, cortisol levels, and prolactin levels play in that…

Scientific Questions Being Studied

I am currently exploring the data at this stage to formalize a specific research question regarding women with persistent chronic pelvic pain following endometrial ablation and what role, if any, their stress, cortisol levels, and prolactin levels play in that relationship.

Project Purpose(s)

  • Social / Behavioral

Scientific Approaches

At this stage, I'm still exploring what questions would be most appropriate once I am able to look at the data available. It will likely be either a mediating regression model with prolactin and cortisol as mediators or a t-test to determine differences between groups depending upon what is available.

Anticipated Findings

At the moment, my work is still preliminary as I begin to formulate a specific research question. That said, I anticipate that greater prolactin and cortisol levels will be present among women with persistent chronic pelvic pain post endometrial ablation.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Jacqueline Smith - Graduate Trainee, Rutgers, The State University of New Jersey

Alopecia Areata and other Immune-Mediated Dermatologic Conditions

We are interested in studying the influence of climate on the clinical course of alopecia areata (AA), atopic diseases, and other immune-mediated dermatologic conditions. We will also explore the association between AA, atopic conditions, and immune-mediated diseases. We hypothesize that…

Scientific Questions Being Studied

We are interested in studying the influence of climate on the clinical course of alopecia areata (AA), atopic diseases, and other immune-mediated dermatologic conditions. We will also explore the association between AA, atopic conditions, and immune-mediated diseases. We hypothesize that people with immune-mediated diseases may be at increased risk of developing AA. Lastly, we plan to explore whether medication utilization for AA, atopic diseases, and immune-mediated dermatologic conditions varies by race, season, geographical location, socioeconomic status, or insurance status.

Project Purpose(s)

  • Disease Focused Research (Immune-mediated dermatologic conditions)

Scientific Approaches

Using the All of Us dataset, we will evaluate the associations between various immune-mediated diseases and risk of alopecia areata. We will also utilize correlation analyses to explore whether various factors influence the pattern of disease flares and health care utilization. Statistical analysis will be conducted using STATA software and R.

Anticipated Findings

Comorbidity studies have the potential to inform screening for associated diseases and may help guide future animal model or genome wide association studies, which may ultimately improve clinical outcomes.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

Collaborators:

  • Isabelle Moseley - Graduate Trainee, Brown University

MDD_Analysis

Analysis and identification of Digital BioMakers on depression population using sensor + other data sources.

Scientific Questions Being Studied

Analysis and identification of Digital BioMakers on depression population using sensor + other data sources.

Project Purpose(s)

  • Disease Focused Research (major depressive disorder)

Scientific Approaches

- Statistical analysis on longitudinal sensory signals.
- Machine learning approached to build classifiers

Anticipated Findings

- Validation of identified biomarkers based on existing literature that are clear indicator depression.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

ZhouLab

Like other forms of big data, biobank data is characterized by its volume, velocity, variety, and veracity (4V). This proposal develops several statistical methods and computational algorithms that address certain aspects of 4V for identifying biomarkers associated with cardio-metabolic related…

Scientific Questions Being Studied

Like other forms of big data, biobank data is characterized by its volume, velocity, variety, and veracity (4V). This proposal develops several statistical methods and computational algorithms that address certain aspects of 4V for identifying biomarkers associated with cardio-metabolic related traits and study their genetic overlap with cognitive functions. We focus on four specific topics. (1) We provide a foundation for developing optimization algorithms for analyzing data that cannot fit into computer memory. (2) Bag of little bootstrap (BLB) for massive variance component models. (3) Variance component selection.

Project Purpose(s)

  • Population Health
  • Drug Development
  • Methods Development
  • Ancestry
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Approaches

We will design and code implementations using the high-performance dynamic language python or Julia if possible. To foster scientific reproducibility and maximize software sustainability, we will embrace modern software engineering practices. We will use observational study design to extract incidence of disease outcomes, e.g., heart failure, stroke, dementia, etc. We will use time-to-event models, e.g., Cox-PH models, to analyze the incidence of diseases.

Anticipated Findings

From our proposal, we expect to develop algorithms, user friendly open-source software, as well as analysis pipelines to encourage efficient and reproducible research. Additionally, from these studies, we also expect that we will identify novel genetic variants or other clinical risk factors implicated in diseases or disease-related traits, a better understanding of how specific genetic variants may impact diseases and traits, how they interact with each other and with lifestyle factors, and how this information could be used to pursue a more personalized approach to medicine. The uniqueness of our proposal is to incorporating time-dependent trajectories into disease predictions and early preventions.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Jin Zhou - Mid-career Tenured Researcher, University of California, Los Angeles

Collaborators:

  • Aubrey Jensen - Project Personnel, University of Arizona

Duplicate of Hypertension_Analysis

What disease phenotypes are associated with hypertension? I intend to use electronic health record to extract phenotypes not normally associated with hypertension.

Scientific Questions Being Studied

What disease phenotypes are associated with hypertension? I intend to use electronic health record to extract phenotypes not normally associated with hypertension.

Project Purpose(s)

  • Disease Focused Research (hypertension)

Scientific Approaches

I intend to build dataset that include people with and with hypertension and make comparisons between the two cohorts. I will uses the phewas package to compare diseases associated with these cohorts.

Anticipated Findings

I anticipate finding different correlations of disease between the case and control cohorts. These findings will show how we can use information contained in electronic health records.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Hypertension_Analysis

What disease phenotypes are associated with hypertension? I intend to use electronic health record to extract phenotypes not normally associated with hypertension.

Scientific Questions Being Studied

What disease phenotypes are associated with hypertension? I intend to use electronic health record to extract phenotypes not normally associated with hypertension.

Project Purpose(s)

  • Disease Focused Research (hypertension)

Scientific Approaches

I intend to build dataset that include people with and with hypertension and make comparisons between the two cohorts. I will uses the phewas package to compare diseases associated with these cohorts.

Anticipated Findings

I anticipate finding different correlations of disease between the case and control cohorts. These findings will show how we can use information contained in electronic health records.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Duplicate of Eye Related

I am currently exploring the data to determine the amount of eye health related information exists in All of US. This will help determine future research questions.

Scientific Questions Being Studied

I am currently exploring the data to determine the amount of eye health related information exists in All of US. This will help determine future research questions.

Project Purpose(s)

  • Disease Focused Research (eye disease)

Scientific Approaches

At this stage, my use of the workbench is exploratory. I will be using data mining techniques to look for eye health related data to inform future research.

Anticipated Findings

I expect to find some data relevant to eye health to provide a basis for further study.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Comorbidities

We are hoping to compare the All of Us study population with the UK Biobank in answering the following questions: 1. Are there morbidity disparities between ethnic groups? 2. Are there comorbidity disparities between ethnic groups, with respect to the…

Scientific Questions Being Studied

We are hoping to compare the All of Us study population with the UK Biobank in answering the following questions:

1. Are there morbidity disparities between ethnic groups?
2. Are there comorbidity disparities between ethnic groups, with respect to the overall burden of comorbidities for each ethnic group?
3.1 Do comorbidity disparities between ethnic groups impact disease outcomes, with respect to the overall burden of comorbidities for each disease?
3.2 Do comorbidity disparities between ethnic groups impact disease outcomes, with respect to specific disease-risk factor comorbidities?

Overall, we hope to determine how comorbidities may serve as risk factors for ethnic health disparities.

Project Purpose(s)

  • Population Health
  • Ancestry

Scientific Approaches

We will be using the All of Us dataset alongside the UK Biobank, with a focus on the All of Us populations. In this study, we will be using the Elixhauser Comorbidity Index to categorize and describe comorbidities. For our analysis, we will be using a primarily network-based approach. All analyses will be conducted in R or Cytoscape. Limitations include that electronic health records may not have complete health information on participants, as these records depend on the presence of insurance billing codes.

Anticipated Findings

We anticipate that we will find:

1. Morbidity disparities between ethnic groups;
2. Comorbidity disparities between ethnic groups;
3. Specific comorbidity risk factors which differentially affect disease outcomes in ethnic groups.

Additionally, we anticipate that our results will have similarities to previous findings in the UK Biobank. However, specific disparities may be different, especially since ethnic categories are different in the United States compared with the United Kingdom. We will also investigate environmental and social factors which may be contributing to the differences we find between the two study populations. Finally, we may follow up this study with looking at genetic contributions to our results.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Maria Ahmad - Project Personnel, NIH
  • Leonardo Marino-Ramirez - Senior Researcher, NIH

COVID and SHS

We are looking to explore the relationship between COVID-19 and secondhand smoke exposure.

Scientific Questions Being Studied

We are looking to explore the relationship between COVID-19 and secondhand smoke exposure.

Project Purpose(s)

  • Disease Focused Research (COVID-19)
  • Social / Behavioral

Scientific Approaches

We are going to look at the COVID-19 Participant Experience survey.

Anticipated Findings

We anticipate finding that SHS exposure increases COVID-19 disease rate and worsens outcomes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Elissa Resnick - Project Personnel, University of Illinois at Chicago

Pregnancy heart rate study

Women’s exposure to social adversity over the life course is associated with altered physiologic set points within stress regulatory systems (e.g. autonomic and endocrine systems). Such physiologic alterations render some women more vulnerable to adverse mental, physical and reproductive health…

Scientific Questions Being Studied

Women’s exposure to social adversity over the life course is associated with altered physiologic set points within stress regulatory systems (e.g. autonomic and endocrine systems). Such physiologic alterations render some women more vulnerable to adverse mental, physical and reproductive health trajectories. Circadian heart rate parameters are an emerging pre-morbid biomarker of sympathovagal balance of the autonomic stress response but few studies have studied this in the context of pregnancy. Therefore, the purpose of this study is to 1) describe within-person trajectories of circadian heart rate over the duration of pregnancy, 2) examine whether between-person variation in these parameters is associated with social and intergenerational adversity and 3) whether within- and between-person nocturnal heart rate parameters are associated with physical activity in pregnancy.

Project Purpose(s)

  • Social / Behavioral
  • Methods Development

Scientific Approaches

We will curate a subset of data from the National Institutes of Health All of Us Research Program for secondary analysis. We will examine a subset of individuals with 1) a confirmed pregnancy and 2) available Fitbit data. Minute-level data on heart rate from the Fitbit will be used to compute circadian heart rate parameters (e.g. nocturnal dipping ratio). Intensive longitudinal data analysis methods will be used for within-person analyses. Multilevel modeling will be used to examine between-person analyses. This research will enhance understanding of stress-related programming effects on pathophysiologic pregnancy complications.

Anticipated Findings

We hypothesize that pregnancy will be associated with an overall increase in heart rate and changes in the nocturnal dipping ratio over the course of pregnancy. Social and intergenerational adversity will be associated with baseline nocturnal dipping ratio and trajectories of heart rate over gestation.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Crystal Epstein - Early Career Tenure-track Researcher, University of North Carolina, Greensboro

Collaborators:

  • Thomas McCoy - Project Personnel, University of North Carolina, Greensboro

Social Determinants of Colorectal Cancer

I plan to explore the social determinants of colorectal cancer (CRC) incidence and screening adherence, and which, if any, clinical factors may predict improved adherence to revised CRC screening guidelines advising average risk screening begin at age 45.

Scientific Questions Being Studied

I plan to explore the social determinants of colorectal cancer (CRC) incidence and screening adherence, and which, if any, clinical factors may predict improved adherence to revised CRC screening guidelines advising average risk screening begin at age 45.

Project Purpose(s)

  • Disease Focused Research (colorectal cancer)

Scientific Approaches

Using this new resource, I plan to explore the data, which I anticipate will be hypothesis-generating for future study.

Anticipated Findings

I would anticipate characterizing the social determinants of colorectal cancer incidence and screening adherence.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Long Nguyen - Early Career Tenure-track Researcher, Mass General Brigham

HIV immunogenetic study

Despite the public health impact on the HIV-1 pandemic from highly active antiretroviral therapy (ART) and pre-exposure prophylaxis (PrEP), questions remain about the mechanisms that define risk of sexually acquired HIV. There is no explanation why after accounting for virus…

Scientific Questions Being Studied

Despite the public health impact on the HIV-1 pandemic from highly active antiretroviral therapy (ART) and pre-exposure prophylaxis (PrEP), questions remain about the mechanisms that define risk of sexually acquired HIV. There is no explanation why after accounting for virus exposure and presence of host variants modifying HIV-1 replication pathways, some highly exposed people exhibit resistance to HIV-1, while others are susceptible. Our studies suggest that a subset of immune response genes alters a person’s pro-inflammatory environment. We hypothesize that variants in these genes contribute to defining a homeostatic inflammation setpoint that impacts the risk of HIV-1 infection and other inflammatory disorders. Understanding these mechanisms could lead to novel interventions to reduce HIV-1 acquisition, along with other inflammatory conditions. We will evaluate the concept that aggregate host variation in two candidate genes modify risk of HIV-1 and other inflammatory disorders.

Project Purpose(s)

  • Disease Focused Research (HIV and other inflammatory conditions such as Type 1 and Type 2 diabetes)
  • Ancestry

Scientific Approaches

We will build three datasets. One includes HIV-infected cases (~5000 people) and HIV-uninfected controls. Many participants identified as HIV-uninfected are not highly HIV-exposed. Therefore, we will use Cox proportional hazards modeling to generate an HIV-risk score and use this to identify the HIV-uninfected in the highest 2% of HIV-exposure to include as controls. We will develop a dataset of Type 1 diabetes mellitus (T1D) cases with controls at high risk of T1D but who don’t have T1D, and a dataset of Type 2 diabetes mellitus (T2D) cases with controls at high risk of T2D but do not have s diagnosis. Whole genome sequence data will be used to test the association of aggregate functional genomic variants in our candidate genes with prevalence of HIV-1, T1D and T2D accounting for population structure. We will use the All of Us platform to access de-identified EHR, physical exam, and genomic data in a context that will protect the privacy of participants.

Anticipated Findings

To date no reproducible, underlying genetic mechanism has been identified that connects innate inflammation to HIV-risk. We hope to demonstrate that evaluation of aggregate variation in our two candidate genes can be used to define a homeostatic inflammation setpoint that impacts risk of HIV-1 infection and other inflammatory disorders. Our study will aid in understanding the pathways and mechanisms associated with susceptibility and natural resistance to HIV-1 infection, providing a framework for future host-targeted prevention measures and anti-inflammatory treatments based on a person’s constellation of homeostatic inflammation characteristics. Our research will also shed light on the burden of different rare variants contributing to HIV-susceptibility across various ethnicities and ancestries.

Demographic Categories of Interest

  • Race / Ethnicity
  • Gender Identity
  • Sexual Orientation

Research Team

Owner:

Social Determinants and Healthcare Access in Eye Conditions - v4 Dataset

We are planning to explore disparities in healthcare access and utilization for patients with eye conditions across different demographic groups. We would like to evaluate risk of developing advanced/severe disease in different eye conditions, and understand how social determinants contribute…

Scientific Questions Being Studied

We are planning to explore disparities in healthcare access and utilization for patients with eye conditions across different demographic groups. We would like to evaluate risk of developing advanced/severe disease in different eye conditions, and understand how social determinants contribute to this risk while adjusting for other known risk factors. We are also interested in understanding the availability of social determinants of health data in this data repository compared to EHR clinical data warehouses alone.

Project Purpose(s)

  • Population Health

Scientific Approaches

We will build cohorts of patients with various eye diseases (i.e. diabetic retinopathy, retinal vein occlusions, glaucoma, etc.). Then we will develop concept sets and extract data on outcomes (i.e. development of complications), as well as predictors including clinical data and social data. We will draw on survey data and EHR data within All of Us. When genomic data and wearable data become available, we are interested in evaluating those data sources as well. We will use statistical modeling and machine learning to generate predictive models.

Anticipated Findings

We anticipate that there may be differential risk for developing complications based on disparities in healthcare access and utilization for patients with eye conditions.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Terrence Lee - Graduate Trainee, University of California, San Diego
  • Sally Baxter - Research Fellow, University of California, San Diego
  • John McDermott - Graduate Trainee, University of California, San Diego
  • Grace Ahn - Graduate Trainee, University of California, San Diego
  • Gordon Ye - Undergraduate Student, University of California, San Diego
  • Alison Chan - Graduate Trainee, University of California, San Diego
  • Bita Shahrvini - Graduate Trainee, University of California, San Diego
  • Bharanidharan Radha Saseendrakumar - Project Personnel, University of California, San Diego
  • Arash Delavar - Graduate Trainee, University of California, San Diego

Collaborators:

  • Priyanka Soe - Project Personnel, University of California, San Diego
  • Mahasweta Nayak - Undergraduate Student, University of California, San Diego
  • Cecilia Vallejos - Undergraduate Student, University of California, San Diego

Research Program for Vision Surveillance: Diabetes and Diabetic Retinopathy

How do data from the All of Us database compare against known data sources that are considered to be representative of the general population and have been traditionally used in vision health surveillance activities (such as NHANES, NHIS, etc.)? How…

Scientific Questions Being Studied

How do data from the All of Us database compare against known data sources that are considered to be representative of the general population and have been traditionally used in vision health surveillance activities (such as NHANES, NHIS, etc.)? How does All of Us compare to existing big-data sources such as IQVIA?

There is increasing interest in understanding how social factors impact health and vision outcomes. Social determinants of health are important considerations for disease management and prognosis, and our representative use case (diabetes and diabetic retinopathy) has huge implications for our health system as the leading cause of blindness and visual impairment among working-age adults in the United States. By answering the above questions, we can determine whether the All of Us database is representative and may be broadly generalizable for future studies.

Project Purpose(s)

  • Control Set

Scientific Approaches

- Develop standard cohort definition for diabetes
- Develop standard cohort definition for diabetic retinopathy
- Determine prevalence of diabetes and compare across different data sources – All of Us, NHANES, NHIS, IQVIA
o Numerator: Number of adults with diabetes
o Denominator: Total number of adults available in data source
- Determine prevalence of diabetic retinopathy and compare across different data sources – All of Us, NHANES, NHIS, IQVIA
o Numerator: Number of adults with diabetic retinopathy
o Denominator: Total number of adults available in data source vs. total number of adults with diabetes
- For prevalence calculations, will need to establish defined study periods and ensure consistency across data sources
- Potential analyses:
o Look at state/regional variations
o Examine demographics (age, gender, race, ethnicity) of cohorts across data sources
- Identify areas of similarity/alignment vs. differences

Anticipated Findings

If we are able to demonstrate that the All of Us database is representative and aligns with existing nationwide data sources, then findings regarding links between social determinants and vision health outcomes using All of Us would be felt to be more broadly generalizable. On the other hand, if there are major discrepancies between All of Us and previously established data sources, this would be important information for the vision research community to be aware of, and this could even inform future efforts to make the database more representative.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care

Research Team

Owner:

1 - 25 of 673
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.