Research Projects Directory

Research Projects Directory

525 active projects

This information was updated 4/16/2021

Information about each project within the Researcher Workbench is available in the Research Projects Directory below. Approved researchers provide their project’s research purpose, description, populations of interest, and more. This information helps All of Us ensure transparency on the type of research being conducted.

At this time, all listed projects are using data in the Registered Tier. The Registered Tier contains individual-level data from electronic health records, surveys, physical measurements, and wearables. Personal identifiers have been removed from these data to protect participant privacy.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

Version Two ophthalmology epidemiology(DV3)

We would like to evaluate the epidemiology, treatments, and health outcomes of eye diseases using the diverse population in the All Of Us project. Over 12 million people in the United States over the age of 40 have visual impairment,…

Scientific Questions Being Studied

We would like to evaluate the epidemiology, treatments, and health outcomes of eye diseases using the diverse population in the All Of Us project. Over 12 million people in the United States over the age of 40 have visual impairment, and over 3 million have visual impairment despite glasses, contacts, or other treatments. Visual impairment has severe impacts on patients' quality of life and mortality. There are many common causes of visual impairment, including some reversible (such as cataract) and others that are treatable but can still cause irreversible vision loss (macular degeneration, glaucoma, diabetic retinopathy). Some of these diseases disproportionately impact minority populations (e.g. glaucoma in African Americans and Hispanics).
We hope to broadly characterize the prevalence of eye diseases in this cohort, as well as associated medical and surgical treatments. We hope to be able to investigate risk factors, patterns and outcomes of treatment of different eye diseases.

Project Purpose(s)

  • Disease Focused Research (eye diseases)
  • Population Health

Scientific Approaches

We plan to primarily use the EHR, survey, and physical measurements dataset to describe the epidemiology of eye diseases, using encounter-level billing codes to determine their presence or absence. We plan to investigate risk factors for these eye diseases, including demographics, medications, physical measurements (to the extent available), survey data, and other associated diagnoses. We will begin with simple descriptive statistics. In diagnoses with sufficiently sized cohort, we will also build logistic regressions to evaluate risk factors for diagnosis.
We will also evaluate treatment patterns (medical and surgical) for different eye diseases, using EHR data of medications and surgeries undergone. We will characterize demographic and patterns in patterns of medications and surgeries.

Anticipated Findings

We anticipate that our findings will contribute broadly to the knowledge of epidemiology of eye diseases in the US, as well as improve our understanding of patterns of treatments and outcomes of eye diseases in the US. In this diverse population, we will also be able to see if there are disparities in eye diseases and their treatment patterns and outcomes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Wendeng Hu - Project Personnel, Stanford University
  • Sophia Wang - Early Career Tenure-track Researcher, Stanford University
  • Eric Lee - Graduate Trainee, Stanford University

Warfarin Modeling

We are exploring the data to inform hypotheses regarding warfarin dose or adverse events related to warfarin therapy.

Scientific Questions Being Studied

We are exploring the data to inform hypotheses regarding warfarin dose or adverse events related to warfarin therapy.

Project Purpose(s)

  • Population Health
  • Drug Development
  • Ancestry

Scientific Approaches

We plan to use a cohort of warfarin-taking individuals and use machine learning to predict outcomes related to drug therapy.

Anticipated Findings

We expect to find new sources of outcome variability that may be informative for clinically implementable predictive models.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Hayley Patterson - Project Personnel, University of Arizona

Duplicate of Research in Genetics Workshop

Exploring data as an undergraduate biology student to formalize a study and become familiar with research techniques. I am hoping to answer a question based on infectious diseases and how they lead to other diseases, amoralities.

Scientific Questions Being Studied

Exploring data as an undergraduate biology student to formalize a study and become familiar with research techniques. I am hoping to answer a question based on infectious diseases and how they lead to other diseases, amoralities.

Project Purpose(s)

  • Disease Focused Research (Human immunodeficiency virus infectious disease)
  • Population Health
  • Educational
  • Ancestry

Scientific Approaches

I plan to use R program, infectious disease data, healthcare data, to discover if there are any links to motor neuron diseases, genetic mutations, etc.

Anticipated Findings

I anticipate to find more details between HIV and associated motor neuron disease. Also hoping to find associations with the CCR5- delta 32 mutation.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Access to Care

Research Team

Owner:

Effect of COVID-19 on Socialization and Isolation in Latino Populations

The primary purpose of this study is to investigate the relationships between 1) individual characteristics (e.g., age, gender, ethnicity, etc.), 2) socialization, 3) loneliness/isolation, & 4) health. This study will also examine relationships between engagement & health. Results will be…

Scientific Questions Being Studied

The primary purpose of this study is to investigate the relationships between 1) individual characteristics (e.g., age, gender, ethnicity, etc.), 2) socialization, 3) loneliness/isolation, & 4) health. This study will also examine relationships between engagement & health. Results will be contextualized within COVID-19; the data may inform on the experience of aging during isolating events. Results may further understanding of differential effects of isolation on continued activity, socialization, & health in aging populations across cohorts. These data may help inform on the relationships between socialization/loneliness in adults.
We aim to identify:
1. Relationships between demographics, health & socialization, loneliness, & isolation; how any relationships are affected by ethnicity.
2. Relationships between socialization, loneliness & isolation, & health; how any relationships are affected by ethnicity.
3. How COVID-19 influences the data.

Project Purpose(s)

  • Social / Behavioral
  • Educational
  • Other Purpose (Findings from this study may contribute to manuscripts for scientific journals and/or conference submissions. )

Scientific Approaches

Dataset development will utilize All of Us data and will occur in the researcher workbench. Data will be pulled from several All of Us datasets including: The Basics (demographics), Overall Health (health status), Lifestyle (substance use), Personal Medical History (health status) , Health Care Access & Utilization (insurance), COVID-19 Participant Experience (COVID-19 experience) , and Physical Measurements (physical activity and health).
All data analysis and data visualization will be conducted within the All of Us workbench. Normality tests, regression analyses and tests of correlation (e.g. chi-square analysis, Pearson correlation, etc.) will be utilized in analyses. Results will be reported in APA style.

Anticipated Findings

We anticipate that variables which are associated with cumulative (dis)advantage (e.g., race, gender) & health will negatively affect downstream outcomes including socialization; persons with less disadvantage & more access may have higher levels of socialization. This may be related to reduced loneliness. Cyclically, persons with more socialization may be healthier & less lonely, & persons who are healthier & less lonely may socialize more. However, we may uncover evidence of the benefit of larger, multigenerational homes (common in Latino culture), as they help buffer isolation for at-risk persons. This contributes to marginalized peoples & health & behavioral research by exploring risk & protective factors. It may inform future interventions to improve health and reduce isolation across diverse aging populations, particularly during isolating events.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Research Team

Owner:

  • Sarah Hubner - Graduate Trainee, University of Nebraska, Omaha

Collaborators:

  • Julie Blaskewicz Boron - Mid-career Tenured Researcher, University of Nebraska, Omaha
  • Athena Ramos - Senior Researcher, University of Nebraska Medical Center

COVID-19 Resilience

The objective of this study is to examine psychosocial aspects (i.e. social isolation, resilience, and loneliness) during the Sars-CoV-2 (COVID-19) pandemic using the de-identified data from the National Institutes of Health (NIH) All of Us Research Program Data (e.g. COVID-19…

Scientific Questions Being Studied

The objective of this study is to examine psychosocial aspects (i.e. social isolation, resilience, and loneliness) during the Sars-CoV-2 (COVID-19) pandemic using the de-identified data from the National Institutes of Health (NIH) All of Us Research Program Data (e.g. COVID-19 Participant Experience (COPE) Survey). To accomplish this we will pursue the following aims: 1) Examine psychosocial aspects of participant experience during COVID-19 (i.e. social isolation, resilience, and loneliness) and 2) Explore associations between overall general health, social isolation, resilience, and loneliness during COVID-19.

Project Purpose(s)

  • Population Health
  • Social / Behavioral

Scientific Approaches

All data analysis will be conducted with the NIH All of Us Research “Workbench” per NIH All of Us Research protocol. Researchers will use descriptive and inferential statistics to analyze de-identified All of Us Research participant data. We anticipate potentially using methods such as Latent class analysis. The minimum number of records needed to carry out the study objectives is 500.
De-identified data available from the All of Us Research Program include: Survey questions from the following surveys: The Basics, Overall Health, Lifestyle,and COVID-19 Participant Experience (COPE).

Anticipated Findings

Our study enables linkage of population level datasets. We are able to apply big data approaches aimed toward precision public health strategies. Previous research indicates social support and overall well-being are correlated with a person’s resilience. This study seeks to explore psychosocial components (i.e. social isolation and resilience) and overall general health during COVID-19. We anticipate finding psychosocial and overall health components are associated with Resilience during the COVID-19 pandemic.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Robin Austin - Early Career Tenure-track Researcher, University of Minnesota
  • Bhavana Goparaju - Project Personnel, University of Minnesota

Effects of Physical Activity on Health in Aging Latino Populations

The primary purpose of this study is to investigate the relationships between 1) individual characteristics (e.g., age, gender, etc.), 2) ethnicity, 3) physical activity, and 4) subjective/objective physical health. The data will be used to further understanding of the effects…

Scientific Questions Being Studied

The primary purpose of this study is to investigate the relationships between 1) individual characteristics (e.g., age, gender, etc.), 2) ethnicity, 3) physical activity, and 4) subjective/objective physical health. The data will be used to further understanding of the effects of antecedent characteristics on physical activity and health. Similarly, these data will help inform on the relationships between health and physical activity in aging individuals. Findings will concentrate on the context of ethnicity to contribute to research on aging ethnic minorities and marginalized populations.

The specific questions we aim to elucidate are as follows:
AIM 1. How do individual characteristics affect engagement in physical activity and subjective/objective health; how are these relationships affected by ethnicity?
AIM 2. What are the relationships between physical activity and subjective/objective health; how are these relationships affected by ethnicity?

Project Purpose(s)

  • Social / Behavioral
  • Educational
  • Other Purpose (Findings from this study may contribute to manuscripts for scientific journals and/or conference submissions. )

Scientific Approaches

Dataset development and analyses will utilize All of Us data and will occur in the researcher workbench. Data will be pulled from several All of Us datasets including: The Basics (e.g., demographic, ability, etc.), Overall Health (e.g., quality of life, everyday activities, etc.), Lifestyle (e.g., substance use), Personal Medical History (e.g., cardiovascular history), and Physical Measurements (e.g., height, weight, etc.). Normality tests, regression analyses and tests of correlation (e.g. chi-square analysis, Pearson correlation, etc.) will be utilized in analyses. Results will be reported in APA style.

Anticipated Findings

The objectives of this study are to inform on the relationships between individual characteristics, physical activity, and health, particularly in aging Latino adults. We anticipate that variables associated with increased cumulative disadvantage (e.g., race, gender, etc.) will disproportionally affect physical activity engagement and health; it is hypothesized that physical activity will positively impact health and vice versa, such that persons with more engagement are healthier. This contributes to ethnic, minority and marginalized peoples research by providing greater context for health and behavioral research by exploring both risk and protective factors. It may also serve to inform future meaningful interventions to improve health and activity in aging diverse populations.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Research Team

Owner:

  • Sarah Hubner - Graduate Trainee, University of Nebraska, Omaha

Collaborators:

  • Julie Blaskewicz Boron - Mid-career Tenured Researcher, University of Nebraska, Omaha
  • Athena Ramos - Senior Researcher, University of Nebraska Medical Center

Duplicate of Cancer

We intend to explore the difference in the prevalence of cancer between the AoU population. In particular, we will be looking at the difference between the entire population, the subset with medical records, and the subset with self-reported data.

Scientific Questions Being Studied

We intend to explore the difference in the prevalence of cancer between the AoU population. In particular, we will be looking at the difference between the entire population, the subset with medical records, and the subset with self-reported data.

Project Purpose(s)

  • Population Health

Scientific Approaches

We intend to select a list of SNOMED codes corresponding to primary cancers to get the subset with cancer in the medical record

We intend to select the survey question asking about self-reported cancer to get the subset with self-reported cancer

Anticipated Findings

We expect the difference of cancer to vary between self-report and medical record, which could have implications for how cancer is measured on a population-level.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use

Sex Differences in CVD Comorbidities & Risk Factors based on COVID-19 Positivity

Our study will look at how cardiovascular disease comorbidities, risk factors, and behaviors differ by sex in those who are COVID-19 positive versus negative. This question is important because there is a lack of public knowledge on the association between…

Scientific Questions Being Studied

Our study will look at how cardiovascular disease comorbidities, risk factors, and behaviors differ by sex in those who are COVID-19 positive versus negative. This question is important because there is a lack of public knowledge on the association between COVID-19 and cardiovascular health and how this may differ between men and women. Therefore, we are interested in analyzing prevalent cardiovascular comorbidities, risk factors, and behaviors according to COVID-19 test positivity. Since there are some sex differences in COVID-19 outcomes, we wish to examine how comorbidities, risk factors, and behaviors may differ between men and women.

Project Purpose(s)

  • Disease Focused Research (COVID-19, cardiovascular disease, hypertension, diabetes, heart failure)

Scientific Approaches

We will use the cohort builder to find participants who tested positive for COVID-19. Our inclusion criteria will include cardiovascular risk factors, behaviors and comorbidities. Our exclusion criteria will include people who tested negative for COVID-19. We will then create a second cohort of participants who have tested negative for COVID-19. Our inclusion criteria will remain the same as the first cohort, but this time the exclusion criteria will include people who tested positive for COVID-19. We will also stratify our cohorts into male versus female. After, we will create a dataset builder, and generate concept sets on cardiovascular risk factors (eg hypertension, blood pressure, diabetes, obesity), behaviors (diet, smoking, physical activity), and comorbidities (prior history of coronary heart disease, stroke, heart failure, COPD, diabetes). We will then import the data to Jupyter Notebook and write code in R to analyze our data.

Anticipated Findings

The anticipated findings from this study are that males who tested positive for COVID-19 will have the most cardiovascular risk factors, followed by males who tested negative for COVID-19, females who tested positive for COVID-19, and lastly, females who tested negative for COVID-19. We expect that males who tested positive for COVID-19 will be more likely to have hypertension, diabetes and less physical activity compared to females. Our findings will be useful to those who have contracted COVID-19 during this pandemic, and how they may be at risk for developing new or aggravating current cardiovascular conditions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Yufan Gong - Graduate Trainee, University of California, Los Angeles

COVID19 & Anxiety

As this is for educational purposes, my college partner and I are enrolled in Towson Univeristy's Workshop in Biology course (biol 483) taught by Dr. McDougal and we will be studying the effects of COVID-19 on mental health in specific…

Scientific Questions Being Studied

As this is for educational purposes, my college partner and I are enrolled in Towson Univeristy's Workshop in Biology course (biol 483) taught by Dr. McDougal and we will be studying the effects of COVID-19 on mental health in specific demographics

Project Purpose(s)

  • Educational

Scientific Approaches

We plan on using literature that has been published which is centered around our topic, the effects of COVID-10 on mental health. In addition, we will use the data available to us provided by All of Us. This data includes the COVID-19 Participant Experience (COPE) survey. We will look specifically at questions focusing on anxiety. Participants answers will be compared from the first time the questions were asked and two weeks after the initial questions were asked.

Anticipated Findings

We expect to see that anxiety has increased over time during the pandemic. We anticipate seeing the difference among various demographics. We hope our findings contribute to how demographical differences influence mental health in a pandemic.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Income Level

Research Team

Owner:

Collaborators:

  • Akeem Laurence - Undergraduate Student, Towson University
  • Adedola Adebamowo - Undergraduate Student, Towson University

Duplicate of How to Get Started with Registered Tier Data

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? This notebook will give you an overview of what data is available in the current…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect? This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Record (EHR), Physical Measurements (PM), and Survey data.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:

1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, you will understand the following:

All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Md Huda - Research Fellow, University of California, Davis

Diabetes and vaccine

The study will utilize the All Of Us data to test if the antibody response to the vaccine is modulated by diabetes.

Scientific Questions Being Studied

The study will utilize the All Of Us data to test if the antibody response to the vaccine is modulated by diabetes.

Project Purpose(s)

  • Disease Focused Research (diabetes mellitus)
  • Control Set

Scientific Approaches

We will use a different statistical method and machine learning modeling to determine the factor contributing to the altered vaccine response in diabetic patients. Additionally, this dataset will enrich our findings in the mouse model.

Anticipated Findings

We anticipated that we will find lower vaccine responses to the common vaccine in diabetic patients.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Md Huda - Research Fellow, University of California, Davis

Duplicate of Hypoglycemia in Non-Diabetics

Hypoglycemia is a common occurrence in hospitalized patients with diabetes and is associated with adverse clinical outcomes. Numerous prospective and retrospective studies demonstrate an increase risk of cardiovascular events, all-cause hospitalization, longer hospital stay, and all-cause mortality among diabetic patients…

Scientific Questions Being Studied

Hypoglycemia is a common occurrence in hospitalized patients with diabetes and is associated with adverse clinical outcomes. Numerous prospective and retrospective studies demonstrate an increase risk of cardiovascular events, all-cause hospitalization, longer hospital stay, and all-cause mortality among diabetic patients who have experienced hypoglycemia during inpatient admissions versus those who have not. In those without diabetes, inpatient hypoglycemia may still occur. Studies demonstrate that even in patients without diabetes, hypoglycemia results in poor clinical outcomes as related to mortality and cognitive function. There is no standard protocol for blood glucose monitoring inpatient for patients without diabetes. A standardized protocol could more closely trend blood glucose values among hospitalized non-diabetic patients who have an elevated risk of hypoglycemia in order to reduce the rate of hypoglycemia and its related complications.

Project Purpose(s)

  • Disease Focused Research (hypoglycemia)
  • Educational

Scientific Approaches

A retrospective review of patients will be conducted who have experienced at least one episode of hypoglycemia (BG < 70 mg/dL) during inpatient hospitalization and potential risk factors which may have contributed to such episode will be identified. Examples include end stage liver disease, renal disease, cardiac disease, protein-calorie malnutrition. Following, use these risk factors in a multivariate analysis to create a scoring system which assigns specific point values to each risk factor in order to predict risk of hypoglycemia during admission.

Anticipated Findings

The findings should create a model of risk factors for hypoglycemia among hospitalized non-diabetics. Using this risk model, other researchers may be able to expand the findings to create prospective studies that aim to reduce the risk of developing hypoglycemia groups with these risk factors.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Bijun Kannadath - Early Career Tenure-track Researcher, University of Arizona

Bipolar Disorder - Health Disparities

Bipolar disorder (BD) is a psychiatric disorder characterized by recurrent episodes of depression and mania (BP type I) or hypomania (BP type II). BD has a lifetime prevalence of roughly 2.5% with BP II slightly more common than BP I…

Scientific Questions Being Studied

Bipolar disorder (BD) is a psychiatric disorder characterized by recurrent episodes of depression and mania (BP type I) or hypomania (BP type II). BD has a lifetime prevalence of roughly 2.5% with BP II slightly more common than BP I and females slightly more likely than males to be diagnosed (Kessler et al. 2012). Evidence from family and twin studies have established a genetic basis for BD, with more recent work suggesting that approximately 30% of heritability is due to common genetic variants . Existing GWAS studies of bipolar disorder have identified 30 loci associated with the disease, however almost the entirety of these studies were performed in individuals of European descent. The dearth of racial and ethnic minority populations in genetic studies is a significant problem that further excludes these populations from medical advances. In the present work, we aim to increase diversity and statistical power in bipolar genetic research.

Project Purpose(s)

  • Disease Focused Research (Bipolar Disorder)
  • Population Health
  • Ancestry

Scientific Approaches

The primary goal for the this project is to facilitate genetic association studies (genome-wide) of bipolar disorder generally as well as specific features of bipolar disorder including age-of-onset, extent and features of mania/hypomania and depression, frequency of cycling, and presence of psychotic symptoms. Studies may also investigate response to pharmacotherapies (e.g. lithium) and comorbidities. A primary focus of this effort is to investigate similarities and differences in the genetic and biological underpinnings of bipolar disorder across racial groups and ethnicities. Previous research has suggested racial and ethnic differences in diagnosis, presence of specific features, and treatment. One aim of this effort is to develop the tools necessary for determining whether these observed differences are biological or social in origin. These studies have direct implications for eliminating health disparities in minority populations.

Anticipated Findings

Given the high rates of misdiagnosis in minority populations we anticipate higher levels of genetic heterogeneity when investigating the bipolar phenotype. There are also reports of differential rates of manic and psychotic symptoms, as well as medication use. This may reflect the effects of misdiagnosis or it may represent legitimate differential illness presentation. If the latter, then we would anticipate this to be reflected in GWAS studies. These efforts will help to better understand the role of structural racism and systematic bias in mental health (specifically surrounding bipolar disorder) and to understand the genetic basis of bipolar disorder with regards to racial/ethnic heterogeneity.

Demographic Categories of Interest

  • Race / Ethnicity

Research Team

Owner:

  • Eric Vallender - Mid-career Tenured Researcher, University of Mississippi Medical Center
  • Christina Jordan - Early Career Tenure-track Researcher, University of Mississippi Medical Center

Systemic Disease and Glaucoma

We have previously published a predictive model of glaucoma progression using electronic health record (EHR) data pertaining to systemic attributes from a single institution. We aim to use the All of Us dataset to 1) serve as external validation for…

Scientific Questions Being Studied

We have previously published a predictive model of glaucoma progression using electronic health record (EHR) data pertaining to systemic attributes from a single institution. We aim to use the All of Us dataset to 1) serve as external validation for this single-center model and 2) to train new models focused on predicting glaucoma progression using systemic predictors. This is important to understand whether the original findings are generalizable and provide additional knowledge about the utility of systemic predictors on a national-level dataset.

Project Purpose(s)

  • Disease Focused Research (Primary open angle glaucoma)
  • Other Purpose (This work is the result of an All of Us Research Program Demonstration Project. Demonstration Projects are efforts by the All of Us Research Program designed to meet the goal of ensuring the quality and utility of the Research Hub as a resource for accelerating precision medicine. This work has been approved, reviewed, and overseen by the All of Us Research Program Science Committee and Data and Research Center to ensure compliance with program policy. )

Scientific Approaches

We plan to primarily work with EHR data contained in All of Us for a cohort of adult participants diagnosed with primary open-angle glaucoma. We will extract data on systemic conditions and medications for this cohort, as well as physical measurements and vital signs. We will clean the data such that the format is consistent with the data from our previously published model. Then, we will use this data as an external validation of a logistic regression model derived from our prior study that was based at a single academic center. Next, we will use All of Us data to train a new set of models, using techniques such as logistic regression, random forests, and artificial neural networks. We will optimize these models using feature selection methods and class balancing procedures. By evaluating performance metrics such as area under the curve (AUC), precision, recall, and accuracy, we will assess whether we can achieve superior predictive performance when training models using All of Us.

Anticipated Findings

We anticipate that the All of Us data will validate the findings from the model, which demonstrated that blood pressure-related metrics and certain medication classes had predictive value for glaucoma progression. In addition, we anticipate that the models trained with All of Us data will outperform the model trained with single institution data due to larger sample size and greater diversity. These findings will support further investigation in understanding the relationship between systemic conditions like blood pressure with glaucoma progression.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Francis Ratsimbazafy - Other, All of Us Program Operational Use
  • Nghia Nguyen - Research Fellow, University of California, San Diego
  • Joshua Morriss - Graduate Trainee, Virginia Commonwealth University

Real Data

To find of there is any relationship between LDL, Blood pressure and Vasculitis . It can help medical professionals in diagnosing vasculitis and its severity easier and earlier.

Scientific Questions Being Studied

To find of there is any relationship between LDL, Blood pressure and Vasculitis . It can help medical professionals in diagnosing vasculitis and its severity easier and earlier.

Project Purpose(s)

  • Educational

Scientific Approaches

Data analyses methods like; regression analyses, linear graphs, box plots, scatter plots, pie charts and possibly more. There will be comparison of different variables to see if there is an increased incidence in a certain group. R- programming will be very useful in analyzing the data because it is a very efficient way to look at data.

Anticipated Findings

We expect to find that high blood pressure and LDL to have a positive relationship with vasculitis. This can make diagnosis easier and faster. It can also lead to other studies that can further advance the field.

Demographic Categories of Interest

  • Age

Research Team

Owner:

Collaborators:

  • Kathryn McDougal - Other, Towson University
  • Abby Wennick - Undergraduate Student, Towson University
  • Akeem Laurence - Undergraduate Student, Towson University

TW & GK

My research partner and I will be exploring the effects of that the COVID-19 lockdown had on physical activity and if there are any possible correlations to mental health decline and other health related issues. We wanted to explore this…

Scientific Questions Being Studied

My research partner and I will be exploring the effects of that the COVID-19 lockdown had on physical activity and if there are any possible correlations to mental health decline and other health related issues. We wanted to explore this data as we feel as though it is important to try to maintain a healthy lifestyle as best as possible, but we know that not everyone may have the necessary resources to do so. We plan on looking at demographics and wearable data to help determine the health of participants before during and hopefully after the lockdown, and may be able o find a way that we can encourage those to get outside for just a few minutes a day.

Project Purpose(s)

  • Educational

Scientific Approaches

we plan on looking at the wearable fitbit data, COVID surveys, and some of the basic surveys as well.

Anticipated Findings

We anticipate that we will find that the pandemic lockdown caused a decline in physical activity which resulted in an increase in associated disease related to lack of physical activity.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

Collaborators:

  • Teresa Watkins - Undergraduate Student, Towson University
  • Njambi Kiguru - Undergraduate Student, Towson University
  • Kathryn McDougal - Other, Towson University
  • Akeem Laurence - Undergraduate Student, Towson University

Duplicate of Duplicate of Demo - PheWAS Smoking

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform…

Scientific Questions Being Studied

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform separate PheWAS studies with smoking status as the independent variable. Specific questions include:

1. How can one implement a PheWAS within the All of Us Researcher Workbench?
2. How can one use heterogeneous data sources within the All of Us dataset to explore disease associations using self-reported exposures (Participant Provided Information, or “PPI”) and exposures captured in the electronic medical record (EHR).

Project Purpose(s)

  • Methods Development
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

As a method for assessing the health burden of smoking on potential observed phenotypes, we implement a Phenome-Wide Association study. A Phenome-wide association study consists of an array of association tests over an indexed representation of the human phenome. In this analysis, we will conduct PheWAS for EHR derived smoking and PPI derived smoking exposures included in the All of Us research dataset. We will be representing "Smoking Exposure” in three ways:
EHR Smoking ICD Billing Codes
Participant Provided Information (PPI) Smoking lifetime 100 cigarettes yes/no
Participant Provided Information (PPI) Smoking lifetime smoking everyday
To perform PheWAS, we will map ICD representations of disease to a common vocabulary of PheCodes. We then use Jupyter Notebooks to create reusable functions to perform PheWAS and generate Manhattan Plots to summarize associations.

Anticipated Findings

For this study, we anticipate that we will be able to replicate known disease associations with smoking exposure. This will serve to demonstrate the quality, utility, and diversity of the All of Us data and tools and the power of gathering multiple data sources for a single phenotype, providing researchers options for study design and validation. Importantly the entire pheWAS package is made available for reuse by researchers in the Workbench, for new hypothesis generation.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Jie Chen - Late Career Tenured Researcher, Augusta University

Duplicate of Phenotype - Type 2 Diabetes

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research.

Scientific Questions Being Studied

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Phenotype Library Workspace created by the Researcher Workbench Support team. It is meant to demonstrate the implementation of key phenotype algorithms within the All of Us Research Program cohort.)

Scientific Approaches

Not Applicable

Anticipated Findings

By reading and running the Notebooks in this Phenotype Library Workspace, researchers can implement the following phenotype algorithms:

Jennifer Pacheco and Will Thompson. Northwestern University. Type 2 Diabetes Mellitus. PheKB; 2012 Available from: https://phekb.org/phenotype/18

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Jie Chen - Late Career Tenured Researcher, Augusta University

Duplicate of Hypothesis generation

The main focus is to study patterns and trends in individuals taking complementary medicines and interventions, as well as associations of these interventions with multi-health system outcomes. We hope to use study findings to generate hypotheses for planning of clinical…

Scientific Questions Being Studied

The main focus is to study patterns and trends in individuals taking complementary medicines and interventions, as well as associations of these interventions with multi-health system outcomes. We hope to use study findings to generate hypotheses for planning of clinical trials in whole-person health research.
Specific questions include:
1) What are the patterns and trends in complementary medicines such as yoga, acupuncture, mindfulness interventions etc.? What are the participants’ demographic characteristics and health/comorbidity conditions associated with combinations of interventions?
2) What are the associations between complementary medicines and behavioral interventions and health outcomes? Specifically, are there certain combinations of interventions associated with great improvement of multi-health system outcomes?
3) Are there latent traits in individuals associated with great multi-health system outcomes related to complementary medicines and interventions?

Project Purpose(s)

  • Population Health
  • Methods Development

Scientific Approaches

We plan to apply statistical methods for longitudinal data and machine learning to identify patterns, trends and associations. R will be used for all analyses. Specifically, we will:
1) Construct datasets with individuals that have measures of complementary medicines and behavioral interventions, and/or have provided answers to survey questions related to those interventions.
2) Use growth mixture modeling to identify sub-populations, to describe longitudinal change within each sub-population, and to examine differences in trends with respect to use of those interventions over time.
3) Use structural equation modeling to link identified sub-populations and trends to individual level health-related outcomes and study the associations.
4) Also use clustering techniques to explore cross-sectional data to identify combinations of individual characteristics associated with great improvement of multi-health system outcomes related to complementary medicines and behavioral interventions.

Anticipated Findings

From this study we expect to:
1) Identify patterns and trends of complementary medicines and behavioral interventions applications.
2) Build constructs that characterize underlying traits of whole-person health.
3) Identify traits in individuals associated with great improvement of multi-health system outcomes related to complementary medicines and behavioral interventions.
Study findings will provide important information in understanding the causal relationship between complementary interventions and health outcomes, and will be used to generate hypotheses of effectiveness of a combination of behavioral and other interventions on whole person health, which can be tested in future clinical trials.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care
  • Income Level

Research Team

Owner:

TOS Abstract

We are interested in assessing disease risk scores in the AOU dataset using survey, physical measures, EHR and Fitbit data. Using unsupervised and supervised ML to see if clusters have potential to improve current standards such as ACSVD, DRF and…

Scientific Questions Being Studied

We are interested in assessing disease risk scores in the AOU dataset using survey, physical measures, EHR and Fitbit data. Using unsupervised and supervised ML to see if clusters have potential to improve current standards such as ACSVD, DRF and others

Project Purpose(s)

  • Methods Development

Scientific Approaches

Datasets similar to current local NYC population from 2019 census data, using unsupervised (kmeans, knn) and supervised (regression) machine learning to determine if AOU data in framework of ACSVD and DRF risk factor questionnaires perform better with different variables of interest.

Anticipated Findings

We anticipate that we will be able to replicate or improve risk factor assessments; risk factors may change with subgroups.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

Akeem Demo Space

This workspace will be used to prepare instructor content and analysis protocols for a course-based research laboratory class supported by the Towson University Research Enhancement Program. The purpose of this course is for students to have the experience of developing…

Scientific Questions Being Studied

This workspace will be used to prepare instructor content and analysis protocols for a course-based research laboratory class supported by the Towson University Research Enhancement Program. The purpose of this course is for students to have the experience of developing a research question in human health and then they will design and implement an analysis of publicly available data to answer their research question. The student research projects will focus on medical health and public health topics. As well as learning skills important in medical and epidemiological research, students will be able ask questions that could lead to better understanding of and treatment for diseases in traditionally under-served populations.

Project Purpose(s)

  • Educational

Scientific Approaches

Data analysis will be run in an NIH-approved "Researcher Workbench" platform using Jupyter Notebook and R. The questions students will ask will be dependent on what data All of Us has available to researchers at the time of the course. These data will include health data, physical measurement data, biospecimen-related data, and genomic data.

Anticipated Findings

As well as learning skills important in medical and epidemiological research, students will be able ask questions that could lead to better understanding of and treatment for diseases in traditionally under-served populations. We also hope this course will encourage undergraduate students to consider careers in medical research.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

COPE survey analysis

We are interested in using the COPE survey questions and answers from respondents to see how differences may occur by geographic location and demographic information. Specifically, we are interested in social distancing and mental health questions.

Scientific Questions Being Studied

We are interested in using the COPE survey questions and answers from respondents to see how differences may occur by geographic location and demographic information. Specifically, we are interested in social distancing and mental health questions.

Project Purpose(s)

  • Disease Focused Research (COVID-19)

Scientific Approaches

We plan to use the COPE survey data, linked patient information, geographic information for each site, (possibly) medication data, and condition occurrence data. We plan to use phenome wide association studies (PheWAS) in order to determine likely phenotypes associated with outcomes of interest, such as social distancing measures.

Anticipated Findings

We anticipate to find those with debilitating diseases to be more concerned with social distancing, however we are unsure which diseases will have greater association. We also suspect that measures of depression and loss due to covid to be highly associated with more social distancing. We propose that these PheWAS analyses and investigations into differences of social distancing and mental health will heavily contribute to the field of study on COVID-19 and sociological and behavioral health effects.

Demographic Categories of Interest

  • Race / Ethnicity

Research Team

Owner:

Systemic Disease and Glaucoma (Cloned)

We have previously published a predictive model of glaucoma progression using electronic health record (EHR) data pertaining to systemic attributes from a single institution. We aim to use the All of Us dataset to 1) serve as external validation for…

Scientific Questions Being Studied

We have previously published a predictive model of glaucoma progression using electronic health record (EHR) data pertaining to systemic attributes from a single institution. We aim to use the All of Us dataset to 1) serve as external validation for this single-center model and 2) to train new models focused on predicting glaucoma progression using systemic predictors. This is important to understand whether the original findings are generalizable and provide additional knowledge about the utility of systemic predictors on a national-level dataset.

Project Purpose(s)

  • Disease Focused Research (Primary open angle glaucoma)
  • Other Purpose (This work is the result of an All of Us Research Program Demonstration Project. Demonstration Projects are efforts by the All of Us Research Program designed to meet the goal of ensuring the quality and utility of the Research Hub as a resource for accelerating precision medicine. This work has been approved, reviewed, and overseen by the All of Us Research Program Science Committee and Data and Research Center to ensure compliance with program policy. )

Scientific Approaches

We plan to primarily work with EHR data contained in All of Us for a cohort of adult participants diagnosed with primary open-angle glaucoma. We will extract data on systemic conditions and medications for this cohort, as well as physical measurements and vital signs. We will clean the data such that the format is consistent with the data from our previously published model. Then, we will use this data as an external validation of a logistic regression model derived from our prior study that was based at a single academic center. Next, we will use All of Us data to train a new set of models, using techniques such as logistic regression, random forests, and artificial neural networks. We will optimize these models using feature selection methods and class balancing procedures. By evaluating performance metrics such as area under the curve (AUC), precision, recall, and accuracy, we will assess whether we can achieve superior predictive performance when training models using All of Us.

Anticipated Findings

We anticipate that the All of Us data will validate the findings from the model, which demonstrated that blood pressure-related metrics and certain medication classes had predictive value for glaucoma progression. In addition, we anticipate that the models trained with All of Us data will outperform the model trained with single institution data due to larger sample size and greater diversity. These findings will support further investigation in understanding the relationship between systemic conditions like blood pressure with glaucoma progression.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Learning_ALT

Obesity is one of the most important risks for many diseases in the United States and across the world. Differences in body weight and shape across gender and race/ethnicity have been extensively described. We sought to replicate these differences and…

Scientific Questions Being Studied

Obesity is one of the most important risks for many diseases in the United States and across the world. Differences in body weight and shape across gender and race/ethnicity have been extensively described. We sought to replicate these differences and evaluate newly emerging data from the All of Us Research Program (AoU). In this project, we ask the scientific question: How do individuals from different genders and different racial/ethnic groups in the All Of Us dataset differ with respect to weight, waist and hip circumferences, cholesterol levels and levels of alanine aminotransferase?

Project Purpose(s)

  • Disease Focused Research (Obesity)

Scientific Approaches

Within each ethnic/racial group and each gender group, we first visually examine histograms of each outcome variable to determine the presence of any major outliers that may represent measurement errors. Then we tabulated the mean values and other descriptive statistics for continuous variables such as waist and hip circumferences. We also determined the proportion of individuals with abdominal obesity. To formally test for differences among groups and to adjust for age and other covariates, we will use linear regression, transforming variables to conform to assumptions of linear regression. Data for race and ethnicity was obtained from participants in participant-provided information (PPI). Biological sex at birth, height, weight, waist circumference (WC), and hip circumference measurements were obtained according to AoU baseline visit protocols. Levels of alanine aminotransferase (ALT) were obtained from the EHR records of participants.

Anticipated Findings

For this study, we anticipate that we will be able to replicate known differences in body weight and shape across gender and race/ethnicity. We anticipate that we will find racial/ethnic and gender disparities related to ALT, a surrogate marker of hepatic steatosis. We anticipate the ability to evaluate the consistency of the All of Us cohort with national averages related to obesity and indicate that this resource is likely to be a major source of scientific inquiry and discovery. This project will serve to demonstrate the quality, utility, and diversity of the All of Us data and tools and the power of gathering multiple data sources for a single set of phenotypes, providing researchers options for study design and validation.

Demographic Categories of Interest

  • Race / Ethnicity
  • Sex at Birth

Research Team

Owner:

PGHD in Heart Failure Self-Care

Data created through personal devices can be immense and difficult to sift through, however digital phenotyping, the quantification of the individual phenotype through the use of personal digital devices, in combination with machine learning has the ability to make sense…

Scientific Questions Being Studied

Data created through personal devices can be immense and difficult to sift through, however digital phenotyping, the quantification of the individual phenotype through the use of personal digital devices, in combination with machine learning has the ability to make sense of the data and identify risks for decompensation, transforming and personalizing care (Vaidyam, Halamka, & Torous). This concept is promising; however, the area has not been fully developed, remaining more of a research topic than a clinical tool (Vaidyam, Halamka, & Torous). Utilization of PGHD can increase self-care maintenance and adherence to treatment regimens (Cajita et al., Kiyarosta et al.; Son et al.,). Gaining a better understanding of the data, or digital phenotype will provide more insight into self-management behaviors and improved outcomes. The research question identified is: How does the use of PGHD in heart failure patients, such as activity trackers, predict utilization of healthcare services?

Project Purpose(s)

  • Disease Focused Research (congestive heart failure)
  • Population Health

Scientific Approaches

This is a developing dissertation plan:
Initially the idea is to conduct a quantitative analysis of fitbit data and EHR data to develop a digital phenotype of the HF patient, evaluating if their fitbit use interacts with health are utilizing.

Anticipated Findings

Identifying a digital phenotype of the data will create a better understanding of health behaviors to achieve personalized care. Integrating personal devices to track health information directly links to precision health interventions. Precision health is the future of chronic disease management and patient generated health will have an important role in this development.

The results will inform my dissertation.

Demographic Categories of Interest

  • Age

Research Team

Owner:

1 - 25 of 525
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.