Research Projects Directory

Research Projects Directory

At this time, all listed projects are using data in the registered tier. The registered tier contains individual-level data from electronic health records, survey answers, and physical measurements. These data have been altered to protect participant privacy.

Note: Researcher Workbench users provide information about their research projects independently. Any views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program.

Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

There are currently 73 active workspaces. This information was updated on 7/2/2020.

Sort By Title:

D014 - Opioids

Project Purpose(s)

  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.) ...

Scientific Questions Being Studied

As a demonstration project, this study will present the results of prevalence of opioid use in the United States. Specific questions include:

1. What is the prevalence of prescription opioids received from healthcare systems?
2. What is the prevalence of opioids misuse including nonmedical prescription opioids use and street opioid use?
3. Data in both previous questions will also be stratified by geographic region

Scientific Approaches

We will identify prevalence of opioid use in two ways and stratified by state.
First, we use EHR Drug Exposures to capture use of prescription opioid.
Second, we use lifestyle survey questionnaire to capture substance use reported by patients themselves:
1. In your LIFETIME, which of the following substances have you ever used?
2. In the PAST THREE MONTHS, how often have you used this substance?
The prevalence will be stratified by state, therefore EHR Observation Table will be used to capture this information.

Anticipated Findings

For this study, we anticipate that we will be able to replicate previous national studies of estimating prevalence of opioids. All of Us workbench research data also provides an alternative tool for assessing prevalence rate of substance use and prescription opioids for US population.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Hsueh-Han Yeh - Research Associate, Henry Ford Health System

D027-MS

Project Purpose(s)

  • Disease Focused Research (multiple sclerosis)
  • Other Purpose (Provide evidence of AOU ability to replicate findings on the prevalence and demographics of MS ) ...

Scientific Questions Being Studied

Objective: Determine the prevalence, demographics and regional distribution of multiple sclerosis (MS) in the All of Us Research Program?

Scientific Approaches

Study population: All of Us Research Program participants who have given access to their electronic health record information and who have answered the Basics survey, and who have answered Personal Medical History survey.

Data analysis: We will determine the prevalence of multiple sclerosis in the All of Us Research Program electronic medical record data and personal medical history survey with three different cohorts: patients had EHR only, survey only and both EHR and Survey. Those data will then be stratified by age, sex, race/ethnicity and region as self-reported in the Basics PPI survey.

Anticipated Findings

We anticipate that the AoURP will have prevalence and demographics of MS as recent previous studies. We further anticipate that findings regarding MS in AoURP participants' EHR will be similar to those in the survey data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Cathryn Peltz - Other, Henry Ford Health System

Collaborators:

  • Amy Tang

D029

Project Purpose(s)

  • Disease Focused Research (cardio vascular disease, cancer (all types), diabetes )
  • Population Health ...

Scientific Questions Being Studied

The overall goal of this project is to examine whether there is evidence of the Latino Epidemiological Paradox within the All of Us Research Project (AoURP) cohort.
. In this proposal, we will perform analysis that would seek to examine this phenomenon. We will address the following aims:
• Specific Aim 1. To determine whether Latinos have lower prevalence of gender stratified age-adjusted CVD versus NHWs and non-Hispanic blacks in the cohort.
• Specific Aim 2. To determine whether Latinos have lower prevalence of gender stratified age-adjusted cancer (overall) versus NHWs and non-Hispanic blacks in the cohort
• Specific Aim #3. To determine whether Latinos have higher prevalence of gender stratified age-adjusted diabetes and obesity (overall) versus NHWs and non-Hispanic blacks in the cohort
• Specific Aim #4: To extent possible examine differences by Latino subgroups and among foreign born versus US born Latinos.

Scientific Approaches

Not available.

Anticipated Findings

to determine whether there is evidence of the Latino epidemiological paradox in the AoURP cohort.

Demographic Categories of Interest

Not available.

Research Team

Owner:

  • olveen carrasquillo - Late Career Tenured Researcher, University of Miami

DataExploration

Project Purpose(s)

  • Social / Behavioral
  • Educational ...
  • Methods Development

Scientific Questions Being Studied

Explore the collected data set so far and determine the type of research and education activities we can perform in future work.

Scientific Approaches

Descriptive statistics will be calculated to understand the data. In certain cases, we will also use data visualization. We will use Python and R packages for the data analysis.

Anticipated Findings

A clear understanding about the current data set.

Demographic Categories of Interest

  • Age
  • Geography
  • Disability Status
  • Access to Care

Research Team

Owner:

  • Leming Zhou - Project Personnel, University of Pittsburgh

Demographics of Mammography 2020_04

Project Purpose(s)

  • Population Health
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use) ...

Scientific Questions Being Studied

Mammography is an effective screening tool for breast cancer, often identifying tumors that can be treated before they develop invasive potential. Across the United States, it is estimated that 65% of women aged 40 and above have received a screening mammogram. However, smaller studies using data from electronic health records suggest that (1) that the actual screening rate may be lower and (2) mammography screening differs by racial, ethnic, and sociodemographic characteristics, and lower rates of mammography screening may contribute to disparities in breast cancer mortality.

In this demonstration project, we will describe the distribution of mammography screening captured by the submitted electronic health records in the large and diverse participant sample of the All of Us Research Program. Further, we will describe the participant characteristics that are associated with mammography rates in women during the ages in which national guidelines suggest routine screening.

Scientific Approaches

After limiting ourselves to All of Us research participants with electronic health record information, we will identify rates of mammography screening using the procedure and diagnosis tables. Using the participant provided information from the surveys, we will use logistic regression to identify participant characteristics that are associated with higher or lower rates of screening.

Anticipated Findings

Some prior research has attempted to validate self-reported mammography screening against electronic health record verification of the screening. Largely, this research has found that (1) mammography rates are likely lower than self-report suggests and (2) certain patient characteristics are associated with lower rates of screening.

We anticipate that these findings will largely hold in the All of Us study population, and that the diversity of the All of Us participants will allow us to better identify those who may need more assistance to achieve the recommended screening frequency.

Demographic Categories of Interest

  • Race / Ethnicity
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

  • Molly Scannell Bryan - Early Career Tenure-track Researcher, University of Illinois at Chicago

Distress and T2D

Project Purpose(s)

  • Disease Focused Research (type 2 diabetes mellitus) ...

Scientific Questions Being Studied

Depression, anxiety, and other forms of mental distress are frequently co-morbid with type 2 diabetes. Are there common risk factors between the two?

Scientific Approaches

Comparison demographics and medications of individuals diagnosed with type 2 diabetes with and without mental distress.

Anticipated Findings

If we know that someone is at risk for mental distress, we might be able to provide increased support to mitigate the effects.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Sara Taylor - Research Fellow, Massachusetts Institute of Technology

Diversity within Eating Disorders

Project Purpose(s)

  • Population Health ...

Scientific Questions Being Studied

We will explore sociodemographic variables (e.g., gender, sexual orientation, race/ethnicity) in relation to eating disorder diagnoses. We specifically are interested in disparities in the occurrence of eating disorder diagnoses, access to treatment, age of initial diagnosis, and associated distress and impairment. That is, are some sociodemographic groups more likely to receive a diagnosis of eating disorders, receive care, have varying age of initial diagnosis, and/or experience disproportionate distress/impairment, than other sociodemographic groups? These research questions are important, as there is limited research exploring intersecting identities within eating disorders.

Scientific Approaches

Within the All of Us dataset V3, we will exact sociodemographic variables (gender, sexual orientation, race/ethnicity) eating disorder diagnosis and treatment, and items from the 'Overall Health' survey to assess distress/impairment and quality of life. Initially, descriptive statistics will be used to report the frequencies of eating disorder diagnoses and treatment as a function of the aforementioned sociodemographic variables. Should there be adequate statistical power, logistic regression models will be employed with sociodemographic variables set as 'predictors' of binary eating disorder 'outcomes.' Additionally, within individuals diagnosed with an eating disorder, we will examine sociodemographic differences in distress/impairment and quality of life via linear regression. Metrics of effect size estimates will also be reported. Should statistical power allow us, interaction terms by sociodemographic variables will also be tested.

Anticipated Findings

There is limited research on intersecting identities among individuals diagnosed with eating disorders. By employing the All of Us dataset, we may be able to identify health disparities in the occurrence of eating disorders and/or associated distress/impairment and quality of life. Results may help guide additional research efforts into understanding the mechanisms which may place some populations at disproportionate risk, which subsequently could lead to refined and tailored eating disorder prevention and treatment approaches.

Demographic Categories of Interest

  • Race / Ethnicity
  • Gender Identity
  • Sexual Orientation
  • Access to Care

Research Team

Owner:

  • Aaron Blashill - Mid-career Tenured Researcher, San Diego State University

Collaborators:

  • Jonathan Helm - Early Career Tenure-track Researcher, San Diego State University

Duplicate of How to Work with All of Us Physical Measurements Data

Project Purpose(s)

  • Educational
  • Methods Development ...

Scientific Questions Being Studied

How to navigate around physical measurements?

Scientific Approaches

N/A

Anticipated Findings

N/A

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Ozan Dikilitas - Research Fellow, Mayo Clinic

Duplicate of How to Work with All of Us Physical Measurements Data

Project Purpose(s)

  • Educational
  • Methods Development ...

Scientific Questions Being Studied

How to navigate around physical measurements?

Scientific Approaches

N/A

Anticipated Findings

N/A

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Harry Hochheiser - Mid-career Tenured Researcher, University of Pittsburgh

Duplicate of Phenotype - Depression

Project Purpose(s)

  • Disease Focused Research (Major Depression Disorder)
  • Educational ...
  • Methods Development
  • Other Purpose (This is an All of Us Phenotype Library Workspace created by the Researcher Workbench Support team. It is meant to demonstrate the implementation of key phenotype algorithms within the All of Us Research Program cohort.)

Scientific Questions Being Studied

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms of depression in one’s own research.

Scientific Approaches

Not Applicable

Anticipated Findings

By reading and running the Notebooks in this Phenotype Library Workspace, researchers can implement the following phenotype algorithms:

This Workspace contains an implementation of a phenotype algorithm for depression: This algorithm was obtained from the eMERGE network. Citation: TBA. KPWA/UW. Depression. PheKB; 2018 Available from: https://phekb.org/phenotype/1095

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Sara Taylor - Research Fellow, Massachusetts Institute of Technology

Duplicate of Systemic Disease and Glaucoma

Project Purpose(s)

  • Disease Focused Research (primary open angle glaucoma)
  • Other Purpose (This work is the result of an All of Us Research Program Demonstration Project. Demonstration Projects are efforts by the All of Us Research Program designed to meet the goal of ensuring the quality and utility of the Research Hub as a resource for accelerating precision medicine. This work has been approved, reviewed, and overseen by the All of Us Research Program Science Committee and Data and Research Center to ensure compliance with program policy. ) ...

Scientific Questions Being Studied

We have previously published a predictive model of glaucoma progression using electronic health record (EHR) data pertaining to systemic attributes from a single institution. We aim to use the All of Us dataset to 1) serve as external validation for this single-center model and 2) to train new models focused on predicting glaucoma progression using systemic predictors. This is important to understand whether the original findings are generalizable and provide additional knowledge about the utility of systemic predictors on a national-level dataset.

Scientific Approaches

We will develop predictive models using the All of Us dataset using multivariable logistic regression, random forests, and artificial neural networks.

Anticipated Findings

We anticipate that the All of Us data will validate the findings from the model, which demonstrated that blood pressure-related metrics and certain medication classes had predictive value for glaucoma progression. In addition, we anticipate that the models trained with All of Us data will outperform the model trained with single institution data due to larger sample size and greater diversity.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Sally Baxter - Research Fellow, University of California, San Diego

Collaborators:

  • Tsung-Ting Kuo - Early Career Tenure-track Researcher, University of California, San Diego
  • Roxana Loperena Cortes - Other, All of Us Program Operational Use
  • Paulina Paul
  • Lucila Ohno-Machado
  • Luca Bonomi
  • Katherine Kim - Early Career Tenure-track Researcher, University of California, Davis
  • Jihoon Kim - Other, UCSD

Evidence of the Latino Epidemiologic Paradox in the All of Us Research Project

Project Purpose(s)

  • Disease Focused Research (Cardiovascular disease)
  • Population Health ...
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Questions Being Studied

The overall goal of this project is to examine whether there is evidence of the Latino Epidemiological Paradox within the All of Us Research Project (AoURP) cohort. The specific aims are:

Specific Aim 1
To determine whether Latinos have lower prevalence of gender stratified age-adjusted CVD versus NHWs and non-Hispanic blacks in the cohort.

Specific Aim 2
To determine whether Latinos have lower prevalence of gender stratified age-adjusted cancer (overall) versus NHWs and non-Hispanic blacks in the cohort.

Specific Aim 3
To determine whether Latinos have higher prevalence of gender stratified age-adjusted diabetes and obesity (overall) versus NHWs and non-Hispanic blacks in the cohort.

Specific Aim 4
To extent possible examine differences by Latino subgroups and among foreign born versus US born Latinos.

Scientific Approaches

Study population. All of Us Research Project core participants. We will examine data from different data sources including electronic health records (EHR) and participant provided information (PPI) and physical measurements.

Main outcome variables: we will work with the DRC Research Support Team to obtain support for their existing classification scheme for common complex diseases which in this project would include cardiovascular disease, cancer (including subtypes to extent possible) and Diabetes (Type 2). For the definition of diseases we will use EHR data to preserve very objective outcomes, excluding for now survey data.

Statistical analysis
We will present all data stratified by gender adn age adjusted using direct standardization. BMI categories would be <25, 25-30, 30-35 and >35). For diabetes AIC data will be categorized (AIC <7, AIC 7-9 and AIC > 9).

Anticipated Findings

We expect to find evidence of the Latino Epidemiological Paradox within the All of Us Research Project (AoURP) cohort. We expect to find that despite multiple social and economic disadvantages, overall on many measures of population health Latinos seem to have a more favorable health advantage than other racial/ethnic minority groups such as blacks and in some measures even better health status than Non-Hispanic Whites (NHWs).

Previous studies like the Study of Latinos (SOL), which is the largest study of Latinos (16,000), aimed to examine this paradox but had the limitation that only included Latinos and thus comparative data on non-Latinos was not collected. With 40,000 Latinos core participants in the AllofUs study (as well 160,000 non Latinos), the AoURP study is uniquely positioned to contribute our knowledge and further understanding of this paradox.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth

Research Team

Owner:

  • Raul Montanez Valverde - Graduate Trainee, University of Miami

Collaborators:

  • olveen carrasquillo - Late Career Tenured Researcher, University of Miami

FHIRCat-LungCancer

Project Purpose(s)

  • Disease Focused Research (lung cancer)
  • Methods Development ...
  • Control Set
  • Ancestry

Scientific Questions Being Studied

Lung cancer continues to be the leading cause of deaths from malignancy worldwide. There have been widespread efforts to develop safe and effective screening methods to detect lung cancer at an earlier stage. The US Preventive Services Task Force (USPSTF) recommends screening for lung cancer in individuals aged 55-80 years, who have a smoking history of 30 pack-years or more, and who either currently smoke or quit within the past 15 years. However, data has shown that only a third of patients diagnosed with lung cancer in the USA meet the USPSTE screening criteria, suggesting that many potential high-risk individuals are not eligible for low-dose CT screening. Therefore, there is an urgent need to seek more sophisticated risk assessment methods incorporating clinical data, and to identify those at high risk and optimize the lung cancer screening criteria.

Scientific Approaches

We plan to create methods and tools to characterize the phenotypic abnormalities associated with patient eligibility and risk factors using phenome-wide association study (PhWAS). We will explore the all of us datasets to answer our scientific questions on lung cancer screening.

Anticipated Findings

We anticipate that we can identify patient cohorts with lung cancer risk factors and demonstrate the feasibility of the use all of us datasets to conduct PheWAS study to characterize the phenotypic abnormalities associated with patient eligibility and risk factors.

Demographic Categories of Interest

  • Race / Ethnicity

Research Team

Owner:

  • Guoqian Jiang - Mid-career Tenured Researcher, Mayo Clinic

Collaborators:

  • jie na - Project Personnel, Mayo Clinic

First Test Workspace

Project Purpose(s)

  • Methods Development ...

Scientific Questions Being Studied

Exploratory data analysis to start, to see if can support research questions around clinical decision support applications:

- Can the results of microbiology culture tests be accurately predicted based on available patient / clinical data at the time of test ordering?

- Can the clinical orders from new specialty consultation visits be predicted based on available patient / clinical data at the time of referral from a generalist?

Scientific Approaches

Supervised and unsupervised machine learning models (e.g., collaborative filtering) applied to clinical data sources to predict subsequent labels in the form of clinical test orders and results.
Cases where patients receive empiric antibiotic prescriptions (simultaneous antibiotics with new diagnostic microbiology culture tests).
Cases where a patient is referred to and then subsequent sees a specialist (e.g., endocrinology or hematology).

Anticipated Findings

Clinical orders and tests results are sufficiently predictable given available data that they can power clinical decision support information retrieval tools to aid clinical decision making under uncertainty.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Jonathan Chen - Early Career Tenure-track Researcher, Stanford University

For training and learning

Project Purpose(s)

  • Educational ...

Scientific Questions Being Studied

The workspace is aimed to develop a learning module and provide and exposure to students on potential social science based implications for occupational choices.

Scientific Approaches

The workspace aims to use traditional statistical methods in Python.

Anticipated Findings

Developing a deeper understanding of the dataset and the baseline descriptives related to occupational choice.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Pankaj Patel - Late Career Tenured Researcher, Villanova University