Research Projects Directory

Research Projects Directory

552 active projects

This information was updated 5/7/2021

Information about each project within the Researcher Workbench is available in the Research Projects Directory below. Approved researchers provide their project’s research purpose, description, populations of interest, and more. This information helps All of Us ensure transparency on the type of research being conducted.

At this time, all listed projects are using data in the Registered Tier. The Registered Tier contains individual-level data from electronic health records, surveys, physical measurements, and wearables. Personal identifiers have been removed from these data to protect participant privacy.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

Training_v4

I would like to use this workspace purely for educational purposes only. It will be used to demonstrate to students various data analysis approaches using large datasets and to familiarize them with All of Us cloud storage workflow.

Scientific Questions Being Studied

I would like to use this workspace purely for educational purposes only. It will be used to demonstrate to students various data analysis approaches using large datasets and to familiarize them with All of Us cloud storage workflow.

Project Purpose(s)

  • Educational

Scientific Approaches

To produce aggregate summary statistics and regression models for various measurement variables available in All of Us data.

Anticipated Findings

This exploratory analysis will enable us to explore heterogeneity in anthropometric measures among various racial-ethnic groups

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Heidi Steiner - Graduate Trainee, University of Arizona
  • Claire Devaney - Undergraduate Student, University of Arizona

DSMES

Coaching methods for DSMES

Scientific Questions Being Studied

Coaching methods for DSMES

Project Purpose(s)

  • Disease Focused Research (type 2 diabetes mellitus)
  • Population Health

Scientific Approaches

Type 2 diabetes DSMES

Anticipated Findings

Changes in Hemoglobin A1C, Weight, BMI and CVD

Demographic Categories of Interest

  • Race / Ethnicity

Research Team

Owner:

Getting Acquainted

I intend to explore demographics and physical activity (Fitbit data) to assist in formulating prospective research questions. I am keenly interested in learning more about objectively measured Physical Activity counts for individuals at risk of certain chronic diseases (e.g. diabetes,…

Scientific Questions Being Studied

I intend to explore demographics and physical activity (Fitbit data) to assist in formulating prospective research questions. I am keenly interested in learning more about objectively measured Physical Activity counts for individuals at risk of certain chronic diseases (e.g. diabetes, cancer, CVD).

Project Purpose(s)

  • Educational

Scientific Approaches

I first must explore the data to determine the scientific approaches I will use. My purposes at this point are purely exploratory, and will likely become more focused as I become familiar with the data available in the workbench.

Anticipated Findings

For now, I will be generating descriptive statistics and running a few crosstabs. Again, my intent is to become familiar and proficient with the Research Workbench interface before making hypotheses or anticipating results.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sexual Orientation
  • Geography
  • Education Level
  • Income Level

Research Team

Owner:

  • Jammie Hopkins - Early Career Tenure-track Researcher, Morehouse School of Medicine

Demo - Hypertensive Disorders of Pregnancy

1. What is the prevalence of hypertensive disorders during pregnancy? 2. What is the prevalence of hypertensive disorders during pregnancy by demographics? Of those diagnosed with a hypertensive disorder during pregnancy, what is the epidemiology of the risk factors associated…

Scientific Questions Being Studied

1. What is the prevalence of hypertensive disorders during pregnancy? 2. What is the prevalence of hypertensive disorders during pregnancy by demographics? Of those diagnosed with a hypertensive disorder during pregnancy, what is the epidemiology of the risk factors associated with hypertension in pregnancy? 3. Are there racial disparities in hypertension during pregnancy, when adjusted for these risk factors? 4. How can one use heterogeneous data sources within the All of Us dataset to explore disease associations using self-reported exposures (Participant Provided Information, or “PPI”) and exposures captured in the electronic medical record (EHR).”

Project Purpose(s)

  • Disease Focused Research (Hypertensive disorder of pregnancy)
  • Social / Behavioral
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

Our sample was pulled from the 78,938 females in the AoU cohort who had EHR and PPI data. Females were identified as participants with female sex assigned at birth. Of these, only the 13,155 females who had at least 1 SNOMED code in their EHR as "pregnancy finding" were included in the analysis. For our analyses, a participant was classified as having a hypertensive disorder of pregnancy if they had at least one SNOMED code for gestational hypertension, pre-eclampsia with or without severe features, eclampsia, or HELLP Syndrome. We used published risk factors for preeclampsia as described by the United States Preventive Services Task Force in our univariate and multivariate analysis. Odds ratios were calculated for the risk factors. Descriptive statistics for the overall pregnant female cohort and the hypertensive disorder of pregnancy cohort were also classified. We used both EHR and PPI data to identify the risk factors for hypertensive disorders of pregnancy.

Anticipated Findings

We anticipate to see racial disparities in the prevalence of hypertensive disorders during pregnancy. Similar to previous literature, we anticipate our results will show participants who identify as African American are at greater odds of being diagnosed with hypertensive disorder of pregnancy compared to White participants. We also anticipate finding higher odds of being diagnosed with hypertensive disorders of pregnancy among participants who have at least one risk factor for preeclampsia as described by USPSTF compared to participants without any risk factors. This study will serve to demonstrate the quality, utility, and diversity of the All of Us data and tools, providing researchers options for study design and validation.

Demographic Categories of Interest

  • Race / Ethnicity
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

Collaborators:

  • Guohai Zhou - Other, Mass General Brigham

Health literacy

The Health Resources & Services Administration (HRSA) defines health literacy as "the degree to which individuals have the capacity to obtain, process, and understand basic health information needed to make appropriate health decisions." Health literacy is an important determinant of…

Scientific Questions Being Studied

The Health Resources & Services Administration (HRSA) defines health literacy as "the degree to which individuals have the capacity to obtain, process, and understand basic health information needed to make appropriate health decisions." Health literacy is an important determinant of health and well-being. People with low health literacy may have challenges in accessing healthcare services; understanding health information and risk probability; completing health-related forms and assessments; and managing health conditions.

We are interested in exploring health literacy among Latinos. We want to answer the following questions:
1. How does health literacy among Latinos in the United States differ by personal factors (i.e., age, nativity, education, primary language), geographic, and social factors?
2. How is health literacy associated with self-rated health, quality of life, and healthcare experiences among Latinos?

Project Purpose(s)

  • Population Health
  • Social / Behavioral

Scientific Approaches

We plan to use data from the All of Us surveys to explore the variables of interest and use bivariate and multivariate analysis techniques.

Anticipated Findings

We hope that by answering our reserach questions, we will be to better understand health literacy differs among Latino subgroups and how it is associated with perceptions of health and well-being. We anticipate that Latinos who are older, report lower formal education, and have limited English proficiency will have lower health literacy. Further, we anticipate that those with lower health literacy will report poorer experiences with the healthcare system, but they may continue to rate their health and quality of life high. This type of information may help public health practitioners to develop, promote, and disseminate health literacy interventions. It may also help healthcare providers, healthcare institutions, and public health authorities to reflect on their practices and policies related to health literacy, patient experiences, and community engagement.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Education Level

Research Team

Owner:

  • Athena Ramos - Senior Researcher, University of Nebraska Medical Center

Student Workspace Storage

This workspace will be used to prepare instructor content and analysis protocols for a course-based research laboratory class supported by the Towson University Research Enhancement Program. The purpose of this course is for students to have the experience of developing…

Scientific Questions Being Studied

This workspace will be used to prepare instructor content and analysis protocols for a course-based research laboratory class supported by the Towson University Research Enhancement Program. The purpose of this course is for students to have the experience of developing a research question in human health and then they will design and implement an analysis of publicly available data to answer their research question. The student research projects will focus on medical health and public health topics. As well as learning skills important in medical and epidemiological research, students will be able ask questions that could lead to better understanding of and treatment for diseases in traditionally under-served populations.

Project Purpose(s)

  • Educational

Scientific Approaches

Data analysis will be run in an NIH-approved "Researcher Workbench" platform using Jupyter Notebook and R. The questions students will ask will be dependent on what data All of Us has available to researchers at the time of the course. These data will include health data, physical measurement data, biospecimen-related data, and genomic data.

Anticipated Findings

As well as learning skills important in medical and epidemiological research, students will be able ask questions that could lead to better understanding of and treatment for diseases in traditionally under-served populations. We also hope this course will encourage undergraduate students to consider careers in medical research.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Social support and quality of life in older adults

The question to be studied using the All of Us Research Program data: Do older adults who report a higher quality of life have a better support system? This question will help individuals and professionals better understand the relationship between…

Scientific Questions Being Studied

The question to be studied using the All of Us Research Program data: Do older adults who report a higher quality of life have a better support system? This question will help individuals and professionals better understand the relationship between social support and quality of life. Additionally, this information will help increase validity for the All of Us Research Program data as this question is based on previous findings.

Project Purpose(s)

  • Population Health

Scientific Approaches

We will use survey responses to study this question.

Anticipated Findings

I anticipate that older adults who have more social support are more likely to report a higher quality of life. The findings of this study can help medical professionals and researchers create solutions to have a better quality of life.

Demographic Categories of Interest

  • Age

Research Team

Owner:

Duplicate of Demo Project - Family History in EHR & PPI Data

As a demonstration project, this study will summarize structured data elements available in the All of Us registered tier and compare to published survey results to describe data for reuse in disease specific outcomes. Specific questions include: 1. Could harnessing…

Scientific Questions Being Studied

As a demonstration project, this study will summarize structured data elements available in the All of Us registered tier and compare to published survey results to describe data for reuse in disease specific outcomes. Specific questions include:

1. Could harnessing informatics tools like predictive modeling and clinical decision support to detect and alert healthcare providers to these preventative measures significantly improve the precise care we deliver to patients?
2. How can one evaluate the availability of family medical history information within the All of Us registered tier data and characterize the structured data elements from both data sources?

Project Purpose(s)

  • Methods Development
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

We utilize the Family Medical History PPI survey to capture self-reported information but exclude participants who did not know any of their family history or who skipped every survey question. We pay particular attention to the disease/relative pairings that map to the American College of Medical Genetics and Genomics’ (ACMG) list of important diseases.

We define EHR family history information as the collection of registered tier observations with "family+history" or "FH:" anywhere in their OMOP concept name. We exclude observations of “Family social history” and remove duplicate observation and value concept pairings from the same healthcare organization regarding the same participant as these were likely due to repeated entries across multiple routine annual physical exams.

We aim to compare the data sources by summarizing the type and amount of family history information gained.

Anticipated Findings

This description of the family medical history data in the All of Us registered tier database will assist future investigators in understanding All of Us data methods and give feedback to the program on the utility of participant survey and EHR data.

We hypothesize that the survey data will provide a more complete look at family medical history due to its structured nature. Though, we are also interested in determining how much overlap there is between the PPI and EHR data. It’s plausible that the free-form nature of EHR family history information yields more detailed records. We would ultimately like to determine if a gold standard method for defining a participant’s family medical history is attainable within the All of Us registered tier data.

We anticipate facing informatics challenges because of collecting data from different sources, mapping these data to a common data model, and attempting to harness data from these sources to find the common source of truth.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • Melissa Patrick - Project Personnel, All of Us Program Operational Use
  • Robert Cronin - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
  • Ashley Able - Other, Vanderbilt University Medical Center

Hepatitis C and Parkinson's disease

Previous research has described an association between Hepatitis C (HCV) infections and Parkinson’s disease. However, no research of this kind has been performed or has focused on underrepresented or minority groups in the United States, for example the Hispanic population.…

Scientific Questions Being Studied

Previous research has described an association between Hepatitis C (HCV) infections and Parkinson’s disease. However, no research of this kind has been performed or has focused on underrepresented or minority groups in the United States, for example the Hispanic population. Therefore, we are interested in exploring the All of Us dataset for reports on HCV infection and Parkinson’s disease.

Project Purpose(s)

  • Disease Focused Research (hepatitis C and Parkinson's disease)
  • Population Health

Scientific Approaches

We will identify participants 40 years or older and asses if they have had a report or diagnosis for HCV and/or Parkinson’s disease. The sample or cohort will be described through descriptive statistics. We intend to use inferential statistics to test for associations between both diseases.

Anticipated Findings

We anticipate that the association seen in studies for other populations will also exist in this cohort. Currently there is scarce data on Parkinson’s disease prevalence, especially for minority/ underrepresented groups. We aim to better understand how both of the diseases affect these groups.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Access to Care

Research Team

Owner:

RacialEthnicDifferences_AnthropoLipidALT

Obesity is one of the most important risks for many diseases in the United States and across the world. Differences in body weight and shape across gender and race/ethnicity have been extensively described. We sought to replicate these differences and…

Scientific Questions Being Studied

Obesity is one of the most important risks for many diseases in the United States and across the world. Differences in body weight and shape across gender and race/ethnicity have been extensively described. We sought to replicate these differences and evaluate newly emerging data from the All of Us Research Program (AoU). In this project, we ask the scientific question: How do individuals from different genders and different racial/ethnic groups in the All Of Us dataset differ with respect to weight, waist and hip circumferences, cholesterol levels and levels of alanine aminotransferase?

Project Purpose(s)

  • Disease Focused Research (Obesity)
  • Other Purpose (This work is the result of an All of Us Research Program Demonstration Project. Demonstration Projects are efforts by the All of Us Research Program designed to meet the goal of ensuring the quality and utility of the Research Hub as a resource for accelerating precision medicine. This work has been approved, reviewed, and overseen by the All of Us Research Program Science Committee and Data and Research Center to ensure compliance with program policy.)

Scientific Approaches

Within each ethnic/racial group and each gender group, we first visually examine histograms of each outcome variable to determine the presence of any major outliers that may represent measurement errors. Then we tabulated the mean values and other descriptive statistics for continuous variables such as waist and hip circumferences. We also determined the proportion of individuals with abdominal obesity. To formally test for differences among groups and to adjust for age and other covariates, we will use linear regression, transforming variables to conform to assumptions of linear regression. Data for race and ethnicity was obtained from participants in participant-provided information (PPI). Biological sex at birth, height, weight, waist circumference (WC), and hip circumference measurements were obtained according to AoU baseline visit protocols. Levels of alanine aminotransferase (ALT) were obtained from the EHR records of participants.

Anticipated Findings

For this study, we anticipate that we will be able to replicate known differences in body weight and shape across gender and race/ethnicity. We anticipate that we will find racial/ethnic and gender disparities related to ALT, a surrogate marker of hepatic steatosis. We anticipate the ability to evaluate the consistency of the All of Us cohort with national averages related to obesity and indicate that this resource is likely to be a major source of scientific inquiry and discovery. This project will serve to demonstrate the quality, utility, and diversity of the All of Us data and tools and the power of gathering multiple data sources for a single set of phenotypes, providing researchers options for study design and validation.

Demographic Categories of Interest

  • Race / Ethnicity
  • Sex at Birth

Research Team

Owner:

Collaborators:

  • Jianglin Feng - Other, University of Arizona
  • Lina Sulieman - Other, All of Us Program Operational Use

Maternal Health

My goal with this exploratory study is to look at prediction models for gestational hypertension and preeclampsia. I will attempt to understand patterns in past diagnosis that could point towards future diagnosis of hypertension during pregnancy.

Scientific Questions Being Studied

My goal with this exploratory study is to look at prediction models for gestational hypertension and preeclampsia. I will attempt to understand patterns in past diagnosis that could point towards future diagnosis of hypertension during pregnancy.

Project Purpose(s)

  • Disease Focused Research (pre-eclampsia)

Scientific Approaches

I plan to use association rule mining on prior diagnosis codes. This will enable us to develop prediction models for low resource settings without heavy use of AI or ML.

Anticipated Findings

We hope to be able to identify patterns that can help us detect gestational hypertension. If successful, we will attempt to validate the model on other datasets.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Treatment for Opioid Use Disorder

We are interested in better understanding how patients are treated for Opioid Use Disorder (OUD) in the United States. The public health crisis presented by the opioid epidemic in the US is unprecedented and a recent Consensus Study Report from…

Scientific Questions Being Studied

We are interested in better understanding how patients are treated for Opioid Use Disorder (OUD) in the United States. The public health crisis presented by the opioid epidemic in the US is unprecedented and a recent Consensus Study Report from the National Academies of Sciences, Engineering, and Medicine concluded that medication-based treatment should be used as first line treatments for OUD. Despite the abundance of evidence demonstrating that medicated-based treatment with methadone, buprenorphine, or extended-release naltrexone is safe and effective, these treatments are still not available to many Americans. We hope to use the data from All of Us to better understand how Americans are treated for OUD and if demographic characteristics such as race, ethnicity, gender, and others have an affect on which treatments an individual receives.

Project Purpose(s)

  • Disease Focused Research (Opioid Use Disorder)

Scientific Approaches

Descriptive methods will be used to identify individuals who have reported opioid abuse or misuse or been previously treated for opioid use disorder. An inferential analysis will then be performed to identify significant demographic characteristics that may be associated with which treatments an individual receives for OUD.

Anticipated Findings

Although the opioid epidemic affects everyone living in America, we know that it has not necessarily affected everyone equally. To date, there is little data on the accessibility of medication-based treatment for OUD for minority groups, especially US Hispanics and Latinos. We hope to better understand how demographic characteristics as well as social determinants of health influence the treatments individuals receive from OUD.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography

Research Team

Owner:

  • Kyle Melin - Mid-career Tenured Researcher, University of Puerto Rico Medical Sciences

New Workspace

This workspace will expose students to the realities of research and database work. It will focus on data analysis and extraction using R programming.

Scientific Questions Being Studied

This workspace will expose students to the realities of research and database work. It will focus on data analysis and extraction using R programming.

Project Purpose(s)

  • Educational

Scientific Approaches

This workspace will be used to examine vasculitis in conjunction with LDL to find out if there are other novel ways to test for the condition. Methods that will be used include: Data collection, data analyses, statistical methods, R programming, graphs and boxplots.

Anticipated Findings

This study should establish relationships between vasculitis and other different concepts including: its correlation to age, gender, race and many other constraints. This could provide a novel way to diagnose the condition earlier on in life and benefit patients.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation

Research Team

Owner:

Collaborators:

  • Njambi Kiguru - Undergraduate Student, Towson University
  • Kathryn McDougal - Other, Towson University
  • Abby Wennick - Undergraduate Student, Towson University
  • Akeem Laurence - Undergraduate Student, Towson University

Data Exploration

I do not intent to answer a scientific question. I am exploring the dataset for associations of HLD, HTN, DM and smoking with CAD

Scientific Questions Being Studied

I do not intent to answer a scientific question. I am exploring the dataset for associations of HLD, HTN, DM and smoking with CAD

Project Purpose(s)

  • Other Purpose (Data exploration, query searching for CAD, hyperlipidemia, hypertension, diabetes and smoking history in the general cohort.)

Scientific Approaches

I will look at the cohort in general and see who has CAD. I will also plan on seeing how does the prevalence of HTN, DM, HLD & smoking differ between those with and without CAD

Anticipated Findings

There are no anticipated scientific findings. this is just for exploration and practice. I expect that HTN, HLD, DM & smoking will be associated with CAD.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Cancer and Covid-19

The goal of this project is to examine if social distance measurements due to Covid_19 caused stress to cancer patients compared to non-cancer patients. In particular we will use the data to examine differences by race, ethnicity, age and sex…

Scientific Questions Being Studied

The goal of this project is to examine if social distance measurements due to Covid_19 caused stress to cancer patients compared to non-cancer patients. In particular we will use the data to examine differences by race, ethnicity, age and sex and to answer to the following questions:
In the past month, have recommendations for socially distancing caused stress for you?
Have you EVER been near someone that you know, or suspect, had COVID-19 (such as co-workers, family members, or others)?
Do you think you have had COVID-19?
We will examine all of the questions above and determine if there are differences in patients diagnosed with cancer compared to patients never diagnosed with cancer and are the differences in those question above differ by race, ethnicity, gender and age.

Project Purpose(s)

  • Disease Focused Research (cancer)
  • Social / Behavioral

Scientific Approaches

We plan to use univariate and multivariate analyses in order to examine questions about perceived stress and social distancing in cancer and non-cancer patients . The multivariate analyses will include a serries of logistic regression models where the outcomes will be perceived stress due to COVID-19 recommendations or if you think you had COVID-19 and independent covariate will include age, race, ethnicity and gender.

Anticipated Findings

We anticipate that the recommendations for social distancing will caused more stress to cancer patients compared to non-cancer patients and these relationships will vary by age, race, ethnicity and gender.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Research Team

Owner:

  • Argyrios Ziogas - Late Career Tenured Researcher, University of California, Irvine

Collaborators:

  • Steven Hiek - Project Personnel, University of California, Irvine
  • Kathryn Campbell - Project Personnel, University of California, Irvine

Practice JSP

Does Smoke inhalation of Cannabis affect the lungs? Lung function effects from smoking cannabis (medical or recreational). Using the All of Us research data I will be able to analyze and refine my results to see those who do consume…

Scientific Questions Being Studied

Does Smoke inhalation of Cannabis affect the lungs? Lung function effects from smoking cannabis (medical or recreational). Using the All of Us research data I will be able to analyze and refine my results to see those who do consume cannabis, frequency, age group and gender. Anticipated findings is that smoking cannabis will reduce lung function and affect the lungs in a negative way. Smoke inhalation is an irritant to the lungs and throat.

Project Purpose(s)

  • Other Purpose (The purpose of this project is to further investigate the lung effects of smoking cannabis. Medical cannabis has become a form of medicine that has become popular in recent years. )

Scientific Approaches

I would like to use the following data sets from the All of Us data browser : EHR domains, Drug exposure, life style, conditions and labs and measurements. With the use of R programming and Jupyter notebook I will be able to create graphs and tables based on the dataset.

Anticipated Findings

Since smoke inhalation is classified as an irritant to the lungs and throat, I expect the use of cannabis by smoking will decrease lung function over time and will trigger lung reactions such as Asthma, bronchitis etc. This research is designed to a general population not to a specific population. Smoking cannabis is not restricted to any population.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Giselle Chuquipoma - Undergraduate Student, Towson University
  • Njambi Kiguru - Undergraduate Student, Towson University
  • Kathryn McDougal - Other, Towson University
  • Akeem Laurence - Undergraduate Student, Towson University

Employment Validation

The purpose of this research will be to compare aggregations of All of Us participants' self-reports of employment status by various demographic factors with similar distributions available through the U.S. government sources such as the monthly Current Population Survey (CPS)…

Scientific Questions Being Studied

The purpose of this research will be to compare aggregations of All of Us participants' self-reports of employment status by various demographic factors with similar distributions available through the U.S. government sources such as the monthly Current Population Survey (CPS) and the periodic American Community Survey (ACS). Employment status is a key basic indicator in the All of Us Research Program that is likely to be studied as either a precursor or an outcome of many All of Us indicators of overall health, lifestyle, personal and family medical history, health care access and utilization, and involvement with COVID-19. It is important to assess the comparability of All of Us data estimates with major benchmarks in U.S. social and economic data.

Project Purpose(s)

  • Methods Development

Scientific Approaches

In the All of Us "Basics" survey of demographic information as well as in a survey of experiences and health during the time of the coronavirus disease 2019 (COVID-19) pandemic, all participants are asked to report whether they are employed for wages (part- time or full-time) or self-employed. In the so-called "U.S. Bureau of Labor Statistics (BLS) protocol" used in many government and non-government statistical programs for allocation of the labor force status of the population, people are employed only if they report holding a job with an employer or through self-employment for pay for at least one hour per week or for 15 or more hours without pay in a family-owned farm or business. Although All of Us surveys and surveys that use the BLS protocol reporting of employment and self-employment data collection protocols differ, how these survey processes and their outcomes differ by major indicators of sex, race, ethnicity, age, and time period of measurement is not known.

Anticipated Findings

In this study, I will compare the similarities and differences in the employment-population ratio (a standard and common labor force metric applied in the U.S) calculated through All of Us participant data and in tabulations calculated from public data collected in CPS and ACS surveys. This comparison is prerequisite to any subsequent use I might propose for All of Us participant data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • David Passmore - Late Career Tenured Researcher, Pennsylvania State University

Opioid Use

We will look at Opioid Use and Back Pain. We will look to see if opioid use can be reduced with specific interventions, and who is most at risk for being prescribed opioids. We are looking at the available data…

Scientific Questions Being Studied

We will look at Opioid Use and Back Pain.

We will look to see if opioid use can be reduced with specific interventions, and who is most at risk for being prescribed opioids.

We are looking at the available data to try to discern risk factors for opioid use and back pain. The broadness of the study is mostly due to our naivety of the interface, data, and capabilities of everything in "All of Us."

Project Purpose(s)

  • Disease Focused Research (Back Pain and Opioid Use)

Scientific Approaches

We will use the available cohorts within the All of Us study group to analyze the risk factors for back pain and opioid use.

Regarding determination of "at risk" groups, social determinants of health, and analyses of such data, we are limited by the available data within "All of Us," and will likely need to utilize more granular data at our institution for comparison. At this time, the "All of Us" network is very young with limited numbers; however, we believe the numbers will increase and allow for associations to be found on multi-variate analysis--although limited by the lack of granularity. Hopefully, in the end, this research will aid us in finding answers to risk factors for opioid use and back pain. Eventually, we could compare the more granular results from our local data set in a primarily minority and low income area to the general data provided from "All of Us."

Anticipated Findings

We hope to find groups at risk for opioid use or back pain and intervene earlier.

There are limited data on social determinants of health in All of Us; however we are at an institution that is working towards improving data collection on social determinants of health and making interventions on it. Eventually, we could compare the cohorts; however, we cannot claim that the "All of Us" network has enough participants at this time to make any generalizable associations.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

Sex Differences in CVD Comorbidities & Risk Factors based on COVID-19 Positivity

Our study will look at how cardiovascular disease comorbidities, risk factors, and behaviors differ by sex in those who are COVID-19 positive versus negative. This question is important because there is a lack of public knowledge on the association between…

Scientific Questions Being Studied

Our study will look at how cardiovascular disease comorbidities, risk factors, and behaviors differ by sex in those who are COVID-19 positive versus negative. This question is important because there is a lack of public knowledge on the association between COVID-19 and cardiovascular health and how this may differ between men and women. Therefore, we are interested in analyzing prevalent cardiovascular comorbidities, risk factors, and behaviors according to COVID-19 test positivity. Since there are some sex differences in COVID-19 outcomes, we wish to examine how comorbidities, risk factors, and behaviors may differ between men and women.

Project Purpose(s)

  • Disease Focused Research (COVID-19, cardiovascular disease, hypertension, diabetes, heart failure)

Scientific Approaches

We will use the cohort builder to find participants who tested positive for COVID-19. Our inclusion criteria will include cardiovascular risk factors, behaviors and comorbidities. Our exclusion criteria will include people who tested negative for COVID-19. We will then create a second cohort of participants who have tested negative for COVID-19. Our inclusion criteria will remain the same as the first cohort, but this time the exclusion criteria will include people who tested positive for COVID-19. We will also stratify our cohorts into male versus female. After, we will create a dataset builder, and generate concept sets on cardiovascular risk factors (eg hypertension, blood pressure, diabetes, obesity), behaviors (diet, smoking, physical activity), and comorbidities (prior history of coronary heart disease, stroke, heart failure, COPD, diabetes). We will then import the data to Jupyter Notebook and write code in R to analyze our data.

Anticipated Findings

The anticipated findings from this study are that males who tested positive for COVID-19 will have the most cardiovascular risk factors, followed by males who tested negative for COVID-19, females who tested positive for COVID-19, and lastly, females who tested negative for COVID-19. We expect that males who tested positive for COVID-19 will be more likely to have hypertension, diabetes and less physical activity compared to females. Our findings will be useful to those who have contracted COVID-19 during this pandemic, and how they may be at risk for developing new or aggravating current cardiovascular conditions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Collaborators:

  • Yufan Gong - Graduate Trainee, University of California, Los Angeles
  • Nathan Wong - Other, University of California, Irvine
  • Divya Devineni - Graduate Trainee, University of California, Irvine

Investigating new-onset neurocognitive complications in COVID-19 patients

The physiological impact of COVID-19 on various segments of the population has been divergent. While some COVID-positive patients have developed serious cardio-pulmonary complications, others have shown relatively mild pulmonary symptoms. Recent studies in the UK and Spain have shown that…

Scientific Questions Being Studied

The physiological impact of COVID-19 on various segments of the population has been divergent. While some COVID-positive patients have developed serious cardio-pulmonary complications, others have shown relatively mild pulmonary symptoms. Recent studies in the UK and Spain have shown that a notable percentage of patients showed significant impact to the central nervous system. Whether these impacts only affected the patients in the short term, or if they have longer term consequences is still not well understood. Recent reports suggest that neurological complications of COVID-19 also exist in the US population and may be at a higher prevalence than seen in Europe. We plan to use machine learning and related computational methods to identify features that may be predictive of the new-onset neurocognitive complications in people who tested positive for COVID-19 in the US population.

Project Purpose(s)

  • Disease Focused Research (Neurocognitive complications of COVID-19)

Scientific Approaches

Aim 1: Is there an association between COVID-19 (various severity levels) and new onset neurocognitive dysfunction? To address this aim, we will first define the phenotypes of “COVID19 related new onset neurocognitive dysfunction” using longitudinal health record data.
Aim 2: What are some clinical and demographic features that are predictive of COVID19 related new onset neurocognitive changes? We will implement machine learning methods that robust towards small imbalanced datasets, providing valuable insights while reducing the risk of misinterpretation when implemented on sparse datasets.
Aim 3: Are there factors that make these specific populations more vulnerable than the general population? To understand how the features identified in the AoU cohort may generalize in other populations, we will compare summary statistics from the identified experimental and population in the AoU COVID cohort with summary statistics from other population level cohorts like N3C, and global.health.

Anticipated Findings

Our goal is to find features that can predict a probability score or likelihood of risk for a new patient so that they can be directed to prophylactic treatment as soon as possible. We hope that this project will lay the foundation to preemptively identify and monitor new-onset neurocognitive complications due to COVID-19, and assist patients in receiving appropriate and necessary prophylactic care.

Demographic Categories of Interest

  • Disability Status

Research Team

Owner:

A1C vs RBC count analysis

We want to explore data to check if changes in RBC count or Hemoglobin conc. have any effect on A1C in normal as well as diabetic patients. The analysis will help understand if RBC count and Hb conc. can play…

Scientific Questions Being Studied

We want to explore data to check if changes in RBC count or Hemoglobin conc. have any effect on A1C in normal as well as diabetic patients. The analysis will help understand if RBC count and Hb conc. can play a role in miss diagnosis of Diabetic patients because of A1C levels.

Project Purpose(s)

  • Educational

Scientific Approaches

We are trying to check if the hypothesis is true or not before we conduct any wet lab experiments to confirm our hypothesis.

Anticipated Findings

We anticipate that as the RBC count and Hb conc. increases, A1C levels would go down.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Alcohol Consumption and Changes During the Pandemic

We will investigate alcohol consumption patterns and changes during the COVID-19 pandemic, examine whether there are any differences among demographic/geographic subgroups, and examine any factors associated with changes in alcohol consumption. We may also examine any differences in demographic/geographic factors…

Scientific Questions Being Studied

We will investigate alcohol consumption patterns and changes during the COVID-19 pandemic, examine whether there are any differences among demographic/geographic subgroups, and examine any factors associated with changes in alcohol consumption. We may also examine any differences in demographic/geographic factors between participants and non-participants in the COPE Surveys to evaluate participation representativeness.

Project Purpose(s)

  • Population Health
  • Social / Behavioral

Scientific Approaches

We plan to use data from the core surveys (The Basics, Overall Health, and Lifestyle) as well as the COVID-19 Participant Experience (COPE) Surveys. Analytic approaches include descriptive statistics, univariate or unadjusted hypothesis testing, multivariable regression, and/or data visualizations.

Anticipated Findings

Alcohol poses different challenges during the COVID-19 pandemic. The findings will help us better understand alcohol consumption patterns and changes in US population during the pandemic as well as potential demographic disparities and geographic variation. The findings may also shed some light on factors associated with changes in alcohol consumption during the pandemic.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Education Level
  • Income Level

Research Team

Owner:

Collaborators:

  • Tulshi Saha - Senior Researcher, NIH
  • Nanwei Cao - Senior Researcher, NIH

Workspace 1

The purpose of this workbench is to explore the dataset and learn how to use the research workbench

Scientific Questions Being Studied

The purpose of this workbench is to explore the dataset and learn how to use the research workbench

Project Purpose(s)

  • Other Purpose (exploration)

Scientific Approaches

I do not plan to study an approach; my goal is to learn how to use the research workbench

Anticipated Findings

None. I am just trying to explore the data

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

First Workspace on Diabetes

It’s estimated 5 million of the 18 million americans in the US with diabetes do not know they have it. Early diagnosis of diabetes and pre-diabetes is important so that patients can begin to manage the disease early and potentially…

Scientific Questions Being Studied

It’s estimated 5 million of the 18 million americans in the US with diabetes do not know they have it. Early diagnosis of diabetes and pre-diabetes is important so that patients can begin to manage the disease early and potentially prevent or delay the serious disease complications that can decrease quality of life. Some of these complications include premature heart disease and stroke, blindness, limb amputations, and kidney failure. Our project will review clinical data to estimate each patient’s diabetes risk, which will help identify high risk individuals who may need to seek medical attention.

Project Purpose(s)

  • Educational

Scientific Approaches

Given the importance of being able to correctly give early diagnosis, this team places a stronger importance for our model on predictive performance, which leads the team to initially lean towards using random forest or boosting (particularly XGBoost) models. These more complex models might not be as interpretable, and take longer to run, but they generally yield a much higher out-of-sample performance, which is the main goal of this team’s project. The team will also look into how well a CART model performs with the dataset given that a CART tree, if it is not too complex, would offer that interpretability for the end users of this research project. It is important to note that a CART model would be used in lieu of a more complex model if it performs on par with the more complex models.

Anticipated Findings

The overall goal of the project is to assist doctors with treating patients. Ideally, the data analysis would provide doctors with a model of what age range a patient is likely to be diagnosed with diabetes based on his/her risk factors. If we can provide doctors with this information, then they can have a conversation with patients about preventative care, such as lifestyle changes, in advance of that age to try to prevent patients from developing diabetes. It also allows doctors to know when to start testing patients’ A1C levels to diagnose diabetes so the disease can be treated as quickly as possible. This prevents the untreated disease from wreaking havoc on the patients’ bodies, preventing complications from diabetes down the line.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

  • xiaodi hu - Graduate Trainee, Massachusetts Institute of Technology
  • Alexander Warner - Graduate Trainee, Massachusetts Institute of Technology
  • Kenneth Fan - Graduate Trainee, Massachusetts Institute of Technology

Epidemiology of Inflammatory Skin Conditions

We are interested in studying the epidemiology of inflammatory skin conditions and diseases associated with inflammatory skin conditions.

Scientific Questions Being Studied

We are interested in studying the epidemiology of inflammatory skin conditions and diseases associated with inflammatory skin conditions.

Project Purpose(s)

  • Disease Focused Research (Inflammatory skin diseases)

Scientific Approaches

We will use the All of Us dataset to describe the epidemiology of inflammatory skin conditions across racial/ethnic and age groups. We will use association testing to determine which diseases and conditions are associated with inflammatory skin conditions.

Anticipated Findings

We hope to better describe the burden of inflammatory skin conditions among different racial/ethnic and age groups and we hope to show noel associations between inflammatory skin conditions and other diseases including cardiovascular disease, autoimmune disease, and metabolic disease. These data will help improve the treatment of inflammatory skin diseases.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

1 - 25 of 552
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.