Chenjie Zeng

Research Fellow, NIH

4 active projects

Duplicate of D043 AOU_DEMO_PheRS implementation_v4

Genetic association studies often examine features independently, potentially missing subpopulations with multiple phenotypes that share a single cause. Phenotype risk scores (PheRS) is an approach published by Lisa Bastarache, et al. to help identify patients with unrecognized Mendelian disease patterns…

Scientific Questions Being Studied

Genetic association studies often examine features independently, potentially missing subpopulations with multiple phenotypes that share a single cause. Phenotype risk scores (PheRS) is an approach published by Lisa Bastarache, et al. to help identify patients with unrecognized Mendelian disease patterns using phenotypes from the electronic health record (EHR). Our specific question is to test wether we can replicate PheRS approach for three mendelian diseases including CYSTIC FIBROSIS (CF), HEMOCHROMATOSIS (HH) and SICKLE CELL(SC) ANEMIA in All of Us cohort.

Project Purpose(s)

  • Methods Development
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use)

Scientific Approaches

PheRS utilizes Mendelian diseases descriptions annotated with Human Phenotype Ontology (HPO) terms, and these terms are mapped to ICD-9/10 codes. We then calculate the prevalence of each term by taking the number of unique individuals and dividing it by the total number of individuals in All of Us EHR cohort. We further use the -1og10 of the prevalence as the weight for each term. The PheRS for a particular disease is calculated by summing up the weights of each term that is present for an individual. In addition, we produce a residualized PheRS (rPheRS) using a linear regression model adjusted for age, sex, race and the number of unique years for which they have billing data in the EHR (ie, PheRS ∼ Age + Sex + Race + uniq_encounter_years). We use a cubic spline with 3 knots for age. The rPheRS is defined as the studentized residual of the PheRS from this model. Wilcoxon rank sum test is used to test the difference for raw PheRS or rPheRS between case and control groups.

Anticipated Findings

We hope we can replicate PheRS algorithm, using three diseases as examples including cystic fibrosis, hereditary hemochromatosis and sickle cell disease. We hope All of Us could facilitate the discovery of pathogenic variants, refine estimates of penetrance across diverse populations, and provide a more nuanced understanding of inheritance patterns in the future.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Mendelian diseases DX

We are studying the phenotypic burdens of diseases among patients with diagnoses of mendelian diseases across multiple racial groups.

Scientific Questions Being Studied

We are studying the phenotypic burdens of diseases among patients with diagnoses of mendelian diseases across multiple racial groups.

Project Purpose(s)

  • Disease Focused Research (Mendelian diseases)
  • Methods Development

Scientific Approaches

We will include all participants. We will compare the diseases profiles of participants with ICD codes or other concepts of genetic diagnosis of mendelian diseases.

Anticipated Findings

We hope to understand the burden of diseases in these patients, particularly among non-white ancestral groups.

Demographic Categories of Interest

  • Race / Ethnicity

Research Team

Owner:

Duplicate of Erwin Update: DJS: Duplicate of JAMA PheWAS Final Review 05-21-2020

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform…

Scientific Questions Being Studied

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform separate PheWAS studies with smoking status as the independent variable. Specific questions include:

1. How can one implement a PheWAS within the All of Us Researcher Workbench?
2. How can one use heterogeneous data sources within the All of Us dataset to explore disease associations using self-reported exposures (Participant Provided Information, or “PPI”) and exposures captured in the electronic medical record (EHR).”

There is no pre-specified hypothesis. It is important to determine if PheWAS can be conducted within the All of Us workbench

Project Purpose(s)

  • Methods Development
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform separate PheWAS studies with smoking status as the independent variable. Specific questions include:

1. How can one implement a PheWAS within the All of Us Researcher Workbench?
2. How can one use heterogeneous data sources within the All of Us dataset to explore disease associations using self-reported exposures (Participant Provided Information, or “PPI”) and exposures captured in the electronic medical record (EHR).”

There is no pre-specified hypothesis. It is important to determine if PheWAS can be conducted within the All of Us workbench

Anticipated Findings

For this study, we anticipate that we will be able to replicate known disease associations with smoking exposure. This will serve to demonstrate the quality, utility, and diversity of the All of Us data and tools and the power of gathering multiple data sources for a single phenotype, providing researchers options for study design and validation. Importantly the entire PheWAS package is made available for reuse by researchers in the Workbench, for new hypothesis generation.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

Duplicate of Erwin Update: DJS: Duplicate of JAMA PheWAS Final Review 05-21-2020

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform…

Scientific Questions Being Studied

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform separate PheWAS studies with smoking status as the independent variable. Specific questions include:

1. How can one implement a PheWAS within the All of Us Researcher Workbench?
2. How can one use heterogeneous data sources within the All of Us dataset to explore disease associations using self-reported exposures (Participant Provided Information, or “PPI”) and exposures captured in the electronic medical record (EHR).”

There is no pre-specified hypothesis. It is important to determine if PheWAS can be conducted within the All of Us workbench

Project Purpose(s)

  • Methods Development
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery. We will perform separate PheWAS studies with smoking status as the independent variable. Specific questions include:

1. How can one implement a PheWAS within the All of Us Researcher Workbench?
2. How can one use heterogeneous data sources within the All of Us dataset to explore disease associations using self-reported exposures (Participant Provided Information, or “PPI”) and exposures captured in the electronic medical record (EHR).”

There is no pre-specified hypothesis. It is important to determine if PheWAS can be conducted within the All of Us workbench

Anticipated Findings

For this study, we anticipate that we will be able to replicate known disease associations with smoking exposure. This will serve to demonstrate the quality, utility, and diversity of the All of Us data and tools and the power of gathering multiple data sources for a single phenotype, providing researchers options for study design and validation. Importantly the entire PheWAS package is made available for reuse by researchers in the Workbench, for new hypothesis generation.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

1 - 4 of 4
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.