Christopher Lord

Project Personnel, All of Us Program Operational Use

13 active projects

Duplicate of Workshop: Intro to All of Us Genomics Data

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. There are five hands-on exercises during the workshop, each with a specific notebook. Exercise 1: Duplicate the workspace & start the cloud environment Exercise…

Scientific Questions Being Studied

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. There are five hands-on exercises during the workshop, each with a specific notebook.
Exercise 1: Duplicate the workspace & start the cloud environment
Exercise 2: Looking at the genomic data (notebook)
Exercise 3: GWAS - extracting phenotypic data (notebook)
Exercise 4: GWAS - running Hail GWAS (notebook)
Exercise 5: Advanced GWAS (2 notebooks)

By running the exercises in this workspace, researchers will become more familiar with the genomic data, know how to access the genomic data, see how the genomic data and tools can be used in the Researcher Workbench, and be able to start their own genomic data project.

Project Purpose(s)

  • Other Purpose (This workspace is meant for use during the Introduction to Analyzing All of Us Genomic Data workshop. In this workshop, participants will get hands-on experience using the genomics data running a genome-wide association study (GWAS) using Hail. )

Scientific Approaches

We are using the All of Us dataset in order to run a genome-wide association study (GWAS) using Hail. In the workshop, we will give an introduction to the All of Us Researcher Workbench and demonstrate how to use the Cohort Builder and Jupyter Notebooks to set up a research project. Using Jupyter notebooks, we will create a dataset linking the All of Us phenotypic data to the short read whole genome sequencing (srWGS) data. After running the GWAS steps using Hail, we will visualize the results.

Anticipated Findings

This study is running a genome-wide association study (GWAS) using Hail, using height as the selected phenotypic data. We do not anticipate findings from this example workspace but we expect that workshop participants will be able to apply similar methods to their future research.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Ghada Soliman - Other, City University of New York (CUNY)
  • Jennifer Zhang - Project Personnel, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Chris Lord - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Genevieve Brandt - Project Personnel, All of Us Program Operational Use

Duplicate of Data Wrangling in All of Us Program (v7)

For Educational purpose to show best practices when using jupyter notebooks for data access, storage, data manipulations - transformations, conversions, cleaning, optimization and other research support related issues that is useful for multiple AoU researchers.

Scientific Questions Being Studied

For Educational purpose to show best practices when using jupyter notebooks for data access, storage, data manipulations - transformations, conversions, cleaning, optimization and other research support related issues that is useful for multiple AoU researchers.

Project Purpose(s)

  • Educational
  • Other Purpose (For use with Office hours. notebooks for adding code snippets useful for researchers. This is a placeholder for creating notebooks for best practices among other things)

Scientific Approaches

For Educational purpose to show best practices when using jupyter notebooks for data access, storage, data manipulations - transformations, conversions, cleaning, optimization and other research support related issues that is useful for multiple AoU researchers.

Anticipated Findings

For Educational purpose to show best practices when using jupyter notebooks for data access, storage, data manipulations - transformations, conversions, cleaning, optimization and other research support related issues that is useful for multiple AoU researchers.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use

regenie_ldl_gwas_with_cromwell_aouv7_controlled

The goal of this workspace is to provide another example of a way to do a GWAS analysis, this time utilizing cromwell to run tools outside of the notebook.

Scientific Questions Being Studied

The goal of this workspace is to provide another example of a way to do a GWAS analysis, this time utilizing cromwell to run tools outside of the notebook.

Project Purpose(s)

  • Educational

Scientific Approaches

This workspace will feature notebooks using Cromwell to run regenie via WDL and a notebook to do the analysis of the regenie GWAS results. The phenotype of interest is LDL cholesterol and we'll be using participant age and sex assigned at birth as covariates along with the top 15 ancestry PCs.

Anticipated Findings

I expect to be able to reproduce the GWAS results found in TOPMed and in the v6 regenie featured workspace. Given that this will be v7 data, which contain more samples than v6, there will likely be additional hits beyond those discovered in v6 due to more statistical power.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Sophie Schwartz - Project Personnel, All of Us Program Operational Use
  • Jun Qian - Other, All of Us Program Operational Use
  • Genevieve Brandt - Project Personnel, All of Us Program Operational Use
  • Chris Lord - Project Personnel, All of Us Program Operational Use

Duplicate of Workshop: Intro to All of Us Genomics Data

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. There are five hands-on exercises during the workshop, each with a specific notebook. Exercise 1: Duplicate the workspace & start the cloud environment Exercise…

Scientific Questions Being Studied

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. There are five hands-on exercises during the workshop, each with a specific notebook.
Exercise 1: Duplicate the workspace & start the cloud environment
Exercise 2: Looking at the genomic data (notebook)
Exercise 3: GWAS - extracting phenotypic data (notebook)
Exercise 4: GWAS - running Hail GWAS (notebook)
Exercise 5: Advanced GWAS (2 notebooks)

By running the exercises in this workspace, researchers will become more familiar with the genomic data, know how to access the genomic data, see how the genomic data and tools can be used in the Researcher Workbench, and be able to start their own genomic data project.

Project Purpose(s)

  • Other Purpose (This workspace is meant for use during the Introduction to Analyzing All of Us Genomic Data workshop. In this workshop, participants will get hands-on experience using the genomics data running a genome-wide association study (GWAS) using Hail. )

Scientific Approaches

We are using the All of Us dataset in order to run a genome-wide association study (GWAS) using Hail. In the workshop, we will give an introduction to the All of Us Researcher Workbench and demonstrate how to use the Cohort Builder and Jupyter Notebooks to set up a research project. Using Jupyter notebooks, we will create a dataset linking the All of Us phenotypic data to the short read whole genome sequencing (srWGS) data. After running the GWAS steps using Hail, we will visualize the results.

Anticipated Findings

This study is running a genome-wide association study (GWAS) using Hail, using height as the selected phenotypic data. We do not anticipate findings from this example workspace but we expect that workshop participants will be able to apply similar methods to their future research.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Genevieve Brandt - Project Personnel, All of Us Program Operational Use

Duplicate of How to Work with All of Us Genomic Data (Hail - Plink)(v7)

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Scientific Questions Being Studied

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Project Purpose(s)

  • Other Purpose (Demonstrate to the All of Us Researcher Workbench users how to get started with the All of Us genomic data and tools. It includes an overview of all the All of Us genomic data and shows some simple examples on how to use these data.)

Scientific Approaches

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Anticipated Findings

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Polygenic_Risk_Score_Genetic_Ancestry_Calibration

Polygenic risk scores (PRS) are available for a wide array of traits and conditions, offering many potential applications including preventative medicine. There is, however, a serious concern that clinical use of PRS could contribute to health disparities due to the…

Scientific Questions Being Studied

Polygenic risk scores (PRS) are available for a wide array of traits and conditions, offering many potential applications including preventative medicine. There is, however, a serious concern that clinical use of PRS could contribute to health disparities due to the poorer performance of PRS in non-European ancestry individuals.
We aim to improve our ability to correct the genetic ancestry-dependent bias in PRS for 10 conditions (Asthma, Atrial fibrillation, Breast Cancer, Chronic Kidney Disease, Coronary heart disease, Hypercholesterolemia, Obesity/BMI, Prostate cancer, Type 1 Diabetes, Type 2 Diabetes). We will use the AoU dataset to produce a resource that can be used to reduce the ancestry-dependent bias in these 10 PRS. This resource will initially be used by the eMERGE IV consortium, which is an NIH-funded consortium of clinical centers across the United States, with an aim to enroll a prospective cohort of 25,000 individuals.

Project Purpose(s)

  • Control Set

Scientific Approaches

Arrays will be imputed using the phasing and imputation tools Eagle2 and Minimac4. Polygenic risk score will then be calculated using the population genomics tool PLINK. A simple linear model will then be fit to the scores, which attempt to describe the macroscopic relationship between genetic ancestry and observed polygenic scores. The fitted parameters of this model can then be used to reduce genetic ancestry-dependent bias when calculating these scores in a clinical setting.

Anticipated Findings

We will produce a set of fitted parameters for a simple model which attempts to describe the macroscopic relationship between genetic ancestry and observed polygenic scores. The fitted parameters of this model can then be used as a resource to reduce genetic ancestry-dependent bias when calculating these scores in a clinical setting.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Michael Gatzen - Project Personnel, Broad Institute
  • Fabio Cunial - Project Personnel, Broad Institute

Demo Project: State-level Activity Inequality [Published Work]

How is physical activity distributed within states in the US? Analysis of such activity distributions and inequality can reveal important relationships between physical activity disparities, health outcomes, and modifiable factors, as Althoff et al. studied in their paper, "Large-scale physical…

Scientific Questions Being Studied

How is physical activity distributed within states in the US? Analysis of such activity distributions and inequality can reveal important relationships between physical activity disparities, health outcomes, and modifiable factors, as Althoff et al. studied in their paper, "Large-scale physical activity data reveal worldwide activity inequality" (2017).

Project Purpose(s)

  • Educational

Scientific Approaches

The cohort will consist of Fitbit users in the US, with analysis being subdivided to the state level. Various graphs will be utilized to help visualize the low- and high-activity trends across states. Well-defined measures such as the Gini coefficient will be used to aid in the analysis of activity inequality.

Anticipated Findings

The study aims to find relationships between activity inequality and health outcomes, such as obesity levels. With the growing accessibility of fitness trackers and activity sensors built into personal devices, this study hopes to leverage the volume of available data and potentially inform measures to improve population activity and health.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate of How to Work with All of Us Genomic Data (Hail - Plink)(v7)

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Scientific Questions Being Studied

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Project Purpose(s)

  • Ancestry
  • Other Purpose (Demonstrate to the All of Us Researcher Workbench users how to get started with the All of Us genomic data and tools. It includes an overview of all the All of Us genomic data and shows some simple examples on how to use these data.)

Scientific Approaches

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Anticipated Findings

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate of How to Work with All of Us Survey Data (v7)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? By running the notebooks in this workspace, you should get familiar with how to query…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect?
By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, what the frequencies of answers for each question in each PPI module are.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Tutorial Workspace created by the Researcher Workbench Support team. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, what the frequencies of answers for each question in each PPI module are.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, researchers will learn the following:
- how to query the survey data,
- how to summarize PPI modules, and questions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Jun Qian - Other, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Chenyu Li - Graduate Trainee, University of Pittsburgh
  • Brandy Mapes - Other, All of Us Program Operational Use

AOU_Recover_Long_Covid_v6

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Scientific Questions Being Studied

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

To achieve this objective, data science workflows were used to apply ML algorithms on the Researcher Workbench. This effort allowed an expansion in the number of participants used to evaluate the ML models used to identify risk of PASC/Long COVID and also serve to validate the efforts of one team and providing insight to other teams. These models were implemented within the All of Us Controlled Tier data (C2022Q2R2), which was last refreshed on June 22, 2022. We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Anticipated Findings

We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • WeiQi Wei - Other, All of Us Program Operational Use
  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Mark Weiner - Mid-career Tenured Researcher, Cornell University
  • Hiral Master - Project Personnel, All of Us Program Operational Use
  • Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
  • David Mohs - Other, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Chenchal Subraveti - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use
  • Chris Lunt - Other, All of Us Program Operational Use

GeneticAncestryDemoProject

As a demonstration project, this project will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' race & ethnicity (which are socially defined), and genetic ancestry (which can be objectively…

Scientific Questions Being Studied

As a demonstration project, this project will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' race & ethnicity (which are socially defined), and genetic ancestry (which can be objectively inferred from participants' genome). Socially defined race & ethnicity and genetically inferred ancestry are both relevant to health outcomes. Race & ethnicity shape individuals’ lived experience and social environment, eg structural inequities, environmental injustice, and barriers to healthcare access. Genetic ancestry can affect health outcomes via differences in the frequencies of variants associated with disease and drug response. Specifically, we will ask:

1. What is the extent of racial, ethnic, and genetic diversity in the All of Us cohort?

2. How do genetic ancestry and admixture change over geography and with age in the US?

3. Are there associations between genetic ancestry and health outcomes in the All of Us cohort?

Project Purpose(s)

  • Population Health
  • Methods Development
  • Ancestry
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

To characterize the diversity of the All of Us cohort, we analyzed participant genetic, demographic, and geographic data.

Here is a brief list of methods used:

1. All of Us participant genome-wide genotype was merged and harmonized with global reference population data.

2. Unsupervised clustering analysis techniques - Hopkins statistic, visual assessment of clustering tendency, K-means clustering & UMAP - to assess the extent of genetic structure in the cohort.

3. Supervised genetic ancestry inference using global reference populations, principal components analysis, and the Rye (Rapid ancestrY Estimation) program.

4. Genetic ancestry was compared to participants' self-identified race & ethnicity.

5. Geocoded data and participant age were used to measure how genetic ancestry and admixture vary with respect to participant geography and age.

6. Admixture regression to associate participant health outcomes, gleaned from electronic health records, with their genetic ancestry.

Anticipated Findings

1. The All of Us participant cohort will be racially, ethnically, and genetically diverse, consistent with the project’s aim to recruit underrepresented biomedical research groups in support of health equity.

2. All of Us participant genetic variation will be highly structured and best modeled by clusters rather than a continuum of variation.

3. All of Us participants’ will show patterns of genetically inferred ancestry that are correlated with their socially defined ancestry (i.e. race and ethnicity).

4. All of Us participants’ genetic ancestry and admixture will change over geography and with age.

5. All of Us participants’ genetic ancestry will be associated with a variety of health outcomes.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology
  • Jun Qian - Other, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Ashley Green - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Jennifer Zhang - Project Personnel, All of Us Program Operational Use

AFib epidemiology (AOU v4)

The overall goal of this study, as a Demonstration project, is to evaluate the ability of the All of Us Research Program data to replicate epidemiologic patterns of atrial fibrillation (AF), a common arrhythmia, previously described in other setting. We…

Scientific Questions Being Studied

The overall goal of this study, as a Demonstration project, is to evaluate the ability of the All of Us Research Program data to replicate epidemiologic patterns of atrial fibrillation (AF), a common arrhythmia, previously described in other setting. We will address this goal with these two aims:
• Specific Aim 1. To determine the association of race and ethnicity with the prevalence and incidence of atrial fibrillation (AF). We hypothesize than non-whites will have lower prevalence and incidence of AF than whites.
• Specific Aim 2. To estimate associations of established risk factors for AF with the prevalence and incidence of AF. We hypothesize that increased body mass index, higher blood pressure, diabetes, smoking and a prior history of cardiovascular diseases will be associated with increased prevalence and incidence of AF.

Project Purpose(s)

  • Population Health
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

We will select all All of Us participants who self-reported sex at birth male or female, whose self-reported race was white, black or Asian, as well as those who self-reported being Hispanics.

Atrial fibrillation (AF) will be identified from self-reports in the medical survey or from electronic health records (EHR).

Clinical factors will be identified from EHR and study measurements (blood pressure, weight, height).

We will evaluate the association of demographic (age, sex, race/ethnicity) and clinical (body mass index, blood pressure, smoking, cardiovascular diseases) factors with prevalence of self-reported AF and prevalence of AF in the EHR, as well as incident AF ascertained from the EHR.

Anticipated Findings

The overall goal of this project is to evaluate the prevalence and incidence of atrial fibrillation (AF), overall and by race/ethnicity, as well as to confirm the association of established risk factors for AF in the All of Us Research participants. We expect to confirm associations between demographic and clinical variables previously reported in the literature, demonstrating the value of the All of Us Research Program data to address questions regarding this common cardiovascular disease.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Vignesh Subbian - Early Career Tenure-track Researcher, University of Arizona
  • Francis Ratsimbazafy - Other, All of Us Program Operational Use
  • Aymone Kouame - Other, All of Us Program Operational Use
  • Aniqa Alam
  • Konstantinos Sidiropoulos - Other, Nova Southeastern University

Wearables and The Human Phenome (Published Work)

Our primary goal is to understand the relation between activity levels with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses…

Scientific Questions Being Studied

Our primary goal is to understand the relation between activity levels with the development and progression of human disease. Higher physical activity is associated with lower prevalence and better outcomes in virtually every human disease. These analyses will generate hypotheses guiding clinical and research interventions focused on activity to reduce morbidity and mortality in patients seeking care.

This workspace is replication workspace for Wearables and The Human Phenome project. We replicated the workspace to provide a clean and reduced version of code that was used to generate the findings, which were published in Nature Medicine (https://www.nature.com/articles/s41591-022-02012-w).

Project Purpose(s)

  • Population Health
  • Social / Behavioral

Scientific Approaches

We will examine the relationship between daily activity (steps, activity intensity) over time and the prevalence and progression of coded human diseases. We will use the Fitbit data, EHR-curated diagnoses, laboratory values, and survey results.

Anticipated Findings

We expect to find that lower levels of activity are associated with a higher prevalence and more rapid progression of chronic diseases. These data will provide the rationale to link wearables data with electronic health records nationwide as a window into behavioral activity choice as a modifiable risk factor for chronic diseases. We may find substantial variation in activity and disease prevalence/severity by socioeconomic status, which would motivate studies/interventions to reduce these health disparities.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use
1 - 13 of 13
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.