Jessica Hamblin

Graduate Trainee, Pennsylvania State University

6 active projects

Duplicate of How to Get Started with Registered Tier Data (v7)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? This notebook will give you an overview of what data is available in the current…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect? This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Record (EHR), Physical Measurements (PM), and Survey data.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:

1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, you will understand the following:

All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of Genomics-Workshop-July2023_jmh97

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. There are five hands-on exercises during the workshop, each with a specific notebook. Exercise 1: Duplicate the workspace & start the cloud environment Exercise…

Scientific Questions Being Studied

This workspace is meant to help researchers get familiar with the All of Us Researcher Workbench. There are five hands-on exercises during the workshop, each with a specific notebook.
Exercise 1: Duplicate the workspace & start the cloud environment
Exercise 2: Looking at the genomic data (notebook)
Exercise 3: GWAS - extracting phenotypic data (notebook)
Exercise 4: GWAS - running Hail GWAS (notebook)
Exercise 5: Advanced GWAS (2 notebooks)

By running the exercises in this workspace, researchers will become more familiar with the genomic data, know how to access the genomic data, see how the genomic data and tools can be used in the Researcher Workbench, and be able to start their own genomic data project.

Project Purpose(s)

  • Other Purpose (This workspace is meant for use during the Introduction to Analyzing All of Us Genomic Data workshop. In this workshop, participants will get hands-on experience using the genomics data running a genome-wide association study (GWAS) using Hail. )

Scientific Approaches

We are using the All of Us dataset in order to run a genome-wide association study (GWAS) using Hail. In the workshop, we will give an introduction to the All of Us Researcher Workbench and demonstrate how to use the Cohort Builder and Jupyter Notebooks to set up a research project. Using Jupyter notebooks, we will create a dataset linking the All of Us phenotypic data to the short read whole genome sequencing (srWGS) data. After running the GWAS steps using Hail, we will visualize the results.

Anticipated Findings

This study is running a genome-wide association study (GWAS) using Hail, using height as the selected phenotypic data. We do not anticipate findings from this example workspace but we expect that workshop participants will be able to apply similar methods to their future research.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate of Demo - Siloed Analysis of All of Us and UK Biobank Genomic Data

Historically, researchers responded to limitations in genomic data sharing policy and practice by conducting meta analysis on summary outputs from isolated genomic datasets. Recent work has demonstrated the increased power of individual-level genetic analysis on pooled datasets. In addition, advancements…

Scientific Questions Being Studied

Historically, researchers responded to limitations in genomic data sharing policy and practice by conducting meta analysis on summary outputs from isolated genomic datasets. Recent work has demonstrated the increased power of individual-level genetic analysis on pooled datasets. In addition, advancements in data access and sharing policies coupled with technological advancements in cloud-based environments for data access and analysis have opened up new possibilities for pooled analysis of large-scale genomic datasets. The NIH All of Us Research Program and UK Biobank are two leading examples of large, population scale studies which combine genomic data with deep phenotypic health data. There is a grand opportunity to demonstrate how the world’s largest research-ready biomedical datasets can create more value together and advance discovery in genome science.

Project Purpose(s)

  • Other Purpose (This is a demonstration project meant to support research with All of Us genomic data. Please see https://www.biorxiv.org/content/10.1101/2022.11.29.518423)

Scientific Approaches

The primary goal of this project is to demonstrate the potential of the All of Us Researcher Workbench for pooled analyses of All of Us and UK Biobank data. Specifically, we aim to: 1. Develop and describe an approved, secure path for connecting UK Biobank data to the All of Us Researcher Workbench. 2. Conduct a genome-wide association study of blood lipids on the pooled dataset aimed at demonstrating that biomedical researchers can be more productive when permitted to analyze the union of the cohorts, as opposed to computing aggregate results in separate data silos for each cohort and then combining those aggregates.

Anticipated Findings

The secondary goal of this project is to demonstrate and measure the experience when the same analyses are repeated in a siloed manner. Specifically we aim to: 3. Repeat the previously described genome-wide association study on the All of Us Researcher Workbench when working with the All of Us data and on UK Biobank’s DNAnexus when working with the UK Biobank data. 4. Conduct a meta analysis on the aggregate results for each cohort (in accordance with each program’s data use policies) and compare the result of combining those aggregates to the results from the pooled analysis. Evaluate not only differences in results, but also differences in analysis cost and analyst productivity.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Margaret Sunitha Selvaraj - Research Fellow, Broad Institute
  • Melissa Patrick - Project Personnel, All of Us Program Operational Use
  • Jennifer Zhang - Project Personnel, All of Us Program Operational Use
  • Gage Rion - Project Personnel, All of Us Program Operational Use
  • David Glazer - Other, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Aymone Kouame - Other, All of Us Program Operational Use
  • Alexander Bick - Early Career Tenure-track Researcher, Vanderbilt University Medical Center

Duplicate of How to Work with All of Us Genomic Data (Hail - Plink)(v7)

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Scientific Questions Being Studied

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Project Purpose(s)

  • Other Purpose (Demonstrate to the All of Us Researcher Workbench users how to get started with the All of Us genomic data and tools. It includes an overview of all the All of Us genomic data and shows some simple examples on how to use these data.)

Scientific Approaches

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Anticipated Findings

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

PheWAS

We want to run a phenome-wide association study to look at associations between genetic markers and diseases across multiple organ systems.

Scientific Questions Being Studied

We want to run a phenome-wide association study to look at associations between genetic markers and diseases across multiple organ systems.

Project Purpose(s)

  • Disease Focused Research (cardiometabolic, immune-related, neurodegenerative)
  • Population Health
  • Methods Development
  • Ancestry

Scientific Approaches

We will use linear logistic regression stratified by sex, and we will use imputed genotype data, ICD codes (conditions), and clinical laboratory measures.

Anticipated Findings

We expect to find a comprehensive list of genetic variants associated with multiple disease categories in males vs females.

Demographic Categories of Interest

  • Sex at Birth

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate of Skills Assessment Training Notebooks For Users (v7)

This workspace contains multiple notebooks that assess users' understanding of the workbench and OMOP. These notebooks are meant to help users check their knowledge not only on Python, R, and SQL, but also on the general data structure and data…

Scientific Questions Being Studied

This workspace contains multiple notebooks that assess users' understanding of the workbench and OMOP. These notebooks are meant to help users check their knowledge not only on Python, R, and SQL, but also on the general data structure and data model used by the All of Us program.

Project Purpose(s)

  • Educational

Scientific Approaches

There are no scientific approach used in this workspace because it is meant for educational purposes only. We will cover all aspects of OMOP, and hence will use most datasets available in the workbench.

Anticipated Findings

We do not anticipate to have any findings. Instead, we are educating people on the use of the workbench and the common data model OMOP used by the program.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

1 - 6 of 6
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.