Matthew Bailey

Early Career Tenure-track Researcher, Brigham Young University

2 active projects

Duplicate of Phenotype - Breast Cancer (v7)

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research. This is for basic learning on how to use the All of Us data.

Scientific Questions Being Studied

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research.

This is for basic learning on how to use the All of Us data.

Project Purpose(s)

  • Disease Focused Research (cancer)
  • Population Health
  • Educational
  • Methods Development
  • Control Set
  • Other Purpose (This is an All of Us Phenotype Library Workspace created by the Researcher Workbench Support team. It is meant to demonstrate the implementation of key phenotype algorithms within the All of Us Research Program cohort.)

Scientific Approaches

This is a temporary work space that I duplicated, in order to learn how to parse All of Us phenotype data

Anticipated Findings

By reading and running the Notebooks in this Phenotype Library Workspace, researchers can implement the following phenotype algorithms:

Ning Shang, George Hripcsak, Chunhua Weng, Wendy K. Chung, & Katherine Crew. Breast Cancer. Retrieved from https://phekb.org/phenotype/breast-cancer.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Matthew Bailey - Early Career Tenure-track Researcher, Brigham Young University

Germline Mutations that Increase Cancer Risk

Previous studies have reported numerous, heritable gene variants that can increase risk of developing cancer. We look to increase understanding of these gene variants and their connection to cancer risk in a more diverse population. We are also interested in…

Scientific Questions Being Studied

Previous studies have reported numerous, heritable gene variants that can increase risk of developing cancer. We look to increase understanding of these gene variants and their connection to cancer risk in a more diverse population. We are also interested in exploring how these variants connect to other reported health problems in individuals who later develop cancer. Specifically, we intend to ask the following questions:

1. Are harmful, germline, gene variants a good predictor of whether or not an individual will develop cancer during their life?
2. Are there other commonly reported health problems that can be linked to greater risk for cancer in people with these gene variants?
3. Do these findings hold across a diverse population?

Project Purpose(s)

  • Disease Focused Research (cancer)
  • Ancestry

Scientific Approaches

We will create workflows that will align, intersect, extract, integrate, and analyze known predisposition cancer mutations in the All of Us cohort.
Align: We will use the UCSC genome browser tool to ensure that predisposition variants match the human reference build of All of Us.
Intersect: We will use “bedtools intersect” and “BigQuery” to identify predisposition variants in whole genome sequencing mutation files (VCFs).
Extract: We will store all suspected cancer predisposition variants as a first “data freeze”. This dataset will be our “training-set”. We will use subsequent All of Us data releases as “test-sets” for any novel associations or statical models we identify.
Integrate: Using genomics data and insurance billing codes, we will visualize the relationships between predisposition variants, cancer occurrences, and other reported health problems.
Analyze: We will build custom scripts in Python and R to identify associations found when combining genomics and phenotypic data.

Anticipated Findings

We expect to see that the presence of pathogenic gene variants can help predict a person’s risk for developing cancer. We anticipate that this finding will hold across a diverse population. We also expect to find other frequently reported health problems that associate with increased occurrence of cancer.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Mary Davis - Early Career Tenure-track Researcher, Brigham Young University
  • Kylee Bates - Undergraduate Student, Brigham Young University
  • Matthew Bailey - Early Career Tenure-track Researcher, Brigham Young University
  • Adam Bates - Undergraduate Student, Brigham Young University

Collaborators:

  • Justin Bryan - Undergraduate Student, Brigham Young University
  • David Stone - Undergraduate Student, Brigham Young University
  • Spencer Boris - Undergraduate Student, Brigham Young University
  • Christian Betteridge - Undergraduate Student, Brigham Young University
1 - 2 of 2
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.