Jason Karnes

Early Career Tenure-track Researcher, University of Arizona

23 active projects

ABO Systematic Review

Research questions: 1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort? 2) Will a SNP approach for ABO blood…

Scientific Questions Being Studied

Research questions:

1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort?
2) Will a SNP approach for ABO blood typing be concordant with available serotype?
3) What disease association ABO blood types can be replicated using the AllofUs dataset?
4) What novel disease associations, if any, with ABO blood types can be identified in a diverse cohort?

Relevance: Genomic variation in RBC and antigens is associated with a myriad of conditions. The ABO locus alone is associated with many conditions including venous thromboembolism (VTE), pancreatic cancer, malaria, and COVID-19. Furthermore, it is not common practice to extensively type beyond the traditional ABO blood groups, and the studies that do so are primarily done in individuals of European ancestry. Thus, we seek to do the first PheWAS on extensively typed RBC antigens and to do so in a diverse cohort.

Project Purpose(s)

  • Disease Focused Research (red blood cell (RBC) antigen-associated diseases)

Scientific Approaches

We plan to employ a blood typing algorithm to extensively type RBC antigens from 1) whole genome sequencing and 2) array data in the AllofUs cohort, and compare the two outcomes. Then, we plan to employ the phenome-wide association study (PheWAS) approach to identify associations between RBC antigen types and other clinical phenotypes. PheWAS will be carried out using multivariable linear regression and logistic regressions with ABO blood groups with our novel ABO blood type. For example, in the case of the ABO blood group, ABO blood subtypes (A101, A102, Aw01, B101, etc.) will act as the independent variable and phenotypes, derived from participant provided information (PPI) electronic health records (EHR), as the dependent variable. Initial models will include adjustments for age, gender, and race/ethnicity. Differential associations by race/ethnicity, gender, and sex will also be evaluated.

Anticipated Findings

This proposed project aims to test our novel ABO blood typing algorithm on WGS and array data in the diverse AllofUs cohort. We also aim to replicate known RBC-disease associations as well as identify any novels ones that may be identified within a diverse cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona
  • Sadaf Raoufi - Graduate Trainee, University of Arizona
  • Rudramani Pokhrel - Other, University of Arizona
  • Jason Giles - Research Fellow, University of Arizona
  • Ehsan Khajouei - Project Personnel, University of Arizona
  • Andrew Klein - Graduate Trainee, University of Arizona

Duplicate of Heparin-induced Thrombocytopenia (HIT) GWAS

Research questions: Can we identify novel genomic associations with heparin-induced thrombocytopenia (HIT). Relevance: Heparin is a widely used anticoagulant that carries the risk of an antibody-mediated adverse drug reaction referred to as heparin-induced thrombocytopenia (HIT). A subset of heparin-treated patients…

Scientific Questions Being Studied

Research questions: Can we identify novel genomic associations with heparin-induced thrombocytopenia (HIT).
Relevance: Heparin is a widely used anticoagulant that carries the risk of an antibody-mediated adverse drug reaction referred to as heparin-induced thrombocytopenia (HIT). A subset of heparin-treated patients produces detectable levels of antibodies against complexes of heparin bound to circulating platelet factor 4 (PF4). We aim to identify genetic variants associated with HIT using a genome-wise association study (GWAS) approach.

Project Purpose(s)

  • Disease Focused Research (heparin-induced thrombocytopenia)

Scientific Approaches

We plan to identify a HIT-positive cohort as well as a healthy control group that have genotype data available to perform a GWAS using PLINK. Our primary GWAS will feature a logistic regression of HIT status. Regression models will be adjusted for age, sex, and principal components 1 to 3.

Anticipated Findings

This proposed project aims to replicate known associations between genetic variants and HIT as well as identify any novels ones that may be identified within a diverse cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
  • Andrew Klein - Graduate Trainee, University of Arizona

Collaborators:

  • Kiana Martinez - Research Fellow, University of Arizona

CYP2C19 PheWAS

Polymorphisms in CYP2C19 have pharmacogenomic relevance to drugs such as clopidogrel. We aim to perform a phenome-wide association study (PheWAS) between genetic variants to test the association of CYP2C19 variants and other phenotypes to identify clinical impacts of pharmacogenomic variation.

Scientific Questions Being Studied

Polymorphisms in CYP2C19 have pharmacogenomic relevance to drugs such as clopidogrel. We aim to perform a phenome-wide association study (PheWAS) between genetic variants to test the association of CYP2C19 variants and other phenotypes to identify clinical impacts of pharmacogenomic variation.

Project Purpose(s)

  • Disease Focused Research (CYP2C19-associated diseases, including Stent Thrombosis)
  • Ancestry

Scientific Approaches

We aim to employ a PheWAS approach to systematically identify associations between CYP2C19 variants and clinical phenotypes. PheWAS will be carried out using multivariable logistic regression. Initial models will adjust for age, sex, and principal components. Differential associations by race/ethnicity and sex will also be evaluated.

Anticipated Findings

The proposed project aims to identify associations between CYP2C19 variants and PAH as well as novel associations with other clinical phenotypes. This will help improve risk prediction for PAH.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona
  • Sadaf Raoufi - Graduate Trainee, University of Arizona

Immunogenomic Associations with Disease and Differential Risk (v7)

We propose to perform immunogenomic phenome-wide association studies (iPheWAS)—a disease-neutral approach that identifies the association between immunogenomic variation across a broad array of phenotypes.

Scientific Questions Being Studied

We propose to perform immunogenomic phenome-wide association studies (iPheWAS)—a disease-neutral approach that identifies the association between immunogenomic variation across a broad array of phenotypes.

Project Purpose(s)

  • Ancestry

Scientific Approaches

The main scientific approach is phenome wide association study (PheWAS). The genomic datasets and EHR datasets will be used. The influence of genetic variation in several important loci, including HLA, will be interrogated across a wide array of disease using PheWAS. Differential risk across diverse populations and biological sex will also be interrogated.

Anticipated Findings

We expect to replicate a wide array of immunogenomic associations across disease. We also expect to find that the influence of genetic variation differs between groups of diverse ancestries.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Sadaf Raoufi - Graduate Trainee, University of Arizona
  • Daphne Demekas - Project Personnel, University of Arizona
  • Genevieve Krause - Project Personnel, University of Arizona

haplotype phasing

I want to play with All of Workbench environment. My exploration would be what kinds of data available? I would like to explore the haplotype phasing methods' performance.

Scientific Questions Being Studied

I want to play with All of Workbench environment. My exploration would be what kinds of data available? I would like to explore the haplotype phasing methods' performance.

Project Purpose(s)

  • Educational

Scientific Approaches

I will select short read wgs (chr20) vcf file and perform haplotype phasing methods like shapeit, beagle to evaluate their performance and limitaions.

Anticipated Findings

Which haplotype phasing methods are best for the diverse data in All of Us? How can I improve existing methods?

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Noor Subah - Project Personnel, University of Arizona
  • Genevieve Krause - Project Personnel, University of Arizona

Impact of APOE4 on survival from diagnosis of AD dementia in diverse populations

The APOE4 allele is the major susceptibility gene for developing Alzheimer’s disease (AD) at older ages. APOE has 3 common alleles (APOE2, 3 &4), giving rise to 6 genotypes (APOE2/2, 2/3, 2/4, 3/3, 2/4, 3/4 & 4/4). In comparison to…

Scientific Questions Being Studied

The APOE4 allele is the major susceptibility gene for developing Alzheimer’s disease (AD) at older ages. APOE has 3 common alleles (APOE2, 3 &4), giving rise to 6 genotypes (APOE2/2, 2/3, 2/4, 3/3, 2/4, 3/4 & 4/4). In comparison to APOE3/3, the most common genotype, each copy of the APOE4 allele is associated with higher risk of AD dementia & younger median age at dementia onset. The impact of APOE4 on risk, rate of decline, & differential effects of the first AD-modifying disease medications has begun to have a major impact on the fight against AD. Recent studies in relatively small cohorts raised the possibility that APOE4 has a smaller impact on AD risk in African American/Black & Hispanic/Latino than in non-Hispanic persons. Confirming that possibility in large real-world cohort could have major implications for research & care in these underrepresented groups, as well as efforts to discover protective mechanisms that could be targeted by future AD-modifying & prevention therapies.

Project Purpose(s)

  • Disease Focused Research (Alzheimer's disease)

Scientific Approaches

We proposed to capitalize on longitudinal real-world electronic health record (EHR) data from All of Us to characterize differential risk of progressing to clinical diagnosis of probable AD dementia in APOE4 carriers, including homozygote (HM, 4/4), heterozygote (HT, 3/4) & non-carriers (NC, 3/3) in African American/Black, Hispanic/Latino & non-Hispanic participants. Data from participants with these genotypes who are initially ages 60-80, don’t have initial diagnosis of AD dementia & have 5+ years of subsequent EHR data. Survival analyses will control for potential confounds of age, sex, education & if available an indicator of SES. To test our hypothesis with improved statistical power, we will combine HM & HT into an aggregate APOE4 carrier group, compare survival from AD dementia in initial analysis & control for the potential confound of differences among ethnic/racial groups in the carrier group. Exploratory analyses characterize HM vs NC & HT vs NC in the 3 ethnic/racial groups.

Anticipated Findings

We hypothesize that the impact of APOE4 on a person’s AD risk is attenuated in these underrepresented groups (URGs). Confirming that possibility in a large real-world cohort could have major implications for research and care in these URGs, as well as the effort to discover protective mechanisms that could be targeted by future AD-modifying and prevention therapies.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Valentina Ghisays - Research Associate, Banner Health
  • Marcus Naymik - Mid-career Tenured Researcher, Translational Genomics Research Institute
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
  • Ignazio Piras - Early Career Tenure-track Researcher, Translational Genomics Research Institute
  • Ehsan Khajouei - Project Personnel, University of Arizona
  • Dhruman Goradia - Senior Researcher, Banner Health

Estimating Local Ancestry in the AllofUs Cohort (v7)

Research questions: 1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants? 2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for…

Scientific Questions Being Studied

Research questions:

1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants?
2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for local ancestry?

Relevance: While some studies have moved beyond race-based research in relation to pharmacogenomic (PGx) traits and have instead considered global ancestry, such studies rarely consider the influence of local ancestry (LA). Adding LA estimates into association analyses will provide a more comprehensive and inclusive approach to adjusting for population stratification and allow for research to better utilize diverse and admixed cohorts. Thus, we seek to focus our efforts on identifying novel PGx variants in relation to pharmacogenomic traits in a diverse cohort that includes admixed populations by incorporating local ancestry into association analyses in an effort to reduce adverse pharmacogenomic outcomes.

Project Purpose(s)

  • Disease Focused Research (pharmacogenomic-associated diseases)

Scientific Approaches

We plan to generate LA estimates, using RFMix with different iterations of K, per chromosome in the AllofUs cohort with available whole-genome sequencing (WGS) data. Appropriate reference populations will be informed from global ancestry estimates and retrieved from merged 1000 Genomes and Human Genome Diversity Project (HGDP) datasets. LA, which can be represented as the number of inherited alleles (0, 1, or 2) from each ancestral population at a particular locus, will be defined in a gene-specific manner, plus and minus 5000 base pairs to capture relevant regulatory regions. Since transitions may occur within genes, gene-based LA will also be calculated as a within-gene proportion of ancestry per individual. The LA at each clinically relevant pharmacogene will be represented as a percentage using descriptive statistics. Then, we plan to perform a GWAS analysis on pharmacogenomic traits of interest.

Anticipated Findings

Given our sample size, we expect to be able to confirm or deny the presence of correlation between LA, global ancestry, and PGx variant carriage. We expect that these approaches will be applicable to a broad range of PGx phenotypes, providing a proof of concept for the use of LA in PGx studies of admixed and diverse populations. We also expect to identify novel variants associated with drug safety and efficacy. These variants will most likely be more prevalent in admixed individuals and thus will partly address racial disparities in pharmacogenomics.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

Estimating Local Ancestry in the AllofUs Cohort (v7)

Research questions: 1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants? 2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for…

Scientific Questions Being Studied

Research questions:

1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants?
2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for local ancestry?

Relevance: While some studies have moved beyond race-based research in relation to pharmacogenomic (PGx) traits and have instead considered global ancestry, such studies rarely consider the influence of local ancestry (LA). Adding LA estimates into association analyses will provide a more comprehensive and inclusive approach to adjusting for population stratification and allow for research to better utilize diverse and admixed cohorts. Thus, we seek to focus our efforts on identifying novel PGx variants in relation to pharmacogenomic traits in a diverse cohort that includes admixed populations by incorporating local ancestry into association analyses in an effort to reduce adverse pharmacogenomic outcomes.

Project Purpose(s)

  • Disease Focused Research (pharmacogenomic-associated diseases)

Scientific Approaches

We plan to generate LA estimates, using RFMix with different iterations of K, per chromosome in the AllofUs cohort with available whole-genome sequencing (WGS) data. Appropriate reference populations will be informed from global ancestry estimates and retrieved from merged 1000 Genomes and Human Genome Diversity Project (HGDP) datasets. LA, which can be represented as the number of inherited alleles (0, 1, or 2) from each ancestral population at a particular locus, will be defined in a gene-specific manner, plus and minus 5000 base pairs to capture relevant regulatory regions. Since transitions may occur within genes, gene-based LA will also be calculated as a within-gene proportion of ancestry per individual. The LA at each clinically relevant pharmacogene will be represented as a percentage using descriptive statistics. Then, we plan to perform a GWAS analysis on pharmacogenomic traits of interest.

Anticipated Findings

Given our sample size, we expect to be able to confirm or deny the presence of correlation between LA, global ancestry, and PGx variant carriage. We expect that these approaches will be applicable to a broad range of PGx phenotypes, providing a proof of concept for the use of LA in PGx studies of admixed and diverse populations. We also expect to identify novel variants associated with drug safety and efficacy. These variants will most likely be more prevalent in admixed individuals and thus will partly address racial disparities in pharmacogenomics.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona

SNX29 PheWAS

Pulmonary Arterial Hypertension (PAH) is a rare but severe disease. We aim to perform a phenome-wide association study (PheWAS) between genetic variants to test the association of SNX29 variants and PAH, and test the associations between SNX29 variants and other…

Scientific Questions Being Studied

Pulmonary Arterial Hypertension (PAH) is a rare but severe disease. We aim to perform a phenome-wide association study (PheWAS) between genetic variants to test the association of SNX29 variants and PAH, and test the associations between SNX29 variants and other phenotypes to improve risk prediction.

Project Purpose(s)

  • Disease Focused Research (SNX29-associated disease, including Pulmonary Arterial Hypertension)
  • Ancestry

Scientific Approaches

We aim to employ a PheWAS approach to systematically identify associations between SNX29 variants and clinical phenotypes. PheWAS will be carried out using multivariable logistic regression. Initial models will adjust for age, sex, and principal components. Differential associations by race/ethnicity and sex will also be evaluated.

Anticipated Findings

The proposed project aims to identify associations between SNX29 variants and PAH as well as novel associations with other clinical phenotypes. This will help improve risk prediction for PAH.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

APOE PheWAS

Alzheimer's Disease (AD) is the most common type of dementia and variation in APOE if strongly associated with AD in multiple race/ethnic groups. We aim to perform a phenome-wide association study (PheWAS) between genetic variants to test the association of…

Scientific Questions Being Studied

Alzheimer's Disease (AD) is the most common type of dementia and variation in APOE if strongly associated with AD in multiple race/ethnic groups. We aim to perform a phenome-wide association study (PheWAS) between genetic variants to test the association of APOE variants and AD, and test the associations between APOE variants and other phenotypes to improve risk prediction.

Project Purpose(s)

  • Disease Focused Research (APOE-associated disease, including Alzheimer's Disease)
  • Ancestry

Scientific Approaches

We aim to employ a PheWAS approach to systematically identify associations between APOE variants and clinical phenotypes. PheWAS will be carried out using multivariable logistic regression. Initial models will adjust for age, sex, and principal components. Differential associations by race/ethnicity and sex will also be evaluated.

Anticipated Findings

The proposed project aims to identify novel associations between APOE variants and a wide array of clinical phenotypes. This will help improve risk prediction for Alzheimer's and other APOE-related diseases.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

ABO PheWAS - v7

Research questions: 1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort? 2) Will a SNP approach for ABO blood…

Scientific Questions Being Studied

Research questions:

1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort?
2) Will a SNP approach for ABO blood typing be concordant with available serotype?
3) What disease association ABO blood types can be replicated using the AllofUs dataset?
4) What novel disease associations, if any, with ABO blood types can be identified in a diverse cohort?

Relevance: Genomic variation in RBC and antigens is associated with a myriad of conditions. The ABO locus alone is associated with many conditions including venous thromboembolism (VTE), pancreatic cancer, malaria, and COVID-19. Furthermore, it is not common practice to extensively type beyond the traditional ABO blood groups, and the studies that do so are primarily done in individuals of European ancestry. Thus, we seek to do the first PheWAS on extensively typed RBC antigens and to do so in a diverse cohort.

Project Purpose(s)

  • Disease Focused Research (red blood cell (RBC) antigen-associated diseases)

Scientific Approaches

We plan to employ a blood typing algorithm to extensively type RBC antigens from 1) whole genome sequencing and 2) array data in the AllofUs cohort, and compare the two outcomes. Then, we plan to employ the phenome-wide association study (PheWAS) approach to identify associations between RBC antigen types and other clinical phenotypes. PheWAS will be carried out using multivariable linear regression and logistic regressions with ABO blood groups with our novel ABO blood type. For example, in the case of the ABO blood group, ABO blood subtypes (A101, A102, Aw01, B101, etc.) will act as the independent variable and phenotypes, derived from participant provided information (PPI) electronic health records (EHR), as the dependent variable. Initial models will include adjustments for age, gender, and race/ethnicity. Differential associations by race/ethnicity, gender, and sex will also be evaluated.

Anticipated Findings

This proposed project aims to test our novel ABO blood typing algorithm on WGS and array data in the diverse AllofUs cohort. We also aim to replicate known RBC-disease associations as well as identify any novels ones that may be identified within a diverse cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona
  • Sadaf Raoufi - Graduate Trainee, University of Arizona
  • Rudramani Pokhrel - Other, University of Arizona
  • Jason Giles - Research Fellow, University of Arizona
  • Ehsan Khajouei - Project Personnel, University of Arizona
  • Andrew Klein - Graduate Trainee, University of Arizona

OXTR PheWAS

Postpartum hemorrhage (PPH) is described as the accumulation of ≥1000 mL of blood loss within 24 hours after birth and results in 25-60% of maternal death worldwide. Not only have the cases of PPH drastically risen in recent years, but…

Scientific Questions Being Studied

Postpartum hemorrhage (PPH) is described as the accumulation of ≥1000 mL of blood loss within 24 hours after birth and results in 25-60% of maternal death worldwide. Not only have the cases of PPH drastically risen in recent years, but it is also experienced unequally by racial/ethnic groups. Oxytocin, frequently administered during labor to simulate contractions, is associated with greater PPH. Thus, we aim to perform a phenome-wide association study (PheWAS) between genetic variants to 1) test the association of oxytocin receptor (OXTR) variants and PPH, and 2) test the associations between OXTR variants and other phenotypes to improve risk prediction.

Project Purpose(s)

  • Disease Focused Research (feto-maternal outcomes, including postpartum hemorrhage)
  • Ancestry

Scientific Approaches

We aim to employ a PheWAS approach to systematically identify associations between OXTR variants and clinical phenotypes. PheWAS will be carried out using multivariable logistic regression. Initial models will adjust for age, sex, and principal components. Differential associations by race/ethnicity and sex will also be evaluated.

Anticipated Findings

The proposed project aims to identify associations between OXTR variants and PPH as well as novel associations with other clinical phenotypes. This will help improve risk prediction for PPH.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
  • Elise Erickson - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Huashi Li - Project Personnel, University of Arizona
  • Ehsan Khajouei - Project Personnel, University of Arizona
  • Anthony Vicenti - Project Personnel, University of Arizona

ABO PheWAS - v6

Research questions: 1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort? 2) Will a SNP approach for ABO blood…

Scientific Questions Being Studied

Research questions:

1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort?
2) Will a SNP approach for ABO blood typing be concordant with available serotype?
3) What disease association ABO blood types can be replicated using the AllofUs dataset?
4) What novel disease associations, if any, with ABO blood types can be identified in a diverse cohort?

Relevance: Genomic variation in RBC and antigens is associated with a myriad of conditions. The ABO locus alone is associated with many conditions including venous thromboembolism (VTE), pancreatic cancer, malaria, and COVID-19. Furthermore, it is not common practice to extensively type beyond the traditional ABO blood groups, and the studies that do so are primarily done in individuals of European ancestry. Thus, we seek to do the first PheWAS on extensively typed RBC antigens and to do so in a diverse cohort.

Project Purpose(s)

  • Disease Focused Research (red blood cell (RBC) antigen-associated diseases)

Scientific Approaches

We plan to employ a blood typing algorithm to extensively type RBC antigens from 1) whole genome sequencing and 2) array data in the AllofUs cohort, and compare the two outcomes. Then, we plan to employ the phenome-wide association study (PheWAS) approach to identify associations between RBC antigen types and other clinical phenotypes. PheWAS will be carried out using multivariable linear regression and logistic regressions with ABO blood groups with our novel ABO blood type. For example, in the case of the ABO blood group, ABO blood subtypes (A101, A102, Aw01, B101, etc.) will act as the independent variable and phenotypes, derived from participant provided information (PPI) electronic health records (EHR), as the dependent variable. Initial models will include adjustments for age, gender, and race/ethnicity. Differential associations by race/ethnicity, gender, and sex will also be evaluated.

Anticipated Findings

This proposed project aims to test our novel ABO blood typing algorithm on WGS and array data in the diverse AllofUs cohort. We also aim to replicate known RBC-disease associations as well as identify any novels ones that may be identified within a diverse cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
  • Jun Qian - Other, All of Us Program Operational Use

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona
  • Sadaf Raoufi - Graduate Trainee, University of Arizona

Duplicate of Cholesterol PheWAS

Research Question: 1) What disease associations with cholesterol levels can be replicated using the AllofUs dataset? 2) Are known differences in cholesterol levels by race/ethnicity observable in the AllofUs dataset?

Scientific Questions Being Studied

Research Question:
1) What disease associations with cholesterol levels can be replicated using the AllofUs dataset?
2) Are known differences in cholesterol levels by race/ethnicity observable in the AllofUs dataset?

Project Purpose(s)

  • Methods Development

Scientific Approaches

Prior to PheWAS analyses, demographic characteristics will be acquired for the study population, for which lipid panel values are available. Summary statistics related to cholesterol levels and other variables such as blood pressure, and waist and hip circumference, will also be performed, including measure of central tendency and tests of normality. Cholesterol levels will be summarized by self-reported race/ethnicity categories (registered tier generalizations).
Primary statistical analyses will be carried out using multivariable linear regression with cholesterol measures as the independent variable and individual phecodes as dependent variables. Cholesterol, triglycerides, HDL, and LDL will be tested in separate PheWAS analyses. Initial models will include adjustment for age, gender, BMI, antihyperlipidemic drugs, and smoking status and alcohol intake based on participant provided information (PPI).

Anticipated Findings

Our proposed project involves comprehensive replication of known disease associations while using existing phenotype algorithms and is thus within the scope of a demonstration project. While the PheWAS approach could be considered agnostic, our analysis will not be directed at generation of new associations. Considering the depth of existing literature on associations with lipid panel biomarkers, we do not expect our analysis to be powered to identify new associations with these laboratory values.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Anthony Vicenti - Project Personnel, University of Arizona

Immunogenomic Associations with Disease and Differential Risk

We propose to perform immunogenomic phenome-wide association studies (iPheWAS)—a disease-neutral approach that identifies the association between immunogenomic variation across a broad array of phenotypes.

Scientific Questions Being Studied

We propose to perform immunogenomic phenome-wide association studies (iPheWAS)—a disease-neutral approach that identifies the association between immunogenomic variation across a broad array of phenotypes.

Project Purpose(s)

  • Ancestry

Scientific Approaches

The main scientific approach is phenome wide association study (PheWAS). The genomic datasets and EHR datasets will be used. The influence of genetic variation in several important loci, including HLA, will be interrogated across a wide array of disease using PheWAS. Differential risk across diverse populations and biological sex will also be interrogated.

Anticipated Findings

We expect to replicate a wide array of immunogenomic associations across disease. We also expect to find that the influence of genetic variation differs between groups of diverse ancestries.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Travis Wheeler - Mid-career Tenured Researcher, University of Arizona
  • Sadaf Raoufi - Graduate Trainee, University of Arizona
  • Kiana Martinez - Research Fellow, University of Arizona
  • Daphne Demekas - Project Personnel, University of Arizona
  • Anthony Vicenti - Project Personnel, University of Arizona

Duplicate of Association of Cholesterol with Heart Diseases

The lipid hypothesis was based on an initial evidence that cardiac diseases are associated with high total cholesterol. This hypothesis has significantly changed our lifestyle during the last half century although many contradictory studies exist. Has the association between heart…

Scientific Questions Being Studied

The lipid hypothesis was based on an initial evidence that cardiac diseases are associated with high total cholesterol. This hypothesis has significantly changed our lifestyle during the last half century although many contradictory studies exist. Has the association between heart diseases and cholesterol changes during this long period? Or, is the original association result still valid now? And, is there another association mechanism that can explain the major contradictions? A re-evaluation of the association is necessary. AllOfUs provides a much larger EHR datasets for this association study than the original datasets that had only a few thousands of patients.

Project Purpose(s)

  • Disease Focused Research (myocardial infarction, stroke)

Scientific Approaches

Standard quantitative association approach and the datasets for patients with cholesterol measurements will be used.

Anticipated Findings

We expect to find changes on the association between cholesterol and heart diseases, and new association mechanism may also be found.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Anthony Vicenti - Project Personnel, University of Arizona

Duplicate of Race/Ethnicity PheWAS

Our primary objective is to establish disparities in disease diagnosis by race and ethnicity and to characterize how patient-reported sociocultural factors such as educational level and family income contribute to these disparities. Our study will investigate diagnosis of human disease…

Scientific Questions Being Studied

Our primary objective is to establish disparities in disease diagnosis by race and ethnicity and to characterize how patient-reported sociocultural factors such as educational level and family income contribute to these disparities. Our study will investigate diagnosis of human disease comprehensively across multiple disease systems to systematically identify such disparities. The amazing breadth and impressive diversity of the All of Us dataset combined with patient-provided information enables an investigation of racial/ethnic disparities in disease diagnosis in the context of how patient-reported sociocultural factors contribute to these disparities. This project is relevant to public health in that it will combat misinterpretation and oversimplification regarding the causes of associations with race/ethnicity.

Project Purpose(s)

  • Population Health

Scientific Approaches

We plan to employ the phenome-wide association study approach (PheWAS) to determine the association of race/ethnicity across a broad array of human disease. The PheWAS catalogue will be used to generate disease classification among All of Us participants and self-reported race and ethnicity, will be considered the primary independent variable. Participant-provided survey information, including responses related to educational attainment, socio-economic status, and healthcare access, will be examined for direct effects on disease diagnosis and used as covariates in PheWAS analyses to determine how patient-reported sociocultural factors contribute to observed disparities independent of race/ethnicity.

Anticipated Findings

We expect that considering sociocultural factors will provide evidence of racial disparities in disease diagnosis that are not related to biology but to factors such as educational attainment, socioeconomic status, and limited access to specialty care. We expect that the focus of our study will be how racial/ethnic disparities in disease prevalence change overall rather than providing a detailed analysis or explanation of individual disease associations. This broader approach will prevent stigmatizing observations as well as overemphasis and misinterpretation of specific associations. We expect to stress how these sociocultural factors are driving forces for racial/ethnic differences in disease diagnosis. We expect that ultimately this analysis will combat common misinterpretations and oversimplifications regarding the causes of disease associations with race/ethnicity.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Anthony Vicenti - Project Personnel, University of Arizona

Estimating Local Ancestry in the AllofUs Cohort to be Utilized in GWAS

Research questions: 1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants? 2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for…

Scientific Questions Being Studied

Research questions:

1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants?
2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for local ancestry?

Relevance: While some studies have moved beyond race-based research in relation to pharmacogenomic (PGx) traits and have instead considered global ancestry, such studies rarely consider the influence of local ancestry (LA). Adding LA estimates into association analyses will provide a more comprehensive and inclusive approach to adjusting for population stratification and allow for research to better utilize diverse and admixed cohorts. Thus, we seek to focus our efforts on identifying novel PGx variants in relation to pharmacogenomic traits in a diverse cohort that includes admixed populations by incorporating local ancestry into association analyses in an effort to reduce adverse pharmacogenomic outcomes.

Project Purpose(s)

  • Disease Focused Research (pharmacogenomic-associated diseases)

Scientific Approaches

We plan to generate LA estimates, using RFMix with different iterations of K, per chromosome in the AllofUs cohort with available whole-genome sequencing (WGS) data. Appropriate reference populations will be informed from global ancestry estimates and retrieved from merged 1000 Genomes and Human Genome Diversity Project (HGDP) datasets. LA, which can be represented as the number of inherited alleles (0, 1, or 2) from each ancestral population at a particular locus, will be defined in a gene-specific manner, plus and minus 5000 base pairs to capture relevant regulatory regions. Since transitions may occur within genes, gene-based LA will also be calculated as a within-gene proportion of ancestry per individual. The LA at each clinically relevant pharmacogene will be represented as a percentage using descriptive statistics. Then, we plan to perform a GWAS analysis on pharmacogenomic traits of interest.

Anticipated Findings

Given our sample size, we expect to be able to confirm or deny the presence of correlation between LA, global ancestry, and PGx variant carriage. We expect that these approaches will be applicable to a broad range of PGx phenotypes, providing a proof of concept for the use of LA in PGx studies of admixed and diverse populations. We also expect to identify novel variants associated with drug safety and efficacy. These variants will most likely be more prevalent in admixed individuals and thus will partly address racial disparities in pharmacogenomics.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona

Heparin-induced Thrombocytopenia (HIT) GWAS

Research questions: Can we identify novel genomic associations with heparin-induced thrombocytopenia (HIT). Relevance: Heparin is a widely used anticoagulant that carries the risk of an antibody-mediated adverse drug reaction referred to as heparin-induced thrombocytopenia (HIT). A subset of heparin-treated patients…

Scientific Questions Being Studied

Research questions: Can we identify novel genomic associations with heparin-induced thrombocytopenia (HIT).
Relevance: Heparin is a widely used anticoagulant that carries the risk of an antibody-mediated adverse drug reaction referred to as heparin-induced thrombocytopenia (HIT). A subset of heparin-treated patients produces detectable levels of antibodies against complexes of heparin bound to circulating platelet factor 4 (PF4). We aim to identify genetic variants associated with HIT using a genome-wise association study (GWAS) approach.

Project Purpose(s)

  • Disease Focused Research (heparin-induced thrombocytopenia)

Scientific Approaches

We plan to identify a HIT-positive cohort as well as a healthy control group that have genotype data available to perform a GWAS using PLINK. Our primary GWAS will feature a logistic regression of HIT status. Regression models will be adjusted for age, sex, and principal components 1 to 3.

Anticipated Findings

This proposed project aims to replicate known associations between genetic variants and HIT as well as identify any novels ones that may be identified within a diverse cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kiana Martinez - Research Fellow, University of Arizona
  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona

HLA PheWAS

Research questions: 1) What disease associations with HLA can be replicated using the AllofUs dataset? 2) What novel disease associations with HLA can be identified using the AllofUs dataset? Relevance: The human leukocyte antigen (HLA) system is the most polymorphic…

Scientific Questions Being Studied

Research questions:

1) What disease associations with HLA can be replicated using the AllofUs dataset?
2) What novel disease associations with HLA can be identified using the AllofUs dataset?

Relevance: The human leukocyte antigen (HLA) system is the most polymorphic in the human genome that has been associated with protection and predisposition to a broad array of infectious, autoimmune, and malignant diseases. Further research needs to be done in diverse populations to identify the full scope of phenotypes potentially associated with the HLA system.

Project Purpose(s)

  • Disease Focused Research (HLA-associated diseases)
  • Methods Development

Scientific Approaches

Prior to PheWAS analyses, HLA alleles will be imputed for each participant with whole-genome sequencing (WGS) data using a novel approach referencing the IPD-IMGT/HLA Database which defines the official HLA sequences named by the WHO Nomenclature Committee for Factors of the HLA System. Demographic characteristics will be acquired for the study population and summary statistics related to HLA-relevant variables will also be performed.

Primary statistical analyses will be carried out using multivariable linear regression HLA alleles as the independent variable and individual phecodes as dependent variables. Initial models will include adjustment for age, gender, and select variables from participant provided information (PPI). Differential associations by race/ethnicity, gender, and sex will also be evaluated.

Anticipated Findings

Our project expects to successfully generate HLA alleles for all AllofUs participants with available WGS data. We then expect to validate past phenotypic associations with HLA alleles as well as discover novel ones as this work will be performed in a diverse cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Anthony Vicenti - Project Personnel, University of Arizona

Duplicate of Association between cholesterol and cancer

Recent studies show conflict results about the association between cholesterol and cholesterol. All of US provides a large-scale contemporary cohort for a detailed exploration about this association.

Scientific Questions Being Studied

Recent studies show conflict results about the association between cholesterol and cholesterol. All of US provides a large-scale contemporary cohort for a detailed exploration about this association.

Project Purpose(s)

  • Disease Focused Research (cancer)

Scientific Approaches

We will use a direct and intuitive method to study the association between different cholesterol ( total cholesterol, LDL-cholesterol, HDL-cholesterol and triglyceride) and different cancers. Focus will be on breast cancer. Logistic regression will be used to quantitate the association.

Anticipated Findings

We anticipant to get clearer association between breast cancer and cholesterol.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Jason Karnes - Early Career Tenure-track Researcher, University of Arizona

Association of Cholesterol with Heart Diseases

The lipid hypothesis was based on an initial evidence that cardiac diseases are associated with high total cholesterol. This hypothesis has significantly changed our lifestyle during the last half century although many contradictory studies exist. Has the association between heart…

Scientific Questions Being Studied

The lipid hypothesis was based on an initial evidence that cardiac diseases are associated with high total cholesterol. This hypothesis has significantly changed our lifestyle during the last half century although many contradictory studies exist. Has the association between heart diseases and cholesterol changes during this long period? Or, is the original association result still valid now? And, is there another association mechanism that can explain the major contradictions? A re-evaluation of the association is necessary. AllOfUs provides a much larger EHR datasets for this association study than the original datasets that had only a few thousands of patients.

Project Purpose(s)

  • Disease Focused Research (myocardial infarction, stroke)

Scientific Approaches

Standard quantitative association approach and the datasets for patients with cholesterol measurements will be used.

Anticipated Findings

We expect to find changes on the association between cholesterol and heart diseases, and new association mechanism may also be found.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Lina Sulieman - Other, All of Us Program Operational Use

Association between cholesterol and cancer

Recent studies show conflict results about the association between cholesterol and cholesterol. All of US provides a large-scale contemporary cohort for a detailed exploration about this association.

Scientific Questions Being Studied

Recent studies show conflict results about the association between cholesterol and cholesterol. All of US provides a large-scale contemporary cohort for a detailed exploration about this association.

Project Purpose(s)

  • Disease Focused Research (cancer)

Scientific Approaches

We will use a direct and intuitive method to study the association between different cholesterol ( total cholesterol, LDL-cholesterol, HDL-cholesterol and triglyceride) and different cancers. Focus will be on breast cancer. Logistic regression will be used to quantitate the association.

Anticipated Findings

We anticipant to get clearer association between breast cancer and cholesterol.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

1 - 23 of 23
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.