Jason Karnes
Early Career Tenure-track Researcher, University of Arizona
12 active projects
ABO PheWAS - v6
Scientific Questions Being Studied
Research questions:
1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort?
2) Will a SNP approach for ABO blood typing be concordant with available serotype?
3) What disease association ABO blood types can be replicated using the AllofUs dataset?
4) What novel disease associations, if any, with ABO blood types can be identified in a diverse cohort?
Relevance: Genomic variation in RBC and antigens is associated with a myriad of conditions. The ABO locus alone is associated with many conditions including venous thromboembolism (VTE), pancreatic cancer, malaria, and COVID-19. Furthermore, it is not common practice to extensively type beyond the traditional ABO blood groups, and the studies that do so are primarily done in individuals of European ancestry. Thus, we seek to do the first PheWAS on extensively typed RBC antigens and to do so in a diverse cohort.
Project Purpose(s)
- Disease Focused Research (red blood cell (RBC) antigen-associated diseases)
Scientific Approaches
We plan to employ a blood typing algorithm to extensively type RBC antigens from 1) whole genome sequencing and 2) array data in the AllofUs cohort, and compare the two outcomes. Then, we plan to employ the phenome-wide association study (PheWAS) approach to identify associations between RBC antigen types and other clinical phenotypes. PheWAS will be carried out using multivariable linear regression and logistic regressions with ABO blood groups with our novel ABO blood type. For example, in the case of the ABO blood group, ABO blood subtypes (A101, A102, Aw01, B101, etc.) will act as the independent variable and phenotypes, derived from participant provided information (PPI) electronic health records (EHR), as the dependent variable. Initial models will include adjustments for age, gender, and race/ethnicity. Differential associations by race/ethnicity, gender, and sex will also be evaluated.
Anticipated Findings
This proposed project aims to test our novel ABO blood typing algorithm on WGS and array data in the diverse AllofUs cohort. We also aim to replicate known RBC-disease associations as well as identify any novels ones that may be identified within a diverse cohort.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Kiana Martinez - Research Fellow, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
- Jun Qian - Other, All of Us Program Operational Use
Collaborators:
- Anthony Vicenti - Project Personnel, University of Arizona
- Sadaf Raoufi - Graduate Trainee, University of Arizona
ABO PheWAS - v7
Scientific Questions Being Studied
Research questions:
1) Can our novel ABO blood typing algorithm using genetic data be used effectively to extensively type ABO subtypes from whole genome sequencing and array data in a diverse cohort?
2) Will a SNP approach for ABO blood typing be concordant with available serotype?
3) What disease association ABO blood types can be replicated using the AllofUs dataset?
4) What novel disease associations, if any, with ABO blood types can be identified in a diverse cohort?
Relevance: Genomic variation in RBC and antigens is associated with a myriad of conditions. The ABO locus alone is associated with many conditions including venous thromboembolism (VTE), pancreatic cancer, malaria, and COVID-19. Furthermore, it is not common practice to extensively type beyond the traditional ABO blood groups, and the studies that do so are primarily done in individuals of European ancestry. Thus, we seek to do the first PheWAS on extensively typed RBC antigens and to do so in a diverse cohort.
Project Purpose(s)
- Disease Focused Research (red blood cell (RBC) antigen-associated diseases)
Scientific Approaches
We plan to employ a blood typing algorithm to extensively type RBC antigens from 1) whole genome sequencing and 2) array data in the AllofUs cohort, and compare the two outcomes. Then, we plan to employ the phenome-wide association study (PheWAS) approach to identify associations between RBC antigen types and other clinical phenotypes. PheWAS will be carried out using multivariable linear regression and logistic regressions with ABO blood groups with our novel ABO blood type. For example, in the case of the ABO blood group, ABO blood subtypes (A101, A102, Aw01, B101, etc.) will act as the independent variable and phenotypes, derived from participant provided information (PPI) electronic health records (EHR), as the dependent variable. Initial models will include adjustments for age, gender, and race/ethnicity. Differential associations by race/ethnicity, gender, and sex will also be evaluated.
Anticipated Findings
This proposed project aims to test our novel ABO blood typing algorithm on WGS and array data in the diverse AllofUs cohort. We also aim to replicate known RBC-disease associations as well as identify any novels ones that may be identified within a diverse cohort.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Kiana Martinez - Research Fellow, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Anthony Vicenti - Project Personnel, University of Arizona
- Sadaf Raoufi - Graduate Trainee, University of Arizona
- Rudramani Pokhrel - Other, University of Arizona
Duplicate of Cholesterol PheWAS
Scientific Questions Being Studied
Research Question:
1) What disease associations with cholesterol levels can be replicated using the AllofUs dataset?
2) Are known differences in cholesterol levels by race/ethnicity observable in the AllofUs dataset?
Project Purpose(s)
- Methods Development
Scientific Approaches
Prior to PheWAS analyses, demographic characteristics will be acquired for the study population, for which lipid panel values are available. Summary statistics related to cholesterol levels and other variables such as blood pressure, and waist and hip circumference, will also be performed, including measure of central tendency and tests of normality. Cholesterol levels will be summarized by self-reported race/ethnicity categories (registered tier generalizations).
Primary statistical analyses will be carried out using multivariable linear regression with cholesterol measures as the independent variable and individual phecodes as dependent variables. Cholesterol, triglycerides, HDL, and LDL will be tested in separate PheWAS analyses. Initial models will include adjustment for age, gender, BMI, antihyperlipidemic drugs, and smoking status and alcohol intake based on participant provided information (PPI).
Anticipated Findings
Our proposed project involves comprehensive replication of known disease associations while using existing phenotype algorithms and is thus within the scope of a demonstration project. While the PheWAS approach could be considered agnostic, our analysis will not be directed at generation of new associations. Considering the depth of existing literature on associations with lipid panel biomarkers, we do not expect our analysis to be powered to identify new associations with these laboratory values.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Kiana Martinez - Research Fellow, University of Arizona
- Anthony Vicenti - Project Personnel, University of Arizona
Immunogenomic Associations with Disease and Differential Risk
Scientific Questions Being Studied
We propose to perform immunogenomic phenome-wide association studies (iPheWAS)—a disease-neutral approach that identifies the association between immunogenomic variation across a broad array of phenotypes.
Project Purpose(s)
- Ancestry
Scientific Approaches
The main scientific approach is phenome wide association study (PheWAS). The genomic datasets and EHR datasets will be used. The influence of genetic variation in several important loci, including HLA, will be interrogated across a wide array of disease using PheWAS. Differential risk across diverse populations and biological sex will also be interrogated.
Anticipated Findings
We expect to replicate a wide array of immunogenomic associations across disease. We also expect to find that the influence of genetic variation differs between groups of diverse ancestries.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Travis Wheeler - Mid-career Tenured Researcher, University of Arizona
- Sadaf Raoufi - Graduate Trainee, University of Arizona
- Kiana Martinez - Research Fellow, University of Arizona
- Daphne Demekas - Project Personnel, University of Arizona
- Anthony Vicenti - Project Personnel, University of Arizona
Duplicate of Association of Cholesterol with Heart Diseases
Scientific Questions Being Studied
The lipid hypothesis was based on an initial evidence that cardiac diseases are associated with high total cholesterol. This hypothesis has significantly changed our lifestyle during the last half century although many contradictory studies exist. Has the association between heart diseases and cholesterol changes during this long period? Or, is the original association result still valid now? And, is there another association mechanism that can explain the major contradictions? A re-evaluation of the association is necessary. AllOfUs provides a much larger EHR datasets for this association study than the original datasets that had only a few thousands of patients.
Project Purpose(s)
- Disease Focused Research (myocardial infarction, stroke)
Scientific Approaches
Standard quantitative association approach and the datasets for patients with cholesterol measurements will be used.
Anticipated Findings
We expect to find changes on the association between cholesterol and heart diseases, and new association mechanism may also be found.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Kiana Martinez - Research Fellow, University of Arizona
- Anthony Vicenti - Project Personnel, University of Arizona
Duplicate of Race/Ethnicity PheWAS
Scientific Questions Being Studied
Our primary objective is to establish disparities in disease diagnosis by race and ethnicity and to characterize how patient-reported sociocultural factors such as educational level and family income contribute to these disparities. Our study will investigate diagnosis of human disease comprehensively across multiple disease systems to systematically identify such disparities. The amazing breadth and impressive diversity of the All of Us dataset combined with patient-provided information enables an investigation of racial/ethnic disparities in disease diagnosis in the context of how patient-reported sociocultural factors contribute to these disparities. This project is relevant to public health in that it will combat misinterpretation and oversimplification regarding the causes of associations with race/ethnicity.
Project Purpose(s)
- Population Health
Scientific Approaches
We plan to employ the phenome-wide association study approach (PheWAS) to determine the association of race/ethnicity across a broad array of human disease. The PheWAS catalogue will be used to generate disease classification among All of Us participants and self-reported race and ethnicity, will be considered the primary independent variable. Participant-provided survey information, including responses related to educational attainment, socio-economic status, and healthcare access, will be examined for direct effects on disease diagnosis and used as covariates in PheWAS analyses to determine how patient-reported sociocultural factors contribute to observed disparities independent of race/ethnicity.
Anticipated Findings
We expect that considering sociocultural factors will provide evidence of racial disparities in disease diagnosis that are not related to biology but to factors such as educational attainment, socioeconomic status, and limited access to specialty care. We expect that the focus of our study will be how racial/ethnic disparities in disease prevalence change overall rather than providing a detailed analysis or explanation of individual disease associations. This broader approach will prevent stigmatizing observations as well as overemphasis and misinterpretation of specific associations. We expect to stress how these sociocultural factors are driving forces for racial/ethnic differences in disease diagnosis. We expect that ultimately this analysis will combat common misinterpretations and oversimplifications regarding the causes of disease associations with race/ethnicity.
Demographic Categories of Interest
- Race / Ethnicity
- Geography
- Access to Care
- Education Level
- Income Level
Data Set Used
Registered TierResearch Team
Owner:
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Kiana Martinez - Research Fellow, University of Arizona
- Anthony Vicenti - Project Personnel, University of Arizona
Estimating Local Ancestry in the AllofUs Cohort to be Utilized in GWAS
Scientific Questions Being Studied
Research questions:
1) Is there a correlation between global and local ancestry in pharmacogenomic (PGx) variants?
2) What novel PGx variants, if any, with pharmacogenomic traits can be identified in a diverse cohort using a GWAS approach and adjusting for local ancestry?
Relevance: While some studies have moved beyond race-based research in relation to pharmacogenomic (PGx) traits and have instead considered global ancestry, such studies rarely consider the influence of local ancestry (LA). Adding LA estimates into association analyses will provide a more comprehensive and inclusive approach to adjusting for population stratification and allow for research to better utilize diverse and admixed cohorts. Thus, we seek to focus our efforts on identifying novel PGx variants in relation to pharmacogenomic traits in a diverse cohort that includes admixed populations by incorporating local ancestry into association analyses in an effort to reduce adverse pharmacogenomic outcomes.
Project Purpose(s)
- Disease Focused Research (pharmacogenomic-associated diseases)
Scientific Approaches
We plan to generate LA estimates, using RFMix with different iterations of K, per chromosome in the AllofUs cohort with available whole-genome sequencing (WGS) data. Appropriate reference populations will be informed from global ancestry estimates and retrieved from merged 1000 Genomes and Human Genome Diversity Project (HGDP) datasets. LA, which can be represented as the number of inherited alleles (0, 1, or 2) from each ancestral population at a particular locus, will be defined in a gene-specific manner, plus and minus 5000 base pairs to capture relevant regulatory regions. Since transitions may occur within genes, gene-based LA will also be calculated as a within-gene proportion of ancestry per individual. The LA at each clinically relevant pharmacogene will be represented as a percentage using descriptive statistics. Then, we plan to perform a GWAS analysis on pharmacogenomic traits of interest.
Anticipated Findings
Given our sample size, we expect to be able to confirm or deny the presence of correlation between LA, global ancestry, and PGx variant carriage. We expect that these approaches will be applicable to a broad range of PGx phenotypes, providing a proof of concept for the use of LA in PGx studies of admixed and diverse populations. We also expect to identify novel variants associated with drug safety and efficacy. These variants will most likely be more prevalent in admixed individuals and thus will partly address racial disparities in pharmacogenomics.
Demographic Categories of Interest
- Race / Ethnicity
Data Set Used
Controlled TierResearch Team
Owner:
- Kiana Martinez - Research Fellow, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Anthony Vicenti - Project Personnel, University of Arizona
Heparin-induced Thrombocytopenia (HIT) GWAS
Scientific Questions Being Studied
Research questions: Can we identify novel genomic associations with heparin-induced thrombocytopenia (HIT).
Relevance: Heparin is a widely used anticoagulant that carries the risk of an antibody-mediated adverse drug reaction referred to as heparin-induced thrombocytopenia (HIT). A subset of heparin-treated patients produces detectable levels of antibodies against complexes of heparin bound to circulating platelet factor 4 (PF4). We aim to identify genetic variants associated with HIT using a genome-wise association study (GWAS) approach.
Project Purpose(s)
- Disease Focused Research (heparin-induced thrombocytopenia)
Scientific Approaches
We plan to identify a HIT-positive cohort as well as a healthy control group that have genotype data available to perform a GWAS using PLINK. Our primary GWAS will feature a logistic regression of HIT status. Regression models will be adjusted for age, sex, and principal components 1 to 3.
Anticipated Findings
This proposed project aims to replicate known associations between genetic variants and HIT as well as identify any novels ones that may be identified within a diverse cohort.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Kiana Martinez - Research Fellow, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Anthony Vicenti - Project Personnel, University of Arizona
HLA PheWAS
Scientific Questions Being Studied
Research questions:
1) What disease associations with HLA can be replicated using the AllofUs dataset?
2) What novel disease associations with HLA can be identified using the AllofUs dataset?
Relevance: The human leukocyte antigen (HLA) system is the most polymorphic in the human genome that has been associated with protection and predisposition to a broad array of infectious, autoimmune, and malignant diseases. Further research needs to be done in diverse populations to identify the full scope of phenotypes potentially associated with the HLA system.
Project Purpose(s)
- Disease Focused Research (HLA-associated diseases)
- Methods Development
Scientific Approaches
Prior to PheWAS analyses, HLA alleles will be imputed for each participant with whole-genome sequencing (WGS) data using a novel approach referencing the IPD-IMGT/HLA Database which defines the official HLA sequences named by the WHO Nomenclature Committee for Factors of the HLA System. Demographic characteristics will be acquired for the study population and summary statistics related to HLA-relevant variables will also be performed.
Primary statistical analyses will be carried out using multivariable linear regression HLA alleles as the independent variable and individual phecodes as dependent variables. Initial models will include adjustment for age, gender, and select variables from participant provided information (PPI). Differential associations by race/ethnicity, gender, and sex will also be evaluated.
Anticipated Findings
Our project expects to successfully generate HLA alleles for all AllofUs participants with available WGS data. We then expect to validate past phenotypic associations with HLA alleles as well as discover novel ones as this work will be performed in a diverse cohort.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sadaf Raoufi - Graduate Trainee, University of Arizona
- Kiana Martinez - Research Fellow, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Anthony Vicenti - Project Personnel, University of Arizona
Duplicate of Association between cholesterol and cancer
Scientific Questions Being Studied
Recent studies show conflict results about the association between cholesterol and cholesterol. All of US provides a large-scale contemporary cohort for a detailed exploration about this association.
Project Purpose(s)
- Disease Focused Research (cancer)
Scientific Approaches
We will use a direct and intuitive method to study the association between different cholesterol ( total cholesterol, LDL-cholesterol, HDL-cholesterol and triglyceride) and different cancers. Focus will be on breast cancer. Logistic regression will be used to quantitate the association.
Anticipated Findings
We anticipant to get clearer association between breast cancer and cholesterol.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierAssociation of Cholesterol with Heart Diseases
Scientific Questions Being Studied
The lipid hypothesis was based on an initial evidence that cardiac diseases are associated with high total cholesterol. This hypothesis has significantly changed our lifestyle during the last half century although many contradictory studies exist. Has the association between heart diseases and cholesterol changes during this long period? Or, is the original association result still valid now? And, is there another association mechanism that can explain the major contradictions? A re-evaluation of the association is necessary. AllOfUs provides a much larger EHR datasets for this association study than the original datasets that had only a few thousands of patients.
Project Purpose(s)
- Disease Focused Research (myocardial infarction, stroke)
Scientific Approaches
Standard quantitative association approach and the datasets for patients with cholesterol measurements will be used.
Anticipated Findings
We expect to find changes on the association between cholesterol and heart diseases, and new association mechanism may also be found.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Jianglin Feng - Other, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
Collaborators:
- Lina Sulieman - Other, All of Us Program Operational Use
Association between cholesterol and cancer
Scientific Questions Being Studied
Recent studies show conflict results about the association between cholesterol and cholesterol. All of US provides a large-scale contemporary cohort for a detailed exploration about this association.
Project Purpose(s)
- Disease Focused Research (cancer)
Scientific Approaches
We will use a direct and intuitive method to study the association between different cholesterol ( total cholesterol, LDL-cholesterol, HDL-cholesterol and triglyceride) and different cancers. Focus will be on breast cancer. Logistic regression will be used to quantitate the association.
Anticipated Findings
We anticipant to get clearer association between breast cancer and cholesterol.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Registered TierResearch Team
Owner:
- Jianglin Feng - Other, University of Arizona
- Jason Karnes - Early Career Tenure-track Researcher, University of Arizona
You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.