Shivam Sharma

Graduate Trainee, Georgia Institute of Technology

11 active projects

Type2Diabetes-RareVariantBurden

I want to explore genes and variants that are impacting type 2 diabetes risk between populations . This can better help us understand Type 2 diabetes biology and how it manifest between population groups.

Scientific Questions Being Studied

I want to explore genes and variants that are impacting type 2 diabetes risk between populations . This can better help us understand Type 2 diabetes biology and how it manifest between population groups.

Project Purpose(s)

  • Disease Focused Research (type 2 diabetes mellitus)
  • Population Health
  • Ancestry

Scientific Approaches

What genes are associated with Type 2 diabetes when restricting to specific groups of variants (e.g., missense variants and pLoFs)? I will be using Exome data to perform exome-wide association analysis as well as gene-set based burden analysis.

Anticipated Findings

We want to see if there is a burden of different types of variants associated with type 2 diabetes risk. Additionally, is the burden different between population groups in the All of Us Research Program?

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

AncestryAdmixtureRegression

Between and within group admixture regression design can be used to decompose genetic and socio-environmental effects on health disparities​. Health outcome differences between groups may be attributed to genetic and socio-environmental factors​. Systematic differences in socio-environmental factors can exist between…

Scientific Questions Being Studied

Between and within group admixture regression design can be used to decompose genetic and socio-environmental effects on health disparities​. Health outcome differences between groups may be attributed to genetic and socio-environmental factors​. Systematic differences in socio-environmental factors can exist between groups, whereas systematic differences within groups are minimized​. For admixed groups (eg Black/African American), systematic differences in ancestry proportions remain within the group​. This distinction can be leveraged to decompose genetic and socio-environmental effects on health disparities​

Project Purpose(s)

  • Population Health
  • Ancestry

Scientific Approaches

We will perform continental genetic ancestry inference for the All of Us cohort. We will use this cohort to select individuals with two-way African and European ancestry. Then using logistic regression, we will find biggest health outcome differences between Black and White groups​. Then we will analyze the between and within models for SIRE groups.

Anticipated Findings

We expect to find associated health outcomes (disease) with genome-wide ancestry fractions. These can be used to discover and quantify ancestry-related health disparities​. This will allow public health policy makers to dedicate appropriate resources to diseases that disproportionately affect minorities groups in the US.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Collaborators:

  • Vincent Lam - Research Fellow, National Institutes of Health (NIH)
  • Jay Menuey - Graduate Trainee, Georgia Institute of Technology
  • Jeff Kramer - Undergraduate Student, Georgia Institute of Technology
  • Courtney Astore - Graduate Trainee, Georgia Institute of Technology

Genomic data in the All of Us Research Program - Genetic Ancestry Analysis

This workspace will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' genetic ancestry (which can be objectively inferred from participants' genome). Socially defined race & ethnicity and genetically inferred…

Scientific Questions Being Studied

This workspace will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' genetic ancestry (which can be objectively inferred from participants' genome). Socially defined race & ethnicity and genetically inferred ancestry are both relevant to health outcomes. The workspace will contain notebooks which can be used to replicate Figure 2 in the publication: "Genomic data in the All of Us Research Program" (All of Us Research Program Genomics Investigators (2024). Genomic data in the All of Us Research Program. Nature, 10.1038/s41586-023-06957-x. Advance online publication. https://doi.org/10.1038/s41586-023-06957-x)

Project Purpose(s)

  • Ancestry

Scientific Approaches

To characterize the genetic diversity of the All of Us cohort, we analyzed participant genetic data. Here is a brief list of methods used: 1. All of Us participant genome-wide genotype was merged and harmonized with global reference population data. 2. Unsupervised clustering analysis techniques - UMAP on PCA to assess the extent of genetic structure in the cohort. 3. Supervised genetic ancestry inference using global reference populations, principal components analysis, and the Rye (Rapid ancestrY Estimation) program.

Anticipated Findings

The All of Us participant cohort will be genetically diverse, consistent with the project’s aim to recruit underrepresented biomedical research groups in support of health equity. These results are already published here: "Genomic data in the All of Us Research Program" (All of Us Research Program Genomics Investigators (2024). Genomic data in the All of Us Research Program. Nature, 10.1038/s41586-023-06957-x. Advance online publication. https://doi.org/10.1038/s41586-023-06957-x).

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Type2Diabetes

I want to explore genes and variants that are impacting type 2 diabetes risk. This can better help us understand Type 2 diabetes biology.

Scientific Questions Being Studied

I want to explore genes and variants that are impacting type 2 diabetes risk. This can better help us understand Type 2 diabetes biology.

Project Purpose(s)

  • Population Health
  • Educational

Scientific Approaches

What genes are associated with Type 2 diabetes when restricting to specific groups of variants (e.g., missense variants and pLoFs)?

Anticipated Findings

We want to see if there is a burden of different types of variants associated with type 2 diabetes risk.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Collaborators:

  • Courtney Astore - Graduate Trainee, Georgia Institute of Technology

GeneticAncestryInferenceV7

As a demonstration project, this project will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' race & ethnicity (which are socially defined), and genetic ancestry (which can be objectively…

Scientific Questions Being Studied

As a demonstration project, this project will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' race & ethnicity (which are socially defined), and genetic ancestry (which can be objectively inferred from participants' genome). Socially defined race & ethnicity and genetically inferred ancestry are both relevant to health outcomes. Race & ethnicity shape individuals’ lived experience and social environment, eg structural inequities, environmental injustice, and barriers to healthcare access. Genetic ancestry can affect health outcomes via differences in the frequencies of variants associated with disease and drug response. Specifically, we will ask:

1. What is the extent of racial, ethnic, and genetic diversity in the All of Us cohort?

2. How do genetic ancestry and admixture change over geography and with age in the US?

3. Are there associations between genetic ancestry and health outcomes in the All of Us cohort?

Project Purpose(s)

  • Population Health
  • Methods Development
  • Ancestry
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

To characterize the diversity of the All of Us cohort, we analyzed participant genetic, demographic, and geographic data.

Here is a brief list of methods used:

1. All of Us participant genome-wide genotype was merged and harmonized with global reference population data.

2. Unsupervised clustering analysis techniques - Hopkins statistic, visual assessment of clustering tendency, K-means clustering & UMAP - to assess the extent of genetic structure in the cohort.

3. Supervised genetic ancestry inference using global reference populations, principal components analysis, and the Rye (Rapid ancestrY Estimation) program.

4. Genetic ancestry was compared to participants' self-identified race & ethnicity.

5. Geocoded data and participant age were used to measure how genetic ancestry and admixture vary with respect to participant geography and age.

6. Admixture regression to associate participant health outcomes, gleaned from electronic health records, with their genetic ancestry.

Anticipated Findings

1. The All of Us participant cohort will be racially, ethnically, and genetically diverse, consistent with the project’s aim to recruit underrepresented biomedical research groups in support of health equity.

2. All of Us participant genetic variation will be highly structured and best modeled by clusters rather than a continuum of variation.

3. All of Us participants’ will show patterns of genetically inferred ancestry that are correlated with their socially defined ancestry (i.e. race and ethnicity).

4. All of Us participants’ genetic ancestry and admixture will change over geography and with age.

5. All of Us participants’ genetic ancestry will be associated with a variety of health outcomes.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Collaborators:

  • Courtney Astore - Graduate Trainee, Georgia Institute of Technology

MutationalLoad

Does mutational load differ between self-reported race and ethnicity (SIRE) groups in the All of Us research project. Moreover, does the pattern change when it comes to genetic ancestry? We also want to evaluate heterozygosity patterns within the All of…

Scientific Questions Being Studied

Does mutational load differ between self-reported race and ethnicity (SIRE) groups in the All of Us research project. Moreover, does the pattern change when it comes to genetic ancestry? We also want to evaluate heterozygosity patterns within the All of Us research program.

Project Purpose(s)

  • Population Health
  • Ancestry

Scientific Approaches

We intend to use variants stored in the whole genome sequencing data for the All of Us research program. We also want to use HAIL and ACMG genes coordinates to extract the variants of interest. We will use R and Python programming for data analysis and visualization. We will also use SnpEff for variant effect prediction.

Anticipated Findings

We expect to find differences in mutational load between SIRE groups. We also think we might find patterns of mutational load variance with genetic ancestry. This could be essential to understand for ACMG list of essential genes.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

GeneticAncestry

As a demonstration project, this project will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' race & ethnicity (which are socially defined), and genetic ancestry (which can be objectively…

Scientific Questions Being Studied

As a demonstration project, this project will describe, characterize and, validate the extent of diversity in the All of Us cohort with respect to the participants' race & ethnicity (which are socially defined), and genetic ancestry (which can be objectively inferred from participants' genome). Socially defined race & ethnicity and genetically inferred ancestry are both relevant to health outcomes. Race & ethnicity shape individuals’ lived experience and social environment, eg structural inequities, environmental injustice, and barriers to healthcare access. Genetic ancestry can affect health outcomes via differences in the frequencies of variants associated with disease and drug response. Specifically, we will ask:

1. What is the extent of racial, ethnic, and genetic diversity in the All of Us cohort?

2. How do genetic ancestry and admixture change over geography and with age in the US?

3. Are there associations between genetic ancestry and health outcomes in the All of Us cohort?

Project Purpose(s)

  • Population Health
  • Methods Development
  • Ancestry
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use.)

Scientific Approaches

To characterize the diversity of the All of Us cohort, we analyzed participant genetic, demographic, and geographic data.

Here is a brief list of methods used:

1. All of Us participant genome-wide genotype was merged and harmonized with global reference population data.

2. Unsupervised clustering analysis techniques - Hopkins statistic, visual assessment of clustering tendency, K-means clustering & UMAP - to assess the extent of genetic structure in the cohort.

3. Supervised genetic ancestry inference using global reference populations, principal components analysis, and the Rye (Rapid ancestrY Estimation) program.

4. Genetic ancestry was compared to participants' self-identified race & ethnicity.

5. Geocoded data and participant age were used to measure how genetic ancestry and admixture vary with respect to participant geography and age.

6. Admixture regression to associate participant health outcomes, gleaned from electronic health records, with their genetic ancestry.

Anticipated Findings

1. The All of Us participant cohort will be racially, ethnically, and genetically diverse, consistent with the project’s aim to recruit underrepresented biomedical research groups in support of health equity.

2. All of Us participant genetic variation will be highly structured and best modeled by clusters rather than a continuum of variation.

3. All of Us participants’ will show patterns of genetically inferred ancestry that are correlated with their socially defined ancestry (i.e. race and ethnicity).

4. All of Us participants’ genetic ancestry and admixture will change over geography and with age.

5. All of Us participants’ genetic ancestry will be associated with a variety of health outcomes.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Collaborators:

  • Vincent Lam - Research Fellow, National Institutes of Health (NIH)
  • Sonali Gupta - Research Assistant, National Institutes of Health (NIH)
  • Courtney Astore - Graduate Trainee, Georgia Institute of Technology

AdmixtureMapping

Do creatinine levels vary by genetic ancestry? If yes, then can we identify specific genetic ancestral loci that are strongly associated with elevated levels of serum creatinine in humans? If these are found to be true, then we can establish…

Scientific Questions Being Studied

Do creatinine levels vary by genetic ancestry? If yes, then can we identify specific genetic ancestral loci that are strongly associated with elevated levels of serum creatinine in humans? If these are found to be true, then we can establish that serum creatinine variation in the humans is genetically determined to an extent.

Project Purpose(s)

  • Ancestry

Scientific Approaches

We want to use Genotyping data and blood biomarkers (serum creatinine) from the All of Us cohort. This data will be used to infer genetic ancestry at global and local scale. Followed by that we will perform admixture mapping and plot manhattan plots to look for haplotypes associated with higher levels of serum creatinine.

Anticipated Findings

We expect to find serum creatinine associated with African ancestry in the All of Us cohort. We also hope to find some specific local ancestry loci associated with serum creatinine levels.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Hispanic / Latino Analyses

The goal of this study is to study the intra-ethnic health disparities among Hispanic participants in the All of Us cohort. Specifically, we would like to see how racial identity and Hispanic ethnicity interact, as well as how genetic ancestry…

Scientific Questions Being Studied

The goal of this study is to study the intra-ethnic health disparities among Hispanic participants in the All of Us cohort. Specifically, we would like to see how racial identity and Hispanic ethnicity interact, as well as how genetic ancestry and Hispanic ethnicity interact to influence disease risk, using T2D as our disease of interest.

Project Purpose(s)

  • Disease Focused Research (Type 2 Diabetes)
  • Ancestry

Scientific Approaches

We plan to use the glm function in the stats package in R to model interaction effects between racial identity and Hispanic ethnicity, as well as the interaction effects between genetic ancestry and Hispanic ethnicity.

Anticipated Findings

We currently hypothesize that racial identity will have a greater effect on disease outcome than ethnicity and that the impact that racial identity will have on disease outcome will vary between Hispanic and non-Hispanic participants. We also anticipate that the impact of ancestry on disease outcome will vary between Hispanic and non-Hispanic participants.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Vincent Lam - Research Fellow, National Institutes of Health (NIH)
  • Sonali Gupta - Research Assistant, National Institutes of Health (NIH)
  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

GeneBurdenAdmixedPopulations

The objective of this aim is to identify genes with a high burden of evolutionarily constrained variants for each disease and ancestry group. This will be done by grouping individuals by dominant ancestry or if they are admixed, and then…

Scientific Questions Being Studied

The objective of this aim is to identify genes with a high burden of evolutionarily constrained variants for each disease and ancestry group. This will be done by grouping individuals by dominant ancestry or if they are admixed, and then performing rare variant gene-burden tests on each group for each disease. A mask will be applied to the exome variants to account for those with a high Combined Annotation-Dependent Depletion (CADD) score, indicating that the variants are more likely to be deleterious and potentially constrained by evolution. An appropriate regression model that accounts for imbalanced case/control ratios will be selected. The gene-trait associations identified for each group and disease will be compared for level of significance and direction of effect.

Project Purpose(s)

  • Population Health
  • Ancestry

Scientific Approaches

We will first characterize global and local ancestries of All of Us participants. We will use these to define admixed (two way between European, African, and native American) individuals and non-admixed individuals. We will the define masks for the burden tests which will help us perform burden testing. Burden tests will then be performed between these cohorts.

Anticipated Findings

Burdens between admixed and non-admixed populations would differ, it will be exciting to know how burden changes with ancestry estimates between admixed and non-admixed populations.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology

Pharmacogenomics

Our project aims to analyze the utility of self-identified race and ethnicity labels in genetically-informed drug predictions for different individuals.

Scientific Questions Being Studied

Our project aims to analyze the utility of self-identified race and ethnicity labels in genetically-informed drug predictions for different individuals.

Project Purpose(s)

  • Population Health
  • Drug Development

Scientific Approaches

We are going to use whole genome sequencing data filtered for only pharmacogenomics variants. We will then employ principal component analysis followed by a suite of machine learning models to predict SIRE labels using PC data.

Anticipated Findings

We anticipate the PC vectors will be able to capture the pharmacogenomics variation and predict the SIRE labels in the All of Us dataset.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Shivam Sharma - Graduate Trainee, Georgia Institute of Technology
1 - 11 of 11
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.