Srushti Gangireddy

Project Personnel, Vanderbilt University Medical Center

10 active projects

Exploring_srWGS_VCF_Data

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics…

Scientific Questions Being Studied

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics Variant Search feature within a Data Browser aims to harness the power of VCF data to provide researchers, clinicians, and geneticists with a user-friendly tool to search, analyze, visualize genetic variants efficiently. In this exploration, we will delve into the process of integrating VCF data into such an application, highlighting the importance and challenges of this endeavor. VCF files contain a wealth of information about genetic variants, including single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations. Integrating VCF data into the Genomics Variant Search App provides access to a comprehensive variant database that is crucial for research, clinical diagnostics, and drug.

Project Purpose(s)

  • Other Purpose (Exploring VCF Data)

Scientific Approaches

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics Variant Search feature within a Data Browser aims to harness the power of VCF data to provide researchers, clinicians, and geneticists with a user-friendly tool to search, analyze, visualize genetic variants efficiently. In this exploration, we will delve into the process of integrating VCF data into such an application, highlighting the importance and challenges of this endeavor. VCF files contain a wealth of information about genetic variants, including single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations. Integrating VCF data into the Genomics Variant Search App provides access to a comprehensive variant database that is crucial for research, clinical diagnostics, and drug.

Anticipated Findings

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics Variant Search feature within a Data Browser aims to harness the power of VCF data to provide researchers, clinicians, and geneticists with a user-friendly tool to search, analyze, visualize genetic variants efficiently. In this exploration, we will delve into the process of integrating VCF data into such an application, highlighting the importance and challenges of this endeavor. VCF files contain a wealth of information about genetic variants, including single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations. Integrating VCF data into the Genomics Variant Search App provides access to a comprehensive variant database that is crucial for research, clinical diagnostics, and drug.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Explore_Genomics_VCF_Data

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics…

Scientific Questions Being Studied

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics Variant Search feature within a Data Browser aims to harness the power of VCF data to provide researchers, clinicians, and geneticists with a user-friendly tool to search, analyze, visualize genetic variants efficiently. In this exploration, we will delve into the process of integrating VCF data into such an application, highlighting the importance and challenges of this endeavor. VCF files contain a wealth of information about genetic variants, including single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations. Integrating VCF data into the Genomics Variant Search App provides access to a comprehensive variant database that is crucial for research, clinical diagnostics, and drug.

Project Purpose(s)

  • Other Purpose (Exploring vcf data)

Scientific Approaches

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics Variant Search feature within a Data Browser aims to harness the power of VCF data to provide researchers, clinicians, and geneticists with a user-friendly tool to search, analyze, and visualize genetic variants efficiently. In this exploration, we will delve into the process of integrating VCF data into such an application, highlighting the importance and challenges of this endeavor. VCF files contain a wealth of information about genetic variants, including single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations. Integrating VCF data into the Genomics Variant Search App provides access to a variant database that is crucial for research, clinical diagnostics, and drug development.

Anticipated Findings

Genomics is a rapidly evolving field with immense potential to revolutionize healthcare, genetics, and personalized medicine. Central to this field is the management and analysis of genetic variants, which are typically stored in Variant Call Format (VCF) files. A Genomics Variant Search feature within a Data Browser aims to harness the power of VCF data to provide researchers, clinicians, and geneticists with a user-friendly tool to search, analyze, and visualize genetic variants efficiently. In this exploration, we will delve into the process of integrating VCF data into such an application, highlighting the importance and challenges of this endeavor. VCF files contain a wealth of information about genetic variants, including single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations. Integrating VCF data into the Genomics Variant Search App provides access to a variant database that is crucial for research, clinical diagnostics, and drug development.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

V7 PASC Workspace

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Scientific Questions Being Studied

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Project Purpose(s)

  • Educational
  • Ancestry
  • Other Purpose (practice notebook to familiarize with RW)

Scientific Approaches

We will apply algorithms developed by the RECOVER PCORnet Adult Cohort and compare the overlap in cohorts with the set derived though the N3C algorithm

Anticipated Findings

We expect to find a high degree of concordance between the RECOVER Adult Cohort algorithm and the N3C algorithm, even though the approaches were developed through different machine learning methods on different source patient data sets

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Lina Sulieman - Other, All of Us Program Operational Use

AOU_Recover_Long_Covid_v6

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Scientific Questions Being Studied

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

To achieve this objective, data science workflows were used to apply ML algorithms on the Researcher Workbench. This effort allowed an expansion in the number of participants used to evaluate the ML models used to identify risk of PASC/Long COVID and also serve to validate the efforts of one team and providing insight to other teams. These models were implemented within the All of Us Controlled Tier data (C2022Q2R2), which was last refreshed on June 22, 2022. We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Anticipated Findings

We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • WeiQi Wei - Other, All of Us Program Operational Use
  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Mark Weiner - Mid-career Tenured Researcher, Cornell University
  • Hiral Master - Project Personnel, All of Us Program Operational Use
  • Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
  • David Mohs - Other, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Chenchal Subraveti - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use
  • Chris Lunt - Other, All of Us Program Operational Use

Practice Notebook to Explore AoU dataset

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Scientific Questions Being Studied

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Project Purpose(s)

  • Educational
  • Other Purpose (practice notebook to familiarize with RW)

Scientific Approaches

We will apply algorithms developed by the RECOVER PCORnet Adult Cohort and compare the overlap in cohorts with the set derived though the N3C algorithm

Anticipated Findings

We expect to find a high degree of concordance between the RECOVER Adult Cohort algorithm and the N3C algorithm, even though the approaches were developed through different machine learning methods on different source patient data sets

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Mark Weiner - Mid-career Tenured Researcher, Cornell University
  • Hiral Master - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Aashri Aggarwal - Undergraduate Student, Cornell University

Exploring_AOU_Data

This study will identify patients tested COVID positive to identify conditions occurring after covid-19. We would like to explore genomic and fitbit data and see if they add any value in determining the status of long covid.

Scientific Questions Being Studied

This study will identify patients tested COVID positive to identify conditions occurring after covid-19. We would like to explore genomic and fitbit data and see if they add any value in determining the status of long covid.

Project Purpose(s)

  • Disease Focused Research (long covid-19)

Scientific Approaches

This project uses Recover algorithm to identify patients with long-covid.
We are using xgboost libraries to run the model developed by n3c.

Anticipated Findings

We plan to study how different data like fitbit and genomic data contribute towards the patient having or not having long covid.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • QiPing Feng - Early Career Tenure-track Researcher, Vanderbilt University Medical Center

Collaborators:

  • Elliot Outland - Project Personnel, Vanderbilt University Medical Center

Duplicate of Demo - Siloed Analysis of All of Us and UK Biobank Genomic Data

Historically, researchers responded to limitations in genomic data sharing policy and practice by conducting meta analysis on summary outputs from isolated genomic datasets. Recent work has demonstrated the increased power of individual-level genetic analysis on pooled datasets. In addition, advancements…

Scientific Questions Being Studied

Historically, researchers responded to limitations in genomic data sharing policy and practice by conducting meta analysis on summary outputs from isolated genomic datasets. Recent work has demonstrated the increased power of individual-level genetic analysis on pooled datasets. In addition, advancements in data access and sharing policies coupled with technological advancements in cloud-based environments for data access and analysis have opened up new possibilities for pooled analysis of large-scale genomic datasets. The NIH All of Us Research Program and UK Biobank are two leading examples of large, population scale studies which combine genomic data with deep phenotypic health data. There is a grand opportunity to demonstrate how the world’s largest research-ready biomedical datasets can create more value together and advance discovery in genome science.

Project Purpose(s)

  • Other Purpose (This is a demonstration project meant to support research with All of Us Genomic Data)

Scientific Approaches

The primary goal of this project is to demonstrate the potential of the All of Us Researcher Workbench for pooled analyses of All of Us and UK Biobank data. Specifically, we aim to: 1. Develop and describe an approved, secure path for connecting UK Biobank data to the All of Us Researcher Workbench. 2. Conduct a genome-wide association study of blood lipids on the pooled dataset aimed at demonstrating that biomedical researchers can be more productive when permitted to analyze the union of the cohorts, as opposed to computing aggregate results in separate data silos for each cohort and then combining those aggregates.

Anticipated Findings

The secondary goal of this project is to demonstrate and measure the experience when the same analyses are repeated in a siloed manner. Specifically we aim to: 3. Repeat the previously described genome-wide association study on the All of Us Researcher Workbench when working with the All of Us data and on UK Biobank’s DNAnexus when working with the UK Biobank data. 4. Conduct a meta analysis on the aggregate results for each cohort (in accordance with each program’s data use policies) and compare the result of combining those aggregates to the results from the pooled analysis. Evaluate not only differences in results, but also differences in analysis cost and analyst productivity.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Implementing Recover Algorithm on AOU Data

Identify potential long COVID patients among three groups in the database: All COVID-19 patients, patients hospitalized with COVID-19, and patients who had COVID-19 but were not hospitalized. The models proved to be accurate, as people identified as at risk for…

Scientific Questions Being Studied

Identify potential long COVID patients among three groups in the database: All COVID-19 patients, patients hospitalized with COVID-19, and patients who had COVID-19 but were not hospitalized. The models proved to be accurate, as people identified as at risk for long COVID were similar to patients seen at long COVID clinics.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

XGBoost machine learning model is developed to identify potential patients with long COVID.
Base population is defined as any non-deceased adult patient (age ≥18 years) with either an International Classification of Diseases-10-Clinical Modification COVID-19 diagnosis code (U07.1) from an inpatient or emergency visit, or a positive SARS-CoV-2 PCR or antigen test, and for whom at least 90 days have passed since COVID-19 index date.
The model examines demographics, health-care utilization, diagnoses, and medications for adults with COVID-19.

Anticipated Findings

Identify with high accuracy, patients who potentially have long COVID. Find the important features.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • WeiQi Wei - Other, All of Us Program Operational Use
  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Hiral Master - Project Personnel, All of Us Program Operational Use
  • Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)

Collaborators:

  • Chris Lunt - Other, All of Us Program Operational Use

RECOVER+AoU

The goal of this initial cross-platform testing effort is focused on expanding the analytical capability of available data sources that have collected data on SARS-CoV-2. As we gather data across the US, we can use independent data sources to better…

Scientific Questions Being Studied

The goal of this initial cross-platform testing effort is focused on expanding the analytical capability of available data sources that have collected data on SARS-CoV-2. As we gather data across the US, we can use independent data sources to better understand PASC in our population and identify possible interventions. As a first step, we hope to leverage available RECOVER data tools and apply within the All of Us Researcher Workbench to assess cross-platform interoperability and analytical equivalence. This would provide a path to engage our research community and guide research towards our understanding of PASC.

Project Purpose(s)

  • Population Health
  • Methods Development
  • Control Set
  • Other Purpose (Testing PASC ML Algorithm from N3C-RECOVER in AoU Platform)

Scientific Approaches

Bring existing data query code and data analytics code from the RECOVER researcher team into the All of Us Researcher Workbench. Use “equivalent” code sets to explore and expand our understanding of PASC and its effects on the US population. Share reproducible findings through programming “notebook” and analysis of standardized datasets (OMOP).

Anticipated Findings

This research activity will be developed in conjunction with an awareness campaign of the collaborative efforts undertaken by both RECOVER and AoU. We intend to highlight the available datasets with SARS-CoV-2 data, as well as the cloud-based researcher workspaces (RECOVER, AoU). With the awareness campaign and cross-platform testing, we intent to create an on-ramp for experienced and young researchers within two large and diverse datasets.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • WeiQi Wei - Other, All of Us Program Operational Use
  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Hiral Master - Project Personnel, All of Us Program Operational Use
  • Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
  • Chris Lunt - Other, All of Us Program Operational Use

Srushti_LongCovid

Train Machine Learning models to identify potential long-COVID patients among (1) all COVID-19 patients, (2) patients hospitalized with COVID-19, and (3) patients who had COVID-19 but were not hospitalized.

Scientific Questions Being Studied

Train Machine Learning models to identify potential long-COVID patients among (1) all COVID-19 patients, (2) patients hospitalized with COVID-19, and (3) patients who had COVID-19 but were not hospitalized.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

To reflect that long-COVID may look different depending on the severity of the patient’s acute COVID-19, we built three different ML models using the three-site subset: (1) all patients, (2) patients who had been hospitalized with acute COVID-19, and (3) patients who were not hospitalized. The intent of each model is to identify the patients most likely to have long-COVID, using attendance at a long-COVID specialty clinic as a proxy for long-COVID diagnosis. To train and test each model, patients were randomly sampled to yield similar patient counts in both classes (long-COVID clinic patients and patients who did not attend the long-COVID clinic). For the all-patient model, data were also sampled to yield similar numbers of hospitalized and non-hospitalized patients.

Anticipated Findings

The combined demographics of the long-COVID clinic patients show significant differences from the COVID-19 patients at those sites who did not attend the long-COVID clinic (third and fourth columns of Table 1). Notably, non-hospitalized long-COVID clinic patients are disproportionately female.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

1 - 10 of 10
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.