Sri Raj
Early Career Tenure-track Researcher, Albert Einstein College of Medicine
12 active projects
Identity-by-descent in the United States - Ishana
Scientific Questions Being Studied
We are leveraging the full genomic and population diversity of the All of Us project to understand the genetic ancestral basis of diversity in the causes, etiology and treatment of health outcomes. All of Us provides the racial and ethnic background of participants, but these are inaccurate proxies for genetic ancestry, which will help us understand the contribution of genetic ancestral differences among individuals to the biological basis of health outcomes. Therefore, we will measure genetic diversity, identify the genetic ancestry of All of Us participants throughout the United States. This information will help us better understand biological variation that contributes to differences in health outcomes.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
We are first quantifying fine scale population substructure using genomic approaches that measure: a) global genetic diversity, or the total proportion of different global ancestries represented in an individual's genome; b) local genetic ancestry, or where in the genome this ancestry is located in an individual; c) detection of genomic segments shared identity-by-descent (IBD). These IBD segments are segments of DNA shared between individuals from a shared common ancestor. We are using Hail, PLINK, ADMIXTURE, RFMix, MOSAIC,TBWPT, and in-house Python and R scripts and other genomic software to capture this variation.
Anticipated Findings
We anticipate that we will identify founder populations that are distributed differently across the United States, and distinguish population subgroups that are finer grained than either racial categories or continental ancestry categories. For example, the Latino ethnicity comprises individuals who are Dominican, Puerto Rican, Mexican, Cuban, etc. We anticipate being able to distinguish these groups, as well as the admixture among these groups, to more accurately understand the contribution of ancestry to health outcomes. Quantification this ancestry is the first step to understanding the biological diversity within the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
Collaborators:
- William Jerome - Graduate Trainee, Albert Einstein College of Medicine
- Tinaye Mutetwa - Graduate Trainee, Albert Einstein College of Medicine
- Srilakshmi Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
- Kevin Tao - Graduate Trainee, Albert Einstein College of Medicine
- Hersh Gupta - Graduate Trainee, Albert Einstein College of Medicine
- Chynna Smith - Graduate Trainee, Albert Einstein College of Medicine
- Varun Gupta - Project Personnel, Albert Einstein College of Medicine
- Roberto Ortega - Graduate Trainee, Albert Einstein College of Medicine
- chenxin zhang - Project Personnel, Albert Einstein College of Medicine
Identity-by-descent in the United States
Scientific Questions Being Studied
We are leveraging the full genomic and population diversity of the All of Us project to understand the genetic ancestral basis of diversity in the causes, etiology and treatment of health outcomes. All of Us provides the racial and ethnic background of participants, but these are inaccurate proxies for genetic ancestry, which will help us understand the contribution of genetic ancestral differences among individuals to the biological basis of health outcomes. Therefore, we will measure genetic diversity, identify the genetic ancestry of All of Us participants throughout the United States. This information will help us better understand biological variation that contributes to differences in health outcomes.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
We are first quantifying fine scale population substructure using genomic approaches that measure: a) global genetic diversity, or the total proportion of different global ancestries represented in an individual's genome; b) local genetic ancestry, or where in the genome this ancestry is located in an individual; c) detection of genomic segments shared identity-by-descent (IBD). These IBD segments are segments of DNA shared between individuals from a shared common ancestor. We are using Hail, PLINK, ADMIXTURE, RFMix, MOSAIC,TBWPT, and in-house Python and R scripts and other genomic software to capture this variation.
Anticipated Findings
We anticipate that we will identify founder populations that are distributed differently across the United States, and distinguish population subgroups that are finer grained than either racial categories or continental ancestry categories. For example, the Latino ethnicity comprises individuals who are Dominican, Puerto Rican, Mexican, Cuban, etc. We anticipate being able to distinguish these groups, as well as the admixture among these groups, to more accurately understand the contribution of ancestry to health outcomes. Quantification this ancestry is the first step to understanding the biological diversity within the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Collaborators:
- William Jerome - Graduate Trainee, Albert Einstein College of Medicine
- Tinaye Mutetwa - Graduate Trainee, Albert Einstein College of Medicine
- Srilakshmi Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
- Kevin Tao - Graduate Trainee, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
- Hersh Gupta - Graduate Trainee, Albert Einstein College of Medicine
- Chynna Smith - Graduate Trainee, Albert Einstein College of Medicine
- Varun Gupta - Project Personnel, Albert Einstein College of Medicine
- Roberto Ortega - Graduate Trainee, Albert Einstein College of Medicine
- chenxin zhang - Project Personnel, Albert Einstein College of Medicine
Admixture mapping and identity by descent mapping in the United States
Scientific Questions Being Studied
We will use detailed genetic ancestry information to identify novel associations of genetic variants with diseases. This project is enabled by the diversity of the All of Us Research program. We will integrate previous analyses by our research group on the ancestry of all of us participants with phenotype information to identify these associations.
Project Purpose(s)
- Population Health
- Methods Development
- Ancestry
Scientific Approaches
We will use identity by descent mapping and admixture mapping approaches to associate local genomic ancestry, and identity-by-descent segments between pairs of individuals, with phecode level health information on All of Us participants. This will be done using Hail, PLINK v2.0 and other software. We will carefully correct for covariates such as age, sex, smoking status, social determinants of health outcomes as well, and iterate on this to understand how genetic ancestry can identify new genetic variants associated with various phenotypes available on All of Us participants.
Anticipated Findings
We expect to find several new variants, given the diversity of the All of Us sample and the large sample size of these diverse participants. This information may be used to better understand the genetic basis of heterogeneity in disease risk, etiology and outcome in the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Collaborators:
- Srilakshmi Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
Urban genetics
Scientific Questions Being Studied
The world has witnessed on-going urbanization and globalization for decades. Currently, more than half of the world’s population lives in urban areas and the proportion is expected to go up to about 66% by 2050. The number of people living outside their country of origin has also been increasing over the last two decades. This global trend is evident in largest cities in the world such as New York, suggesting rapid and extensive admixture of individuals with different ancestry inside the city. This admixture is different from the admixture we experienced before in many points. Thus, it is important to understand what would happen in our genome in the city from the perspective of population genetics.
Additionally, the environment in urban area is very unique: E.g. high density of people, heavy traffic, heavy air pollution, low level of physical activity and etc. These unique environment can drive evolution though gene environment interaction.
Project Purpose(s)
- Ancestry
Scientific Approaches
To assess what genetic features in cities are different from those in rural areas, we will categorize All Of US participants by the level of urbanization based on zip code data and compare several statistics (e.g. genetic diversity). To understand what is happening in a large city, we will analyze population structure in New York City by PCA, ADMIXTURE, UMAP and IBD sharing network and compare with New York States. We will also infer local ancestry for each individual using MOSAIC and gnomix to compare ancestry composition in each individual's genome.
Anticipated Findings
By analyzing NYC participants, we will find several founder populations and many recently admixed individuals. By combining EHR data, we will be able to find population-specific disease and associated variants. This may improve our understanding of disease background for underrepresented populations such as admixed populations and small founder populations.
Higher genetic diversity and higher rate of assortative mating are anticipated in the city than rural area. Our findings may be helpful to create future frameworks of genomic study such as Genome-wide association study (GWAS) because some genetic features like assortative mating would contradict the assumption of current framework.
Inferring local ancestry is important for finding disease-associated variants in admixed individuals. We will provide the results to other researchers to enhance genetic researches in admixed populations.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
Collaborators:
- Srilakshmi Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Duplicate of Admixture mapping and IBD mapping in the United States
Scientific Questions Being Studied
We will use detailed genetic ancestry information to identify novel associations of genetic variants with diseases. This project is enabled by the diversity of the All of Us Research program. We will integrate previous analyses by our research group on the ancestry of all of us participants with phenotype information to identify these associations.
Project Purpose(s)
- Population Health
- Methods Development
- Ancestry
Scientific Approaches
We will use identity by descent mapping and admixture mapping approaches to associate local genomic ancestry, and identity-by-descent segments between pairs of individuals, with phecode level health information on All of Us participants. This will be done using Hail, PLINK v2.0 and other software. We will carefully correct for covariates such as age, sex, smoking status, social determinants of health outcomes as well, and iterate on this to understand how genetic ancestry can identify new genetic variants associated with various phenotypes available on All of Us participants.
Anticipated Findings
We expect to find several new variants, given the diversity of the All of Us sample and the large sample size of these diverse participants. This information may be used to better understand the genetic basis of heterogeneity in disease risk, etiology and outcome in the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
Collaborators:
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
Duplicate of Identity-by-descent in the United States
Scientific Questions Being Studied
We are leveraging the full genomic and population diversity of the All of Us project to understand the genetic ancestral basis of diversity in the causes, etiology and treatment of health outcomes. All of Us provides the racial and ethnic background of participants, but these are inaccurate proxies for genetic ancestry, which will help us understand the contribution of genetic ancestral differences among individuals to the biological basis of health outcomes. Therefore, we will measure genetic diversity, identify the genetic ancestry of All of Us participants throughout the United States. This information will help us better understand biological variation that contributes to differences in health outcomes.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
We are first quantifying fine scale population substructure using genomic approaches that measure: a) global genetic diversity, or the total proportion of different global ancestries represented in an individual's genome; b) local genetic ancestry, or where in the genome this ancestry is located in an individual; c) detection of genomic segments shared identity-by-descent (IBD). These IBD segments are segments of DNA shared between individuals from a shared common ancestor. We are using Hail, PLINK, ADMIXTURE, RFMix, MOSAIC,TBWPT, and in-house Python and R scripts and other genomic software to capture this variation.
Anticipated Findings
We anticipate that we will identify founder populations that are distributed differently across the United States, and distinguish population subgroups that are finer grained than either racial categories or continental ancestry categories. For example, the Latino ethnicity comprises individuals who are Dominican, Puerto Rican, Mexican, Cuban, etc. We anticipate being able to distinguish these groups, as well as the admixture among these groups, to more accurately understand the contribution of ancestry to health outcomes. Quantification this ancestry is the first step to understanding the biological diversity within the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- William Jerome - Graduate Trainee, Albert Einstein College of Medicine
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Collaborators:
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
- Kevin Tao - Graduate Trainee, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
- Hersh Gupta - Graduate Trainee, Albert Einstein College of Medicine
- Chynna Smith - Graduate Trainee, Albert Einstein College of Medicine
- chenxin zhang - Project Personnel, Albert Einstein College of Medicine
- Tinaye Mutetwa - Graduate Trainee, Albert Einstein College of Medicine
Not_Assigned_Duplicate of Identity-by-descent in the United States
Scientific Questions Being Studied
We are leveraging the full genomic and population diversity of the All of Us project to understand the genetic ancestral basis of diversity in the causes, etiology and treatment of health outcomes. All of Us provides the racial and ethnic background of participants, but these are inaccurate proxies for genetic ancestry, which will help us understand the contribution of genetic ancestral differences among individuals to the biological basis of health outcomes. Therefore, we will measure genetic diversity, identify the genetic ancestry of All of Us participants throughout the United States. This information will help us better understand biological variation that contributes to differences in health outcomes.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
We are first quantifying fine scale population substructure using genomic approaches that measure: a) global genetic diversity, or the total proportion of different global ancestries represented in an individual's genome; b) local genetic ancestry, or where in the genome this ancestry is located in an individual; c) detection of genomic segments shared identity-by-descent (IBD). These IBD segments are segments of DNA shared between individuals from a shared common ancestor. We are using Hail, PLINK, ADMIXTURE, RFMix, MOSAIC,TBWPT, and in-house Python and R scripts and other genomic software to capture this variation.
Anticipated Findings
We anticipate that we will identify founder populations that are distributed differently across the United States, and distinguish population subgroups that are finer grained than either racial categories or continental ancestry categories. For example, the Latino ethnicity comprises individuals who are Dominican, Puerto Rican, Mexican, Cuban, etc. We anticipate being able to distinguish these groups, as well as the admixture among these groups, to more accurately understand the contribution of ancestry to health outcomes. Quantification this ancestry is the first step to understanding the biological diversity within the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Chynna Smith - Graduate Trainee, Albert Einstein College of Medicine
Collaborators:
- William Jerome - Graduate Trainee, Albert Einstein College of Medicine
- Tinaye Mutetwa - Graduate Trainee, Albert Einstein College of Medicine
- Srilakshmi Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
- Kevin Tao - Graduate Trainee, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
- Hersh Gupta - Graduate Trainee, Albert Einstein College of Medicine
- chenxin zhang - Project Personnel, Albert Einstein College of Medicine
- Varun Gupta - Project Personnel, Albert Einstein College of Medicine
Duplicate of test_cancers_AoUv7
Scientific Questions Being Studied
understand the role of ancestry in cancer risk in diverse populations. We aim to look at a variety of cancers that show population variation in outcome.
Project Purpose(s)
- Disease Focused Research (cancers)
- Educational
Scientific Approaches
Datasets: All of us
Methods: statistical epidemiology - significant enrichment of genetic and non-genetic risk factors in cancer risk
Anticipated Findings
we anticipate that we'll find variation in risk factors among individuals from different ethnicities and different ancestral backgrounds
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Duplicate of test_genetic structure_AoUv7
Scientific Questions Being Studied
I aim to understand the population genetic structure and genetic ancestry of All of Us participants. By doing so, and understanding their spatial variation, I hope to inform local healthcare strategies that benefit the local population.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
Dataset: WGS of AoU participants
Methods: ADMIXTURE, Identity by descent, network and cluster analysis
Tools: Hail, Python, PLINK
Anticipated Findings
1. elucidate population structure
2. better understanding of ancestry that will inform public health
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Collaborators:
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
Chynna Duplicate of Identity-by-descent in the United States
Scientific Questions Being Studied
We are leveraging the full genomic and population diversity of the All of Us project to understand the genetic ancestral basis of diversity in the causes, etiology and treatment of health outcomes. All of Us provides the racial and ethnic background of participants, but these are inaccurate proxies for genetic ancestry, which will help us understand the contribution of genetic ancestral differences among individuals to the biological basis of health outcomes. Therefore, we will measure genetic diversity, identify the genetic ancestry of All of Us participants throughout the United States. This information will help us better understand biological variation that contributes to differences in health outcomes.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
We are first quantifying fine scale population substructure using genomic approaches that measure: a) global genetic diversity, or the total proportion of different global ancestries represented in an individual's genome; b) local genetic ancestry, or where in the genome this ancestry is located in an individual; c) detection of genomic segments shared identity-by-descent (IBD). These IBD segments are segments of DNA shared between individuals from a shared common ancestor. We are using Hail, PLINK, ADMIXTURE, RFMix, MOSAIC,TBWPT, and in-house Python and R scripts and other genomic software to capture this variation.
Anticipated Findings
We anticipate that we will identify founder populations that are distributed differently across the United States, and distinguish population subgroups that are finer grained than either racial categories or continental ancestry categories. For example, the Latino ethnicity comprises individuals who are Dominican, Puerto Rican, Mexican, Cuban, etc. We anticipate being able to distinguish these groups, as well as the admixture among these groups, to more accurately understand the contribution of ancestry to health outcomes. Quantification this ancestry is the first step to understanding the biological diversity within the United States.
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Chynna Smith - Graduate Trainee, Albert Einstein College of Medicine
Collaborators:
- William Jerome - Graduate Trainee, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
- Kevin Tao - Graduate Trainee, Albert Einstein College of Medicine
- Ishana Raghuram - Project Personnel, University of California, Berkeley
- Hersh Gupta - Graduate Trainee, Albert Einstein College of Medicine
- chenxin zhang - Project Personnel, Albert Einstein College of Medicine
- Tinaye Mutetwa - Graduate Trainee, Albert Einstein College of Medicine
Leveraging health systems data to achieve health equity in the US
Scientific Questions Being Studied
There are many people living in the US with different racial and ethnic backgrounds. Our knowledge accumulated through genomic researches, however, is biased toward European ancestry. The disease risk of an individuals is affected by their race, ethnicity and ancestry due to shared genetic and environmental factors. Therefore, it is important to understand genetic diversity in the US to achieve the health equity. All Of Us contains genomic data of more than 160K individuals with diverse ancestry background. So we expect to identify fine scale population structure in the US and and detect genomic factors underlying the rare and common disease in various populations in the US.
Project Purpose(s)
- Ancestry
Scientific Approaches
We will analyze fine scale population structure by detecting Identity by descent (IBD) segments shared between every pair of individuals. Through this approach we can identify IBD clusters based on shared ancestry, which probably would include underrepresented populations in genomic researches so far. We can also see geographic distribution of each cluster using zip code data and the strength of founder event for each cluster. Then, we will examine association between each cluster and disease and detect variants associated with rare and common disease by IBD mapping.
Anticipated Findings
We anticipate that we can find many IBD clusters with different ancestry backgrounds across the US. Each cluster may have different disease risks. The frequency and variation of genetic factors may be different between clusters, too. These information would be helpful for the disease screening, diagnosis, and treatment. Our findings may be able to improve the health disparity.
Demographic Categories of Interest
- Race / Ethnicity
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
Collaborators:
- Srilakshmi Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
- Varun Gupta - Project Personnel, Albert Einstein College of Medicine
test_genetic structure
Scientific Questions Being Studied
I aim to understand the population genetic structure and genetic ancestry of All of Us participants. By doing so, and understanding their spatial variation, I hope to inform local healthcare strategies that benefit the local population.
Project Purpose(s)
- Population Health
- Ancestry
Scientific Approaches
Dataset: WGS of AoU participants
Methods: ADMIXTURE, Identity by descent, network and cluster analysis
Tools: Hail, Python, PLINK
Anticipated Findings
1. elucidate population structure
2. better understanding of ancestry that will inform public health
Demographic Categories of Interest
This study will not center on underrepresented populations.
Data Set Used
Controlled TierResearch Team
Owner:
- Sri Raj - Early Career Tenure-track Researcher, Albert Einstein College of Medicine
Collaborators:
- Mariko Segawa - Research Fellow, Albert Einstein College of Medicine
You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.