Francis Ratsimbazafy

All of Us Program Operational Use

5 active projects

Descriptive Statistics

As a demonstration project, this study will present the overview of the data types available based on participant count, separating the surveys into Part 1 which includes the first three surveys ("The Basics”, “Overall Health” and “Lifestyle) participants completed, and…

Scientific Questions Being Studied

As a demonstration project, this study will present the overview of the data types available based on participant count, separating the surveys into Part 1 which includes the first three surveys ("The Basics”, “Overall Health” and “Lifestyle) participants completed, and Part 2 (“Healthcare Access & Utilization”, “Family History”, and “Personal Medical History”) which includes the second set of three surveys that were made available 90 days after enrollment. This study will also look at the overview of the electronic health records (EHR) data available and the physical measurements (PM) data obtained at time of enrollment to the program. We will also look at the total number of participants who have any survey response, PM, and EHR data combined and break it down by age, race, sex at birth, gender identity and look at the breakdown by under-representative biomedical research (UBR) groups.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This work is a result of an All of Us Research Program Demonstration Project. The projects are efforts by the Program designed to meet the program's goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. This work was reviewed and overseen by the All of Us Research Program Science Committee and the All of Us Data and Research Center to ensure compliance with program policy, including acceptable data access and use.)

Scientific Approaches

In this study, we will apply data visualization libraries to aggregate information about the Cohort. We will measure age by using the age reflected when the CDR was generated. Presence of a data type survey, PM, or EHR is counted if at least one observation is present within each category. We will use "The Basics" survey to select race and ethnicity and responses will be mapped to the race variable in the OMOP Person table. All participants responding ‘American Indian or Alaska Native’ will be removed from the CDR as All of Us engages the NIH Tribal Council on the research use of data. Program designations of status as UBR will be adapted to data available in the CDR .

Anticipated Findings

In this study, we anticipate creating plots to describe all of the participant breakdown by age, race, ethnicity, gender, sex at birth and per datatype. We will be using these plots for our All of Us Research Program Demonstration Projects publication as visuals describing the initial cohort released at Beta launch of the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Francis Ratsimbazafy - Other, All of Us Program Operational Use
  • Katie Roster - Graduate Trainee, New York Medical College School of Health Sciences and Practice
  • Jun Qian - Other, All of Us Program Operational Use
  • Eric Song - Administrator, All of Us Program Operational Use

Duplicate of R2020Q4R3 - How to Get Started with Registered Tier Data

This will be an exploratory process in order to assess availability of data sufficient to answer research questions into the utility of pharmacogenomics in modifying chronic comorbidities in individuals 65 years of age and older.

Scientific Questions Being Studied

This will be an exploratory process in order to assess availability of data sufficient to answer research questions into the utility of pharmacogenomics in modifying chronic comorbidities in individuals 65 years of age and older.

Project Purpose(s)

  • Disease Focused Research (pharmacogenomics of aging)
  • Population Health
  • Educational
  • Methods Development
  • Ancestry

Scientific Approaches

This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:

1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, you will understand the following:

All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.

Demographic Categories of Interest

  • Age
  • Geography

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of How to Get Started with Registered Tier Data (tier 5)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? This notebook will give you an overview of what data is available in the current…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect? This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Record (EHR), Physical Measurements (PM), and Survey data.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:

1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, you will understand the following:

All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of Cancer

We intend to explore the difference in the prevalence of cancer between the AoU population. In particular, we will be looking at the difference between the entire population, the subset with medical records, and the subset with self-reported data.

Scientific Questions Being Studied

We intend to explore the difference in the prevalence of cancer between the AoU population. In particular, we will be looking at the difference between the entire population, the subset with medical records, and the subset with self-reported data.

Project Purpose(s)

  • Population Health

Scientific Approaches

We intend to select a list of SNOMED codes corresponding to primary cancers to get the subset with cancer in the medical record

We intend to select the survey question asking about self-reported cancer to get the subset with self-reported cancer

Anticipated Findings

We expect the difference of cancer to vary between self-report and medical record, which could have implications for how cancer is measured on a population-level.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use

Demo - Hypertension Prevalence

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical…

Scientific Questions Being Studied

We are using the All of Us Researcher Workbench interface to answer the question, "Is hypertension prevalence in the All of Us Research Program similar to hypertension prevalence in the 2015–2016 National Health and Nutrition Examination Survey (NHANES) ?". Clinical approaches to understanding and treating hypertension may benefit from the integration of a precision medicine approach that integrates data on environments, social determinants of health, behaviors, and genomic factors that contribute to hypertension risk. Hypertension is a major public health concern and remains a leading risk factor for stroke and cardiovascular disease.

Project Purpose(s)

  • Other Purpose (This work is an AoU demo project. Demo projects are efforts by the AoU Research Program designed to meet the program goal of ensuring the quality and utility of the Research Hub as a resource for accelerating discovery in science and medicine. As an approved demo project, this work was reviewed and overseen by the AoU Research Program Science Committee and the AoU Data and Research Center to ensure compliance with program policy, including policies for acceptable data access and use. )

Scientific Approaches

In this cross-sectional, population-based study, we used All of Us baseline data from patient (age>18) provided information (PPI) surveys and electronic health record (EHR) blood pressure measurements and retrospectively examined the prevalence of hypertension in the EHR cohort using Systemized Nomenclature of Medicine (SNOMED codes and blood pressure medications recorded in the EHR. We used the EHR data (SNOMED codes on 2 distinct dates and at least one hypertension medication) as the primary definition, and then add subjects with elevated systolic or elevated diastolic blood pressure on measurements 2 and 3 from PPI. We extracted each participant’s detailed dates of SNOMED code for essential hypertension from the Researcher Workbench table ‘cb_search_all_events’. We calculated an age-standardized HTN prevalence according to the age distribution of the U.S. Census, using 3 groups (18-39, 40-59, ≥ 60).

Anticipated Findings

The prevalence of hypertension in the All of Us cohort is similar to that of published literature. All of Us age-adjusted HTN prevalence was 27.9% compared to 29.6% in National Health and Nutrition Examination Survey. The All of Us cohort is a growing source of diverse longitudinal data that can be utilized to study hypertension nationwide. The prevalence of hypertension varies in the United States (U.S.) by age, sex, and socioeconomic status. Hypertension can often be treated successfully with medication, and prevented or delayed with lifestyle modifications. Even with these established hypertension intervention and prevention strategies, the prevalence of hypertension continues to be at levels of public health concern. The diversity within All of Us may provide insight into factors relevant to hypertension prevention and treatments in a variety of social and geographic contexts and population strata in the U.S.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

1 - 5 of 5
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.