Sergio Gonzales

Graduate Trainee, Stanford University

2 active projects

Duplicate of How to Get Started with Registered Tier Data (tier 5)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? This notebook will give you an overview of what data is available in the current…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect? This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Record (EHR), Physical Measurements (PM), and Survey data.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

This Tutorial Workspace contains two Jupyter Notebooks (one written in Python, the other in R). Each notebook is divided into the following sections:

1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
2. Data Availability Part 1: How to summarize the number of unique participants with major data types: Physical Measurements, Survey, and EHR;
3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type;
4. Data Organization: An explanation of how data is organized according to our common data model.
5. Example Queries: How to directly query the CDR, using two examples of SQL queries to extract demographic data.
6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, you will understand the following:

All of Us data are made available in a Curated Data Repository. Participants may contribute any combination of survey, physical measurement, and electronic health record data. Not all participants contribute all possible data types. Each unique piece of health information is given a unique identifier called a concept_id and organized into specific tables according to our common data model. You can use these concept_ids to query the CDR and pull data on specific health information relevant to your analysis. See our support article Learning the Basics of the All of Us Dataset for more info.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Research Team

Owner:

sap

Assigned sex at birth (ASAB) is a socially defined, administrative variable but it is used in countless studies as a proxy for physiology as it relates to sexual development and its contribution to the etiology of disease. Use of ASAB…

Scientific Questions Being Studied

Assigned sex at birth (ASAB) is a socially defined, administrative variable but it is used in countless studies as a proxy for physiology as it relates to sexual development and its contribution to the etiology of disease. Use of ASAB in this manor is therefore ill posed as ASAB is not purely a function of biology. Moreover, it excludes individuals with differences in sexual development (DSD), those receiving gender affirming treatments that alter physiology (e.g. hormones), and those receiving care that alter sex physiology unrelated to gender identity (e.g. orchiectomy for testicular cancer).

The focus of this work is create and validated a method for computing a real-valued representation of sex associated physiology (SAP) that depends only physiologic measurements. Using the biomarker data and health records, I am planning to train an auto-encoder that maps biomakers to single variable. I hypothesize this variable will better explain diseases that ASAB.

Project Purpose(s)

  • Methods Development
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Approaches

The auto-encoder I plan to develop will take the biomarker measurements that are known to vary because of SAP (e.g. sex hormones and hemoglobin) as inputs. It will be a multi-task network with outputs for which SAP is likely to contribute (e.g. autoimmune and thyroid disorders) but not a necessary factor in development of disease (e.g. ovarian cancer). I intend only to use All of Us data. I will use a variety of methods to create a network architecture with out a priori design choices: adaptive parameter sharing and uncertainty weighted loss functions.

Once the auto-encoder is trained, I will compute the SAP for participants of All of Us and then compare the performance of models that use SAP as in input instead of ASAB with standard metrics such as: AUC-RO, AUC-PR, F-1, MSE, etc. I will also evaluate how well calibrated the models are for ASAB, gender identity, and access to care.

Anticipated Findings

I expect to find that choice of SAP as an input over ASAB improves model performance and calibration. I expect SAP will vary with time and duration of gender affirming care. This will lead to more robust estimation of a individual's risk for disease, particularly for gender minorities as their physiology changes. I hope this research shows how complicated human sexual physiology is and that ASAB is reductive and an inappropriate choice in research thereof.

Many diseases use ASAB as an important control (e.g. auto immune disorders) but the true role of sex physiology in their etiology is poorly understood. Work on a real-valued representation of sex will highlight the need for research that is more inclusive notions of sex as well as how the menstrual cycle contributes to human health.

Demographic Categories of Interest

  • Age
  • Sex at Birth
  • Gender Identity
  • Access to Care

Research Team

Owner:

1 - 2 of 2
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.