Kathryn Whyte

Research Fellow, Columbia University

3 active projects

TOS Abstract

We are interested in assessing disease risk scores in the AOU dataset using survey, physical measures, EHR and Fitbit data. Using unsupervised and supervised ML to see if clusters have potential to improve current standards such as ACSVD, DRF and…

Scientific Questions Being Studied

We are interested in assessing disease risk scores in the AOU dataset using survey, physical measures, EHR and Fitbit data. Using unsupervised and supervised ML to see if clusters have potential to improve current standards such as ACSVD, DRF and others

Project Purpose(s)

  • Methods Development

Scientific Approaches

Datasets similar to current local NYC population from 2019 census data, using unsupervised (kmeans, knn) and supervised (regression) machine learning to determine if AOU data in framework of ACSVD and DRF risk factor questionnaires perform better with different variables of interest.

Anticipated Findings

We anticipate that we will be able to replicate or improve risk factor assessments; risk factors may change with subgroups.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

Duplicate of NOVA

Historically, dietary quality has been assessed via singular nutrient categories. Recently, there has been more attention turned to ultra-processed food (UPF) from the NOVA groups as major factor in multiple non communicable disease. The many facets include calorie dense and…

Scientific Questions Being Studied

Historically, dietary quality has been assessed via singular nutrient categories. Recently, there has been more attention turned to ultra-processed food (UPF) from the NOVA groups as major factor in multiple non communicable disease. The many facets include calorie dense and inexpensive convenience items.
However, limited centralized data is available related to industrial ingredients' impacts/interaction on human health. From maternal nutrition throughout the life-cycle, this is a wildly underresearched dietary quality lens. This may be due to limited consensus and lack of collaboration or awareness on nonnutrition fields. Utilizing recently converted code from Stata to Python, I would like to investigate if any statistically significant patterns are observed when NOVA is appled to All of Us data sets.

Project Purpose(s)

  • Methods Development

Scientific Approaches

Datasets include various populations, descriptive statistics and dietary assessment data. I developed NOVA coding with major input from the creators of NOVA for interoperability with Python.

Anticipated Findings

Examining populations related to race, ethnicity, age, geographical location and other comobidities will reveal phenotypic differences between high UPF consumers and low UPF consumers. High and low categories to be defined per cohort. Results from the exploratory analyses may inform future precision nutrition interventions and policy related to food industry marketing to children.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

NOVA

Historically, dietary quality has been assessed via singular nutrient categories. Recently, there has been more attention turned to ultra-processed food (UPF) from the NOVA groups as major factor in multiple non communicable disease. The many facets include calorie dense and…

Scientific Questions Being Studied

Historically, dietary quality has been assessed via singular nutrient categories. Recently, there has been more attention turned to ultra-processed food (UPF) from the NOVA groups as major factor in multiple non communicable disease. The many facets include calorie dense and inexpensive convenience items.
However, limited centralized data is available related to industrial ingredients' impacts/interaction on human health. From maternal nutrition throughout the life-cycle, this is a wildly underresearched dietary quality lens. This may be due to limited consensus and lack of collaboration or awareness on nonnutrition fields. Utilizing recently converted code from Stata to Python, I would like to investigate if any statistically significant patterns are observed when NOVA is appled to All of Us data sets.

Project Purpose(s)

  • Methods Development

Scientific Approaches

Datasets include various populations, descriptive statistics and dietary assessment data. I developed NOVA coding with major input from the creators of NOVA for interoperability with Python.

Anticipated Findings

Examining populations related to race, ethnicity, age, geographical location and other comobidities will reveal phenotypic differences between high UPF consumers and low UPF consumers. High and low categories to be defined per cohort. Results from the exploratory analyses may inform future precision nutrition interventions and policy related to food industry marketing to children.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Research Team

Owner:

1 - 3 of 3
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.