Mark Weiner

Mid-career Tenured Researcher, Cornell University

5 active projects

SMILE-PD

This workspace is part of the NIH-funded SMILE-PD project -- Similarity Matching in Longitudinal Electronic Patient Data. The goal is to enable AoU users to select "control" patients that match "case" patients on demographic and clinical characteristics in a manner…

Scientific Questions Being Studied

This workspace is part of the NIH-funded SMILE-PD project -- Similarity Matching in Longitudinal Electronic Patient Data. The goal is to enable AoU users to select "control" patients that match "case" patients on demographic and clinical characteristics in a manner that is intuitive and easy to use while also being rigorous, reproducible and scientifically appropriate.

Project Purpose(s)

  • Methods Development

Scientific Approaches

Analogous to the Word2Vec technique used in Natural Language Processing, this patient-matching algorithm implements a "Patient2Vec" model in which , we we first define the temporal “context” around each event in the EHR sequence. The “context” around event A is the collection of events happening before and after A within a certain time window in the patient EHR corpus. Deriving effective word representations by incorporating contextual information is a fundamental problem in NLP and has been extensively studied. One recent advance to address this issue is the “Word2Vec” technique that trains a two-layer neural network from a text corpus to map each word into a vector space encoding the word contextual correlations. The similarities (usually cosine distance) evaluated in such embedded vector space reflect the contextual associations (e.g., words A and B with high similarity suggests they tend to appear in the same context).

Anticipated Findings

We expect this work to support all users of AoU data that develop risk assessment models or conduct comparative effectiveness research. In both use cases, analyses need to compare exposed cohorts with statistically similar unexposed cohorts. SMILE-PD will help used create an appropriately matched control cohort.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

V7 PASC Workspace

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Scientific Questions Being Studied

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Project Purpose(s)

  • Educational
  • Ancestry
  • Other Purpose (practice notebook to familiarize with RW)

Scientific Approaches

We will apply algorithms developed by the RECOVER PCORnet Adult Cohort and compare the overlap in cohorts with the set derived though the N3C algorithm

Anticipated Findings

We expect to find a high degree of concordance between the RECOVER Adult Cohort algorithm and the N3C algorithm, even though the approaches were developed through different machine learning methods on different source patient data sets

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Lina Sulieman - Other, All of Us Program Operational Use

AOU_Recover_Long_Covid_v6

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Scientific Questions Being Studied

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

To achieve this objective, data science workflows were used to apply ML algorithms on the Researcher Workbench. This effort allowed an expansion in the number of participants used to evaluate the ML models used to identify risk of PASC/Long COVID and also serve to validate the efforts of one team and providing insight to other teams. These models were implemented within the All of Us Controlled Tier data (C2022Q2R2), which was last refreshed on June 22, 2022. We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Anticipated Findings

We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • WeiQi Wei - Other, All of Us Program Operational Use
  • Vern Kerchberger - Early Career Tenure-track Researcher, Vanderbilt University Medical Center
  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Mark Weiner - Mid-career Tenured Researcher, Cornell University
  • Hiral Master - Project Personnel, All of Us Program Operational Use
  • Gabriel Anaya - Administrator, National Heart, Lung, and Blood Institute (NIH - NHLBI)
  • David Mohs - Other, All of Us Program Operational Use
  • Christopher Lord - Project Personnel, All of Us Program Operational Use
  • Chenchal Subraveti - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Jun Qian - Other, All of Us Program Operational Use
  • Chris Lunt - Other, All of Us Program Operational Use

Duplicate of Duplicate of Phenotype - Ischemic Heart Disease (v6)

The Notebooks in this workspace can be used to implement well-known phenotype algorithms in one’s own research.

Scientific Questions Being Studied

The Notebooks in this workspace can be used to implement well-known phenotype algorithms in one’s own research.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Phenotype Library Workspace created by the Researcher Workbench Support team. It is meant to demonstrate the implementation of key phenotype algorithms within the All of Us Research Program cohort.)

Scientific Approaches

Not Applicable

Anticipated Findings

By reading and running the Notebooks in this Phenotype Library Workspace, researchers can implement the following phenotype algorithms:

Christianne L. Roumie; Jana Shirey-Rice, Sunil Kripalani. Vanderbilt University. MidSouth CDRN - Coronary Heart Disease Algorithm. PheKB; 2014. Available from https://phekb.org/phenotype/234

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Mark Weiner - Mid-career Tenured Researcher, Cornell University

Practice Notebook to Explore AoU dataset

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Scientific Questions Being Studied

This project will explore the scope of patients with COVID-19 and the characteristics of patients with PASC.

Project Purpose(s)

  • Educational
  • Other Purpose (practice notebook to familiarize with RW)

Scientific Approaches

We will apply algorithms developed by the RECOVER PCORnet Adult Cohort and compare the overlap in cohorts with the set derived though the N3C algorithm

Anticipated Findings

We expect to find a high degree of concordance between the RECOVER Adult Cohort algorithm and the N3C algorithm, even though the approaches were developed through different machine learning methods on different source patient data sets

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Srushti Gangireddy - Project Personnel, Vanderbilt University Medical Center
  • Mark Weiner - Mid-career Tenured Researcher, Cornell University
  • Hiral Master - Project Personnel, All of Us Program Operational Use

Collaborators:

  • Aashri Aggarwal - Undergraduate Student, Cornell University
1 - 5 of 5
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.