Project Ideas

Below, we’ve suggested some project ideas and resources which we hope you will find helpful for your final projects. This list contains only a small subset of potential directions to start brainstorming ideas in your groups. You are allowed to choose any original research question that is related to the course content. We will continually update this page throughout the semester.

TopicPotential questions Some resources (papers, benchmarks)
Causal structure discovery
  • How to reconstruct the causal structure of a data generating process using observational and interventional data?

  • Implement a benchmark to compare existing causal discovery algorithms (constraint-based, score-based, etc.) on various synthetic (linear/non-linear, Gaussian noise, hidden confounders, etc.) and real-world datasets with known causal graphs.

  • How to extend existing causal discovery algorithms to heterogeneous populations (i.e., having sub-populations with different causal structures) or time-series settings?

  • How to extract causal graphs from language models?
  • Papers: link1, link2, link3

    Datasets/Benchmarks:
    - Database with cause-effect pair
    - Sachs proteins
    - ADNI dataset with known graph
    - Some useful packages:link1, link2, link3, link4
    Subgroup detection in heterogeneous populations
  • How to identify sub-populations with similar treatment effects in a heterogeneous population using randomized clinical trials (RCTs), observational data, both RCT and observational studies, and/or in dynamic longitudinal settings?
  • link1, link2, link3
    Causal estimation
  • Create a benchmark to compare existing methods for estimating average treatment effect (ATE) and/or conditional ATE on a set of observational datasets. Evaluate each method in different settings, including continuous treatments, high-dimensional covariates, small/infinite sample sizes, access to experimental data, and longitudinal datasets.

  • How to use causal inference for drug repurposing?
  • Papers: link1, link2, link3

    Datasets/Benchmarks:
    - IWPC dataset (continuous treatment)
    - BioLINCC clinical trials
    - Student teacher achievement ratio dataset
    - Other useful datasets: link1, link2,
    - Some useful packages:link1, link2, link3, link4
    Partial identification and sensitivity analysis
  • Hidden confounding often results in non-identifiability of causal estimands, i.e., having multiple correct solutions. Researchers often make parametric assumptions (e.g., linearity of the SCM) or specific types of graphs (instrumental variable graphs) to get informative estimation bounds on the causal estimand. What other (parametric) assumptions can lead to similar results?

  • Most existing methods for causal estimation heavily rely on the knowledge of causal graphs. How to improve their robustness if the causal network is only known up to an approximation (e.g., Markov equivalent class)?
  • link1, link2, link3
    Causal reinforcement learning
  • How to develop reinforcement learning (RL) algorithms that can explore the (latent) causal structure of the environment?

  • How to model the prior knowledge as a (hierarchical) causal world model to increase exploration efficiency for new RL tasks?
  • Papers/tutorials: link1, link2

    Datasets/Benchmarks:
    - CausalWorld
    - CausalCity
    Other topics
  • An overview of how the human brain encodes causal information

  • How to learn causal relationships between high-level macro variables from micro-level measurements?

  • One approach for imposing fairness in predictive models is to assess the causal links between sensitive features and the outcome and adjust for that. How to define “optimal” fair predictive models using the language of causality?
  • link

  • link


  • link1, link2
  • Surveys:
  • Causal machine learning for healthcare and precision medicine
  • Causal Machine Learning: A Survey and Open Problems
  • A Review of Causality for Learning Algorithms in Medical Image Analysis

  • Workshops/Conferences/Tutorials:
  • CLeaR (Causal Learning and Reasoning) 2022
  • Causal Inference & Machine Learning: Why now?
  • Causal Discovery & Causality-Inspired Machine Learning
  • Causality and Deep Learning: Synergies, Challenges and the Future
  • Elements of Reasoning: Objects, Structure, and Causality