Posted on :: Tags:

Introduction

I am interested in almost all problems in computational biology and genomics. I expect a student to propose novel statistical approaches that can address challenges in data analysis and modelling of high-dimensional, large-volume biological problems.

Feel free to contact me (ypp@stat.ubc.ca).

Format

You may organize your report including the following sections.

  • Problem definition (1 page): Extract mathematical/statistical problems from the paper and organize them. What are the input data? What is the expected output?

  • Significance (1 paragraph): Why is this an interesting problem? What can be learned by studying this problem? Why is it exciting for you? Author contribution: How did the author(s) find the solution? What was a novel contribution beyond traditional approaches?

  • Limitations/challenges (1 paragraph): What are the assumptions? Are they realistic? What are the technical limitations that the authors acknowledge or not?

  • Novel idea/methods (1-2 pages): Propose your idea and statistical methods. You could interpret the underlying problem in a different formulation. What are related problems/frameworks, but not adopted by the authors?

  • Results (1-2 pages): Include one figure that sketches your approaches. Show tables and figures that clearly demonstrate your methods.

  • Discussion (1 page): Briefly discuss what you have learned and what you would achieve if you were to develop this to a full paper. How would you validate your findings in independent studies, including wet-lab experiments?

Available Papers

  1. Bridgeford, E. W., Powell, M., Kiar, G., Noble, S., Chung, J., Panda, S., Lawrence, R., Xu, T., Milham, M., Caffo, B., & Vogelstein, J. T. (2024). When no answer is better than a wrong answer: a causal perspective on batch effects. In bioRxiv (p. 2021.09.03.458920).

  2. Demirel, I., Alaa, A., Philippakis, A., & Sontag, D. (2024). Prediction-powered Generalization of Causal Inferences. International Conference on Machine Learning, 10385–10408.

  3. Madrigal, A., Lu, T., Soto, L. M., & Najafabadi, H. S. (2024). A unified model for interpretable latent embedding of multi-sample, multi-condition single-cell data. Nature Communications, 15(1), 6573.

  4. Qiu, Y., Sun, J., & Zhou, X.-H. (2023). Unveiling the unobservable: Causal inference on multiple derived outcomes. Journal of the American Statistical Association, 1–12.