To join this seminar virtually: Please request Zoom connection details from headsec [at] stat.ubc.ca.
Time: 11:00am – 11:30am
Speaker: Liam Gilson, UBC Statistics MSc student
Title: Extending an invariant models method to domain generalization settings in ecological data
Abstract: Applied ecology and forest ecology in particular are challenged by changing environmental conditions, especially climate change, situations which often necessitate prediction under unobserved future conditions. This can be described as a domain generalization problem, a setting in which examples from the test task are not available at training time. Present, commonly used methods for model selection are often inappropriate in this setting, and do not emphasize reducing catastrophic error under domain shift. I propose an extension of a model selection method applicable to a limited domain generalization setting, invariant models, to situations commonly encountered in ecological data, specifically where linear mixed-effect models might be applied. In these situations, some predictor variables may be unobserved or otherwise unavailable at training time, complicating the domain generalization setting further. This method is investigated in a series of simulations, and on an example forest ecology dataset.
Time: 11:30am – 12:00pm
Speaker: Giuliano Netto Flores Cruz, UBC Bioinformatics MSc student
Title: Evaluating omics-based tests with Bayesian Decision Curve Analysis
Abstract: Omics-based tests (OBTs) combine high-dimensional omics features into clinical prediction models that predict diagnosis, prognosis, or treatment effects. Past incidences of premature implementation of OBTs into clinical trials have demonstrated the need for increased rigour in their clinical evaluation. However, their performance assessment is often limited to classification metrics such as sensitivity and specificity, with little regard for formal analysis of clinical decision-making. Decision curve analysis (DCA) complements classification metrics by combining classical assessment of predictive performance with the consequences of using a test or model to guide clinical decisions. In DCA, the best clinical decision strategy, such as diagnosing or treating based on an OBT, is the one that maximizes the concept of net benefit: the net number of true positives (or negatives) provided by a given clinical decision strategy. Before reaching real patients, we must be sufficiently confident that new OBTs actually provide superior clinical decision strategies, as compared to default, standard-of-care strategies. Trained on hundreds to thousands of features, OBTs are particularly prone to chance results. In this context, the present work develops parametric Bayesian approaches to DCA that allow uncertainty quantification around four fundamental concerns when evaluating OBT-guided clinical decision strategies: (i) which strategies are clinically useful, (ii) what is the best available decision strategy, (iii) direct pairwise comparisons between strategies, and (iv) what is the consequence of the current level of uncertainty. We evaluate the methods using simulation studies and present a comprehensive case study. We also provide an application to a recently-developed OBT for multi-cancer early detection. Software implementation of the method is freely available in the bayesDCA R package. Ultimately, the Bayesian DCA workflow may help clinicians and health policymakers make better-informed decisions when choosing and implementing clinical decision strategies based on OBTs.
arXiv preprint: https://arxiv.org/abs/2308.02067/