Recent advances of large-scale data integration and meta-analysis
Category: International Statistical Institute
1. Presenter: Yingying Wei (Prof., Chinese University of Hong Kong)
Title: Integrating multiple single-cell RNA-seq datasets for differential inference
When performing joint cell type clustering by integrating multiple Single-cell RNA-seq (scRNA-seq) datasets, people always ignore the treatment or biological conditions of the cells. Here, we propose a Bayesian hierarchical model to rigorously quantify the treatment effects on both cellular compositions and cell-type-specific gene expression levels for scRNA-seq data and an algorithm to handle the large number of cells. Application of our proposed method to pancreatic scRNA-seq datasets demonstrates that considering the biological conditions further boosts the clustering accuracy and identifies cell-type-specific and condition-specific differentially expressed genes.
2. Presenter: George Tseng (Prof., University of Pittsburgh)
Title: On p-value combination of independent and frequent signals: asymptotic efficiency and Fisher ensemble
Here we focus on revisiting a classical scenario of p-value combination, combining a small number of p-values while the sample size generating each p-value goes to infinity. We evaluate many traditional and recently developed modified Fisher's methods to investigate their asymptotic efficiencies and finite-sample performance and concludes that Fisher and adaptively weighted Fisher method have top performance and complementary advantages across different proportions of signals. Then we propose a so-called Fisher ensemble method that combines these two Fisher-related methods using the harmonic mean ensemble approach and show that it achieves asymptotic Bahadur optimality and integrates the strengths of both methods in simulations. We subsequently extend Fisher ensemble for concordant effect size directions. A transcriptomics meta-analysis application confirms the theoretical and simulation conclusions.
3. Presenter: Ming-Chieh Shih (Prof., National Dong-Hua University)
Title: Validation of observational data evidence for treatment effects with randomized clinical trials.
Randomized clinical trials provide unbiased treatment effect estimates by design; however, the inclusion criteria of randomized clinical trials are often limited. Therefore, for certain target population, one must turn to observational studies to infer treatment effects, at the risk of bias within these observations. Here we propose a test that validates the conditional average treatment effect estimates from observational studies using randomized clinical trial data based on a maximum moment restrictions approach. We show that this test has asymptotic power of one and demonstrate its properties using real-world data from Women's Health Initiative.
4. Presenter: Zhonghua Liu (Prof., University of Hong Kong)
Title: Mendelian Randomization Mixed-Scale Treatment Effect Robust Identification and Estimation for Causal Inference
Standard Mendelian randomization analysis can produce biased results if the genetic variant defining an instrumental variable (IV) is confounded and/or has a pleiotropic effect on the outcome not mediated by the treatment variable. We propose a novel approach, called Mendelian Randomization Mixed-Scale Treatment Effect Robust Identification (MR MiSTERI), with conditions that can identify a causal effect even with an invalid IV and is advantageous in the presence of pervasive heterogeneity of pleiotropic effects on the additive scale. In order to incorporate multiple, possibly correlated and weak invalid IVs, we develop a MAny Weak Invalid Instruments (MR MaWII MiSTERI) approach for strengthened identification and improved estimation accuracy. Results from simulation studies and data analysis demonstrate the robustness of the proposed methods.