64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

IPS 316 - Design and Analysis of Experiments for Data Science

Category: IPS
Wednesday 19 July 2 p.m. - 3:40 p.m. (Canada/Eastern) (Expired) Room 106

View session detail

ata sciences are posited with great challenge on data size, heterogeneity, and quality of data. The amount and quality of information extracted from data is often driven by how data were collected and analyzed according to the type of experiments. Thus, design and analysis of experiments for data science is a key to achieve analysis efficiency and computation efficiency. The purpose of this invited session is to bring five statistics experts working in this area to showcase how experimental design and analysis plays an important role in Data Science. We have a special issue on “Design and Analysis of Experiments for Data Science” at the New England Journal of Statistics in Data Science (NEJSDS). Detailed information can be found at https://journal.nestat.org/news/Design_and_Analysis.  These four speakers are selected from those who have agreed to submit their work to this special issue. The discussant is one of the co-editors for this special issue.

The five participants are Drs. John Stufken from George Mason University (USA), Luc Pronzato from Laboratoire I3S - Sophia Antipolis (France), C. Devon Lin from Queen’s University (Canada), Lulu Kang from Illinois Institute of Technology (USA), and Simon Mak from Duke University (USA). The five speakers also well represent a gender diversity: Drs. Lulu Kang and C. Devon Lin are female while the other three speakers are male. 

Experimental design and analysis have wide generality and significant advantages for gaining attractive inferential and computational properties. For example, as extraordinary amounts of data are being produced in many branches of science, proven statistical methods are no longer applicable with extraordinary large datasets due to computational limitations. A critical step in big data analysis is data reduction, which is an experimental design problem. Many newly developed methodology in this field have important applications in data sciences. This session aims to cover some representative work on relevant practical problems., such as multi-stage multi-fidelity Gaussian process model for computer experiments, subdata selection for data reduction in data science, variational inference for computation efficiency in data science, optimal designs for nonlinear model in data science.    
 
We hope this session will help facilitate cross-fertilization of experimental design and analysis and data science. Beyond the design of experiments and statistical learning communities, this session is expected to attract significant attentions from audience in broad fields of statistics and computer science.  The tentative titles for the talks are listed as follows:
 
1. Dr.  John Stufken, Professor, Department of Statistics, School of Computing of the College of Engineering and Computing, George Mason University, USA
    Title: Subdata Selection from Big Data with a Large Number of Variables
 
2. Dr. Simon Mak, Assistant Professor, Department of Statistical Science, Duke University, USA
    Title: Design and Analysis of Multi-stage Multi-fidelity Computer Experiments
 
3. Dr. Lulu Kang, Associate Professor, Department of Applied Mathematics, Applied Mathematics, Illinois Institute of Technology, USA
    Title:  Energetic Variational Inference with Non-Local Interaction 
 
4. Dr. Luc Pronzato, DR CNRS, Laboratoire I3S - Sophia Antipolis, France
    Title: Optimal designs for nonlinear model in data science
 
Session Format: Chair, 4 speakers, and 1 discussant

 

Organiser: Prof. Chunfang Devon Lin 

Chair: Prof. Chunfang Devon Lin 

Speaker: Dr Simon Mak 

Speaker: Dr John Stufken 

Speaker: Ryan Lekivetz 

Good to know

This conference is currently not open for registrations or submissions.