Skip to main content

Statistics Seminar

Enhancing the Study of Microbiome-Metabolome Interactions: A Transfer-Learning Approach for Precise Identification of Essential Microbes

Abstract: Recent research has revealed the essential role that microbial metabolites play in host-microbiome interactions. Although statistical and machine-learning methods have been employed to explore microbiome-metabolome interactions in multiview microbiome studies, most of these approaches focus solely on the prediction of microbial metabolites, which lacks biological interpretation. Additionally, existing methods face limitations in either prediction or inference due to small sample sizes and highly correlated microbes and metabolites. To overcome these limitations, we present a transfer-learning method that evaluates microbiome-metabolome interactions. Our approach efficiently utilizes information from comparable metabolites obtained through external databases or data-driven methods, resulting in more precise predictions of microbial metabolites and identification of essential microbes involved in each microbial metabolite. Our numerical studies demonstrate that our method enables a deeper understanding of the mechanism of host-microbiome interactions and establishes a statistical basis for potential microbiome-based therapies for various human diseases.

 

Date:
-
Location:
MDS 220
Event Series:

Cost of Sequential Adaptation

Abtract: Possibility of early stopping or interim sample size re-estimation lead random sample sizes. If these interim adaptations are informative, the sample size becomes a part of a sufficient statistic. Consequently, statistical inference based solely on the observed sample or the likelihood function does not use all available statistical evidence. In this work, we quantify the loss of statistical evidence using (expected) Fisher Information (FI) because observed Fisher information as a function of the likelihood does not capture the loss of statistical evidence. We decompose the total FI into the sum of the design FI and a conditional on design FI. Further, the conditional on design FI is represented as a weighted linear combination of FI conditional on realized decisions. The decomposition of total FI is useful for making a few practically useful conclusions for designing sequential experiments. In addition, this FI decomposition is used to derive a sequential version of the Cramer-Rao Lower Bound (CRLB) for estimators' mean squared errors. For a given sequential design, when the data are generated from one-parameter exponential family with canonical parameterization, the sequential CRLB is attained. Theoretical results are illustrated with a simple normal case of a two-stage design with a possibility of early stopping.

 

Link to speaker bio: 

https://www.mcw.edu/departments/biostatistics/people/sergey-tarima-phd&…;

 

Date:
Location:
MDS 220
Event Series:

Running Markov chain without Markov basis

 

The methodology of Markov basis initiated by Diaconis and Sturmfels (1998) stimulated active research on Markov bases for more than a decade. It also motivated improvements of algorithms for Gr\"obner basis computation  for toric ideals, such as those implemented in 4ti2. However at present explicit forms of Markov bases are known only for some relatively simple models, such as the decomposable models of contingency tables. Furthermore general algorithms for Markov bases computation often failto produce Markov bases even for moderate-sized models in a practical amount of time. Hence so far we could not perform exact tests based on Markov basis methodology for many important practical problems. In this talk we introduce two alternative methods for running Markov chain instead of using a Markov basis. The first one is to use a Markov subbasis for connecting practical fibers. The second one is to use a lattice basis which is an integer kernel of a design matrix.

Date:
-
Location:
University of Kentucky, Statistics Department MDS 223 Refresments: 3:30-4:00 Seminar: MDS 312
Event Series:

Marginal correlation measures for unpaired clustered data under cluster-based informativeness

In the marginal analysis of clustered data, two types of informativeness have been shown to bias standard method for marginal inference: informative cluster size, in which the number of observations in a cluster is associated with a response variable, and subcluster covariate informativeness, in which the probability that a covariate takes a certain value is associated with the response.  Monte Carlo-based within-cluster resampling estimators and cluster- and covariate-weighted analytic estimators have been suggested to adjust for both of these problems.  In this talk, we suggesting a unifying cluster-weighting paradigm for the marginal analysis of clustered data.  We then apply this paradigm to unpaired, clustered data - data which are paired at the cluster level, but unpaired within cluster - and develop marginal correlation estimators for such data.  The suggested estimators are evaluated through simulations studies, and illustrated with an application to a data from a longitudinal dental study.

Date:
-
Location:
University of Kentucky, Statistics Department MDS 223 Refresments: 3:30-4:00 Seminar: MDS 312
Event Series:

Yuguo Chen (from U of Illinois at Urbana-Champaign)

Sampling for Conditional Inference on Network Data

Random graphs with given vertex degrees have been widely used as a model for many real-world complex networks. We describe a sequential sampling method for sampling networks with a given degree sequence. These samples can be used to approximate closely the null distributions of a number of test statistics involved in such networks, and provide an accurate estimate of the total number of networks with given vertex degrees. We apply our method to a range of examples to demonstrate its efficiency in real problems.
 
Personal webpage:
Date:
-
Location:
CB 102
Event Series:
Subscribe to Statistics Seminar