10th World Congress in Probability and Statistics

Plenary Tue-1

IMS Medallion Lecture (Laurent Saloff-Coste)

Conference
9:00 AM — 10:00 AM KST
Local
Jul 19 Mon, 8:00 PM — 9:00 PM EDT

Gambler's ruin problems

Laurent Saloff-Coste (Cornell University)

9
The classical gambler’s ruin problem asks for the probability that player A wins all the money in a fair game between two players, A and B.
For this lecture, our starting point is a fair game of this sort involving three players, A, B, and C, holding a total on N tokens. That's already quite interesting. More generally, I will discuss techniques that allow us to understand the behavior of certain finite Markov chains before the time the chain is absorbed at a given boundary. This is based on joint work with Persi Diaconis and Kelsey Houston-Edwards.

Session Chair

Qi-Man Shao (Chinese University of Hong Kong)

Plenary Tue-2

IMS Medallion Lecture (Elchanan Mossel)

Conference
10:00 AM — 11:00 AM KST
Local
Jul 19 Mon, 9:00 PM — 10:00 PM EDT

Simplicity and complexity of belief-propagation

Elchanan Mossel (Massachusetts Institute of Technology)

8
Belief Propagation is a very simple and popular algorithm for the inference of posteriors for probability models on trees based on iteratively applying Bayes' rule. It is widely used in coding theory, in machine learning, in evolutionary inference, among other areas. We will survey the distributional properties and statistical efficiency of Belief Propagation in some of the simplest models and applications of these to phylogenetic reconstruction and to detection of block models. Finally, we will discuss the computational complexity of this seemingly simple algorithm.

Session Chair

Krzysztof Burdzy (University of Washington)

Invited 04

Mathematical Population Genetics and Computational Statistics (Organizer: Paul Jenkins)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

Mapping genetic ancestors

Graham Coop (University of California at Davis)

2
Spatial patterns in genetic diversity are shaped by the movements of individuals dispersing from their parents and populations expanding and contracting. It has long been appreciated that these patterns of movement leave shape the underlying genealogies along the genome leading to geographic patterns of isolation by distance in contemporary population genetic data. The enormous amount of information contained in genealogies along recombining sequences has, up till now, not been amenable to this approach. However, it is now possible to infer a sequence of gene genealogies along a recombining sequence. Here we capitalize on this important advance and develop methods to use thousands of trees to estimate time-varying per-generation dispersal rates and to locate the genetic ancestors of a sample back through time. We take a likelihood approach using a simple approximate spatial model (Branching Brownian Motion) as our prior distribution of genealogies. After testing our method with simulations we apply it to the 1001 Genomes dataset of over one thousand Arabidopsis thalianagenomes sampled across a wide geographic extent. We detect a very high dispersal rate in the recent past, especially longitudinally, and use inferred ancestor locations to visualize many examples of recent long-distance dispersal and recent admixture events. We also use inferred ancestor locations to identify the origin and ancestry of the North American expansion, to depict alternative geographic ancestries stemming from multiple glacial refugia. Our method highlights the huge amount of largely untapped information about past dispersal events and population movements contained in genome-wide genealogies.

Cellular point processes: quantifying cell signaling

Barbara Engelhardt (Princeton University)

3
This talk does not have an abstract.

Fitting stochastic epidemic models to gene genealogies using linear noise approximation

Vladimir Minin (University of California, Irvine)

4
Phylodynamics is a set of population genetics tools that aim at reconstructing demographic history of a population based on molecular sequences of individuals sampled from the population of interest. One important task in phylodynamics is to estimate changes in (effective) population size. When applied to infectious disease sequences such estimation of population size trajectories can provide information about changes in the number of infections. To model changes in the number of infected individuals, current phylodynamic methods use non-parametric approaches (e.g., Bayesian curve-fitting based on change-point models or Gaussian process priors), parametric approaches (e.g., based on differential equations), and stochastic modeling in conjunction with likelihood-free Bayesian methods. The first class of methods yields results that are hard to interpret epidemiologically.
The second class of methods provides estimates of important epidemiological parameters, such as infection and removal/recovery rates, but ignores variation in the dynamics of infectious disease spread. The third class of methods is the most advantageous statistically, but relies on computationally intensive particle filtering techniques that limits its applications. We propose a Bayesian model that combines phylodynamic inference and stochastic epidemic models, and achieves computational tractability by using a linear noise approximation (LNA) --- a technique that allows us to approximate probability densities of stochastic epidemic model trajectories. LNA opens the door for using modern Markov chain Monte Carlo tools to approximate the joint posterior distribution of the disease transmission parameters and of high dimensional vectors describing unobserved changes in the stochastic epidemic model compartment sizes (e.g., numbers of infectious and susceptible individuals). We illustrate our new method by applying it to Ebola genealogies estimated using viral genetic data from the 2014 epidemic in Sierra Leone and Liberia.

Q&A for Invited Session 04

0
This talk does not have an abstract.

Session Chair

Paul Jenkins (University of Warwick)

Invited 18

Deep Learning (Organizer: Johannes Schmidt-Hieber)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

Dynamics and phase transitions in deep neural networks

Yasaman Bahri (Google Research)

5
The study of deep neural networks whose hidden layer widths are large has been fruitful in building theoretical foundations for deep learning. I will begin by surveying the result of our past work along these lines. For instance, infinitely-wide deep neural networks can be exactly described by Gaussian processes with particular compositional kernels, both in their prior and predictive posterior distributions. Furthermore, such infinite-width deep networks can be exactly described as linear models under gradient descent up to a maximum learning rate. At larger learning rates with squared loss, empirical evidence suggests a phase transition to a different, nonlinear regime with universal features across architectures and datasets. I will describe our theoretical understanding of this phase transition through the study of a class of simple dynamical systems distilled from neural network evolution in function space.

Theoretical understanding of adding noises to deep generative models

Yongdai Kim (Seoul National University)

9
Deep generative models have received much attention recently since they can generate realistic synthetic images. Recently, some researches have reported that adding noises to data is helpful to learn deep generative models. In this talk, we provide theoretical justifications of this method. We derive the convergence rate of the maximum likelihood estimator of a deep generative model and show that the convergence rate can be improved by adding noises in particular when the noise level of data is small.

Adversarial examples in random deep networks

Peter Bartlett (University of California at Berkeley)

5
Because the phenomenon of adversarial examples in deep networks poses a serious barrier to the reliable and robust application of this methodology, there has been considerable interest in why it arises. We consider ReLU networks of constant depth with independent gaussian parameters, and show that small perturbations of input vectors lead to large changes of outputs. Building on insights of Daniely and Schacham (2020) and of Bubeck et al (2021), we show that adversarial examples arise in these networks because the functions that they compute are very close to linear. The main result is for networks with constant depth, but we also show that some constraint on depth is necessary for a result of this kind: there are suitably deep networks that, with constant probability, compute a function that is close to constant.

Joint work with Sébastien Bubeck and Yeshwanth Cherapanamjeri

Q&A for Invited Session 18

0
This talk does not have an abstract.

Session Chair

Johannes Schmidt-Hieber (University of Twente)

Organized 07

Anomalous Diffusions and Related Topics (Organizer: Zhen-Qing Chen)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

Lp-Kato class measures for symmetric Markov processes under heat kernel estimates

Kazuhiro Kuwae (Fukuoka University)

4

Green function estimates and Boundary Harnack principles for non-local operators whose kernels degenerate at the boundary

Panki Kim (Seoul National University)

10
In this talk, we discuss the potential theory of Markov processes with jump kernels decaying at the boundary of the half space. The boundary part of kernel is comparable to the product of three terms with parameters appearing as exponents in these terms. The constant c in the killing term can be written as a function of a parameter p which is strictly increasing in p. We establish sharp two-sided estimates on the Green functions of these processes for all admissible values of p and parameters in the boundary part of kernel. Depending on the regions where parameters and p belong, the estimates on the Green functions are different. In fact, the estimates have three different forms depending on the regions the parameters belong to. As applications, we completely determine the region of the parameters where the boundary Harnack principle holds or not. This talk is based on joint works with Renming Song and Zoran Vondracek.

Heat kernel upper bounds for symmetric Markov semigroups

Jian Wang (Fujian Normal University )

6
It is well known that Nash-type inequalities for symmetric Dirichlet forms are equivalent to on-diagonal heat kernel upper bounds for associated symmetric Markov semigroups. In this talk, we show that both imply (and hence are equivalent to) off-diagonal heat kernel upper bounds under some mild assumptions. Our approach is based on a new generalized Davies's method. Our results extend that by Carlen-Kusuoka-Stroock for Nash-type inequalities with power order considerably and also extend that by Grigor'yan for second order differential operators on a complete non-compact manifold.The talk is based on a joint work with Z.-Q. Chen (Seattle), P. Kim (Seoul) and T. Kumagai (Kyoto).

Inverse local time of one-dimensional diffusions and its comparison theorem

Lidan Wang (Nankai University)

4
It is well known that for a reflecting Bessel process, the inverse local time at $0$ is an $\alpha$-stable subordinator, then the corresponding subordinate Brownian motion is a $2\alpha$-stable process. Based on discussions of some transforms and regenerative theory of general diffusions, we get a comparison result between inverse local times of Bessel processes and perturbed Bessel processes. An immediate application will be the stability of Green function estimates for trace processes.

Archimedes' principle for ideal gas

Krzysztof Burdzy (University of Washington)

7
I will present Archimedes' principle for a macroscopic ball in ideal gas consisting of point particles with non-zero mass. The main result is an asymptotic theorem, as the number of point particles goes to infinity and their total mass remains constant. Asymptotically, the gas has an exponential density as a function of height. The asymptotic inverse temperature of the gas is identified as the parameter of the exponential distribution.

Joint work with Jacek Malecki

Q&A for Organized Contributed Session 07

0
This talk does not have an abstract.

Session Chair

Zhen-Qing Chen (University of Washington)

Organized 17

The Advances in Time Series and Spatial Statistics (Organizer: Wei-Ying Wu)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

Interpretable, predictive spatio-temporal models via enhanced pairwise directions estimation

ShengLi Tzeng (National Sun Yat-sen University)

4
Spatio-temporal phenomena are often complicated, but kriging methods are widely used in modeling such data, where only a very simple mean structure is assumed. We instead develop a novel approach based on supervised dimension reduction for such data in order to capture nonlinear mean structures without requiring a prespecified parametric model. In addition to prediction as a common interest, our approach focuses more on the exploration of geometric information in the data. The method of Pairwise Directions Estimation (PDE) is incorporated in our approach to implement the data-driven function searching of spatial structures and temporal patterns, useful in exploring data trends. We further enhance PDE, referring to it as PDE+, by using resolution adaptive fixed rank kriging to estimate the random effects not explained in the mean structures. Our proposal can not only produce more accurate and explainable prediction, but also increase the computation efficiency for model building. Illustrative applications to two real datasets are also presented. The results demonstrate that the proposed PDE+ method is very useful for exploring and interpreting the patterns of trend for spatio-temporal data.

Model selection with a nested spatial correlation structure

Chun-Shu Chen (National Central University)

4
In spatial regression analysis, a suitable specification of the mean regression model is crucial for unbiased analysis. Suitably account for the underlying spatial correlation structure of the response variables is also an important issue. Here, we focus on selection of an appropriate mean model in spatial regression analysis under a general anisotropic nested spatial correlation structure. We propose a distribution-free model selection criterion which is an estimate of the weighted mean squared error based on assumptions only for the first two moments of the response data. The simulations under the settings of covariate selection reveal that the proposed criterion performs well for covariate selection in the mean model regardless of the underlying spatial correlation structure is nested/non-nested, isotropic/anisotropic. Also, the proposed criterion accommodates both continuous and count response data. Finally, a real data example regarding the fine particulate matter concentration is also analyzed for illustration.

Consistent order selection for ARFIMA models

Kun Chen (Southwestern University of Finance and Economics)

3
Estimating the orders of the autoregressive fractionally integrated moving average (ARFIMA) model has been a long-standing challenge in time series analysis. This paper tackles the challenge by establishing the consistency of the Bayesian information criterion (BIC) in the ARFIMA model with independent errors. Since we allow the model's memory parameter to be any unknown real number, our consistency result can apply simultaneously to short-memory, long-memory,and non-stationary time series. We further extend BIC's consistency to the ARFIMA model with conditional heteroskedastic errors, thereby broadening the criterion's range of applications. Finally, the finite-sample implications of our theoretical results is illustrated using numerical examples.

Whittle likelihood for irregularly spaced spatial data

Soutir Bandyopadhyay (Colorado School of Mines)

4
Under some regularity conditions, including that the process is Gaussian, the sampling region is rectangular, and that the parameter space $\Theta$ is compact, Matsuda and Yajima (2009) showed that the Whittle estimator $\widehat{\theta}_{n}$ minimizing their version of Whittle likelihood is consistent (for $d\leq 3$) and one can construct large sample confidence regions for covariance parameters $\theta$ using the asymptotic normality of the Whittle estimator $\widehat{\theta}_{n}$. However, this requires one to estimate the asymptotic covariance matrix, which involves integrals of the spatial sampling density. Moreover, nonparametric estimation of the quantities in the asymptotic covariance matrix requires specification of a smoothing parameter and is subject to the curse of dimensionality. In comparison, we propose a spatial frequency domain empirical likelihood method (cf. Bandyopadhyay et al. (2015), Van Hala et al. (2017)) based approach which can be employed to produce asymptotically valid confidence regions and tests on $\theta$, without requiring explicit estimation of such quantities.

Q&A for Organized Contributed Session 17

0
This talk does not have an abstract.

Session Chair

Wei-Ying Wu (National Dong Hwa University)

Organized 24

Advanced Statistical Methods for Complex Data (Organizer: Jongho Im)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

On the verifiable identification condition in NMAR missing data analysis

Kosuke Morikawa (Osaka University and The University of Tokyo)

3
Missing data often causes undesired properties such as bias and loss of efficiency. By modeling the distribution of complete data and its missing-data mechanism, incorporating them into the likelihood can solve the problem. However, especially when the missing-data mechanism is NMAR (Not missing at random), there are two problems: (i) we cannot verify sufficient conditions for the distribution of complete data; (ii) guaranteeing model identifiability is difficult even for relatively simple models. Some recent studies tackle the first problem (i) by modeling the distribution of observed data, not complete data, which is impossible to obtain. As for problem (ii), we have derived sufficient conditions for model identifiability under nonignorable nonresponse by specifying that the distribution of the outcome model is a normal or normal mixture, but the missing-data mechanism is any parametric model. The new conditions can check whether assumed models are identifiable by the observed data in NMAR missingness.

Bayesian hierarchical spatial model for small-area estimation with non-ignorable nonresponses and its application to the NHANES dental caries data

Ick Hoon Jin (Yonsei University)

3
The National Health and Nutrition Examination Survey (NHANES) is a major program of the National Center for Health Statistics, designed to assess the health and nutritional status of adults and children in the United States. The analysis of NHANES dental caries data faces several challenges, including (1) the data were collected using a complex, multistage, stratified, unequal-probability sampling design; (2) the sample size of some primary sampling units (PSU), e.g., counties, is very small; (3) the measures of dental caries have complicated structure and correlation, and (4) there is a substantial percentage of nonresponses, which are expected not to be missing at random or non-ignorable. We propose a Bayesian hierarchical spatial model to address these analysis challenges. We develop a two-level Potts model that closely resembles the caries evolution process, and captures complicated spatial correlations between teeth and surfaces of the teeth. By adding Bayesian hierarchies to the Potts model, we account for the multistage survey sampling design, while also enabling information borrowing across PSUs for small-area estimation. We incorporate sampling weights by including them as a covariate in the model and adopt flexible B-splines to achieve robust inference. We account for non-ignorable missing outcomes and covariates using the selection model. We use data augmentation coupled with the noisy Monte Carlo algorithm to overcome the numerical difficulty caused by doubly-intractable normalizing constants and sample posteriors. Our analysis results show strong spatial associations between teeth and tooth surfaces, including that dental hygienic factors, such as fluorosis and sealant, reduce dental disease risks.

Raking-based relabeling classification method for highly imbalanced data

Seunghwan Park (Kangwon National University)

3
We consider binary classification on the imbalanced data. A dataset is called imbalanced if the proportion of classes are heavily skewed. Classification on the imbalanced data is often challengeable, especially for high dimensional data, in the sense that the unequal classes deteriorate the performance of classifiers. Undersampling the majority class and/or oversampling the minority class are popular methods to construct balanced samples and then it helps to improve the classification performance. However, many existing sampling methods cannot be easily extended to high dimensional data and mixed variables, because they often require to approximate the distribution of attributes and this becomes another critical issue rather than classification. In this paper, we propose new sampling strategy, called counter-matching sampling, such that the attribute values of the major class are imputed for the values of the minor class in construction of balanced samples. The proposed algorithms produce the same or similar performance with the existing methods but is more flexible to data shape and size of attributes. Our sampling algorithm is very attractive in practice considering that our sampling algorithm does not require any density estimation for synthetic data generation in oversampling and is not bothered from mixed variables. Also, the proposed sampling strategy can be easily combined with many existing classifiers.

Imputation approach for outcome dependent sampling design

Jongho Im (Yonsei University)

3
Outcome dependent sampling (ODS) has been widely used to enhance the study efficiency in epidemiology or biomedical studies. We consider a biased two phase sampling design where the second phase samples are selected based on the outcome variable and the covariate x are only observed at the second phase. Many methods have been proposed by incorporating the estimated inclusion probabilities into the target score function of outcome model. In this paper, we propose an imputation method that is essentially implemented by data augmentation. The predictive distribution is
nonparametrically estimated and then a Bayesian bootstrap method is used to generate imputed values. The proposed method employs Rubin's variance formula for variance estimation of imputation estimators. A limited simulation study shows that the proposed method performs well and is comparable to the previous methods.

Q&A for Organized Contributed Session 24

0
This talk does not have an abstract.

Session Chair

Seung Hwan Park (Kangwon National University)

Contributed 25

Time Series Analysis II

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

Robust Bayesian analysis of multivariate time series

Yixuan Liu (The University of Auckland)

2
There is a surge in the literature of nonparametric Bayesian inference on multivariate time series over the last decade, many approaches consider modelling the spectral density matrix using the Whittle likelihood which is an approximation of the true likelihood and commonly employed for Gaussian time series. Meier et al. (2019) proposes a nonparametric Whittle likelihood procedure along with a Bernstein polynomial prior weighted by a Hermitian positive de_nite (Hpd) Gamma process. However, it is known that nonparametric techniques are less effcient and powerful than parametric techniques when the latter specify the parameters which model the observations perfectly. Therefore, Kirch et al. (2019) suggests a nonparametric correction to the parametric likelihood in the univaraite case that takes the effciency of parametric models and amends sensitivities through the nonparamtric correction. Along with this novel likelihood, the Bernstein polynomial prior equipped with a Dirichlet process wight is employed. My current work is to extend the corrected Whittle likelihood procedure to the multivariate case, this will be done by combining the work of Meier et al. (2019) and Kirch et al. (2019). Precisely, the multivariate version of the corrected Whittle likelihood is proposed along with the Hpd Gamma process weighted Bernstein polynomial prior to implement Bayesian inference. A key study of this work is to prove the posterior consistency. In the talk, I will review the work done by Meier et al. (2019) and Kirch et al. (2019), then an introduction of the multivariate corrected Whittle likelihood procedure will be given.

Posterior consistency for the spectral density of non-Gaussian stationary time series

Yifu Tang (The University of Auckland)

1
Various nonparametric approaches for Bayesian spectral density estimation of stationary time series have been suggested in the literature, mostly based on the Whittle likelihood approximation. A generalization of this approximation has been proposed in Kirch et al. (2019) who prove posterior consistency for spectral density estimation in combination with the Bernstein-Dirichlet process prior for Gaussian time series. In this talk, I will talk about how to extend the posterior consistency result to non-Gaussian time series by employing a modified version of general consistency theorem of Shalizi (2009) for dependent data and misspecified models. As a special case, posterior consistency for the spectral density under the Whittle likelihood as proposed by Choudhuri, Ghosal and Roy (2004) is also extended to non-Gaussian time series.

ARMA models for zero inflated count time series

Vurukonda Sathish (Indian Institute of Technology Bombay)

2
Zero inflation is a common nuisance while monitoring disease progression over time. This article proposes a new observation driven model for zero inflated and over-dispersed count time series. The counts given the past history of the process and available information on covariates are assumed to be distributed as a mixture of a Poisson distribution and a distribution degenerate at zero, with a time dependent mixing probability, $\pi_t$. Since count data usually suffers from overdispersion, a Gamma distribution is used to model the excess variation, resulting in a zero inflated negative binomial (NB) regression model with the mean parameter $\lambda_t$. Linear predictors with autoregressive and moving average (ARMA) type terms, covariates, seasonality, and trend are fitted to $\lambda_t$ and $\pi_t$ through canonical link generalized linear models. Estimation is done using maximum likelihood aided by iterative algorithms, such as Newton Raphson (NR) and Expectation and Maximization (EM). Theoretical results on the consistency and asymptotic normality of the estimators are given. The proposed model is illustrated using in-depth simulation studies and a dengue data set.

Time-series data clustering via thick pen transformation

Minji Kim (Seoul National University)

4
Our ultimate goal is to cluster time-series data by suggesting a new similarity measure and an optimization algorithm. To illustrate, we propose a new time-series clustering method based on the Thick Pen Transformation (TPT) proposed by Fryzlewicz and Oh (2011), whose basic idea is to draw along the data with a pen of given thicknesses. The main contribution of this research is that we suggest a new similarity measure for time-series data based on the overlap or gap between the two thick lines after transformation. This method of applying TPT to measure the association exploits the strengths of the transformation; it is a multi-scale visualization technique that can be defined to provide some information on neighborhood values' temporal trends. Moreover, we further suggest an efficient iterative clustering optimization algorithm appropriate for the proposed measure. Our main motivation is to cluster a large number of physical step count data obtained from a wearable device. In addition, comparative numerical experiments are performed to compare our method to some existing methods. Real data analysis and simulation studies suggest that the proposed method can be applied in general for time series data distributed on the same side along the axis, whose similarities are measurable in the form of a proportion of overlapping areas.

Q&A for Contributed Session 25

0
This talk does not have an abstract.

Session Chair

Joungyoun Kim (Yonsei University)

Poster I-1

Poster Session I-1

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

GMOTE: Gaussian-based minority oversampling technique for imbalanced classification adapting tail probability of outliers

Seung Jee Yang (Hanyang University)

1
Imbalanced data substantially affects the performance of the standard classification models. As a solution to these, oversampling methods have been proposed such as the synthetic minority oversampling technique (SMOTE). However, because methods such as SMOTE use linear interpolation to generate synthetic instances, the synthetic data space may appear similar to a polygon. Furthermore, oversampling methods generate synthetic outliers in minority classes. In this paper, we propose a Gaussian-based minority oversampling technique (GMOTE) with a statistical perspective for imbalanced datasets. The proposed method generates instances by using a Gaussian mixture model to avoid linear interpolation and to consider outliers. Motivated by the clustering-based multivariate Gaussian outlier score, we propose considering local outliers by calculating the tail probability of instances calculated using the Mahalanobis distance. Experiments were conducted on a representative set of benchmark datasets, and the GMOTE performance was compared with that of other methods. When GMOTE is combined with a classification and regression tree or support vector machine, it produces better accuracy and F1-score. Experimental results demonstrate this robust performance.

Exact inference for an exponential parameter under generalized progressive type II hybrid censored competing risk data

Subin Cho (Daegu University)

1
Progressive censoring has the drawback that it might take a very long time to observe m-th failures and complete the life test. In this reason, generalized progressive type II censoring scheme was introduced. In addition, it is known that more than one risk factor may be present at the same time. In this paper, we discuss exact inference for competing risk model with generalized progressive type II hybrid censored exponential data. We derive the conditional moment generating function of the maximum likelihood estimators of scale parameters of exponential distribution and the resulting lower confidence bound under generalized progressive type II hybrid censoring scheme. From the example data, it can be seen that the PDF of MLE is almost symmetrical.

Meta-analysis methods for multiple related markers: applications to microbiome studies with the results on multiple $\alpha$-diversity indices

Hyunwook Koh (The State University of New York, Korea)

2
Meta-analysis is a practical and powerful analytic tool that enables a unified statistical inference across the results from multiple studies. Notably, researchers often report the results on multiple related markers in each study (e.g., various $\alpha$-diversity indices in microbiome studies). However, univariate meta-analyses are limited to combining the results on a single common marker at a time, whereas existing multivariate meta-analyses are limited to the situations where marker-by-marker correlations are given in each study. Thus, here we introduce two meta-analysis methods, namely, multi-marker meta-analysis (mMeta) and adaptive multi-marker meta-analysis (aMeta), to combine multiple studies throughout multiple related markers with no priori results on marker-by-marker correlations. mMeta is a statistical estimator for a pooled estimate and its standard error across all the studies and markers, whereas aMeta is a statistical test based on the test statistic of the minimum p-value among marker-specific meta-analyses. mMeta conducts both effect estimation and hypothesis testing based on a weighted average of marker-specific pooled estimates while estimating marker-by-marker correlations non-parametrically via permutations, yet its power is only moderate. In contrast, aMeta closely approaches the highest power among marker-specific meta-analyses, yet it is limited to hypothesis testing. While their applications can be broader, we illustrate the use of mMeta and aMeta to combine microbiome studies throughout multiple $\alpha$-diversity indices. We evaluate mMeta and aMeta in silico and apply them to real microbiome studies on the disparity in $\alpha$-diversity by the status of HIV infection.

Estimation for a nonlinear regression model with non-zero mean errors and an application to a biomechanical model

Hojun You (Seoul National University)

4
We propose a modified least squares estimator for a nonlinear regression model with non-zero mean errors motivated by the head-neck position tracking application. A nonlinear regression with multiplicative errors can be handled under the framework of the proposed method. In addition, we assume temporal dependence in the errors. We propose not only the modified least squares procedure for parameter estimation, but also penalized least squares procedure for parameter estimation and selection at the same time. Asymptotic properties of the proposed estimators, especially local consistency and oracle property of the penalized least square estimator, are established under plausible assumptions imposed on the nonlinear function, errors, and a penalty function. A simulation study demonstrates that the proposed estimation performs well in both parameter estimation and selection with temporally correlated error. The analysis and comparison with the existing methods for head-neck position tracking data show better performance of the proposed method in terms of the variance accounted for (VAF).

Neural network-based clustering for ischemic stroke patients

Su Hoon Choi (Chonnam National University)

1
Finding similar clusters for stroke patients is important because it can lead to discovering new patterns and more effective ways to manage stroke. Although lifetime clustering is an important tool, it remains a relatively unexplored topic. In general, the degree of risk is classified using SPI-II, a traditional risk score to stratify the risk of recurrence of stroke. SPI-II is a verified and reliable stroke risk score. However, existing tools for predicting stroke outcome risks may have limitations because all possible variables cannot be considered. In this study, we compare several lifetime clustering methods including the deep lifetime clustering(DLC) method, which is a neural network-based clustering model. The performance of the clustering method on the real-world survival datasets of patients with ischemic stroke was evaluated. The SPI-II scores are grouped into three groups: low, medium, and high risk, based on previous studies. Accordingly, we conduct an analysis on three clustering in all methods. The metrics used to evaluate clusters obtained from the method are Concordance index, Brier score, and Log-rank score. An analysis was conducted on 7,650 patients out of data from the local comprehensive stroke center registry in patients with acute ischemic stroke. Compared to SPI-II stroke risk scores and other clustering methods, the DLC model performed much better clustering for all evaluation index. These results suggest that the DLC method may be useful for grouping stroke patients with similar outcome risks. Our study had an inherent limitation that it included only data from a single stroke center register in Republic of Korea. Therefore, further research with independent cohorts is likely to be required. Nevertheless, the neural network-based clustering method was first applied to stroke patients on real-world datasets.

Principal component analysis of amplitude and phase variation in multivariate functional data

Soobin Kim (Seoul National University)

5
In many situations, multivariate functional data have both phase and amplitude variations. A common approach is to remove phase variations using selected function aligning methods and then apply functional principal component analysis (FPCA) to the aligned functions, which contain only amplitude variations. To consider both types of variations, we propose an extension of FPCA for amplitude and phase variation to multivariate cases. The original functions are decomposed into amplitude functions and warping functions, and warping functions are transformed into square-integrable functions via a centered log-ratio transformation. Multivariate FPCA is then performed on each amplitude and phase component with data-adaptive weights to balance the variational effects. The proposed method demonstrates its usefulness through real data analysis with sea climate data in Korea.

Clustering non-stationary advanced metering infrastructure data

Donghyun Kang (Chung-Ang University)

2
we propose a clustering method for advanced metering infrastructure (AMI) data in Korea. As AMI data present non-stationarity, we consider time-dependent frequency domain principal components analysis and develop a new clustering method based on the time-varying eigenvectors. Our method provides a meaningful result that is different from the clustering results obtained by employing conventional methods, such as K-means and K-centres functional clustering. We further apply the clustering results to the evaluation of the electricity price system in South Korea, and validate the reform of the progressive electricity tariff system.

Plenary Tue-3

Levy Lecture (Massimiliano Gubinelli)

Conference
7:00 PM — 8:00 PM KST
Local
Jul 20 Tue, 6:00 AM — 7:00 AM EDT

A variational method for Euclidean quantum fields

Massimiliano Gubinelli (University of Bonn)

7
I will talk about recent progresses in understanding the probabilistic structure of certain (bosonic) Euclidean Quantum Field theories in terms of a variational representation of their Laplace transform. This approach gives an alternative construction of the $\Phi^4_3$ measure in finite volume and a tool to investigate some of its properties. It also generates some new interesting mathematical objects like a new kind of equations which allows to describe the infinite volume measures.

Session Chair

Martin Hairer (Imperial College London)

Plenary Tue-4

Doob Lecture (Nicolas Curien)

Conference
8:00 PM — 9:00 PM KST
Local
Jul 20 Tue, 7:00 AM — 8:00 AM EDT

Parking on Cayley trees and Frozen Erdös-Rényi

Nicolas Curien (Paris-Saclay University)

11
Consider a uniform Cayley tree Tn with n vertices and let m cars arrive sequentially, independently, and uniformly on its vertices. Each car tries to park on its arrival node, and if the spot is already occupied, it drives towards the root of the tree and park as soon as possible. Using combinatorial enumeration, Lackner & Panholzer established a phase transition for this process when m is approximately n/2. We couple this model with a variation of the classical Erdös-Rényi random graph process. This enables us to completely describe the phase transition for the size of the components of parked cars using a modification of the standard multiplicative coalescent which we named the frozen multiplicative coalescent. The geometry of critical parked clusters in the parking process is also studied. Those trees are very different from usual random trees and should converge towards the growth-fragmentation trees canonically associated to 3/2-stable process that already appeared in the study of random planar maps.

Based on joint work with Alice Contat

Session Chair

Wendelin Werner (Swiss Federal Institute of Technology Zürich)

Invited 16

Bootstrap for High-dimensional Data (Organizer: Kengo Kato)

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Inference for nonlinear inverse problems

Vladimir Spokoinyi (Weierstrass Institute for Applied Analysis and Stochastics and Humboldt University of Berlin)

4
Bayesian methods are actively used for parameter identification and uncertainty quantification when solving nonlinear inverse problems with random noise. However, there are only few theoretical results justifying the Bayesian approach. Recent papers, see e.g. Nickl (2017); Lu (2017) and references therein, illustrate the main difficulties and challenges in studying the properties of the posterior distribution in the nonparametric setup. This paper offers a new approach for study the frequentist properties of the nonparametric Bayes procedures. The idea of the approach is to relax the nonlinear structural equation by introducing an auxiliary functional parameter and replacing the structural equation with a penalty and by imposing a prior on the auxiliary parameter. For the such extended model, we state sharp bounds on posterior concentration and on the accuracy of the penalized MLE and on Gaussian approximation of the posterior, and a number of further results. All the bounds are given in terms of effective dimension, and we show that the proposed calming device does not significantly affect this value.

Change point analysis for high-dimensional data

Xiaohui Chen (University of Illinois at Urbana-Champaign)

6
Cumulative sum (CUSUM) statistics are widely used in the change point inference and identification. For the problem of testing for existence of a change point in an independent sample generated from the mean-shift model, we introduce a Gaussian multiplier bootstrap to calibrate critical values of the CUSUM test statistics in high dimensions. The proposed bootstrap CUSUM test is fully data-dependent and it has strong theoretical guarantees under arbitrary dependence structures and mild moment conditions. Specifically, we show that with a boundary removal parameter the bootstrap CUSUM test enjoys the uniform validity in size under the null and it achieves the minimax separation rate under the sparse alternatives when the dimension p can be larger than the sample size n. Once a change point is detected, we estimate the change point location by maximizing the $\ell^{\infty}$-norm of the generalized CUSUM statistics at two different weighting scales corresponding to covariance stationary and non-stationary CUSUM statistics. For both estimators, we derive their rates of convergence and show that dimension impacts the rates only through logarithmic factors, which implies that consistency of the CUSUM estimators is possible when p is much larger than n. In the presence of multiple change points, we propose a principled bootstrap-assisted binary segmentation (BABS) algorithm to dynamically adjust the change point detection rule and recursively estimate their locations. We derive its rate of convergence under suitable signal separation and strength conditions. Time permitting, we may also discuss some robust extension of the change point detection problem for high-dimensional location parameters.

Bootstrap test for multi-scale lead-lag relationships in high-frequency data

Yuta Koike (University of Tokyo)

6
Motivated by recent empirical findings in high-frequency financial econometrics, we consider a pair of Brownian motions having possibly different lead-lag relationships at multiple time scales. Given their discrete observation data, we aim to test at which time scales these processes have non-zero cross correlations. For this purpose, we introduce maximum type test statistics based on scale-by-scale cross covariance estimators and develop a Gaussian approximation theory for these statistics. Since their null distributions are analytically intractable, we propose a wild bootstrap procedure to approximate them. Theoretical verification of these approximations are established through recent Gaussian approximation results for high-dimensional vectors of degenerate quadratic forms.

Q&A for Invited Session 16

0
This talk does not have an abstract.

Session Chair

Kengo Kato (Cornell University)

Invited 27

Random Matrices and Related Fields (Organizer: Manjunath Krishnapur)

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

The scaling limit of the characteristic polynomial of a random matrix at the spectral edge

Elliot Paquette (McGill University)

3
The Gaussian beta-ensemble (GbetaE) is a 1-parameter generalization of the Gaussian orthogonal/unitary/symplectic ensembles which retains some integrable structure. Using this ensemble, Ramirez, Rider and Virag -- building on a heuristic of Edelman and Sutton -- constructed a limiting point process, the Airy-beta point process, which is the weak limit of the point process of eigenvalues or a random matrix in a neighborhood of the spectral edge. Jointly with Gaultier Lambert, we give a construction of a new limiting object, the stochastic Airy function (SAi); we show this is the limit of the characteristic polynomial of GbetaE in a neighborhood of the spectral edge. It is the bounded solution of the stochastic Airy equation, which is the usual Airy equation perturbed by a multiplicative white noise. We also give some basic properties of SAi.

Strong asymptotics of planar orthogonal polynomials: Gaussian weight perturbed by finite number of point charges

Seung Yeop Lee (University of South Florida)

3

Secular coefficients and the holomorphic multiplicative chaos

Joseph Najnudel (University of Bristol)

3
We study the coefficients of the characteristic polynomial (also called secular coefficients) of random unitary matrices drawn from the Circular Beta Ensemble (i.e. the joint probability density of the eigenvalues is proportional to the product of the power beta of the mutual distances between the points). We study the behavior of the secular coefficients when the degree of the coefficient and the dimension of the matrix tend to infinity. The order of magnitude of this coefficient depends on the value of the parameter beta, in particular, for beta = 2, we show that the middle coefficient of the characteristic polynomial of the Circular Unitary Ensemble converges to zero in probability when the dimension goes to infinity, which solves an open problem of Diaconis and Gamburd. We also find a limiting distribution for some renormalized coefficients in the case where beta > 4. In order to prove our results, we introduce a holomorphic version of the Gaussian Multiplicative Chaos, and we also make a connection with random permutations following the Ewens measure.

Q&A for Invited Session 27

0
This talk does not have an abstract.

Session Chair

Ji Oon Lee (Korea Advanced Institute of Science and Technology (KAIST))

Invited 28

Statistical Inference for Graphs and Networks (Organizer: Betsy Ogburn)

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

A goodness-of-fit test for exponential random graphs

Gesine Reinert (University of Oxford)

5
For assessing the goodness of fit of a model, often independent replicas are assumed. When the data are given in the form of a network, usually there is only one network available. If the data are hypothesised to come from an exponential random graph model, the likelihood cannot be calculated explicitly. Using Stein's method we introduce a kernelized goodness of fit test and illustrate its performance.

This talk is based on joint work with Nathan Ross and with Wenkai Xu.

Networks in the presence of informative community structure

Alexander Volfovsky (Duke University)

4
The study of network data in the social and health sciences frequently concentrates on associating covariate information to edge formation and assessing the relationship between network information and individual outcomes. In much of this data, it is likely that latent or observed community structure plays an important role. In this talk we describe how to incorporate this community information into a class of latent space models by allowing the the effects of covariates on edge formation to differ between communities (e.g. age might play a different role in friendship formation in communities across a city). This information is lost by ignoring explicit community membership and we show that ignoring such structure can lead to over- or underestimation of covariate importance to edge formation. We further demonstrate that when designing experiments on networks, if outcomes of interest are community driven (e.g. differential response to a treatment based on community behavior), incorporating this structure directly into the randomization procedure leads to an improvement in the ability to estimate causal effects.

Motif estimation via subgraph sampling: the fourth-moment phenomenon

Bhaswar Bhattacharya (University of Pennsylvania)

7
Network sampling has emerged as an indispensable tool for understanding features of large-scale complex networks where it is practically impossible to search/query over all the nodes. Examples include social networks, biological networks, internet and communication networks, and socio-economic networks, among others. In this talk we will discuss a unified framework for statistical inference for counting motifs, such as edges, triangles, and wedges, in the widely used subgraph sampling model. In particular, we will provide precise conditions for the consistency and the asymptotic normality of the natural Horvitz-Thompson (HT) estimator, which can be used for constructing confidence intervals and hypothesis testing for the motif counts. As a consequence, an interesting fourth-moment phenomena for the asymptotic normality of the HT estimator and connections to fundamental results in random graph theory will emerge.

Q&A for Invited Session 28

0
This talk does not have an abstract.

Session Chair

Betsy Ogburn (Johns Hopkins University)

Invited 31

Information Theory and Concentration Inequalities (Organizer: Chandra Nair)

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Algorithmic optimal transport in Euclidean spaces

Salman Beigi (Institute for Research in Fundamental Sciences (IPM))

4
Transportation cost inequalities in product spaces put an upper bound on the distance that a random point in the space should traverse in order to reach a point in a given target subset of the space. The main question in this talk is whether given the random starting point, the target point can be found algorithmically. This is a hard problem in general and whose answer depends on the underlying product space and its metric. In this talk after motivating this problem via applications in learning theory, answers to this question are given for Euclidean spaces. A main tool in the design and analysis of our algorithm in the tensorization property of transportation cost inequalities.

This talk is based on a joint work with Omid Etesami and Amin Gohari.

Entropy bounds for discrete log-concave distributions

Sergey Bobkov (University of Minnesota)

3
We will be discussing two-sided bounds for concentration functions and Renyi entropies in the class of discrete log-concave probability distributions. They are used to derive certain variants of the entropy power inequalities.

The talk is based on a joint work with Arnaud Marsiglietti and James Melbourne.

Entropy and convex geometry

Tomasz Tkocz (Carnegie Mellon University)

3
I shall survey several problems emerging from the interplay between convex geometry and information theory, pertaining mainly to reverse entropy power inequalities.

(Based mainly on joint works with Ball, Madiman, Melbourne, Nayar.)

Q&A for Invited Session 31

0
This talk does not have an abstract.

Session Chair

Chandra Nair (Chinese University of Hong Kong)

Organized 12

Recent Developments for Dependent Data (Organizer: Mikyoung Jun)

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

DeepKriging: spatially dependent deep neural networks for spatial prediction

Ying Sun (King Abdullah University of Science and Technology (KAUST))

6
In spatial statistics, a common objective is to predict the values of a spatial process at unobserved locations by exploiting spatial dependence. In geostatistics, Kriging provides the best linear unbiased predictor using covariance functions and is often associated with Gaussian processes. However, when considering non-linear prediction for non-Gaussian and categorical data, the Kriging prediction is not necessarily optimal, and the associated variance is often overly optimistic. We propose to use deep neural networks (DNNs) for spatial prediction. Although DNNs are widely used for general classification and prediction, they have not been studied thoroughly for data with spatial dependence. In this work, we propose a novel neural network structure for spatial prediction by adding an embedding layer of spatial coordinates with basis functions. We show in theory that the proposed DeepKriging method has multiple advantages over Kriging and classical DNNs only with spatial coordinates as features. We also provide density prediction for uncertainty quantification without any distributional assumption and apply the method to PM2.5 concentrations across the continental United States.

A model-free subsampling method based on minimum energy criterion

Wenlin Dai (Renmin University of China)

3
The extraordinary amounts of data generated in science today pose heavy demands on computational resources and time, which hinders the implementation of various statistical methods. An efficient and popular strategy of downsizing data volumes and hence alleviating these challenges is subsampling. However, the existing methods either rely on specific assumptions for the underlying models or acquire only partial information from the available data. We propose a novel approach, termed adaptive subsampling, that is based on the minimum energy criterion (ASMEC). The proposed method requires no explicit model assumptions and `smartly' incorporates information on covariates and responses. ASMEC subsamples possess two desirable properties: space-filling and spatial adaptiveness to the full data. We investigate the theoretical properties of the ASMEC estimator under the smoothing spline regression model and show that it converges at an identical rate to two recently proposed basis selection methods. The effectiveness and robustness of the ASMEC approach are also supported by a variety of simulated examples and two real-life examples.

Global wind modeling with transformed Gaussian processes

Jaehong Jeong (Hanyang University)

4
Uncertainty quantification of wind energy potential from climate models can be limited because it requires considerable computational resources and is time-consuming. We propose a stochastic generator that aims at reproducing the data-generating mechanism of climate ensembles for global annual, monthly, and daily wind data. Inferences based on a multi-step conditional likelihood approach are achieved by balancing memory storage and distributed computation for a large data set. In the end, we discuss a general framework for modeling non-Gaussian multivariate stochastic processes by transforming underlying multivariate Gaussian processes.

Threshold estimation for continuous three-phase polynomial regression models with constant mean in the middle regime

Chih-Hao Chang (National University of Kaohsiung)

2
This talk considers a continuous three-phase polynomial regression model with two threshold points for dependent data with heteroscedasticity. We assume the model is polynomial of order zero in the middle regime, and is polynomial of higher orders elsewhere. We denote this model by M2, which includes models with one or no threshold points, denoted by M1 and M0, respectively, as special cases. We provide an ordered iterative least squares (OiLS) method when estimating M2 and establish the consistency of the OiLS estimators under mild conditions. We also apply a model-selection procedure for selecting Mk; k=0,1,2. When the underlying model exists, we establish the selection consistency under the aforementioned conditions. Finally, we conduct simulation experiments to demonstrate the finite-sample performance of our asymptotic results.

Q&A for Organized Contributed Session 12

0
This talk does not have an abstract.

Session Chair

Mikyoung Jun (University of Houston)

Organized 16

Non-Euclidean Statistical Inference (Organizer: Young Kyung Lee)

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Functional linear regression model with randomly censored data: predicting conversion time to Alzheimer's disease

Seong Jun Yang (Jeonbuk National University)

3
Predicting the onset time of Alzheimer's disease is of great importance in preventive medicine. Structural changes in brain regions have been actively investigated in the association study of Alzheimer's disease diagnosis and prognosis. In this study, we propose a functional linear regression model to predict the conversion time to Alzheimer's disease among mild cognitive impairment patients. Vertical thickness change in corpus callosum is measured from magnetic resonance imaging scan and put into the model as a functional covariate. A synthetic response approach is taken to deal with the censored data. The simulation studies demonstrate that the proposed model successfully predicts the unobserved true survival time but indicate that high censoring rate may cause poor prediction in time. Through ADNI data application, we find that the atrophy in the rear area of corpus callosum is a possible neuroimaging marker on Alzheimer's disease prognosis.

Deconvolution estimation on hyperspheres

Jeong Min Jeon (Katholieke Universiteit Leuven)

7
This paper considers nonparametric estimation with contaminated data observed on the unit hypersphere $S^d$. For such data, we consider deconvolution density estimation and regression analysis. Our methodology and theory are based on harmonic analysis on $S^d$ which is largely unknown in statistics. We establish novel deconvolution density and regression estimators, and study their asymptotic properties including the rates of convergence and asymptotic distributions. We also provide asymptotic confidence intervals. We present practical details on implementation as well as the results of numerical studies.

Confidence band for persistent homology of KDEs

Jisu Kim (Inria)

3
The persistent homology of the upper level sets of a probability density function quantifies the salient topological features of data. Such a target quantity can be well estimated using the persistent homology of the upper level sets of a KDE(kernel density estimator). In this talk, I will present how the confidence band can be computed for determining the significance of the topological features in the persistent homology of KDEs, based on the bootstrap procedure. First, I will present how the confidence band can be computed for the persistent homology of KDEs computed on a grid. In practice, however, computing the persistent homology on a grid is infeasible when the dimension of the ambient space is high or topological features are in different scales. Hence, I will consider the persistent homology of KDEs on Vietoris-Rips complexes over the sample point. I will describe how to construct a valid confidence band for the persistent homology of KDEs on Vietoris-Rips complexes based on the bootstrap procedure.

Analysis of chemical-gene bipartite network via a user-based collaborative filtering method incorporating chemical structure information

Namgil Lee (Kangwon National University)

2
Drug repositioning refers to finding new applications and different uses of known drugs. In this study, we introduce a network analysis approach for drug repositioning. In particular, we introduce a user-based collaborative filtering method for analyzing bipartite networks between chemicals and genes. Moreover, under the assumption that structural similarity between chemicals is deeply related to functional similarity, an improved measure of similarity between chemicals is proposed. Numerical experiments are conducted to evaluate the statistical significance of the proposed method for the CTD database.

Q&A for Organized Contributed Session 16

0
This talk does not have an abstract.

Session Chair

Young Kyung Lee (Kangwon National University)

Contributed 02

Financial Mathematics and Probabilistic Modeling

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Solving the selection-recombination equation: ancestral lines and duality

Frederic Alberti (Bielefeld University)

4
The selection-recombination equation is a high-dimensional, nonlinear system of ordinary differential equations, which describe the evolution of the genetic type composition of a population under selection and recombination, in a law of large numbers regime. So far, explicit solutions have seemed out of reach; only in the special case of three loci, with selection acting on one of them, has an approximate solution been found, but without an obvious path to generalisation.
We consider the case of an arbitrary number of neutral loci, linked to a single selected locus. In this setting, we investigate how the (random) genealogical structure of the problem can be succinctly encoded by a novel `ancestral initiation graph', and how it gives rise to a recursive integral representation of the solution with a clear, probabilistic interpretation.

References:

-F. Alberti and E. Baake, Solving the selection-recombination equation: Ancestral lines under selection and recombination, https://arxiv.org/abs/2003.06831

-F. Alberti, E. Baake and C. Herrmann, Selection, recombination, and the ancestral initiation graph, https://arxiv.org/abs/2101.10080

Short time asymptotics for modulated rough stochastic volatility models

Barbara Pacchiarotti (Università degli studi di Roma "Tor Vergata")

2
In this paper, we establish a small time large deviation principle for log-price processes when the volatility is a function of a modulated Volterra process. With modulated process we mean a Volterra process with a self similar kernel multiplied by a slowly varying function. We also deduce short time asymptotics for implied volatility and for pricing.

How to detect a salami slicer: a stochastic controller-stopper game with unknown competition

Kristoffer Lindensjö (Stockholm University)

3

Q&A for Contributed Session 02

0
This talk does not have an abstract.

Session Chair

Hyungbin Park (Seoul National University)

Contributed 07

SDEs and Fractional Brownian Motions

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Weak rough-path type solutions for singular Lévy SDEs

Helena Katharina Kremp (Freie Universität Berlin)

4
Since the works by Delarue, Diel and Cannizzaro, Chouk (in the Brownian noise setting), and our previous work, the existence and uniqueness of solutions to the martingale problem associated to multidimensional SDEs with additive $\alpha$-stable Lévy noise for $\alpha$ in (1,2] and rough Besov drift of regularity $\alpha$ in ((2-2 $\alpha$)/3,0] is known. Motivated by the equivalence of probabilistic weak solutions to SDEs with bounded, measurable drift and solutions to the martingale problem, we define a (non-canonical) weak solution concept for singular Lévy diffusions, proving moreover equivalence to the martingale solution in both the Young (i.e. $\alpha > (1-\alpha)/2$), as well as in the rough regime (i.e. $\alpha>(2-2\alpha)/3$). This turns out to be highly non-trivial in the rough case and forces us to define certain rough stochastic sewing integrals involved. In particular, we show that the canonical weak solution concept (introduced also by Athreya, Butkovsky, Mytnik in the Young case), which is well-posed in the Young case, yields non-uniqueness of solutions in the rough case.

Functional limit theorems for approximating irregular SDEs, general diffusions and their exit times

Mikhail Urusov (University of Duisburg-Essen)

4
We propose a new approach for approximating one-dimensional continuous Markov processes in law. More specifically, we discuss the following results:
(1) A functional limit theorem (FLT) for weak approximation of the paths of arbitrary continuous Markov processes;
(2) An FLT for weak approximation of the paths and exit times.
The second FLT has a stronger conclusion but requires a stronger assumption, which is essential. We propose a new scheme, called EMCEL, which satisfies the assumption of the second FLT and thus allows to approximate every one-dimensional continuous Markov process together with its exit times. The approach is illustrated by a couple of examples with peculiar behavior, including an irregular SDE, for which the corresponding Euler scheme does not converge even weakly, a sticky Brownian motion and a Brownian motion slowed down on the Cantor set.

This is a joint work with Stefan Ankirchner and Thomas Kruse.

Q&A for Contributed Session 07

0
This talk does not have an abstract.

Session Chair

Ildoo Kim (Korea University)

Contributed 28

Neural Networks and Deep Learning

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Simulated Annealing-Backpropagation Algorithm on Parallel Trained Maxout Networks (SABPMAX) in detecting credit card fraud

Sheila Mae Golingay (University of the Philippines-Diliman)

5
Based on the Backpropagation (BP) artificial neural network algorithm, this study introduces the idea of combining Simulated Annealing (SA), a global searching algorithm and then proposes a new neural network algorithm: Simulated Annealing-Backpropagation Algorithm on Parallel Trained Maxout Networks (SABPMAX) algorithm. The proposed algorithm can improve the numerical stability and evaluation measures in detecting credit card fraud. It makes use of the global searching capability of SA and the precise local searching element of the backpropagation algorithm to improve the initial weights of the network towards improving detection of credit card fraud. Several models were made and tested using different fraud distributions. Furthermore, separate applications of BP algorithm and SABPMAX algorithm were compared. Numerical results show a higher accuracy rate, higher sensitivity, shorter computing time, and overall better performance of the SABPMAX algorithm.

The smoking gun: statistical theory improves neural network estimates

Sophie Langer (Technische Universität Darmstadt)

5
In this talk we analyze the $L_2$ error of neural network regression estimates with one hidden layer. Under the assumption that the Fourier transform of the regression function decays suitably fast, we show that an estimate, where all initial weights are chosen according to proper uniform distributions and where the weights are learned by gradient descent, achieves a rate of convergence of $1/\sqrt{n}$ (up to a logarithmic factor). Our statistical analysis implies that the key aspect behind this result is the proper choice of the initial inner weights and the adjustment of the outer weights via gradient descent. This indicates that we can also simply use linear least squares to choose the outer weights. We prove a corresponding theoretical result and compare our new linear least squares neural network estimate with standard neural network estimates via simulated data. Our simulations show that our theoretical considerations lead to an estimate with an improved performance. Hence the development of statistical theory can indeed improve neural network estimates.

Stochastic block model for multiple networks

Tabea Rebafka (Sorbonne Université)

5
A model-based approach for the analysis of a collection of observed networks is considered. We propose to fit a stochastic block model to the data. The novelty consists in the analysis of not a single, but multiple networks. The major challenge resides in the development of a computationally efficient algorithm. Our method is an agglomerative algorithm based on the integrated classification likelihood criterion that performs simultaneously model selection and node clustering. Compared to the single-network context, an additional difficulty resides in the necessity to compare networks one to another and aggregate partial solutions. We propose a distance measure to compare stochastic block models and solve the label switching problem among graphs in a computationally efficient way.

Deep neural networks for faster nonparametric regression models

Mehmet Ali Kaygusuz (The Middle East Technical University)

5
Deep neural networks have been an attention in recent years due to it has huge success in applicational areas such as signal progressing, biological networks, and time series analysis. Schmidt-Hieber (2020) suggested Feedforward neural networks for Generalized additive models (GAMs) with sparsity and Re-Lu function. However, the over-parametrized problem can be challenging when the number of parameters exceeds the number of samples which is studied by Bauer and Kohler (2019). Therefore, we use the bootstrap methods to cope with this problem since bootstrap methods (Efron, 1979) are computationally faster and reduced the variance. Specifically, we propose the Smooth bootstrap method (Sen et al. (2010)) which can be more appropriate for nonparametric regression while capturing the nonlinearity and the interaction between variables, resulting in better performance in bias-variance trade-off. In this study, when we combine bootstrap with multilayer neural network together with GAM’s approaches, we also aim to optimize the model selection in GAMs via distinct model selection criteria, namely, consistent Akaike information criterion with Fisher matrix and information complexity (Bozdogan, 1987). We evaluate the performance of all suggested models in different dimensional protein-protein interaction network datasets and biomedical signal data in terms of various accuarcy measures.
[1] Bauer, B and Kohler,M, “On deep learning as a remedy for the curse of dimensionality in nonparametric regression”, The Annals of Statistics, 47(4), 2019, 2261-2285.
[2] Efron,B, "Bootstrap methods: another look at the jackknife" the Annals of Statistics,7(1):1-26,1979
[3] Hamparsum Bozdogan. “Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions”. In: Psychometrika 52.3 (1987), pp. 345–370.
[4] Sen,B, Banerjee, M and Woodroofe,M., “In-cosistency of bootstrap: The Grenander estimator ”, The Annals of Statistics,38(4),2010,1953-1977.
[5] Schmidt-Hieber, J., “Nonparametric regression using deep neural networks with ReLu activation function”, The Annals of Statistics, 48(4), 2020, 1875-1897.

Generative model for fbm with deep ReLU neural networks

Michael Allouche (Ecole Polytechnique)

5
Over the last few years, a new paradigm of generative models based on neural networks have shown impressive results to simulate – with high fidelity – objects in high-dimension, while being fast in the simulation phase. In this work, we focus on the simulation of continuous-time processes (infinite dimensional objects) based on Generative Adversarial Networks (GANs) setting. More precisely, we focus on fractional Brownian motion, which is a centered Gaussian process with specific covariance function. Since its stochastic simulation is known to be quite delicate, having at hand a generative model for full path is really appealing for practical use. However, designing the architecture of such neural networks models is a very difficult question and therefore often left to empirical search. We provide a high confidence bound on the uniform approximation of fractional Brownian motion $(B^H(t):t\in[0,1])$ with Hurst parameter $H$, by a deep-feedforward ReLU neural network fed with a $N$-dimensional Gaussian vector, with bounds on the network construction (number of hidden layers and total number of neurons). Our analysis relies, in the standard Brownian motion case ($H=1/2$), on the Levy construction of $B^H$ and in the general fractional Brownian motion case ($ H \neq 1/2 $), on the Lemarié-Meyer wavelet representation of $B^H$. This work gives theoretical support to use, and guidelines to construct, new generative models based on neural networks for simulating stochastic processes. It may well open the way to handle more complicated stochastic models written as a Stochastic Differential Equation driven by fractional Brownian motion.

Q&A for Contributed Session 28

0
This talk does not have an abstract.

Session Chair

Jong-June Jeon (University of Seoul)

Poster I-2

Poster Session I-2

Conference
9:30 PM — 10:00 PM KST
Local
Jul 20 Tue, 8:30 AM — 9:00 AM EDT

Geometrically Adapted Langevin Algorithm (GALA) for Markov Chain Monte Carlo (MCMC) simulations

Mariya Mamajiwala (University College London)

3
MCMC is a class of methods to sample from a given probability distribution. Of its myriad variants, the one based on the simulation of Langevin dynamics, which approaches the target distribution asymptotically, has gained prominence. The dynamics is specifically captured through a Stochastic Differential Equation (SDE), with the drift term given by the gradient of the log-likelihood function with respect to the parameters of the distribution. However, the unbounded variation of the noise (i.e. the diffusion term) tends to slow down the convergence, which limits the usefulness of the method. By recognizing that the solution of the Langevin dynamics may be interpreted as evolving on a suitably constructed Riemannian Manifold (RM), considerable improvement in the performance of the method can be realised. Specifically, based on the notion of stochastic development - a concept available in the differential geometric treatment of SDEs - we propose a geometrically adapted variant of MCMC. Unlike the standard Euclidean case, in our setting, the drift term in the modified MCMC dynamics is constrained within the tangent space of an RM defined through the Fisher information metric and the related connection. We show, through extensive numerical simulations, how such a mathematically tenable geometric restriction of the flow enables a significantly faster and accurate convergence of the algorithm.

Bayes estimation for the Weibull distribution under generalized adaptive hybrid progressive censored competing risks data

Yeongjae Seong (Daegu University)

2
Adaptive progressive hybrid censoring schemes have become quite popular in reliability and lifetime-testing studies. However, the drawback of the adaptive progressive hybrid censoring scheme is that it might take a very long time in order to complete the life test. In this reason, generalized adaptive progressive hybrid censoring scheme was introduced. In this research, a competing risks model is considered under a generalized adaptive progressive hybrid censoring scheme. When the failure times are Weibull distributed, maximum likelihood estimates for the unknown model parameters are established where the associated existence and uniqueness are shown. An asymptotic distribution of the maximum likelihood estimators is used to construct approximate confidence intervals via the observed fisher information matrix. Moreover, Bayes point estimates and the highest probability density credible intervals of unknown parameters are also presented, and the Gibbs sampling technique is used to approximate corresponding estimates.

Large deviations of mean-field interacting particle systems in a fast varying environment

Sarath Yasodharan (Indian Institute of Science)

2
We study large deviations of a “fully coupled” finite state mean-field interacting particle system in a fast varying environment. The empirical measure of the particles evolves in the slow time scale and the random environment evolves in the fast time scale. Our main result is the path-space large deviation principle for the joint law of the empirical measure process of the particles and the occupation measure process of the fast environment. This extends previous results known for two time scale diffusions to two time scale mean-field models with jumps. Our proof is based on the method of stochastic exponentials. We characterise the rate function by studying a certain variational problem associated with an exponential martingale.

Stochastic homogenisation of Gaussian fields

Leandro Chiarini (Utrecht University)

3
In this poster we prove the convergence of a sequence of random fields that generalise the Gaussian Free Field and bi-Laplacian field. Such fields are defined in terms of non-homogeneous elliptic operators which will be sampled at random. Under standard assumptions of stochastic homogenisation, we identify the limit fields as the usual GFF and bi-Laplacian fields up to a multiplicative constant.

Concentration inequality for U-statistics for uniformly ergodic Markov chains, and applications

Quentin Duchemin (Université Gustave Eiffel)

2

A Bayesian illness-death model to approach the incidence of recurrent hip fracture and death in elderly patients

Fran Llopis-Cardona (Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO))

2
Multi-state models are a wide class of stochastic processes models in which individuals can move between different states over time. These models are of special interest in survival analysis as they allow to deal with a wide range of complex scenarios. We focus on the so-called illness-death model, which includes an initial state, an illness state, and a death state, considered a generalization of the competing risks framework. In an illness-death scenario, competing risks models involve time to illness and to death but do not provide evidence of the transition from illness to death. Illness-death models however add this transition, what makes them a preferable model when progression to death after non-terminal diseases is a relevant outcome. We use an illness-death model to study the evolution of patients who have suffered a hip fracture. The dataset comes from the PREV2FO cohort and includes 34,491 patients aged 65 years and older who were discharged alive after a hospitalization for an osteoporotic hip fracture and followed until a recurrent hip fracture and death. Transition times, from the initial fracture to refracture and death, and from refracture to death, are modelled via Cox proportional hazards models with Weibull baseline hazard functions. For simplicity, we adjusted by covariates sex and age at discharge. Transition from refracture to death is defined with regard to the time from initial fracture to refracture. We use a Bayesian approach to estimate the posterior distribution of the model parameters via Markov Chain Monte Carlo Methods (MCMC). Based on this distribution, we estimate posterior distributions for cumulative incidences of refracture and death, as well as transition probabilities which include free-event probability, probability of permanence at refracture state and the probability of death after refracture. We also estimate cause-specific hazard ratios to assess the effect of covariates on each transition.

The contact process with two types of particles and priority: metastability and convergence in infinite volume

Mariela Pentón Machado (Instituto de Matemática e Estatística, Universidade de São Paulo)

2
We consider a symmetric finite-range contact process on Z with two types of particles (or infections), which propagate according to the same supercritical rate and die (or heal) at rate 1. Particles of type 1 can occupy any site in $(-\infty,0]$ that is empty or occupied by a particle of type 2 and, analogously, particles of type 2 can occupy any site in $[1,+\infty)$ that is empty or occupied by a particle of type 1. We prove that this system exhibits two metastable states: one with the two species and the other one with the family that survives the competition. In addition, we study the convergence of the process when it is defined in infinite volume.

A nonparametric instrumental approach to endogeneity in competing risks models

Jad Beyhum (ORSTAT, Katholieke Universiteit Leuven)

3
This paper discusses endogenous treatment models with duration outcomes, competing risks and random right censoring. The endogeneity issue is solved using a discrete instrumental variable. We show that the competing risks model generates a non-parametric quantile instrumental regression problem. The cause-specific cumulative incidence, the cause-specific hazard and the subdistribution hazard can be recovered from the regression function. A distinguishing feature of the model is that censoring and competing risks prevent identification at some quantiles. We characterize the set of quantiles for which exact identification is possible and give partial identification results for other quantiles. We outline an estimation procedure and discuss its properties. The finite sample performance of the estimator is evaluated through simulations. We apply the proposed method to the Health Insurance Plan of Greater New York experiment.

Invited 02

Scaling Limits of Disordered Systems and Disorder Relevance (Organizer: Rongfeng Sun)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Exceptional geodesic pairs in the directed landscape

Erik Bates (University of Wisconsin-Madison)

5
Within the Kardar-Parisi-Zhang universality class, the space-time Airy sheet is conjectured to be the canonical scaling limit for last passage percolation models. In recent work of Dauvergne, Ortmann, and Virág, this object was constructed and shown to be the limit after parabolic correction of one such model: Brownian last passage percolation. This limit object, called the directed landscape, admits geodesic paths between any two space-time points $(x,s)$ and $(y,t)$ with $s < t$. Here we examine fractal properties of the set of these paths. Our main results concern exceptional endpoints admitting disjoint geodesics. First, we fix two distinct starting locations $x_1$ and $x_2$, and consider geodesics traveling $(x_1,0)\to(y,1)$ and $(x_2,0)\to(y,1)$. We prove that the set of $y\in\mathbb{R}$ for which these geodesics coalesce only at time 1 has Hausdorff dimension one-half. Second, we consider endpoints $(x,0)$ and $(y,1)$ between which there exist two geodesics intersecting only at times 0 and 1. We prove that the set of such $(x,y)\in\mathbb{R}^2$ also has Hausdorff dimension one-half. The proofs require several inputs of independent interest, including (i) connections to the so-called difference weight profile studied by Basu, Ganguly, and Hammond; and (ii) a tail estimate on the number of disjoint geodesics starting and ending in small intervals. The latter result extends the analogous estimate proved for the prelimiting model by Hammond.

This talk is based on joint work with Shirshendu Ganguly and Alan Hammond.

Disorder relevance and the continuum random field Ising model

Adam Bowditch (University College Dublin)

4
Since its introduction by Lenz in 1920, the Ising model has been one of the most studied statistical mechanics models. It has been particularly central in the theory of critical phenomena since Peierls famously proved that it undergoes a phase transition in dimension at least 2. We discuss the long considered question of whether this picture is changed by the addition of disorder acting as a small random external field and whether the model admits a disordered continuum limit.

A CLT for KPZ on torus

Yu Gu (Carnegie Mellon University)

5
I will present a joint work with Tomasz Komorowski on proving a central limit theorem for the KPZ equation on torus.

Q&A for Invited Session 02

0
This talk does not have an abstract.

Session Chair

Rongfeng Sun (National University of Singapore)

Invited 07

High-dimensional Robustness (Organizer: Stanislav Minsker)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Distribution-free robust linear regression

Nikita Zhivotovskiy (Swiss Federal Institute of Technology Zürich)

5
We study random design linear regression with no assumptions on the distribution of the covariates and with a heavy-tailed response variable. When learning without assumptions on the covariates, we establish boundedness of the conditional second moment of the response variable as a necessary and sufficient condition for achieving deviation-optimal excess risk rate of convergence. In particular, combining the ideas of truncated least squares, median-of-means procedures and aggregation theory, we construct a non-linear estimator achieving excess risk of order d/n with the optimal sub-exponential tail. While the existing approaches to learning linear classes under heavy-tailed distributions focus on proper estimators, we highlight that the improperness of our estimator is necessary for attaining non-trivial guarantees in the distribution-free setting considered in this work. Finally, as a byproduct of our analysis, we prove an optimal version of the classical bound for the truncated least squares estimator due to Györfi, Kohler, Krzyzak, and Walk.

Algorithmic high-dimensional robust statistics

Ilias Diaconicolas (University of Wisconsin-Madison)

4
Fitting a model to a collection of observations is one of the quintessential questions in statistics. The standard assumption is that the data was generated by a model of a given type (e.g., a mixture model). This simplifying assumption is at best only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted data. This is the prototypical goal in robust statistics, a field that took shape in the 1960s with the pioneering works of Tukey and Huber. Until recently, even for the basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. Moreover, the quality of the common heuristics degrades badly as the dimension increases. In this talk, we will survey the recent progress in algorithmic high-dimensional robust statistics. We will describe the first computationally efficient algorithms for robust mean and covariance estimation and the main insights behind them. We will also present practical applications of these estimators to exploratory data analysis and adversarial machine learning. Finally, we will discuss new directions and opportunities for future work.

Robust estimation of a mean vector with respect to any norm : a minimax MOM and a Stahel-Donoho Median of means estimators

Guillaume Lecué (Center for Research in Economics and Statistics (CREST))

3

Q&A for Invited Session 07

0
This talk does not have an abstract.

Session Chair

Stanislav Minsker (University of Southern California)

Invited 08

Functional Data Analysis (Organizer: Aurore Delaigle)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Partially specified covariance operators and intrinsically functional graphical models

Victor Panaretos (École polytechnique fédérale de Lausanne)

7
Motivated by the problem of covariance recovery from functional fragments, we consider the problem of completing a partially specified covariance kernel on the unit square. By representing the underlying stochastic process as an undirected graphical model with uncountable vertices and edges, we show that a canonical completion always exists and can be explicitly described, under weak assumptions. For partial covariances specified on nearly banded domains containing the diagonal, we present necessary and sufficient conditions for unique completion, and characterise all completions under non-uniqueness. Finally, we show how the estimation of the canonical completion reduces to a system of ill-posed linear inverse problems in the space of Hilbert-Schmidt operators, and derive rates of convergence under standard source conditions.

Based on joint work with K. Waghmare (EPFL).

Domain selection for functional linear models: a dynamic RKHS approach

Jane-Ling Wang (University of California at Davis)

2
In conventional scalar-on-function linear regression model, the entire trajectory of the predictor process on the whole domain is used to model the response variable. However, the response may only be associated with the covariate process X on a subdomain. We consider the problem of estimating the domain of association when assuming that the regression coefficient function is nonzero on a subinterval. We propose a solution based on the reproducing kernel Hilbert space (RKHS) approach to estimate both the domain and the regression function. A simulation study illustrates the effectiveness of the proposed approach. Asymptotic theory is developed for both estimators.

Simultaneous Inference for function-valued parameters: A fast and fair approach

Dominik Liebl (University of Bonn)

4
Quantifying uncertainty using confidence regions is a central goal of statistical inference.  Despite this, methodologies for confidence bands in Functional Data Analysis are underdeveloped compared to estimation and hypothesis testing.  This work represents a major leap forward in this area by presenting a new methodology for constructing simultaneous confidence bands for functional parameter estimates. These bands possess a number of striking qualities: (1) they have a nearly closed-form expression, (2) they give nearly exact coverage, (3) they have a finite sample correction, (4) they do not require an estimate of the full covariance of the parameter estimate, and (5) they can be constructed adaptively according to a desired criteria. One option for choosing bands we find especially interesting is the concept of fair bands which allows us to do fair (or equitable) inference over subintervals and could be especially useful in longitudinal studies over long time scales.  Our bands are constructed by integrating and extending tools from Random Field Theory, an area that has yet to overlap with Functional Data Analysis.

Q&A for Invited Session 08

0
This talk does not have an abstract.

Session Chair

Yunjin Choi (University of Seoul)

Invited 32

Statistical Learning (Organizer: Yichao Wu)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Equivariant Variance Estimation for Multiple Change-point Model

Ning Hao (University of Arizona)

4
The variance of noise plays an important role in many change-point detection procedures and the associated inferences. Most commonly used variance estimators require strong assumptions on the true mean structure or normality of the error distribution, which may not hold in applications.  In this talk, we introduce a framework of equivariant variance estimation for multiple change-point models. In particular, we characterize the set of all equivariant unbiased quadratic variance estimators for a family of change-point model classes, and develop a minimax theory for such estimators.

A forward approach for sufficient dimension reduction in binary classification

Seung Jun Shin (Korea University)

4
Since the seminal sliced inverse regression (SIR) proposed, the inverse-type methods have been canonical in sufficient dimension reduction (SDR). However, they often suffer in binary classification since the binary response yields two slices at most. In this article, we develop a forward approach for SDR in binary classification based on weighted large-margin classifiers. We first show that the gradient of a large-margin classifier is unbiased for SDR as long as the corresponding loss function is Fisher consistent. This leads us to propose what we call weighted outer-product of gradients (wOPG) method. The WOPG can recover the central subspace exhaustively without linearity or constant variance conditions routinely required for the inverse-type methods. We analyze the asymptotic behavior of the proposed estimator, and demonstrate its promising finite-sample performance for both simulated and real data examples.

Variable Selection for Global Fréchet Regression

Danielle Tucker (University of Illinois at Chicago)

3
Global Fréchet regression is an extension of linear regression to cover more general types of responses, such as distributions, networks and manifolds, which are becoming more prevalent. In such models, predictors are Euclidean while responses are metric space valued. Predictor selection is of major relevance for regression modeling in the presence of multiple predictors but has not yet been addressed for Fréchet regression. Due to the metric space valued nature of the responses, Fréchet regression models do not feature model parameters, and this lack of parameters makes it a major challenge to extend existing variable selection methods for linear regression to global Fréchet regression. In this talk, we share our recent work which addresses this challenge and proposes a novel variable selection method with good practical performance. We provide theoretical support and demonstrate that the proposed variable selection method achieves selection consistency. We also explore the finite sample performance of the proposed method with numerical examples and data illustrations.

Q&A for Invited Session 32

0
This talk does not have an abstract.

Session Chair

Yichao Wu (University of Illinois at Chicago)

Invited 41

Bernoulli Paper Prize Session (Organizer: Bernoulli Society)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Bernoulli Prize for an outstanding survey article in Probability: From infinite random matrices over finite fields to square ice

Leonid Petrov (University of Virginia)
(Chair: Ofer Zeitouni)

3
Asymptotic representation theory of the symmetric group is a rich and beautiful subject with deep connections with probability, mathematical physics, and algebraic combinatorics. I will discuss a one-parameter deformation of this theory related to infinite random matrices over a finite field, which has an interesting connection to the six vertex (square ice) model and traffic systems on a 1-dimensional lattice.

Bernoulli Journal Read Paper Award: A general frequency domain method for assessing spatial covariance structures

Soutir Bandyopadhyay (Colorado School of Mines)
(Chair: Richard Samworth)

4
When examining dependence in spatial data, it can be helpful to formally assess spatial covariance structures that may not be parametrically specified or fully model-based. That is, one may wish to test for general features regarding spatial covariance without presupposing any particular, or potentially restrictive, assumptions about the joint data distribution. Current methods for testing spatial covariance are often intended for specialized inference scenarios, usually with spatial lattice data. We propose instead a general method for estimation and testing of spatial covariance structure, which is valid for a variety of inference problems (including nonparametric hypotheses) and applies to a large class of spatial sampling designs with irregular data locations. In this setting, spatial statistics have limiting distributions with complex standard errors depending on the intensity of spatial sampling, the distribution of sampling locations, and the process dependence. The proposed method has the advantage of providing valid inference in the frequency domain without estimation of such standard errors, which are often intractable, and without particular distributional assumptions about the data (e.g., Gaussianity). To illustrate, we develop the method for formally testing isotropy and separability in spatial covariance and consider confidence regions for spatial parameters in variogram model fitting. A broad result is also presented to justify the method for application to other potential problems and general scenarios with testing spatial covariance. The approach uses spatial test statistics, based on an extended version of empirical likelihood, having simple chi-square limits for calibrating tests. We demonstrate the proposed method through several numerical studies.

Q&A for Invited Session 41

0
This talk does not have an abstract.

Session Chair

Ofer Zeitouni (Weizmann Institute of Science) / Richard Samworth (University of Cambridge)

Organized 06

Theoretical Analysis of Random Walks, Random Graphs and Clustering (Organizer: Ji Oon Lee)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Spectral large deviations for sparse random matrices

Kyeongsik Nam (University of California, Los Angeles)

7
The large deviation problem for the spectrum of random matrices has attracted immense interest. It was first studied for GUE and GOE, which are exactly solvable, and subsequently studied for Wigner matrices with general distributions. Once the sparsity is induced (i.e. each entry is multiplied by the independent Bernoulli distribution Ber(p)), eigenvalues can exhibit a drastically different behavior. Constant average degree of sparsity, p~1/n, is an interesting case since universality breaks down at this regime of sparsity. In this talk, I will consider Gaussian ensembles with sparsity p=1/n and talk about the typical behavior and large deviation estimates for the largest eigenvalue. Joint work with Shirshendu Ganguly.

Robust hypergraph clustering via convex relaxation of truncated MLE

Hye Won Chung (Korea Advanced Institute of Science and Technology (KAIST))

6
We study hypergraph clustering in the weighted $d$-uniform hypergraph stochastic block model ($d$-WHSBM), where each edge consisting of $d$ nodes from the same community has higher expected weight than the edges consisting of nodes from different communities. We propose a new hypergraph clustering algorithm, called CRTMLE, and provide its performance guarantee under the $d$-WHSBM for general parameter regimes. We show that the proposed method achieves the order-wise optimal or the best existing results for approximately balanced community sizes. Moreover, our results settle the first recovery guarantees for growing number of clusters of unbalanced sizes. Involving theoretical analysis and empirical results, we demonstrate the robustness of our algorithm against the unbalancedness of community sizes or the presence of outlier nodes.

Convergence rate to the Tracy-Widom laws for the largest eigenvalue of Wigner matrices

Kevin Schnelli (KTH Royal Institute of Technology)

6
In this talk we discuss quantitative versions of the edge universality for Wigner matrices and related random matrix models. The fluctuations of the largest rescaled eigenvalues for the Gaussian invariant ensembles are described by the Tracy-Widom laws. The universality of these laws has in recent years been established for many non-invariant random matrix models. We will present new results on the convergence rate to the universal laws for the largest eigenvalues of Wigner matrices and high-dimensional sample covariance matrices.

Attributed graph alignment

Lele Wang (University of British Columbia)

5
Motivated by various data science applications including de-anonymizing user identities in social networks, we consider the graph alignment problem, where the goal is to identify the vertex/user correspondence between two correlated graphs. Existing work mostly recovers the correspondence by exploiting the user-user connections. However, in many real-world applications, additional information about the users, such as user profiles, might be publicly available. In this paper, we introduce the attributed graph alignment problem, where additional user information, referred to as attributes, is incorporated to assist graph alignment. We establish sufficient and necessary conditions for recovering vertex correspondence exactly, where the conditions match for a wide range of practical regimes. Our results recover existing tight information-theoretic limits for models where only the user-user connections are available, and further span the full spectrum between these models and models where only attribute information is available.

This is joint work with Ning Zhang and Weina Wang.

Minkowski content for the scaling limit of loop-erased random walk in three dimensions

Xinyi Li (Peking University)

4
In this talk, we will discuss loop-erased random walk in three dimensions and its scaling limit, and briefly explain how to prove the existence of the Minkowski content of the latter and why it gives the scaling limit of the former in natural parametrizaion. This is joint work with Daisuke Shiraishi (Kyoto).

Q&A for Organized Contributed Session 06

0
This talk does not have an abstract.

Session Chair

Ji Oon Lee (Korea Advanced Institute of Science and Technology (KAIST))

Organized 13

Recent Advances in Complex Time Series Analysis (Organizer: Haeran Cho)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Change points detection for high dimensional time series

Likai Chen (Washington University in Saint Louis)

2
We consider multiple change-points detection of high-dimensional time series. Asymptotic theory for testing the existence of breaks, the consistency and the asymptotic distribution of the breakpoint statistics and estimated break sizes will be provided. The theory backs up a simple two-step procedure for detecting and estimating multiple change-points. The proposed two-step procedure involves the maximum of a MOSUM (moving sum) type statistics in the first step and a CUSUM (cumulative sum) refinement step on an aggregated time series in the second step. Thus, for a fixed time-point, we can capture both the biggest break across different coordinates and aggregating simultaneous breaks over multiple coordinates. Extending the existing high-dimensional Gaussian approximation theorem to dependent data with jumps, the theory allows us to characterize the size and power of our multiple change-point test asymptotically. Moreover, we can make inferences on the breakpoints estimates when the break sizes are small. Our theoretical setup incorporates both weak temporal and strong or weak cross-sectional dependence and is suitable for heavy-tailed innovations. A robust long-run covariance matrix estimation is proposed, which can be of independent interest. An application on detecting structural changes of the U.S. unemployment rate is considered to illustrate the usefulness of our method.

Asymptotics of large autocovariance matrices

Monika Bhattacharjee (Indian Institute of Technology Bombay)

2
We consider the high-dimensional moving average process and explore the asymptotics for eigenvalues of its sample autocovariance matrices. Under quite weak conditions, we prove, in a unified way, that the limiting spectral distribution (LSD) of any symmetric polynomial in the sample autocovariance matrices, after suitable centering and scaling, exists and is non-degenerate. We use methods from free probability in conjunction with the method of moments to establish our results. In addition, we are able to provide a general description for the limits in terms of some freely independent variables. We also establish asymptotic normality results for the traces of these matrices. We suggest statistical uses of these results in problems such as order determination of high-dimensional MA and AR processes and testing of hypotheses for coefficient matrices of such processes.

Factor models for matrix-valued high-dimensional time series

Xialu Liu (San Diego State University)

2
In finance, economics and many other fields, observations in a matrix form are often observed over time. For example, many economic indicators are obtained in different countries over time. Various financial characteristics of many companies are reported over time. Although it is natural to turn a matrix observation into a long vector then use standard vector time series models or factor analysis, it is often the case that the columns and rows of a matrix represent different sets of information that are closely interrelated in a very structural way. We propose a novel factor model that maintains and utilizes the matrix structure to achieve greater dimensional reduction as well as finding clearer and more interpretable factor structures. Estimation procedure and its theoretical properties are investigated and demonstrated with simulated and real examples.

Multi-level changepoint inference for periodic data sequences

Anastasia Ushakova (Lancaster University)

2
Existing changepoint approaches consider changepoints to occur linearly in time; one changepoint happens after another and they are not linked. However, data processes may have regularly occurring changepoints, e.g. a yearly increase in sales of ice-cream on the first hot weekend. Using linear changepoint approaches here will miss more global features such as a decrease in sales of ice-cream in favour of sorbet. Being able to tease these global changepoint features from the more local (periodic) ones is beneficial for inference. We propose a periodic changepoint model to model this behaviour using a mixture of a periodic and linear time perspective. Built around a Reversible Jump Markov Chain Monte Carlo sampler, the Bayesian framework is used to study the local (periodic) changepoint behaviour. To identify the optimal global changepoint positions we integrate the local changepoint model into the pruned exact linear time (PELT) search algorithm. We demonstrate that the method detects both local and global changepoints with high accuracy on simulated and motivating applications that share periodic behaviour. Due to the multi-level nature of the analysis, visualisation of the results can be challenging. We additionally provide a unique multi-level perspective for changepoint visualisations in data sequences.

Q&A for Organized Contributed Session 13

0
This talk does not have an abstract.

Session Chair

Haeran Cho (University of Bristol)

Contributed 10

Reflecting Diffusion Processes, Stochastic Networks and Their Applications (Organizer: Amber Puha)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Measure valued processes characterized by a field of reflecting Brownian motions arising from certain queuing problems

Amarjit Budhiraja (University of North Carolina)

5
We study a class of queuing models in which the state of the system at any instant is given by a finite nonnegative Borel measure on the nonnegative real line which puts a unit atom at the remaining processing time of each job in system. The settings where the processing time distributions of jobs have bounded support or light tails have been investigated in previous works. In the current work we study the case where these distributions have finite second moments and regularly varying tails. By considering a parameter given in terms of the tails of processing time distributions, we consider a novel time, volume, and spatial scaling for the measure valued process and show that the scaled measure valued process converges in distribution (in the space of paths of measures). In a sharp contrast to results for bounded support and light tailed service time distributions, this time there is no state space collapse and the limiting random measures are not concentrated on a single atom. Nevertheless, the description of the limit is simple and given explicitly in terms of a certain random field of reflected Brownian motions.

This is joint work with Sayan Banerjee and Amber Puha.

Asymptotic behavior of a critical fluid model for bandwidth sharing with general file size distributions

Yingjia Fu (University of California San Diego)

5
This work concerns the asymptotic behavior of solutions to a critical fluid model for a data communication network, where file sizes are generally distributed and the network operates under a fair bandwidth sharing policy, chosen from the family of (weighted) $\alpha$-fair policies introduced by Mo and Walrand. Solutions of the fluid model are measure-valued functions of time. Under law of large numbers scaling, Gromoll and Williams proved that these solutions approximate dynamic solutions of a flow level model for congestion control in data communication networks, introduced by Massoulié and Roberts. In a recent work, we proved stability of the strictly subcritical version of this fluid model under mild assumptions. In this talk, we study the asymptotic behavior (as time goes to infinity) of solutions of the critical fluid model, in which the nominal load on each network resource is less than or equal to its capacity and at least one resource is fully loaded. For this we introduce a new Lyapunov function, inspired by the work of Kelly and Williams, Mulvany et al. and Paganini et al. Using this, under moderate conditions on the file size distributions, we prove that critical fluid model solutions converge uniformly to the set of invariant states as time goes to infinity, when started in suitable relatively compact sets. We expect that this result will play a key role in developing a diffusion approximation for the critically loaded flow level model of Massoulié and Roberts. Furthermore, the techniques developed here may be useful for studying other stochastic network models with resource sharing.

Error bounds for the one-dimensional constrained Langevin approximation for density dependent Markov chains

Felipe Campos (University of California, San Diego)

5
The stochastic dynamics of chemical reaction networks are often modeled using continuous-time Markov chains. However, except in very special cases, these processes cannot be analysed exactly and their simulation can be computationally intensive. An approach to this problem is to consider a diffusion approximation. The Constrained Langevin Approximation (CLA) is a reflected diffusion approximation for stochastic chemical reaction networks proposed by Leite & Williams. In this work, we extend this approximation to (nearly) density dependent Markov chains, when the diffusion state space is one-dimensional. Then, we provide a bound for the error of the CLA in a strong approximation. Finally, we discuss some applications for chemical reaction networks and epidemic models, illustrating these with examples.

Joint work with Ruth Williams.

Obliquely reflecting diffusions in nonsmooth domains: some new uniqueness results

Cristina Costantini (University of Chieti-Pescara)

5
Exhaustive existence and uniqueness results are available for Brownian motion reflecting in a polyhedron with constant direction of reflection on each face (Varadhan and Williams, 1984; Dai and Williams, 1995) or in a smooth cone with radially constant direction of reflection (Kwon and Williams, 1991). Only partial results are available for reflecting diffusions in nonsmooth domains with curved boundaries and varying directions of reflection, although these situations come up in applications (see, e.g., Kang, Kelly, Lee and Williams, 2009 or Kang and Williams, 2012). This talk will present some recent, published and unpublished, existence and uniqueness results. We consider semimartingale reflecting diffusions, characterized as solutions of Stochastic Differential Equations with Reflection (SDERs). We obtain existence and uniqueness of the solution in a piecewise smooth, 2-dimensional domain, with a varying direction of reflection on each "side", under easily verifiable, geometrically meaningful conditions. In the case of a polygon with a constant direction of reflection on each side, our conditions coincide with Dai and Williams'. Moreover we allow for cusps (Costantini and Kurtz, 2018) and for situations where two ''sides'' meet smoothly but the direction of reflection is discontinuos. We also obtain existence and uniqueness in a d-dimensional domain with one singular point (such as a smooth cone or ''horn''), with a varying direction of reflection, under similar assumptions. The keystone of our arguments is a new reverse ergodic theorem for nonhomogeneous, possibly killed, Markov chains (Costantini and Kurtz, 2021), which is used in combination with a result on existence of strong Markov solutions to SDERs (Costantini and Kurtz, 2019).

Q&A for Contributed Session 10

0
This talk does not have an abstract.

Session Chair

Ruth J. Williams (University of California at San Diego)

Contributed 16

Probability Theory and Statistical Mechanics

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Coexistence of localized Gibbs measures and delocalized gradient Gibbs measures on trees

Florian Henning (Ruhr-University Bochum)

2
In statistical mechanics, Gibbs measures for a spin system (a stochastic process) indexed by a countable graph under the influence of an interaction potential describe equilibrium distributions. They are defined in terms of being compatible with the Gibbsian specification associated with the potential, a system of prescribed conditional distributions built from the potential. In case of an unbounded local spin space, existence of Gibbs measures does not directly follow from compactness arguments. In this talk we focus on the situation where the underlying graph is a regular tree, spins take values in the integers (or an integer lattice) and the interaction potential is spatially homogeneous and of gradient type, i.e., depends only on the difference of spins values at neighboring sites.
We provide general conditions in terms of the relevant p-norms of the associated transfer operator Q (the exponential of the interaction potential) which ensure the existence of a countable family of spatially homogeneous Gibbs measures, describing localization at different heights. Next we prove existence of spatially homogeneous gradient Gibbs measures, describing increments of spin values along the edges of the tree. We construct these gradient Gibbs measures in terms of an edge-wise independent resampling process for $Z_q$-valued Gibbs measures for a suitable transformed fuzzy transfer operator $Q^q$. Then we prove that they are delocalized. Finally, we show that the two conditions on Q can be fulfilled at the same time, which then implies coexistence of both types of measures.

The talk is based on joint work with Christof Kuelske, which is accepted for publication in the Annals of Applied Probability.

Reference: arXiv:2002.09363

Inhomogeneous gradient Gibbs measures on regular trees with homogeneous interactions

Christof Kuelske (Ruhr-University Bochum)

2
It is known that some statistical mechanics models with homogeneous interactions on regular lattices may admit inhomogeneous infinite-volume states. A famous example for this phenomenon are the Dobrushin-states for the Ising model which lack translation-invariance in three or more lattice dimensions. We investigate whether states which lack translation-invariance also exist on regular trees for Z-valued spin models with nearest-neighbor gradient interactions. Our analysis includes the SOS-model and the discrete Gaussian, which are important models of mathematical statistical mechanics, where they are mostly studied on the lattice. We show that, under rather general assumptions on the interaction, such inhomogeneous gradient states do exist. Our proof uses probabilistic methods in close combination with dynamical systems methods. In a first part we extend the probabilistic approach of our earlier work ("Coexistence of localized Gibbs measures and delocalized gradient Gibbs measures on trees", to be published in the Annals of Applied Probability). This allows to draw a relation between the gradient Gibbs states we are aiming at, and the Gibbs states of certain internal q-state spin-models with discrete rotation symmetry, which holds also for inhomogeneous states. In a second part we investigate these q-spin models on the regular tree via their associated discrete dynamical systems. The proofs of existence and lack of translation invariance of infinite-volume gradient states are then specifically based on properties of the local pseudo-unstable manifold of the corresponding discrete dynamical systems of these internal models, around the free state, at large q.

Reference: arXiv:2102.11899, Existence of gradient Gibbs measures on regular trees which are not translation invariant

Statistical mechanical model of adsorption at the surface interface contacting with an ideal gas

Changho Kim (University of California, Merced)

3
We develop a statistical mechanical model for an ideal gas interfaced with a lattice surface where adsorption and desorption of gas particles occur. While this type of model has been investigated, we revisit it for the development of a thermodynamically consistent particle-continuum hybrid model for stochastic simulations of gas-solid interfacial systems as described below. Following the Langmuir adsorption model, we assume that the mean adsorption rate is proportional to the mean impinge rate of gas particles onto the surface and the mean desorption rate is given as a function of surface temperature. As a result, thermodynamic equilibrium is expected to be established for a given pressure of the ideal gas and temperature of the system. By investigating the detailed balance conditions, we derive the equilibrium fluctuational properties of the ideal gas state and surface coverage. We consider several velocity models, including the spectacular reflection and thermal wall models, from which the velocity of each desorbed or colliding particle is drawn. Based on this statistical mechanical model, we find how the ideal gas state and surface coverage after a finite short time delta t can be updated using the adsorption and desorption counts during delta t. For the momentum and energy update, we confirm that the same thermodynamic equilibrium is established whether adsorption and desorption are considered. The resulting time update model gives essential information on how to construct a particle-continuum hybrid model, where positions of all adsorbed particles are tracked whereas only the aggregated information such as the total mass, momentum, and energy densities are tracked in the gas. We present preliminary simulation results of the particle-continuum hybrid simulation method and demonstrate the importance of using a thermodynamically consistent statistical mechanical model.

Q&A for Contributed Session 16

0
This talk does not have an abstract.

Session Chair

HyunJae Yoo (Hankyong National University)

Contributed 19

Detection and Segmentation

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Detection of outliers in compositional data on disabled people in the São Paulo State

Paulo Oliveira (University of São Paulo)

2
Outliers are observations that, for some reason, di_er from the other observations belonging to the data set. In univariate and bivariate data sets, outliers can be detected analyzing the scatter plot. Observations distant from the cloud formed by the data set are considered unusual. In multivariate data sets, the detection of outliers using graphics is more difficult because we have to analysis a couple of variables each time, which results is a long and less reliable process because we can find an observation that is unusual for one variable and not unusual for the others, masking the results. Compositional data are vectors, called compositions, whose components are all positive, it satisfies the sum equal one and has a Simplex space. The sum constraint induces the correlation between the components and this requires that the statistical methods for the analysis of datasets consider this fact. The theory for compositional data was developed mainly by Aitchison in the 1980s, and since then, several techniques and methods have been developed for compositional data modelling. Disabled Person is any person who presents loss or abnormality of a psychological or anatomical structure or function that generates incapacity for the performance of activities, that is, they have different characteristics from most people who are part of society and these characteristics make it difficult to their social inclusion. Disabilities can be permanent or temporary and limit the ability to perform one or more activities such as seeing, listening, walking and intellectual. It is characterized as a complex multidimensional experience and imposes several measurement challenges. Worldwide, disabled people have worse health prospects, lower levels of education, lower economic participation and higher poverty rates compared to people without disabilities. This is partly due to the fact that disabled people face barriers to access services that many of us have long considered guaranteed, such as health, education, employment. In statistical terms, were considered data from 3681111 respondents from the complete questionnaire of the IBGE (Statistics and Geography Brazilian Institute) census aggregated in 645 municipalities in the State of S_o Paulo, Brazil, considering as variables the 16 levels of disabled people with and the following methods were used to detect multivariate outliers’ detection: 95% confidence ellipse based on the first two main components; The Forward Search; Based on MCD (Minimum Covariance Determinant). and finally; Based on MED (Mas Eigen Difference) for comparative study between outlier detection performance by different methods.

Consistent change-point detection for general distributions

Florencia Leonardi (University of São Paulo)

2
We propose a method based on regularized maximum likelihood for change point detection of general multivariate distributions under independent sampling. We show that the estimator is consistent and almost surely recovers the set of change-points under usual and easy to verify conditions. These conditions apply to a large variety of models, such as categorical or normal random variables and finite-state Markov chains. We also show that we can efficiently compute the estimator through a dynamic programming algorithm under a decomposable penalty term.

This is joint work with Lucas Prates de Oliveira.

Change point detection under linear model: use of MOSUM approach

Joonpyo Kim (Seoul National University)

6
This talk presents a new detection method for structural change points based on a MOSUM approach under a piecewise linear model. Most existing methods focus on mean changes, assuming that the underlying model is piecewise constant. However, this stringent assumption cannot be applicable to many real-world processes, such as manufacturing. The proposed method significantly extends the scope of the change points structure by employing the linear regression model so that it is capable of identifying slope changes or smoothness of the processes beyond their mean changes. The proposed method is computationally efficient and easily used for real-time detection due to the inherent feature of the moving window approach. Furthermore, some theoretical properties of the proposed change point estimator are investigated. Results from the real data analysis and simulation examples show the promising empirical performance of the proposed method.

Interval-censored least-squares regressions

Taehwa Choi (Korea University)

2
This article suggests the linear regression model under interval-censored data, where exact event times are unobserved but fall in observed censoring intervals. It is commonly arisen in longitudinal studies such as breast cosmesis data, where periodic monitoring is progressed to check the patient clinical status. Many of previous researches has been mostly focused on probability-based methods such as Cox and transformation models in terms of both theoretical and practical approaches. In contrast, there has not been received much attention on accelerated failure time model, despite direct interpretation on event time is possible. In this article, we generalize the Buckley-James method to explain the accelerated lifetime effects under the interval-censored data. Coupled with regression estimating procedure, a novel EM-algorithm for nonparametric likelihood estimation is devised for nuisance function parameter. Asymptotic behaviors are established, along with slower rate of convergence for nuisance function parameter due to absence of exact data. Simulation studies demonstrate the finite sample performance, and the method is applied to the real data to illustrate the practical usage.

Q&A for Contributed Session 19

0
This talk does not have an abstract.

Session Chair

Myung Hee Lee (Weil Cornell Medicine)

Contributed 23

Bayesian Nonparametric Inference

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

Bernstein - von Mises type theorem for a scale hyperparameter in Bayesian nonparametric inference

Natalia Bochkina (University of Edinburgh)

2
We consider the problem of estimating a smooth function adaptively from a Bayesian perspective in a nonparametric regression model, observed either directly or indirectly. We consider the model in the sequence space, with a smoothing Gaussian prior on the unknown coefficients, and a hyperprior on the prior scale to achieve an adaptive estimator. We show, that under some conditions on the true function, such as self-similarity, the MAP estimator of the scale hyperparameter converges to its oracle value, and the posterior distribution of the scale can be approximated by a Gaussian distribution as the number of observations grows. As far as we are aware, it is the first result of Gaussian approximation of the posterior distribution of a hyperparameter. We will show that this can be interpreted as estimation of the scale parameter from data under model misspecification, and that in the considered setting the posterior variance and the Fisher information of the scale parameter are of different order. We will illustrate these results on an inverse problem with Volterra operator.

Convergence of unadjusted Hamiltonian Monte Carlo for mean-field models

Katharina Schuh (University Bonn)

5
In this talk we study the unadjusted Hamiltonian Monte Carlo algorithm applied to high-dimensional probability distributions of mean-field type. We evolve dimension-free convergence and discretization error bounds in Wasserstein distance. These bounds require the discretization step to be sufficiently small, but do not require strong convexity of either the unary or pairwise potential terms present in the mean-field model. To handle high dimensionality, we use a particlewise coupling that is contractive in a complementary particlewise metric.
This talk is based on joint work with Nawaf Bou-Rabee.

Nonparametric Bayesian volatility estimation for gamma-driven stochastic differential equations

Peter Spreij (University of Amsterdam)

2
We study a nonparametric Bayesian approach to estimation of the volatility function of a stochastic differential equation (SDE) driven by a gamma process. The volatility function is assumed to be positive and Hölder continuous. We show that the SDE admits a weak solution, unique in law. The volatility function is modelled a priori as piecewise constant on a partition of the real line, and we specify a gamma prior on its coefficients. This leads to a straightforward procedure for posterior inference via the Gibbs sampler. We give the contraction rate of the posterior distribution in terms of the Hölder exponent and the sample size.

Joint work with Denis Belomestny, Shota Gugushvili, Moritz Schauer.

Hamiltonian Monte Carlo in high dimensions

Andreas Eberle (University of Bonn)

5
Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo method that is widely used in applications. It is based on a combination of Hamiltonian dynamics and momentum randomizations. The Hamiltonian dynamics is discretized, and the discretization bias can either be taken into account (unadjusted HMC) or corrected by a Metropolis-Hastings accept-reject step (Metropolis adjusted HMC). Despite its empirical success, until a few years ago there have been almost no convergence bounds for the algorithm. This has changed in the last years where approaches to quantify convergence to equilibrium based on coupling, conductance and hypocoercivity have been developed. In this talk, I will present the coupling approach, and show how it can be used to obtain an understanding of the dimension dependence for unadjusted HMC in several high dimensional model classes. I will also mention some open questions.

Q&A for Contributed Session 23

0
This talk does not have an abstract.

Session Chair

Kyoungjae Lee (Inha University)

Made with in Toronto · Privacy Policy · © 2021 Duetone Corp.