Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
|
|
Session Overview | |
|
Location: Unitobler, F-121 52 seats, 100m^2 |
| Date: Tuesday, 09/Jul/2019 | |
| 10:00am - 12:00pm | MS172, part 1: Algebraic statistics |
| Unitobler, F-121 | |
|
|
10:00am - 12:00pm
Algebraic Statistics Algebraic statistics studies statistical models through the lens of algebra, geometry, and combinatorics. From model selection to inference, this interdisciplinary field has seen applications in a wide range of statistical procedures. This session will focus broadly on new developments in algebraic statistics, both on the theoretical side and the applied side. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Testing model fit for networks: algebraic statistics of mixture models and beyond We consider statistical models for relational data that can be represented as a network. The nodes in the network are individuals, organizations, proteins, neurons, or brain regions, while edges---directed or undirected--- specific types of relationships between the nodes such as personal or organizational affinities or other social/financial relationships, or some physical or functional links such as co-activation of brain regions. One of the key open problems in this area is testing whether a proposed statistical model fits the data at hand. Algebraic statistics is known to provide theoretically reliable tools for testing model fit for a class of models that are log-linear exponential families; let's call these log-linear ERGMs. In this talk, we will discuss how the machinery can be extended to mixtures of log-linear ERGMs and other general linear exponential-family models that need not be log-linear, and what the hurdles are that need to be overcome in order for this set of tools to be generalizable, scalable and practical. Oriented Gaussoids An oriented gaussoid is a combinatorial structure that captures the possible signs of correlations among Gaussian random variables. We introduce this concept and present approaches to the classification and construction of oriented gaussoids, drawing parallels to oriented matroids, which capture the possible signs of dependencies in linear algebra. Ideals of Gaussian Graphical Models Gaussian graphical models are semialgebraic subsets of the cone of positive definite matrices. We will report on recent results trying to characterize the vanishing ideals of these models, in particular situations where they are generated by determinantal constraints. Combinatorial matrix theory in structural equation models Many operations on matrices can be viewed from a combinatorial point of view by considering graphs associated to the matrix. For example, the determinant and inverse of a matrix can be computed from the linear subgraphs and 1-connections of the Coates digraph associated to the matrix. This combinatorial approach also naturally takes advantage of the sparsity structure of the matrix, which makes it ideal for applications in linear structural equation models. Another advantage of these combinatorial methods is the fact that they are often agnostic on whether the mixed graph contains cycles. As an example, we obtain a symbolic representation of the entries of the covariance matrix as a finite sum. In general, this sum will become similar to the well known trek rule, but where each half of the trek is a 1-connection instead of a path. This method of computing the covariance matrix can be easily implemented in computer algebra systems, and scales extremely well when the mixed graph has few cycles. |
| 3:00pm - 5:00pm | MS157, part 1: Graphical models |
| Unitobler, F-121 | |
|
|
3:00pm - 5:00pm
Graphical Models Graphical models are used to express relationships between random variables. They have numerous applications in the natural sciences as well as in machine learning and big data. This minisymposium will feature talks on several different types of graphical models, including latent tree models, max linear models, network models, boltzman machines, and non-Gaussian graphical models, each of which exploits their intrinsic algebraic, geometric, and combinatorial structure. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Brownian motion tree models are toric Felsenstein’s classical model for Gaussian distributions on a phylogenetic tree is shown to be a toric variety in the space of concentration matrices. We give an exact semialgebraic characterization of this model, and we demonstrate how the toric structure leads to exact methods for maximum likelihood estimation. Algebra and statistical learning for inferring phylogenetic networks Phylogenetic trees are graphical summaries of the evolutionary history of a set of species. In a phylogenetic tree, the interior nodes represent extinct species, while the leaves represent extant, or living, species. While trees are a natural choice for representing evolution visually, by restricting to the class of trees, it is possible to miss more complicated events such as hybridization and horizontal gene transfer. For more complete descriptions, phylogenetic networks, directed acyclic graphs, are increasingly becoming more common in evolutionary biology. In this talk, we will discuss Markov models on phylogenetic networks and explore how understanding their algebra and geometry can aid in establishing identifiability and model selection. In particular, we will describe a method for network inference that combines computational algebraic geometry and statistical learning. This is joint work with Travis Barton, Colby Long, and Joseph Rusinko. Geometry of max-linear graphical models Motivated by extreme value theory, max-linear graphical models have been recently introduced and studied as an alternative to the classical Gaussian or discrete distributions often used in graphical modeling. We present max-linear models naturally in the framework of tropical geometry. This perspective allows us to shed light on some known results and to prove others with algebraic techniques, including conditional independence statements and maximum likelihood parameter estimation. This is joint work with Claudia Klüppelberg, Steffen Lauritzen and Ngoc Tran. Maximum Likelihood Estimation of Toric Fano Varieties motivated by phylogenetics We study the maximum likelihood estimation problem for several classes of toric Fano models. We start by exploring the maximum likelihood degree for all 2-dimensional Gorenstein toric Fano varieties. We show that the ML degree is equal to the degree of the surface in every case except for the quintic del Pezzo surface with two singular points of type A1 and provide explicit expressions that allow to compute the maximum likelihood estimate in closed form whenever the ML degree is less than 5. We then explore the reasons for the ML degree drop using A-discriminants and intersection theory. Finally, we show that toric Fano varieties associated to 3-valent phylogenetic trees have ML degree one and provide a formula for the maximum likelihood estimate. We prove it as a corollary to a more general result about the multiplicativity of ML degrees of codimension zero toric fibre products, and it also follows from a connection to a recent result about staged trees. This is joint work with Carlos Amendola and Kaie Kubjas. |
| Date: Wednesday, 10/Jul/2019 | |
| 10:00am - 12:00pm | MS172, part 2: Algebraic statistics |
| Unitobler, F-121 | |
|
|
10:00am - 12:00pm
Algebraic Statistics Algebraic statistics studies statistical models through the lens of algebra, geometry, and combinatorics. From model selection to inference, this interdisciplinary field has seen applications in a wide range of statistical procedures. This session will focus broadly on new developments in algebraic statistics, both on the theoretical side and the applied side. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Geometry of Exponential Graph Models When given network data, we can either compute descriptive statistics (degree distribution, diameter, clustering co- efficient, etc.) or we can find a model that explains the data. Modeling allow us to test hypotheses about edge formation, understand the uncertainty associated with the observed outcomes, and conduct inferences about whether the network substructures are more commonly observed than by chance. Modeling is also used for simulation and assessment of local effects. Exponential random graph models (ERGMs) are families of distributions defined by a set of network statistics and, thus, give rise to interesting graph theoretic questions. Our research focuses on the ERGMs where the edge, 2-path, and triangle counts are the sufficient statistics. These models are useful for modeling networks with a transitivity effect such as social networks. One of the most popular research questions for statisticians is the goodness-of-fit testing, how well does the model ”fit” the data? This is a difficult question for ERGMs. And one way to answer this question is to understand the reference set. Given an observed network G, the reference set of G is the set of simple graphs with the same edge, 2-path, and triangle counts as G. In algebraic geometry, it is called the fiber of G and are the 0-1 points on an algebraic variety, which we refer to as the reference variety. The goal of this paper is to understand reference variety through the lens of algebraic geometry. Moment Varieties of Measures on Polytopes This talk brings many areas together: discrete geometry, statistics, algebraic geometry, invariant theory, geometric modeling, symbolic and numerical computations. We study the algebraic relations among moments of uniform probability distributions on polytopes. This is already a non-trivial matter for quadrangles in the plane. In fact, we need to combine invariant theory of the affine group with numerical algebraic geometry to compute first relevant relations. Moreover, the numerator of the generating function of all moments of a fixed polytope is the adjoint of the polytope, which is known from geometric modeling. We prove the conjecture that the adjoint is the unique polynomial of minimal degree which vanishes on the non-faces of a simple polytope. This talk is based on joint work with Kristian Ranestad, Boris Shapiro and Bernd Sturmfels. The stratification of the maximum likelihood degree for toric varieties The lattice points of a lattice polytope give rise to a family of toric varieties when we allow complex coefficients in the monomial parametrization of the "usual" toric variety associated to the polytope. The maximum likelihood degree (ML degree) of any member of this family is at most the normalized volume of the polytope. The set of coefficient vectors associated to ML degrees smaller than the volume is parametrized by Gelfand-Kapranov-Zelevinsky's principal A-determinant. Not much is known about how the ML degree changes as one moves in the parameter space. We will discuss what we know starting with toric surfaces. Nested Determinantal Constraints in Linear Structural Equation Models Directed graphical models specify noisy functional relationships among a collection of random variables. In the Gaussian case, each such model corresponds to a semi-algebraic set of positive definite covariance matrices. The set is given via parametrization, and much work has gone into obtaining an implicit description in terms of polynomial (in-)equalities. Implicit descriptions shed light on problems such as parameter identification, model equivalence, and constraint-based statistical inference. For models given by directed acyclic graphs, which represent settings where all relevant variables are observed, there is a complete theory: All conditional independence relations can be found via graphical d-separation and are sufficient for an implicit description. The situation is far more complicated, however, when some of the variables are hidden. We consider models associated to mixed graphs that capture the effects of hidden variables through correlated error terms. The notion of trek separation explains when the covariance matrix in such a model has submatrices of low rank and generalizes d-separation. However, in many cases, such as the infamous Verma graph, the polynomials defining the graphical model are not determinantal, and hence cannot be explained by d-separation or trek-separation. We show that these constraints often correspond to the vanishing of nested determinants and can be graphically explained by a notion of restricted trek separation. |
| 3:00pm - 5:00pm | MS157, part 2: Graphical models |
| Unitobler, F-121 | |
|
|
3:00pm - 5:00pm
Graphical Models Graphical models are used to express relationships between random variables. They have numerous applications in the natural sciences as well as in machine learning and big data. This minisymposium will feature talks on several different types of graphical models, including latent tree models, max linear models, network models, boltzman machines, and non-Gaussian graphical models, each of which exploits their intrinsic algebraic, geometric, and combinatorial structure. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Interventional Markov Equivalence for Mixed Graph Models Abstract: We will discuss the problem of characterizing Markov equivalence of graphical models under general interventions. Recently, Yang et al. (2018) gave a graphical characterization of interventional Markov equivalence for DAG models that relates to the global Markov properties of DAGs. Based on this, we extend the notion of interventional Markov equivalence using global Markov properties of loopless mixed graphs and generalize their graphical characterization to ancestral graphs. On the other hand, we also extend the notion of interventional Markov equivalence via modifications of factors of distributions Markov to acyclic directed mixed graphs. We prove these two generalizations coincide at their intersection; i.e., for directed ancestral graphs. This yields a graphical characterization of interventional Markov equivalence for causal models that incorporate latent confounders and selection variables under assumptions on the intervention targets that are reasonable for biological applications. Sequential Monte Carlo-based inference in decomposable graphical models We shall discuss a sequential Monte Carlo-based approach to approximation of probability distributions defined on spaces of decomposable graphs, or, more generally, spaces of junction (clique) trees associated with such graphs. In particular, we apply a particle Gibbs version of the algorithm to Bayesian structure learning in decomposable graphical models, where the target distribution is a junction tree posterior distribution. Moreover, we use the proposed algorithm for exploring certain fundamental combinatorial properties of decomposable graphs, e.g. clique size distributions. Our approach requires the design of a family of proposal kernels, so-called junction tree expanders, which expand junction trees by connecting randomly new nodes to the underlying graphs. The performance of the estimators is illustrated through a collection of numerical examples demonstrating the feasibility of the suggested approach in high-dimensional domains. CausalKinetiX: Learning stable structures in kinetic systems Learning kinetic systems from data is one of the core challenges in many fields. Identifying stable models is essential for the generalization capabilities of data-driven inference. We introduce a computationally efficient framework, called Causal KinetiX, that identifies structure from discrete time, noisy observations, generated from heterogeneous experiments. The algorithm assumes the existence of an underlying, invariant kinetic model. The results on both simulated and real-world examples suggests that learning the structure of kinetic systems can indeed benefit from a causal perspective. The talk is based on joint work with Niklas Pfister and Stefan Bauer. It does not require prior knowledge on causality or kinetic systems. Autoencoders memorize training images The ability of deep neural networks to generalize well in the overparameterized regime has become a subject of significant research interest. We show that overparameterized autoencoders exhibit memorization, a form of inductive bias that constrains the functions learned through the optimization process to concentrate around the training examples, although the network could in principle represent a much larger function class. In particular, we prove that single-layer fully-connected autoencoders project data onto the (nonlinear) span of the training examples. In addition, we show that deep fully-connected autoencoders learn a map that is locally contractive at the training examples, and hence iterating the autoencoder results in convergence to the training examples. |
| Date: Thursday, 11/Jul/2019 | |
| 10:00am - 12:00pm | MS194: Latent graphical models |
| Unitobler, F-121 | |
|
|
10:00am - 12:00pm
Latent graphical models Algebro-geometric methods have been extensively applied to study probabilistic graphical models. They became particularly useful in the context of graphical models with hidden variables (latent graphical models). Latent variables appear in graphical models in several important contexts: to represent processes that cannot be observed or measured (e.g. economic activity in business cycle dating, ancestral species in phylogenetics), in causal modelling (confounders), and in machine learning (deep learning, dimension reduction). Graphical models with latent variables lead to sophisticated geometry problems. The simplest examples, like the latent class model, link directly to secant varieties of the Segre variety and low rank tensors. Understanding the underlying geometry proved to be the driving force behind designing new learning algorithms and was essential to understand fundamental limits of these models. This minisession features three speakers who have been leading this research in the last couple of years. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Latent-variable graphical modeling with generalized linear models We describe a convex optimization framework for fitting latent-variable graphical models in the class of generalized linear models. We discuss scaling laws under which our framework succeeds in identifying a population model with high probability as well as experimental results with real data. We also highlight natural tradeoffs in our setup between computational resources and sample size. (Joint with Armeen Taeb and Parikshit Shah) Representation of Markov kernels with deep graphical models We revisit the topic representational power of deep probabilistic graphical models. We consider directed and undirected models with multiple layers of finite valued hidden variables. We discuss relations between directed and undirected models, as well as relations between deep and shallow models, in relation to the number of layers and variables within layers that are needed and sufficient to express any Markov kernel. Conditional independence statements with hidden variables Conditional independence is an important tool in statistical modeling, as, for example, it gives a statistical interpretation to graphical models. In causal reasoning, it is important to know what constraints on the observed variables are caused by hidden variables. In general, given a sub-family of random variables satisfying a list of conditional independence (CI) statements, it is difficult to say which constraints are implied by the CI statements on this sub-family. However, the CI statements correspond to some determinantal conditions on the tensor of joint probabilities of the observed random variables. Hence, the algebraic analogue of this question relates to determinantal varieties and their irreducible decompositions. In a joint project with Ollie Clarke and Johannes Rauh, we generalize the intersection axiom for CI statements, and we study a family of CI statements whose corresponding variety and its irreducible components are all determinantal varieties. |
| 3:00pm - 5:00pm | MS139, part 1: Combinatorics and algorithms in decision and reason |
| Unitobler, F-121 | |
|
|
3:00pm - 5:00pm
Combinatorics and algorithms in decision and reason Combinatorial, or discrete, structures are a fundamental tool for modeling decision-making processes in a wide variety of fields including machine learning, biology, economics, sociology, and causality. Within these various contexts, the goal of key problems can often be phrased in terms of learning or manipulating a combinatorial object, such as a network, permutation, or directed acyclic graph, that exhibits pre-specified optimal features. In recent decades, major break-throughs in each of these fields can be attributed to the development of effective algorithms for learning and analyzing combinatorial models. Many of these advancements are tied to new developments connecting combinatorics, algebra, geometry, and statistics, particularly through the introduction of geometric and algebraic techniques to the development of combinatorial algorithms. The goal of this session is to bring together researchers from each of these fields who are using combinatorial or discrete models in data science so as to encourage further breakthroughs in this important area of mathematical research. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) (Machine) Learning Non-Linear Algebra Machine Learning has been used successfully to improve algorithms in Optimization and Computational Logic. By training a neural network one can predict the shape of the answer or select the best parameters to run an algorithm. In this presentation I discuss some experience with applying machine learning tools to improve algorithms manipulating multivariable polynomial systems. Network Flows in Semi-Supervised Learning via Total Variation Minimization We study the connection between methods for semi-supervised learning for partially labeled network data and network flow optimization. Many methods for semi-supervised learning are based on minimizing the total variation of graph signals induced by labels of data points. These methods can be interpreted as flow optimization on the network-structure underlying the data. Using basic results from convex duality we show that the dual problem of network Lasso is equivalent to maximizing the network flow over the data network. Moreover, the accuracy of network Lasso depends crucially on the existence of sufficiently large network flows between labeled data points. We also provide an interpretation of a primal-dual implementation of network Lasso as distributed flow maximization which bears some similarity with the push–relabel maximum flow algorithm. Scalably vertex-programmable ideological forests from certain political twitterverses around US (2016), UK(2017) and Swedish (2018) national elections Using customised experimental designs via Twitter Streaming and REST APIs, we collected status updates in Twitter around three national elections. Dynamic retweet networks were transformed into empirical geometrically weighted directed graphs, where every node is a user account and every edge accounts for the number of retweets of one use by another, with a natural probabilistic interpretation. Distributed vertex programs were then used to find the most retweeted paths from every user in the population to a given set of subsets of users (subpopulations of interest). Using a pairwise-distance induced by such paths we build a population retweet-based ideological forest. This statistic can be presented while preserving the privacy of the users and attempting to increase self-awareness about how one's own ideological profiles and social norms are formed and influenced in social media. Concrete hypothesis tests around the 2016 US presidential election, SPLC-defined US hate groups and interference by Russian political bots will be the driving empirical skeleton of the talk. The Kingman Coalescent as a density on a space of trees Randomly pick n individuals from a population and trace their genealogy backwards in time until you reach the most recent common ancestor. The Kingman n-Coalescent is a probabilistic model for the tree one obtains this way. The common definitions are stated from a stochastic point of view. The Kingman Coalescent can also be described by a probability density function on a space of certain trees. We describe this approach and extend it to the Multispecies Coalescent model by relating population genetics to polyhedral and algebraic geometry. This is joint work in progress with Christian Haase. |
| Date: Friday, 12/Jul/2019 | |
| 10:00am - 12:00pm | MS198: Positive and negative association |
| Unitobler, F-121 | |
|
|
10:00am - 12:00pm
Positive and negative association Positive and negative association form strong and useful conditions on probability distributions that appear in several applications. Algebraic and combinatorial methods have led to methods for understanding and sampling from important classes of these distributions. This session aims to explore some of the recent breakthroughs and applications of positive and negative association. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Negative dependence and sampling Negative dependence occurs in various forms in probability and machine learning. A prominent example with applications in both probabilistic modeling and randomized algorithms are Determinantal Point Processes (DPPs). DPPs belong to the class of Strongly Rayleigh measures that are characterized by real stable polynomials and exhibit strong notions of negative dependence. For practical applications, it is important that procedures such as sampling can be performed efficiently; recent work suggests that negative dependence can enable exactly that. In this talk, I will summarize selected applications of Determinantal Point Processes and other Strongly Rayleigh measures, and then show results on fast mixing of Markov chains for those measures. Several of these results rely on the connections to real stable polynomials. Log-concave polynomials: Polynomials that a drunkard can (almost) evaluate A central question in algorithm design is what kind of distributions can we sample from efficiently? On the continuous side, uniform distributions over convex sets and more generally log-concave distributions constitute the main tractable class. We will build a parallel theory on the discrete side, that yields tractability for important discrete distributions such as uniform distributions over matroids, generalizations of determinantal point processes, and some regime of the random cluster model. The hammer enabling these algorithmic advances is the introduction and the study of a class of polynomials, that we call completely log-concave. Sampling from discrete distributions becomes equivalent to approximately evaluating associated multivariate polynomials, and we will see how we can use very simple random walks to perform both tasks. This is based on joint work with Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. Total positivity in structured binary distributions We study estimation in totally positive binary distributions, in particular quadratic exponential families of these. Using results from convex optimization we show how the restriction of total positivity induces conditional independence restrictions on the estimated distributions. Also, we give necessary and sufficient conditions for the maximum likelihood estimate to exist within the corresponding exponential family and develop a globally convergent algorithm for its computation. This represents joint work with Caroline Uhler and Piotr Zwiernik. Geometric problems in non-parametric statistics We examine maximum likelihood estimation for probability densities that are both log-concave and totally positive. The solution to this optimization problem is intriguing. One ingredient is the study of convex polytopes that are closed under coordinate-wise maximum and minimum. This is joint work with Elina Robeva, Ngoc Tran, and Caroline Uhler. |
| 3:00pm - 5:00pm | MS139, part 2: Combinatorics and algorithms in decision and reason |
| Unitobler, F-121 | |
|
|
3:00pm - 5:00pm
Combinatorics and algorithms in decision and reason Combinatorial, or discrete, structures are a fundamental tool for modeling decision-making processes in a wide variety of fields including machine learning, biology, economics, sociology, and causality. Within these various contexts, the goal of key problems can often be phrased in terms of learning or manipulating a combinatorial object, such as a network, permutation, or directed acyclic graph, that exhibits pre-specified optimal features. In recent decades, major break-throughs in each of these fields can be attributed to the development of effective algorithms for learning and analyzing combinatorial models. Many of these advancements are tied to new developments connecting combinatorics, algebra, geometry, and statistics, particularly through the introduction of geometric and algebraic techniques to the development of combinatorial algorithms. The goal of this session is to bring together researchers from each of these fields who are using combinatorial or discrete models in data science so as to encourage further breakthroughs in this important area of mathematical research. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) On the Graphs of Graphical Models Graphical models, including constraint networks, Bayesian networks, Markov random fields and influence diagrams, have become a central paradigm for knowledge representation and reasoning in artificial intelligence, and provide powerful tools for solving problems in numerous application areas. Reasoning over probabilistic graphical models typically involves answering inference queries, such as computing the most likely configuration (maximum a posteriori or MAP) or evaluating the marginals or the normalizing constant of a distribution (the partition function). Exact computation of such queries is known to be intractable in general, yet, the underlying graphs of graphical models provide a powerful tool that allow exploiting the problems structure in reasoning algorithms. In this talk I will provide an overview of how such graph parameters (e.g., tree-width, height, w-cutset) interact in their role for bounding graphical models complexity. Causal Inference with Unknown Intervention Targets We consider the problem of estimating causal DAG models from a mix of observational and interventional data, when the intervention targets are partially or completely unknown. This problem is highly relevant for example in genomics, since gene knockout technologies are known to have off-target effects. In this paper, we characterize the interventional Markov equivalence class of DAGs that can be identified from interventional data with unknown intervention targets. In addition, we propose the first provably consistent algorithm for learning the interventional Markov equivalence class from such data. The proposed algorithm greedily searches over the space of permutations to minimize a novel score function. The algorithm is nonparametric, which is particularly important for applications to genomics, where the relationships between variables are often non-linear and the distribution non-Gaussian. We demonstrate the performance of our algorithm on synthetic and biological datasets. On attempts to characterize facets of the chordal graph polytope Our idea of integer linear programming approach to structural learning decomposable graphical models, which are models described by undirected chordal graphs, is to encode them by special zero-one vectors, named characteristic imsets. It leads to the study of a special polytope, defined as the convex hull of all characteristic imsets for chordal graphs we name the chordal graph polytope (Studeny and Cussens; 2017). The talk will be devoted to the attempts to characterize theoretically all facet-defining inequalities for this polytope in order to utilize that in ILP-based procedures for learning decomposable models. |
| Date: Saturday, 13/Jul/2019 | |
| 10:00am - 12:00pm | MS196: Algebro-geometric methods for social network modelling |
| Unitobler, F-121 | |
|
|
10:00am - 12:00pm
Algebro-geometric methods for social network modelling Algebraic and geometric methods have recently been proposed for statical random (social) network models. These methods could be described in three categories: 1) Understanding the geometry of the network models, especially the exponential random graph models (ERGMs) in order to understand the (mis)behaviour of such models in the asymptotic settings, commonly known as degeneracy of such models, which occurs commonly. In addition, many ERGMs are in fact curved exponential families, and understanding the geometry of the parameter space is of great importance. 2) Finding the model polytope of network models, i.e. the polytope of all sufficient statistics for every network of fixed size n in order to determine the existence of the MLE for such models and also to demonstrate which parameters are actually estimable. 3) Understanding the Markov bases of random network models specified by a multi-homogeneous ideal. This is directly relevant to the goodness-of-fit testing problems for network models as well as simulating from these models. The minisymposium would consist of four speakers, some of whom have already agreed to reset their papers. A tentative list is as follows (I will ad them to the list when the attendance is finalized by the authors): (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) Goodness-of-fit testing for log-linear network models We define and study degree and block-based ERGMs called log-linear ERGMs. These models admit a correspondence to contingency table models which gives us access to categorical data analysis tools. We use these tools in combination with sampling tools stemming from discrete mathematics and algebraic statistics to produce a non-asymptotic goodness-of-fit test of network data to these models. Cores, shell indices and the degeneracy of a graph limit The k-core of a graph is the maximal subgraph in which every node has degree at least k, the shell index of a node is the largest k such that the k-core contains the node, and the degeneracy of a graph is the largest shell index of any node. After a suitable normalization, these three concepts generalize to limits of dense graphs (also called graphons). In particular, the degeneracy is continuous with respect to the cut metric. On Exchangeability in Network Models We derive representation theorems for exchangeable distributions on finite and infinite graphs using elementary arguments based on geometric and graph-theoretic concepts. Our results elucidate some of the key differences, and their implications, between statistical network models that are finitely exchangeable and models that define a consistent sequence of probability distributions on graphs of increasing size. We also show that, for finitely exchangeable network models, the empirical subgraph densities are maximum likelihood estimates of their theoretical counterparts. We then characterize all possible conditional independence structures for finitely exchangeable random graphs. |
| 3:00pm - 5:00pm | MS139, part 3: Combinatorics and algorithms in decision and reason |
| Unitobler, F-121 | |
|
|
3:00pm - 5:00pm
Combinatorics and algorithms in decision and reason Combinatorial, or discrete, structures are a fundamental tool for modeling decision-making processes in a wide variety of fields including machine learning, biology, economics, sociology, and causality. Within these various contexts, the goal of key problems can often be phrased in terms of learning or manipulating a combinatorial object, such as a network, permutation, or directed acyclic graph, that exhibits pre-specified optimal features. In recent decades, major break-throughs in each of these fields can be attributed to the development of effective algorithms for learning and analyzing combinatorial models. Many of these advancements are tied to new developments connecting combinatorics, algebra, geometry, and statistics, particularly through the introduction of geometric and algebraic techniques to the development of combinatorial algorithms. The goal of this session is to bring together researchers from each of these fields who are using combinatorial or discrete models in data science so as to encourage further breakthroughs in this important area of mathematical research. (25 minutes for each presentation, including questions, followed by a 5-minute break; in case of x<4 talks, the first x slots are used unless indicated otherwise) From random forests to regulatory rules: extracting interactions in high-dimensional genomic data Individual genomic assays measure elements that interact in vivo as components of larger molecular machines. Understanding the connections between such high-order interactions and complex biological processes from gene regulation to organ development presents a substantial statistical challenge. Namely, to identify high-quality interaction candidates from combinatorial search spaces in genome-scale data. Building on Random Forests (RFs), Random Intersection Trees (RITs), and through extensive, biologically inspired simulations, we developed the iterative Random Forest algorithm (iRF). iRF trains a feature-weighted ensemble of decision trees to detect stable, high-order interactions with same order of computational cost as RF. We define a functional relationship between interacting features and responses that decomposes RF predictions into a collection of interpretable rules, which can be used to evaluate interactions in terms of their stability and predictive accuracy. We demonstrate the utility of iRF for high-order interaction discovery in several genomics problems, where iRF recovers well-known interactions and posits novel, high-order interactions associated with gene regulation. By refining the process of interaction recovery, our approach has the potential to guide mechanistic inquiry into systems whose scale and complexity is beyond human comprehension. Probabilistic tensors and opportunistic Boolean matrix multiplication We introduce probabilistic extensions of classical deterministic measures of algebraic complexity of a tensor, such as the rank and the border rank. We show that these probabilistic extensions satisfy various natural and algorithmically serendipitous properties, such as submultiplicativity under taking of Kronecker products. Furthermore, the probabilistic extensions enable strictly lower rank over their deterministic counterparts for specific tensors of interest, starting from the tensor <2,2,2> that represents 2-by-2 matrix multiplication. By submultiplicativity, this leads immediately to novel randomized algorithm designs, such as algorithms for Boolean matrix multiplication as well as detecting and estimating the number of triangles and other subgraphs in graphs. Joint work with Matti Karppa (Aalto University). Reference: https://doi.org/10.1137/1.9781611975482.31 Discrete Models with Total Positivity We consider the case of a discrete graphical loglinear model whose underlying distribution is assumed to be multivariate totally positive of order 2. In particular, we study the implications of total positivity on interactions between the random variables, the marginal polytope associated to the model, and model selection through maximum likelihood estimation. We also compare these results to recent work in Gaussian setting. This is joint work with Helene Massam. |
