Title: Information-Theoretic Modeling and Inference for Complex Systems: A Brief Tutorial
Abstract: In this short tutorial I will discuss the use of information theory for modeling and inference of complex systems and problems across disciplines. Simply stated, the available information is usually too complex, insufficient, and imperfect to deliver a unique model or solution for most systems and problems. Problems with multiple solutions are called under-determined, or partially identified. Information Theory within a constrained optimization setup provides a way to deal with such complex problems under deep uncertainty and insufficient information. It provides us with a way to sort and rank solutions and then choose the one that satisfies our desired properties. It also provides us with a different way of thinking about solving (complex) problems and a way to nest models in terms of the information and decision criteria they use. It also provides new insights into basic modeling and allows us to solve inference problems that cannot be solved with conventional methods without imposing additional structure or heroic assumptions. Though Information-Theoretic inference provides us with a general framework for modeling and inference (I call it info-metrics), the exact specification is problem-specific. In this brief tutorial I will summarize, in a simple way, the basic idea via a number of graphical representations of the theory and will then provide a few examples.
Welcome Reception
Title: Maximum Entropy and the Principle of Causation
Abstract: The principle of maximum entropy, developed more than six decades ago, provides a systematic approach to modeling, inference, and data analysis grounded in the principles of information theory, Bayesian probability and constrained optimization. Since its formulation, philosophical and mathematical criticisms about the consistency of that method and the role of constraints have been raised. Among these, the chief criticism is that maximum entropy does not satisfy the principle of causation, or similarly, that maximum entropy updating is inconsistent due to an inadequate representation of causal information. In this talk I show that these criticisms rest on misunderstanding and misapplication of the way constraints (information) have to be specified within the maximum entropy method. Correction of these problems eliminates the seeming paradoxes and inconsistencies the different critics claim to have detected. I will demonstrate here, via the same examples the critics used, that properly formulated maximum entropy models satisfy the principle of causation.
Title: Some Applications of Entropy Indicators for Socio-Economic Data Disaggregation
Abstract: The proliferation of databases with a high level of geographical detail has allowed to study distributional issues with a spatial perspective for Europe. While there are datasets that allows for measuring the distribution of household wealth for the EU, they do not provide any level of geographical detail. This paper tries to fill this gap by applying a spatial disaggregation procedure that combines the information contained in several household surveys. The novelties of the technique presented are that (i) the geographical mapping produced is consistent with the national aggregates and that (ii) it does not require imposing strong distributional assumptions. The methodology proposed here is applied to estimate indicators related to the wealth of households for the EU regions.
Coffee Break
Abstract: In this talk we are presenting different examples of inverse problems and discuss their connection with the area of info-metrics. In particular, we will use these examples to discuss the connection between sparsity conditions and maximum entropy or the influence of the choice of the activation function in study of evaluation of digital learning tools.
Title: Entropy, Information, and the Updating of Probabilities
Abstract: The concept of entropy has its origins in the 19th century in the discovery of thermodynamics (Carnot, Clausius, Kelvin) and statistical mechanics (Maxwell, Boltzmann, Gibbs). A series of developments starting around the middle of the 20th century (mostly due to Shannon and Jaynes) liberated the concept of entropy from its physics origins and elevated it into a general purpose tool for processing information. Thus was born the old Method of Maximum Entropy – MaxEnt. In a parallel line of research Bayesian inference enjoyed a remarkable period of expansion in the latter half of the century. The two methods of inference flourished to large extent independently of each other – entropic methods do not yet enjoy widespread acceptance within the orthodox Bayesian community. The connection between them has been an endless source of controversy and even their compatibility has been repeatedly brought into question. Further developments extending into the early 21st century have, however, culminated in the complete unification of entropic and Bayesian inference methods. The goal of this lecture is to summarize the argument for the method of maximum entropy as the universal method of inference and show that it includes the old MaxEnt, all Bayesian methods, and the general subject of large deviations as special cases. The consequences of these methods of entropic inference are potentially enormous both for statistics (entropic priors, model selection, experimental design, etc.) and for science in general. In physics, for example, it is well known that the laws of thermodynamics can be derived from entropic methods. What might at first sight be surprising is that entropic principles can also be used to derive the dynamical laws of mechanics, both classical and quantum. Perhaps the laws of physics themselves are not laws of nature but merely rules to process information about the world.