Information Theory for Modeling and Inference
Amos Golan, American University
External Professor, Santa Fe Institute
Video Recording
Abstract:
In this talk I will discuss the use of information theory for modeling and inference of complex systems and problems across disciplines. Simply stated, the available information is usually too complex, insufficient, and imperfect to deliver a unique model or solution for most systems and problems. Problems with multiple solutions are called under-determined, or partially identified. Information Theory within a constrained optimization setup provides a way to deal with such complex problems under deep uncertainty and insufficient information. It provides us with a way to sort and rank solutions and then choose the one that satisfies our desired properties. It also provides us with a different way of thinking about solving (complex) problems and a way to nest models in terms of the information and decision criteria they use. It also provides new insights into basic modeling and allows us to solve inference problems that cannot be solved with conventional methods without imposing additional structure or heroic assumptions. Though Information-Theoretic inference provides us with a general framework for modeling and inference (I call it info-metrics), the exact specification is problem-specific. In this talk I will briefly summarize the basic idea via a number of graphical representations of the theory and will then provide a few examples.
Key Words: Complex Systems, Complex Data, Constrained Optimization, Decision Function, Deep Uncertainty, Entropy, Inference, Information Theory, Modeling
Bio:
Amos Golan (BA, MS: Hebrew University of Jerusalem; PhD: UC Berkeley) is a professor of economics and directs the interdisciplinary Info-Metrics Institute at American University. He is also an External Professor at the Santa Fe Institute and was a Senior Associate at Pembroke College, Oxford. His research is primarily in the interdisciplinary field of info-metrics - the science of modeling, reasoning, and drawing inferences under conditions of noisy and insufficient information. He has published in economics, econometrics, statistics, mathematics, physics, visualization and philosophy journals. His most recent book is ‘Foundations of Info-Metrics: Modeling, Inference, and Imperfect Information,’ OUP (2018): https://info-metrics.org/. Golan is a Fellow of the American Association for the Advancement of Science (AAAS).
Summary:
Focus Info-Metrics:
Modeling complex systems with minimal information
Accounts for all available information
Provides the best (unbiased solution) given the available information and data.
Due to insufficient information the problems are underdetermined: there are multiple solutions consistent with the information we have
Thus, not guaranteed to be correct
Possible solution:
Impose structure on the space of possible solutions
Use constrained optimization to find best solution within this space
Challenge: choose solution space, decision function and priors based on external knowledge
Info-Metrics approach:
More general constrained optimization based on information theory
Constraints (available information: conservation laws, etc.)
Come from external/prior knowledge but can be tested while evaluating the model
Decision function to optimize: way to choose one solution from among many alternatives consistent with data
(e.g. utility function, normalization)
Info-Metrics uses Shanon’s entropy
Example: Unconditional discrete choice (very little information)
Throwing 3-sided dice
Throw many times, get the mean value of the throws
Based on the mean:
Bet on next throw, or
Infer probability distribution of the dice throws
Visualize as space linear combinations of the possible outcomes of the dice throws
3 possibilities: 1: (1, 0, 0), 2: (0, 1, 0), 3 (0, 0, 1)
Weighted combinations represent probabilities of getting any possibility (.25, .25, .5): add up to 1 since its a probability distribution
Specifying the mean selects a subset of possible distributions consistent with this mean
Line through the space, with infinitely many possible distributions
Choose one of them using a decision function to prioritize more useful one
Info-Metrics uses Shanon’s entropy function to prioritize more homogeneous solutions
Out of all possible solutions, it is the least informed; the closest to uniform distribution
I.e. the probabilities of the different dice faces is most similar
Best solution is the one where the indifference curve touches and is tangent to the constraint (line) of the space of solutions consistent with data
Shannon Entropy is a good choice as the decision function
Least bias
Concentration Theorem: Allows one to put confidence intervals on the range of possible solutions
Large Deviation
Observation: can use different generalizations of Shanon entropy function.
They all converge on the same point in the center of the distribution but diverge elsewhere
Can use priors to shift the region of solution space with the best value
Can convert constrained optimization problem into an unconstrained problem of the problem’s conservation rules
Exponentially fewer parameters/dimensions
One dimension per rule, rather than one dimension per possible outcome
Real world: accommodating model ambiguity, uncertainty and misspecification
Add uncertainty on all the constraints, normalization
General version: Max Entropy(Signal, Noise) = Entropy(Signal) + Entropy(Noise)
Empirical example: Inferring strategies of competing firms
United Airlines vs American Airlines
Dataset: probability distribution on prices for different routes shared by the airlines that each airline individually dominates