Boolean Statistics Homepage

This is the Boolean statistics homepage. To download a beta release of the boolean3 package for R by Jason Morgan, click here.

Boolean statistics are designed for situations in which multiple data-generating processes are thought to produce the same outcome (or non-outcome), but the outcomes themselves cannot be partitioned into outcomes of one type or another. For example, cases of non-war are typically pooled together in statistical analyses, but it seems likely that they are produced in different ways: Germany and France may not go to war because advanced democracies tend not to fight; India and Pakistan might not go to war because each is deterred from attacking the other; Bolivia and Botswana might not go to war because neither can reach the other; but, unfortunately for the analyst, these three types of non-wars are observationally equivalent. Boolean statistics would allow the user to model each data-generating process separately (see Braumoeller and Carson 2011, below) to avoid bias from model misspecification.

More specifically, Boolean logit and probit techniques are designed for a difficult class of statistical problems: those in which one or more intervening variables is latent, or unobserved; the outcome variable is binary; and the interactions of the latent variables that produce the outcome can be characterized by logical "and"s or "or"s.  So, for example:
  • We should observe a militarized dispute only when one state has the relative capabilities and the willingness to initiate it.
    • Relative capabilities and willingness are unobserved but can be predicted based on observed covariates.
    • The binary outcome (militarized interstate dispute onset) becomes increasingly likely in the presence of both capabilities and willingness, but not in the presence of either one alone.
    • The equation of interest would then resemble

  • Citizens may choose not to vote for any number of reasons, including apathy, indifference, or dissatisfaction.
    • Apathy, indifference, and dissatisfaction are unobserved, or imperfectly observed, but can be predicted using correlates from survey questions.
    • The binary outcome (nonvoting) becomes increasingly likely in the presence of any one of the three.  Accordingly, the marginal impact of the others decreases.
    • The equation of interest would then resemble

The boolean3 package for R is designed to estimate these and other models that combine unobserved antecedent conditions with logical "and"s and "or"s.

   
Fig. 1:  Functional forms capturing arguments "X1 and X2 produce Y" (left) and "X1 or X2 produces Y" (right).

The paper that describes the technique in detail is Braumoeller, Bear F. (2003) “Causal Complexity and the Study of Politics,Political Analysis 11(3): 209-233.  An application that models political irrelevance, joint democracy, and the absence of incentives for war as separate data-generating processes that lead to peace can be found in Braumoeller, Bear F., and Austin Carson (2011) “Political Irrelevance, Democracy, and the Limits of Militarized Conflict,Journal of Conflict Resolution 55(2): 292-320.
Comments