Over the last several decades quantitative research on armed conflicts has become increasingly influential for theory building and policy making. However, this strain of research raises numerous ontological and methodological issues that receive relatively little attention from social scientists.  

First, much of the research is data-driven. The majority of studies of terrorism and insurgencies rely on a limited number of online databases, which were mostly compiled by journalists or from journalistic sources without rigorous protocols for data collection.  Indeed, a comparison between different databases that are aimed at documenting the same phenomenon reveals large discrepancies. 

Second, since the data was not collected for a particular study, scholars often are forced to ignore important variables, which could be highly relevant for the testing of their hypotheses, but are not present in the database. In many cases researchers rely on proxy variables in place of more appropriate or accurate variables, as well as on data that is coded at the aggregate (i.e. country), a practice that masks variations across time (i.e. days, months) and space (i.e. local, regional).

Third, due to data availability, the most frequent subject for research on terrorism and counterterrorism is the Arab-Israeli conflict, while Africa serves as the main hub for quantitative analyses of civil wars.  Given the focus on statistical analyses, much of the research on conflicts is conducted by scholars with greater knowledge of sophisticated statistical techniques than of the cases under consideration. 

The result has been a volume of statistically sophisticated research that lacks solid theoretical and substantive foundations.  In many cases, the lack of these pillars along with the limited quality of the data leads to distorted findings, flawed conclusions, and dangerous policy recommendations.

The project aims at addressing these problems through the creation of a multidisciplinary research group. We will work together on generating an innovative “high-definition” dataset of the Arab-Israeli conflict from 1936 to present. 

First, we will build a new “high definition” dataset on security-related incidents in the context of the Arab-Israeli conflict on a day-by-day basis.  We have complete coverage of all events and complete access to information in various forms, including print and online sources (in English and Hebrew).  Data sources include materials and documents, many of which have been released by the Israeli Defense Forces (IDF) and the Israeli police. We have the ability to systematically document each and every event that took place within the realm of the Arab-Israeli conflict.  We will apply rigorous academic standards to our coding protocols and record every step that we take.  We will conceptualize and operationalize each variable based on the literature in the field.  For instance, if we discuss a terrorist attack, we will draw from the academic literature and, to the extent possible, apply consensual definitions in order to identify the cases that qualify as “terrorist attacks” (i.e. events that are “terrorist” rather than “insurgency,” “guerrilla warfare,” etc.).  When possible, we will use broad definitions so that we can incorporate as many cases as possible.  The dataset will account for the timing (day) and location (specific) of attacks in order to provide a more accurate picture of the conflict.  Given the rigor of the coding protocol, the quality of the data that we will use, and the transparency of our processes, we are confident that we will be able to provide an accurate, or “high definition,” quantitative picture of the Arab-Israeli conflict.  We will also collect raw data (more qualitative in nature) data that can be used to generate additional variables. 

Second, and consistent with the interdisciplinary nature of this project, we will produce a modular open-source dataset that can be used by scholars from a variety of disciplines.  Our project remains inductive to the extent that we gather data initially without a specific set of questions in mind; however, with a diverse group of researchers, we are confident that we can cover a wide variety of variables.  In addition, because we will also collect and document raw data for each event, other researchers whose questions are not supported by the data will be able to access the same original sources and add additional variables that are relevant to their research.  Additional variables may also be added in the same way as new questions are generated.  By doing this, we diminish the need for proxy variables (approximations) and aggregate (country-level, year-specific) variables. 

Third, another objective of this project is to offer an alternative methodological route for quantitative research on conflict related issues.  Our objective is to pave a path for averting the problems associated with relying on easily accessible or readily available data and then trying to overcome methodological problems through the use of sophisticated statistical analyses. We will do so by rigorously collecting and coding original data.  We believe that with better and more detailed data fewer problems are going to present themselves later in the research process.  And as a result, researchers can generate more accurate findings, more robust conclusions, and better-informed policy recommendations.  This project is focused on more than documenting the Arab-Israeli conflict it is also dedicated to facilitating an alternate way to conduct statistical analyses that is more transparent and more precise. 

This project builds on the intellectual exchange that science is all about—sharing ideas and generating accurate and detailed open-source data.  Eventually, these efforts will lead to much better outcomes in terms of research and the byproducts of this research.