Mapping bilateral information interests using the activity of Wikipedia editors



Abstract

We live in a global village where electronic communication has eliminated the geographical barriers of information exchange. With global information exchange, the road is open to worldwide convergence of opinions and interests. However, it remains unknown to what extent information interests actually have become global. To address how interests differ between countries, we analyze the information exchange in Wikipedia, the world's largest online collaborative encyclopedia. From the editing activity in Wikipedia, we extract the interest profiles of editors from different countries. Based on a statistical model for interest profiles, we create a network of significant links between countries with similar activity. Using clustering, we find that countries can be divided into 18 clusters with similar interest profiles, which suggests that language, geographical proximity, religion, and historical background diversify the interests. We quantify the effects of these factors using regression analysis and find that information exchange indeed is constrained by the impact of social and economic factors connected to shared interests.

Method
We identify the interest profile of a country by aggregating the edits of all Wikipedia editors whose IPs are recorded in the country. If an article is co-edited by editors located in different countries, we say that the countries share a common interest in the information of the article. In other words, we connect countries if their editors co-edit the same articles. From the co-editing data, we create a network that represents countries as nodes and shared interests as links. We proposed a statistical validation method that filters out connections that could exist only due to size effects or noise.


Results
We find that information interests despite globalization are diverse, and that the highways and barriers of information exchange are formed by social and economic factors connected to shared interests. In line with earlier studies, we find that language, religion, geographical proximity, historic background, and trade have a great impact on the diversity of interests. Since local interest limits the global information exchange, these results can help us to better understand the concept of globalization. The information interest network can also be used to model the propagation of information on a global scale. Moreover, the general statistical validation method that we introduce to extract the network can applied to extract significant dyadic connections in weighted networks in similar systems.
Comments