The app analytics framework

Representing the app ecosystem as a graph

Download our Neo4j Graph Database:

Dataset D1: googlePlayDB-Jan2014.rar

Dataset D2: googlePlayDB-Mar2014.rar

Partial view of our Graph database:

To query the Graph Database we use the Cypher query language. Cypher is a pattern matching language.

Examples of queries:

Statistics on the collected apps

Our dataset consists of 46,644 mobile apps available on the Google Play Store (as of March 2014). The apps belong to the 27 different categories defined by Google, and the 4 predefined subcategories (free, paid, new free, and new paid).

App Statistics

The following chart shows the distribution of apps in the different categories of Google Play Store.

The following chart shows the distribution of apps in the different categories and subcategories of Google Play Store.

Permission Statistics

The following graph reports on the number of apps requesting different number of permissions. Apps request 6 permissions by median.

The following chart summarizes the distribution of apps without permissions in each category of Google Play Store. Most of the apps without permissions belong to the cateogries Personalization and Wallpaper.

The following chart illustrates the average number of permissions requested by the apps in all the categories of Google Play Store. Communication apps are the most demanding apps.

The following chart illustrates the distribution of permission requests in the different permission categories.

Dataset Evolution Statistics

The following graph depicts the evolution of our dataset along time in terms of number of apps and number of different permissions.

The following graph summarizes the evolution of our dataset in terms of number of app additions, removals and updates. The tendency in Google Play Store is rather adding new apps, than updating and removing existing ones.