Project Results‎ > ‎

Recommendation Systems

Recommendation systems represent mechanisms within crowdsouring platforms which are able to influence the assignment of  workers and tasks. This can affect the overall system to reach differently defined goals for the platform.
Within existing crowdsourcing platforms, the assignment of workers and tasks depends on the choice of the worker. In most of the platforms, the worker is aided by a categorization of tasks which is provided by the task requester. The tasks are often not equally distributed throughout these categories, and in some cases the categorization is only very rough or does not exist at all. This insufficient support for the workers choice is an issue that prevents workers from finding suitable tasks and finally results in a suboptimal assignment of workers and tasks. This was shown by us in a user study conducted with 500 participants, where a majority of workers agree that it is not easy to find interesting tasks [1].
Recommendation systems can provide further support to find matching tasks for a worker. Well known approaches, like collaborative filtering which is used in e-commerce systems, can be applied where products are available on a long-term basis. However, in crowdsourcing platforms, the tasks provided are shortly limited in time and number and a recommended item would not be available anymore. In order to leverage the known approaches, similarity measures between tasks have to be identified, to draw conclusions from already processed but unavailable tasks towards recommending open tasks.

In order to create recommendations systems, that are accepted by the workers, the behaviour and priorities of the workers have to be known. In a user study with 151 workers of a micro-task market, we asked about how relevant different criterias are for their decision. From nine criterias "most money" and [best] "payment per time" came in first and second position. However, the criterion of "similar" [to a previously chosen task] was voted to the third position  [2]. In contrast to the first two criteria, the similarity cannot easily be concluded from meta data of the tasks.
How workers perceive this "similarity" of tasks, was examined in our user study with 500 participants [2].
14 different aspects where chosen to be analyzed, where some of them may be derived from the task description (e.g. complexity, domain or action) some can be taken from the tasks meta data provided by the task requester (e.g. payment or time) and some depend on the task requesters attributes (e.g. origin or type of requester). The results of the study show the aspects of "[required] action", "comprehensibility [of the tasks description]", "domain" and "purpose" are regarded to be most valuable for determining task similarity. Overall, the five aspects which have to be derived from the task description come in the first five positions. The results correlate with the results from an analysis of 3.5 mio processed tasks from a time span of six years [3].

To compute the similarities of tasks, using the identified aspects of similarity as shown above, we examined different approaches and different feature sets for classification and for the clustering of tasks [4]. To evaluate the performance of the used features for the clustering algorithm, we created a manually annotated corpus of ten tasks and all their similarities regarding the top 3 mentioned similarity aspects. We found e.g. for the aspect of "required action", that WordNet's path similarity together with a simple verb phrase grammar yields the best results, where we reached a Spearman rank order correlation of 0.697.
The domain of job offers is also very similar to the domain of offered tasks in crowdsourcing platforms. The providers of job search engines also have to bear with a highly dynamic system where collaborative filtering approaches cannot be used. We examined how an automated classification of job offers can be used efficiently in such a scenario [5]. Deploying an approach using Support Vector Machines within a combined setup  of Ensemble Learning and Active Learning a classification with a high accuracy could be reached while less documents have to be annnotated by experts.


[1] Steffen Schnitzer, Svenja Neitzel, Sebastian Schmidt und Christoph Rensing. „Perceived Task Similarities for Task Recommendation in Crowdsourcing Systems“. In: Proceedings of the Conference Companion on World Wide Web. Montreal, Canada, Apr. 2016.
[2] Steffen Schnitzer, Christoph Rensing, Sebastian Schmidt, Kathrin Borchert, Matthias Hirth und Phuoc Tran-Gia. „Demands on Task Recommendation in Crowdsourcing Platforms - the Worker’s Perspective“. In: Proceedings of the CrowdRec Workshop. Vienna, Austria, Sep. 2015.
[3] Martin Becker, Kathrin Borchert, Matthias Hirth, Hauke Mewes, Andreas Hotho und Phuoc Tran-Gia. „MicroTrails: Comparing Hypotheses About Task Selection on a Crowdsourcing Platform“. In: Proceedings of the International Conference on Knowledge Technologies and Data-driven Business. Graz, Austria, Okt. 2015.
[4] Svenja Neitzel. „Towards Similarity-Based Task Recommendation in Crowdsourcing Systems“. Masterthesis. Darmstadt, Germany: Technische Universität Darmstadt, 2016.
[5] Sebastian Schmidt, Steffen Schnitzer und Christoph Rensing. „Text Classification Based Filters for a Domain-specific Search Engine“. In: Computers in Industry 78.C (Mai 2015).