For the reading club next Monday 26 October I propose the following paper:

Web-Scale Querying through Linked Data Fragments by Ruben Verborgh, Miel Vander Sande, Pieter Colpaert, Sam Coppens, Erik Mannens and Rik Van de Walle.

Abstract: To  unlock the full potential of Linked Data sources, we need flexible ways  to query them. Public sparql endpoints aim to fulfill that need, but  their availability is notoriously problematic. We therefore introduce  Linked Data Fragments, a publishing method that allows efficient  offloading of query execution from servers to clients through a  lightweight partitioning strategy. It enables servers to maintain  availability rates as high as any regular http server, allowing querying  to scale reliably to much larger numbers of clients. This paper  explains the core concepts behind Linked Data Fragments and  experimentally verifies their Web-level scalability, at the cost of  increased query times. We show how trading server-side query execution  for inexpensive data resources with relevant affordances enables a new  generation of intelligent clients.

Notes / Discussion summary / Action points:

---

For the reading club next Monday 19 October September I propose the following paper: 

Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications 

by Tim Clark, Paolo N Ciccarese and Carole A Goble.

Micropublications are web-friendly and machine-tractable models of scientific publications, which enable representing scientific claims together with the corresponding evidence (dataand citations).

... 33 pages: you can read the workshop paper ...

--

Towards Linked Open Data enabled Data Mining: Strategies for Feature Generation, Propositionalization, Selection, and Consolidation 

http://ub-madoc.bib.uni-mannheim.de/38852/1/Petar_Ristoski_-_PhD_Symposium_ESWC_2015.pdf

--

Why linked data is not enough for scientists http://www.sciencedirect.com/science/article/pii/S0167739X11001439   It describes an additional Linked Data layer (research objects) which allows to bundle different aspects of the scientific process (data, method, attribution, publication). This paper probably fits better with the research interests of Martine/Dena/Tobias, but  why I am interested in reading it: I am exploring ways of supporting Digital Humanities researchers in doing research on the Rijksmuseum dataset. The paper is from 2011, published in the Future Generation Computer Systems journal and one of the things I am curious about is how much of it is realised in the mean time.  Further discussion points that came up while reading it:  - Would the proposed Research Objects approach be applicable to your domain? - Is this one of many best practices? - Is the practice widely adopted? - Are subproblems solved by others (e.g. prov vocab)? - Are there applications supporting researchers in creating these bundles? - Did you ever try reusing data/methods? - How would you address versioning of graphs?  Document with the overview of the upcoming papers: https://docs.google.com/document/d/1VIGX88GunenRCPe4fQslcS7hSrW5TU2cejDfTcmc1uE/edit

----

Ranking Buildings and Mining the Web for Popular Architectural Patterns

In this paper crowdsourcing, social media, linked open data and machine learning are combined in order to find influential architectural factors for buildings. I think this is a useful combination and unique topic, but which is applicable to other domains.

...

For this paper, there seem to be some (at least superficially) considerable differences between the author's version of the paper (Martine's link) and the official version provided by ACM: http://dl.acm.org/citation.cfm?id=2001297

----

For the reading club next Monday 14 September I propose the following topic "publishing negative results".

I think it is widely agreed that publishing negative results may provide valuable information and may benefit scientific progress.

In the discussion on Monday I would like to focus on how to actually write a publication presenting negative findings.

I suggest the following paper: Don't turn social media into another "literary digest" poll

In this paper the author aims to provide a balanced view of the actual possibilities of social media analytics. In order to do that, 

he repeats (in a much more detailed way) previous studies on electoral prediction from Twitter data. In contrast to the previous studies,

his results show that the results for the 2008 U.S. Presidential Elections could not have been predicted from Twitter data by using 

commonly applied methods.

In an opinion paper the author discusses how he came to write this paper, and how hard it was to to get it published.

As a preparation for Monday, I would like you to think about the following questions:

1) Your own research: 

    Did you ever obtain negative results that support your null hypothesis, or did not fit with the current scientific thinking?

    Did you publish them, and how did you do that?

2) Ways of presenting:

    How would you present your negative findings in an attractive way? The suggested paper shows one possible format,

    i.e., by redoing previous studies and comparing the results. Could you think of other ways, or do you have examples of such papers?

(see also the google document on the reading club with the schedule and notes)