Dr. Paolo Missier, PhD (Computer Science)

School of Computing, Newcastle University
Urban Sciences Building, Firebrick Avenue
Newcastle upon Tyne, NE4 5TG, United Kingdom

paolo dot missier at newcastle dot ac dot uk
Official staff page @Newcastle
my Google Scholar profile
Scopus WOS page

[my self maintained list] [Official University page]
[Google Scholar[DBLP] [ResearchGate]

http://twitter.com/pMissier http://uk.linkedin.com/pub/paolo-missier/0/254/b4a  

Current Projects

  • 2016-2019PIReComp: sustained value extraction from analytics by recurring, selective re-computation. (EPSRC funding, £585,000, 2016-2019).
  • 2017-2021: CO-I, Flood-PREPARED: Predicting Rainfall Events by Physical Analytics of REaltime Data (NERC funding, £1.9M, 48 months)
  • 2017-2019: PI for Newcastle, CEM-DIT: Communication and Trust in Emergencies, with Heriot-Watt and Coventry Universities (Office of Naval Research Global, (£110,000, 3 years) 
  • 2017-present: Enabling a fair IoT data marketplace without central trust. PhD student: Shaimaa Bajoudah
    • follows 2016-2017Researcher in Residence programme, Digital Economy Catapult (EPSRC funding, £25,000, Nov. 2016-June 2017). 
Recent past projects

Recent publications

My Twitter Timeline

Bio and research profile

I am a Reader in Large-Scale Information Management (roughly equivalent to Associate Professor if you are not from the UK) with the School of Computing at Newcastle University, with 20+ years experience in CS research, development, and research management.

The broad goal of my research is to understand the role of metadata, most notably data provenance [7], in making sense of the underlying (big) data as well as improving and optimising the processes that produce and extract added value from the data (i.e. through “big data” analytics).
I call this metadata analytics.
I have been leading (as Principal Investigator) the ReComp project (2016-2019, EPSRC) focused on preserving value from large-scale data analytics over time through selective re-computation, recomp.org.uk where the challenge of collecting provenance metadata and extracting value from it through analytics techniques is central to the research. [invited talk]

I am also interested in the role of provenance in making experimental science more reproducible [11,3,13], in helping track scientific data assets as a way to incentivise scientists to share their data in an Open Science setting (Data Trajectories: a research agenda) [4], and on the automatic creation of views over provenance to facilitate limited-trust data exchange [8] (funded projects: Trusted Dynamic Coalitions, PI, 2012-2013, EPSRC,  CEM-DIT: Communication and Trust in Emergencies, CO-I, 2017-2019, ONRG).

I have also been involved in the specification of the W3C PROV data model for provenance (2011-2013) where I contributed to the main recommendation documents [12,14], which follows the Open Provenance Model [15].

Additional research

My other research interests are centred around (large-scale) information management:
  • Social media analytics (Twitter) to help health authorities combat Zika and Dengue epidemics [5], 
  • Enabling trust-less and fair marketplaces for “personal” IoT data streams using blockchain technology [2], 
  • Real-time multi-source data analytics to predict rainfall events and mitigate their impact (funded project: Flood-PREPARED, 2017-2021, Co-I, NERC)
  • Implementing efficient and cost-effective genomics data processing pipelines using workflow technology on the Cloud (funded project:  Cloud-eGenome:, 2013-2015, PI, MRC/ NIHR) [6]
  • Online active learning for Human Activity Recognition [9]
  • Analysing the effect of cognitive load on car drivers [10]
  • Data and Information QualityDuring my PhD I proposed the notion of Quality Views [16,17], a semantics-based method for semi-automatically adding data quality control to scientific workflows. I am currently Sr. Associate Editor for the ACM Journal on Data and Information Quality (JDIQ) 


I am responsible for the our School’s post-graduate academic teaching on Big Data Analytics and for coordinating the School's new curriculum on Data Science.
Students interested in projects (UG/PGT/PGR) should look here.

News and blog posts

  • Invited talk: Data Provenance and its role in Data Science, Islamabad, Pakistan, April 2017 Very interesting and well-attended workshop organised and sponsored by the Higher Education Commission of Pakistan.The programme is hereand here is the talk I presented
    Posted May 7, 2017, 8:43 AM by Paolo Missier
  • Microsoft Azure Research Award for ReComp ReComp receives a Microsoft Azure Research Award valued at $20,000 in Azure cloud resources, roughly equivalent to 28 CPUcores/24hrs a day for a year, valid until September ...
    Posted Sep 20, 2016, 3:46 PM by Paolo Missier
  • TAPP’16 Workshop paper and presentation Short paper presented at the TAPP’16 workshop (Theory and Practice of Provenance), as part of  ProvenanceWeek 2016 on 6–9 June in McLean, Virginia, US:Missier, Paolo, Jacek Cala ...
    Posted Sep 20, 2016, 3:45 PM by Paolo Missier
  • Feb., 2016: Best paper award at IDCC'16 IDCC'16 is the annual Conference on Data CurationHere's the scoop:  Best paper award! sweet... and here's the paper.
    Posted Mar 4, 2016, 3:16 AM by Paolo Missier
  • Oct. 2015: ReComp project funded by the EPSRC ReComp is a new EPSRC-funded project that will run between Feb. 2016- Jan 2019 at Newcastle University School of Computing Science, in collaboration with Civil Engineering and with the ...
    Posted Nov 17, 2015, 3:52 AM by Paolo Missier
  • Interesting Time Higher Education article on metrics and research management A colleague at Newcastle just forwarded this THE opinion article "Metrics are no substitute for good research management" to our internal list. To the entire CS staff, in fact.I ...
    Posted Aug 27, 2015, 6:36 AM by Paolo Missier
  • 6/2015: Invited talk at the 7th NGS Data Congress, June 15th, London Presenting interim results from our Cloud-e-Genome project:  "Scalable WES Processing And Variant Interpretation.With Provenance Recording.Using Workflow On The Cloud."link to slidesA very successful talk ...
    Posted Jun 19, 2015, 1:38 AM by Paolo Missier
  • 2/2015: Future Generation Computer Systems -- new Special Issue on Scalable Workflow models and technology Long time coming, and a lot of work, but at last our FGCS special issue on Advances in Scalable Workflow models and technology is finally ready: http://www.sciencedirect.com ...
    Posted Feb 28, 2015, 3:07 PM by Paolo Missier
  • DataONE receives a $15M second round of funding, proud to be personally involved From the NSF news piece: "DataONE: the Data Observation Network for Earth (www.dataone.org) is a distributed cyberinfrastructure that meets the needs of science and society for open, persistent ...
    Posted Oct 24, 2014, 2:55 AM by Paolo Missier
  • Springer releases download figures for our DILS'09 proceedings eBook Nice of Springer to release download figures to volume editors -- that does not seem to include hardcopy purchases, but really, who orders hard copies these days.This is for the ...
    Posted Jun 11, 2014, 1:21 AM by Paolo Missier
Showing posts 1 - 10 of 36. View more »