Program and Papers

DAY 1: Thursday 22 August

09:00 - 09:30

Registration

09:30 - 09:40

Opening and Welcome

Deutsche Nationalbibliothek / IFLA Big Data SIG

SESSION 1: On library collections as data

09:40 - 10:10

An innovative approach to scalable semantic embedding

Embedding words and entities in compact, semantically meaningful vector spaces allows for computable semantic similarity/relatedness which could make search more intelligent. In our quest for practical solutions to support libraries in this field, we revisit the global co-occurrence based embedding methods and propose a conceptually simple and computationally lightweight approach. We will show the potentials of this scalable semantic embedding method for other applications such as entity disambiguation, citation recommendation, and clustering and collection exploration.

Dr. Shenghui Wang (OCLC EMEA, the Netherlands)

Slides
Paper

10:10 - 10:40

A macro-analytical exploration of text novels

The demands on the information supply have changed considerably as a result of the digital transformation. The introduction of new research methods such as automated data and text analysis of digital collections is accompanied by the need for altered forms of inventory provision. Paperback fiction novels have become the subject of literary research again in the last 10 - 15 years. We report on the results of an evaluation of 9 000 German-language book novels to assess how genres of book novels differ among themselves and how book novels differ from scholarly literature.

Prof. Fotis Jannidis & Leonard Konle (University of Würzburg, Germany); Dr. Peter Leinen (Deutsche Nationalbibliothek, Germany)

Slides

10:40 - 10:50

Questions / Discussion

10:50 - 11:20

SESSION 2: On data analysis and visualisation

11:20 - 11:50

ezPAARSE and ezMESURE: assembling dashboards on a national repository from fine-grained and locally generated usage statistics to electronic resources

ezPAARSE is open source software which comes with a list of more than 200 community contributed parsers. Collecting ezPAARSE generated and enriched Access Events into the ezMESURE french national repository for dynamic dashboard creation, consolidation and representations has been a sustained effort for INIST-CNRS, Couperin.org and its members. ezMESURE is able to store, retrieve and display this data in a variety of dashboards. With this ecosystem we hope to provide our users with a complete processing chain to finely analyse their data and have strategic elements to support their subscription campaigns.

Thomas Porquet (Couperin.org Consortium, France)

Slides
Paper

11:50 - 12:20

Using graph visualization to enhance representation and evaluation of work clusters

Work clustering is achieved by creating and matching keys which combine different metadata elements of a bibliographic record. Graph visualisation enables users to obtain a more transparent view of the connections underlying the structure of a work cluster. In this presentation we show the use of graph visualisation for display, analysis and evaluation of work clusters. Using graph visualisation not only assists in grasping the internal structure of a work cluster but also helps managing and evaluating large datasets, ultimately supporting data representation and findability.

Dr. Angela Vorndran (German National Library, Germany)

Slides
Paper

12:20 - 12:50

Meaning construction in data visualisation

In this digital data age, the users are the center of data visualisation to meet their information needs. Tools, software and technology have been advancing so rapidly to allow information professionals to process and visualise data in many ways desired. The author will present her research and teaching experience on the process of meaning construction in data visualisation. It is a critical study to analyze how meaning is construction in data visualisation to provide insights for the information professional in data visualisation for design purposes to understand the users better and serve them well in the digital data age.

Dr. Yan Ma (University of Rhode Island, United States of America)

Slides
Paper

12:50 - 13:00

Questions / Discussion

13:00 - 14:00

SESSION 3: On data skills and data services

14:00 - 14:30

An analysis of selected data practices: a case study of researchers and scientists at the national museum of the Philippines

This paper describes a survey of data practices given to the researchers and scientists at the National Museum of the Philippines. The survey results may inform the museum libraries in developing new data services and instruction, while also highlighting the need for additional research into data practices for specific disciplinary areas or types of researchers and scientists. This research highlights the need for further, in-depth study of the museum, particularly focusing on qualitative methods. It could help predict tools and storage needs so that the libraries can be better positioned to serve these needs.

Gianina O. Cabanilla (National Museum of the Philippines, the Philippines)

Paper (not in IFLA Library)

14:30 - 15:00

Data literacy training needs of researchers at South African universities

The purpose of this study was to determine data literacy training needs of researchers at South African universities. The outcome of the study would assist the researchers to develop a data literacy training programme which addressed identified needs. Areas identified for training include research data management planning, metadata management, consistent file naming, and data citation. As a recommendation, librarians and research support personnel should collaborate on the development of data literacy training material to be used by researchers.

Dr. Mathew Moyo (North-West University, South Africa)

Slides

15:00 - 15:30

Data intelligence and AI in the library - from vision to reality

The library world is developing and changing faster than ever. As the profession discusses trends such as Artificial Intelligence, data intelligence, data-driven decision making, machine learning and big data it is important to not lose sight of practical benefits when moving from vision to reality. In this presentation I will discuss how we can use artificial intelligence and data intelligence in our systems to support library processes. It will include an overview of important concepts as well as discuss concrete steps to put theory into practice. Using examples from the Ex Libris systems Alma, Primo and Summon, I will look at administrative processes in a library system as well as library discovery and how new technologies can help to create new services to help users to find their way in a world of big data.

Christine Stohn (Ex Libris, Germany)

Slides

15:30 - 16:00

A study of associated organisations of research data and the energy industry's effective utilisation of an emergency management knowledge database

This study examines the institutional research knowledge database from three perspectives: knowledge organisation, knowledge reuse and knowledge effectiveness. It covers research datasets in the natural and social sciences through the lens of emergency management for the energy industry as a case study. This study covers research data and its effective utilisation in an emergency management knowledge database of the energy industry as its research object, innovatively designing and planning a strategy about associated organisations of research data.

Du Ping-Ping (China University of Mining and Technology, China)

Paper (not in IFLA Library)

16:00 - 16:10

Questions / Discussion

16:10 - 16:30

16:30 - 17:30

Tour of the German National Library (DNB)

17:30 - 20:00

Sponsored social event

Depot 1899, Textorstraße 33, Sachsenhausen, Frankfurt

DAY 2: Friday 23 August

SESSION 4: On linking and sharing data

09:00 - 09:30

Leaving comfort behind - a national union catalogue transition to linked data

The content of the Swedish Union Catalogue is a joint effort of cataloguers working in more than 500 member libraries. The new version of the Union Catalogue, Libris XL, is a custom-built open source system based on BIBFRAME 2.0 and linked open data. With linked data at its core it creates new ways of opening up data previously locked away in library-specific formats and structures. We will present on the complexities of the transition to Libris XL which involved migrating 10 million bibliographic records from MARC to BIBFRAME and adopting a new format-driven cataloguing editor which introduces a completely new way of representing information in Libris.

Bodil Wennerlund (National Library of Sweden, Sweden)

Slides
Paper

09:30 - 10:00

Information management as knowledge graph curation: how the role of information management specialists at Cochrane evolved from scientific literature search co-ordinators to knowledge graph curators as the organisation embraced FAIR data principles

Historically Cochrane's systematic review production process was based on primary evidence collated via scientific literature searches performed by information specialists. This had operational scalability limitations and offered no way of generating notifications of primary evidence updates. To address this Cochrane adopted the principles of FAIR data, at its foundation a lightweight "PICO" ontology that encodes Cochrane's formalised framework for clinical healthcare. This allowed information specialists to represent the knowledge encoded in each review, generating a knowledge graph populated by citizen scientists, machine and subject expert inputs, and curated by information specialists.

Julian Everett (Data Language, England)

Slides

10:00 - 10:30

Performance optimisation in sharing big geoscience data

Online geoscience data sharing plays a key role in enabling organisations to make informed decisions. To maximise the benefits, structured data workflows and frameworks governing the data has to be continually experimented. In this paper we present a framework that addresses geoscience big data issues through the integration of data visualisation, sub-division framework, data provenance and automated cloud computing deployment techniques. The results are evaluated through an extensive set of experiments that validate the approach and highlight the key benefits of the proposed solution.

Basuti Bolo (Botswana International University of Science and Technology, Botswana)

Slides
Paper

10:30 - 10:40

Questions / Discussion

10:40 - 11:00

SESSION 5: On open data and data repositories

11:00 - 11:30

A world of carrots and sticks - Open data policy compliance will change academia forever, but what level is awareness globally?

This paper talks about the top down approach of funder mandates and recent pilots by the National Institute of Health (NIH) to move towards FAIR sharing of the research data they fund. From the researcher level, we will also report on the results of the 2019 State of Open Data, the biggest researcher survey on opinions on open data. Do the mandates drive adoption of policy compliance, or are researchers motivated in other ways? Both stories will help understand how communities can better drive engagement and advance FAIR research.

Mark Hahnel (Figshare, England)

Slides

11:30 - 12:00

Management of open government data in Nigeria academic libraries: status, strategies and challenges

Since the return to democratic government in 1999, Nigerian citizens have been engaging government for good leadership anchored in open governance. A way of achieving open government is making government data available to the people. This paper will examine open government data in Nigeria and management strategies academic libraries could deploy to create awareness and access for researchers. The research will adopt document analysis of Nigerian government websites to examine their status in terms of openness and accessibility and deploy a review of literature to articulate strategies to manage them and the challenges therein.

Dr. Ifeanyi J Ezema (Enugu State University of Science and Technology, Nigeria)

Slides
Paper

12:00 - 12:30

Webometric analysis of 20 online open access data repositories

Data repositories are growing on the Internet and data of these online repositories is also rapidly growing. This paper reveals the present status of top open access online data repositories of the world. We give a brief introduction to Webometrics analysis of the top 20 data repositories available, based on topics covered by the repositories, total number of data sheets available in the repositories, different software for data retrieval, format of data and data visualisation. We also reveal how these repositories are helpful for data librarians or reference librarians.

Arshad Mahmood (State Bank of Pakistan, Pakistan)

Slides

12:30 - 12:40

Questions / Discussion

12:40 - 12:45

Wrap-up and closing

IFLA Big Data SIG

CONTACT INFORMATION
Dr. Peter Leinen Deutsche Nationalbibliothek Leiter Fachbereich InformationsinfrastrukturAdickesallee 1 60322 Frankfurt am Main Telefon: +49 69 1525-1700 Telefax: +49 69 1525-1799 mailto:p.leinen@dnb.de http://www.dnb.de