Abstracts

Title: A Study of the Influence of Articles in the Large-Scale Citation Network

Authors: Frederick Kin Hing Phoa, Livia Lin Hsuan Chang, Junji Nakano

Abstract

Nowadays there are many research metrics at the author-, article-, journal-levels, like the impact factors and many others. However, none of them possess a universally meaningful interpretation on the research influence at all levels, not mentioning that many are subject-biased and consider neighboring relations only. In this work, we introduce a new network-based research metric called the network influence. It considers all information in the whole network and it is universal to all levels. Due to its statistical origin, this metric is computationally efficient and statistically interpretable even if one applies it to a large-scale network. This work demonstrates the analysis of networks via network influence using a large-scale citation database called the Web of Science. By just considering the articles among statistics community in 2005-2014, the network influence of all articles are calculated and compared, resulting in a topten important articles that are slightly different from the list via impact factors. This metric can be easily extended to author citation network and many similar networks embedded in the Web of Science.

Keywords: Network Influence, Research Metrics, Network Centrality, Web of Science

Title: Ranking of Scientific Authors in Large Scale Networks

Authors: Jorge Silva, David Aparício, Fernando Silva

Abstract

Citation based metrics such as the h-index are often used as an indicator of researchers merits but they have serious drawbacks due to bias inherent to specific fields of science. Ranking authors more carefully (i.e., with network metrics) is a challenging research task and a number of random walk algorithms have been proposed. Typically, these PageRank-like algorithms assume that the full network is known. However, due to the sheer size of citation datasets, it is (a) not feasible to obtain nor process the full data and (b) impossible to apply these algorithms to the complete network. In this work, we present a strategy that summarizes part of the network (in order to decrease the computational and memory cost of the ranking strategy) and uses that summarized information to improve the predicted ranks. Furthermore, our algorithm leverages on information such as author productivity, the venue and the year of the publication which has been previously established as relevant factors for author ranking. We compare our method with other state-of-the-art approaches in several scenarios and we demonstrate that our method yields better results. Finally, we apply our method to produce a ranking of portuguese authors.

Title: Research diversity index for evaluation of research

Authors: Hiroka Hamada, Mio Takei and Keisuke Honda

Abstract

Research organizations, along with their funding agencies, routinely need to measure the performance of their organization. However, for funding agency or grant review panels which are supporting seeds research, must make difficult predictions about the likelihood of future scientific success. It is a widely held view that a typical seeds research is very important and opportunity of innovation but not suit to evaluation by short term scientific achievements. If grants program is designed to support seeds research, how should we evaluate our grants effort? The application of scientific citation indexing is a widely used metric for evaluating the success of a research organization. For example, Hutchins et al. (2016) introduce that new method which is improved quantify the influence of a research article by making novel use of its co-citation network to field-normalize the number of citations it has received. A major problem of existing metrics such as journal impact factor and h-index is implicit bias.

In this study, we propose a new clustering method to measure influence of papers in all areas of science. To see structure of entire relation- ship we apply stochastic block model (SBM) on big scale citation network data. SBM generates a matrix which divides several blocks which represent relationship among research fields.

Keywords: Research performance evaluation, Stochastic block model, Pointwise mutual information, Co-citation network analysis

Title: Bibliometric indicators in describing the performance of individual researches

Author: Elizabeth Vieira

Abstract

Individual researchers play a main role in the scientific system being important to study their behaviour and performance in relation to their scientific activities. Researchers are frequently the object of research evaluations and analysis and, worldwide, it is accepted that evaluations are vital in improving the scientific systems. Peer review is generally used as the best method available to select the most promising researchers and bibliometric indicators can contribute to inform this process, although, this is not free of limitations. Using a particular process for the selection of junior and senior researchers in Portugal (Investigador FCT), it was found an overlapping of 70% of the final decisions of the peer review process and the results given by bibliometric indicators. The results show the promising role of bibliometric techniques in informing the peer review process.

Title: Understanding research trends based on paper abstracts using topic modeling

Authors: Mio Takei, Junji Nakano, Kota Hattori, Tomokazu Fujino, Keisuke Honda

Abstract

The production of data is expanding at an astonishing pace. Analysis using statistical science is indispensable for processing the data. In fact, many organizations analyze their data with statistical approaches and trying to promote their businesses. In this paper, we investigate research movements and trends in statistical science using Latent Dirichlet Allocation (LDA).

Abstracts from academic papers, which is the most objective output of research activities, are available by Web of Science (WoS) and are good data to see trends of research. Therefore, we analyzed these data for understanding current research movements and trends in statistical science.

We applied LDA model on paper abstracts and estimated topics. We then aggregated these topic distributions over time to find research movements.

Our results can be used to understand the topics in statistical science that will be upward in the future (and also downward in the future). We can use these results to determining the special themes in our institute research to promote active statistical research in Japan. Furthermore, we can help researchers to discover promising research topics in the future.

Keywords: latent Dirichlet allocation (LDA), paper abstract, statistical science, topic modeling

Title: Temporal Expertise Profiling using Keyword Co-Occurrence Network

Authors: Pedro Belém, Pedro Ribeiro, Fernando Silva

Abstract

Research activity has in itself a significant social component. Not only most research publications are authored by more than one person, but also scientific study and discovery is commonly built upon previous work of many researchers. The idea of a profile that summarizes a researcher's career, which can be decades long, is, therefore, important to facilitate the understanding of this social research dynamic. In this work we provide a method to generate a profile that showcases research interests of a researcher and also how those interests evolved over time. This problem, of linking researchers to their research interests, is normally called Temporal Expertise Profiling. We applied our method to a Portuguese repository of scientific publications metadata, the Authenticus system. We first detect which research areas are present in the Authenticus data using keyword co-occurrence techniques. Then we link the publications of a researcher to those areas, considering the time of publication. Finally, we propose a number of strategies to display this information in an user-friendly way, both the interests and their evolution.

Title: Collaborative research among different research fields to induce innovation: Academic papers trends on the PM2.5 environmental issues in China and Japan

Authors: Yuji Mizukami, Keisuke Honda, Frederick Kin Hing Phoa, Junji Nakano

Abstract

Innovation is the act of creating new value by using "new connection", "new point of view", "new way of thinking", "new usage method" in the field of business administration. In recent years, the promotion of the Innovation has been strongly encouraged in the academic field in Japan, and attempts are also being made to create new value through the connection between fields of research. Moreover, along with the move to promote collaborative research among these research fields, research is being conducted to grasp and promote the degree of them. In this research, the purpose of providing indices for measuring the degree of them, we show indices quantitatively indicating the degree of fusion in different fields and the distance between the fields.

In the analytical case, we will explore the characteristics of the PM 2.5 research in China and Japan by network analysis and consider the field of environmental research. Since the 1990s, the Chinese economy has been remarkably growing. On the other hand, environmental problems are getting worse, especially the damage of PM 2.5 is serious and an early solution is required. Research on PM 2.5 is remarkably active in China and many papers have been produced. In Japan, research on PM 2.5 which flying from China is advancing. We examine differences in cooperative research between the fields of China and Japan in PM 2.5 research.

Keywords: Research performance evaluation, Network analysis, Co-author analysis, Innovation, PM2.5

Title: Hierarchical Expert Profiling using Heterogeneous Information Networks

Authors: Jorge Silva, Fernando Silva, Pedro Ribeiro

Abstract

Linking an expert to his knowledge areas is still a challenging research problem. The task is usually divided into two steps: identifying the knowledge areas/topics in the text corpus and assign them to the experts. Common approaches for the expert profiling task are based on the Latent Dirichlet Allocation (LDA) algorithm. As a result, they require pre-defining the number of topics to be identified which is not ideal in most cases. Furthermore, LDA generates a list of independent topics without any kind of relationship between them. Expert profiles created using this kinda of flat topics lists have been reported as highly redundant and many times either too specific or too general. In this work, we propose a methodology that addresses these limitations by creating hierarchical expert profiles, where the knowledge areas of a researcher are mapped along different granularity levels, from broad areas to more specific ones. For the purpose, we explore the rich structure and semantics of Heterogeneous Information Networks (HINs). Our strategy is divided into two parts. First, we create a data-driven topical hierarchy by discovering overlapping communities and ranking the nodes inside each community. On the second part, we map experts into the topical hierarchy generating their hierarchical knowledge profiles. We tested our approach on a subset of authors from the Authenticus database and we showcased that we are capable of generating accurate profiles.

Title: Author Name Identification using Dirichlet-multinomial Regression topic model

Authors: Tomokazu Fujino, Keisuke Honda, Hiroka Hamada

Abstract

In this talk, we propose a new framework for extracting a complete list of the articles written by researchers who belong to a specific research or educational institute from an academic document database such as the Web of Science. The framework is based on Latent Dirichlet Allocation (LDA), which is a topic model. To improve the framework we use various techniques and indices, such as synonym retrieval, inverse document frequency and Dirichlet-multinomial Regression (DMR). By using DMR, it is possible to reflect observed features of the article such as author's affiliation in the topic distribution derived by LDA.

A numerical example is presented to illustrate the framework and we will discuss how much improvement is possible by using DMR topic model.