11.3 Structural Variation Analysis (SVA)

The theory of structural variation was originally published in a 2012 JASIST paper along with a few proof-of-concept case studies, entitled "predictive effects of structural variation on citation counts."

In a nutshell, the question is whether and how one can quantify the transformative potential of ideas expressed in a newly published paper or even at an early stage when the initial idea is proposed.

The theory of structural variation on the one hand builds on the body of the literature on scientific creativity, in particular, in line with the broadly defined recombination philosophy, which is based on the observation that scientific discoveries or innovations, at least, to a great extent, turn out to be a new idea that can accommodate seemingly irrelevant ideas. In other words, a new bridge across previously disparate islands often gives an early sign of a subsequently successful discovery. On the other hand, such bridges would be necessary but may not be sufficient. For example, if no traffic would use the new bridge, then it generates no impact after all.

Anyhow, the SVA function is now available in the public release of CiteSpace from version 4.0.R1. There is still a lot of work. The procedure is much more complex than other scientometric analytic functions provided in CiteSpace. The data collection needs to be very careful to ensure that you maximize the scope of your data so that it will more or less cover the areas of your interest. The procedure is also computationally extensive. Modern laptops appear to be capable of these operations within a reasonable length of waiting. In any case, be patient.

I will post more detailed baby-step instructions later on. In the meantime, if you have questions, comments, or suggestions, please use the feedback box here or leave messages on FaceBook and/or science.net.

In the meantime, here is a brief introduction. Under the Analytics menu, turn on the radio button Structural Variation Analysis (SVA), modify the project properties of yours if needed before pressing the GO! to start the procedure. Two of the project properties are particularly relevant: the look back years and the maximum number of neighboring links to retain. The unlimited value for both properties is -1; otherwise, provide a positive integer, for example, retain links with 5 'nearest' neighbors.

CiteSpace then will ask a few questions along the way.The one you really need to think about is the window of the analysis in terms of the number of years you want to consider from the year a paper is published, i.e. when signals are sent to the system as a whole. The default value is 2 years. If you have a powerful computer, you may extend it to a longer period of time or even to the very first year available in your dataset. Another parameter you can control is the Top N value for node selection. If your computer is strong, you can choose 100 or higher; otherwise, you may need to wait a while. You may also choose to turn on Pathfinder network scaling, but just check the box for individual slices not the merged network.

Once the visualization window is up, you will see two lists on the left. In previous versions of CiteSpace, you will only see one list. The first list is the cited references. The second list is the citing papers, i.e. papers that cite the references shown in the visualized network. If you check the box next to a citing paper, you may see dashed red lines as shown below. These lines illustrate the novel links that the checked paper added to the network formed prior to its publication. You can sort each column in the table by clicking on the first number at the top of the column (not the column header as you would expect :-).

Here are the meanings of the columns in the second table:

citation frequency of the citing paper, the change of modularity triggered by the paper, the change of cluster linkage, the divergence of centrality, the incremental change rate, the transformative change rate, the year of publication, and the reference of the paper.

See the 2012 paper for more detailed explanations of these metrics.

A few data files are exported to the project folder. For example, you can find a spreadsheet of these metrics in the project folder. If you'd like to study the data with statistic analysis tools, that is possible.

Novel links added by a paper to the existing network of cited references are shown in red dashed lines. The star marks the position of the citing paper itself, which itself was cited by other papers. In this example, the paper apparently bridges the complex network area and the citation mapping area, among other things.

A PowerPoint with Chinese annotations made by Li Jie is available at http://blog.sciencenet.cn/blog-496649-838067.html, item #15.

The solid purple lines and dashed red lines form a new bridge over two islands. The stretch of the red stars are the citing papers contributing to these novel connections.

References