5.1 Collect Data from the Web of Science

The primary source of data for CiteSpace is the Web of Science.

Most importantly, the dataset should include cited references in order to maximize the potential of CiteSpace.

The Web of Science has several ways to search for bibliographic records. The most basic one is called, of course, basic search, which includes topic, author, and several other searchable fields. The following example shows a topic search for “CiteSpace” between the timespan of 2004 and 2015.

I recommend you to watch the video that demonstrates the steps of collecting data from the Web of Science.

The topic search found 16 results. The results are initially displayed in the chronological order of the publication date from the newest to the oldest. You can switch to a different order, for example, by the number of citations, from the highest to the lowest, so you can quickly narrow down to a small subset of the most highly cited records.

You will notice if the results are sorted by Times Cited – highest to lowest. The record with the highest times cited is the 2006 JASIST paper on CiteSpace II, with 246 citations. The topic search found 36 records. You can download these 36 records, however, that would be not representative. If you follow the Create Citation Report link, you will see you can expand the 36 records to about 337 records that cited the set of 36 records. We refer to this way to obtain more potentially relevant records as citation expansion. Since the only thing we know is that each record in the expanded set at least cited one of the original 16 records, it may turn out to be a less relevant record because of the diversity of how authors cite. Let’s if we can do better than finding 337 records related by citation indexing.

You may also notice that the 2004 PNAS and the 2010 JASIST paper on CiteSpace were NOT on the list, although they are certainly about CiteSpace and their citations would put them on the list too. Thus, this example shows that you should be careful when using the topic search along to construct your own dataset.

Under the Citation Network panel, the 104 Times Cited is a clickable link. If you click on it, it will bring you to the list of 104 records that cited the 2004 PNAS paper. The 2006 JASIST paper should be on the list. If we sort the list by Times Cited, then we will see the 2006 JASIST on the top.

Now if you click on the Create Citation Report on the right, you will get access to all the records that citing this lot, i.e. that would be the citation expansion we want.

The Citation Report shows, among other things, 732 citing articles. These 732 articles would form the expanded set. In fact, you can go even further by adding your search results to the Marked List ►Create Citation Report ►Citing Articles. I will leave it to you to explore in the Web of Science.

To download a set of records from the Web of Science, pull down the menu starting with Save to EndNote online and select Save to Other File Formats.

Then you will need to enter the number of records, the content, and the file format in a dialog box like the following. For CiteSpace, include Full Record and Cited References and select Pain Text as the file format. When you save the file, make sure the file name starts with the word ‘download’ and the file extension is .txt. This naming convention will bring your more flexibility later on. For example, you can easily hide a file from CiteSpace by adding a prefix to the names of a few files you want CiteSpace to skip.