6. DISCUSSION, CONCLUSIONS AND LIMITATIONS

6 Discussion, conclusions and limitations

The previous chapter contains the findings of the experiment, describing each hypothesis and how it is tested using several variables. In this chapter the results of the experiment are discussed based on the literature from Chapter 5. Following this, conclusions will be drawn and the limitations of this experiment will be discussed. For easy reference, the hypotheses and results and the complete results of the multiple regression analysis are listed below.

6.1 List of hypotheses and results

Table 8.1 Hypothesis results

6.2 Multiple regression results

Table 8.2 Multiple regession results

^* p < .05, ^** p < .001

6.3 Theory review

Hypothesis 1 states that the discovery of fully accessible titles is significantly higher, compared to titles which are not fully accessible. Discovery was measured as the number of Book visits in the Google Book Search programme a title received during the experimentation period. The results of the experiment confirmed the hypothesis, which was in line with expectations as both the library and information sciences and the field of e-commerce predicted these results. In the library and information sciences, direct access is linked to a greater research impact. As stated by Harnad et al. the full potential of scientific output is reached when all barriers to access are removed. One of these barriers is obscurity, making a document unavailable for potential readers. Publishing in Open Access makes scientific output available for search engines, resulting in higher discovery rates. In the field of e-commerce, search costs are part of the transaction costs. Lowering search costs by making the contents of scientific books fully accessible should have a positive effect, which was confirmed by the results.

Book visits are used as an approximation to discovery: it was not possible to measure if a Book visit occurred by a ‘new’ reader or by a ‘returning’ reader. Therefore we cannot state that 78 Book visits are equal to 78 new readers of that title. If we assume that a percentage of those Book visits are made by returning readers, than the differences in Book visits between the sets still convey relevant information on the discovery rate. Further research is needed to measure the percentage of new vs. returning readers, and whether accessibility influences this.

Hypothesis 2 states that the online consultation (e.g. pages read or number of downloads) of fully accessible titles is significantly higher, compared to titles which are not fully accessible. Online consultation was measured as the number of monthly page views a title received in the Google Book search program during the experimentation period. The results of the experiment confirmed the hypothesis, which is – again – in line with expectations. Online consultation is of course closely linked to the amount of information that is directly available. It should therefore not come as a surprise that a fully accessible document receives more online consultation. Furthermore, research by JISC shows that readers consult several pages of an online book (JISC, 2009). This reading behaviour may also contribute to the number of pages read.

Hypothesis 3 states that the citation rate of fully accessible titles is significantly higher compared to titles which are not fully accessible. The citation rate is measured as the difference per title as found in the Google Scholar search engine during the experimentation period. This hypothesis could not be confirmed. While research on Open Access articles points to a higher citation rate, this is not an unexpected result. As writing a book – the most common form of publication in humanities and social sciences – takes considerably longer than writing an article, the effects may not be visible within the experimentation period.

Apart from the period, the number of titles within a scientific field may influence the results. Citation research usually uses 100 or more articles that are all connected to the same scientific field (Antelman, 2004; Eysenbach, 2006). The sets used in the experiment do not contain the same number of related titles. While it is not easy to place exact boundaries between scientific fields within the sets, it is safe to say that the maximum number of titles rooted in one scientific field will not exceed 20. Further research using a larger collection of books over a longer period may give a more comprehensive description of the influence of Open Access publishing of books on citation rates.

Hypothesis 4 states that the sales figures of fully accessible titles are significantly higher compared to titles which are not fully accessible. The sales figures used were the monthly number of sales per title. Contrary to expectations, the hypothesis could not be confirmed. The data does not suggest any relationship between accessibility and sales of academic books. Neither a negative nor a positive influence could be found.

Using the theory of value creating factors of e-business firms (Amit & Zott, 2001), sales of academic books are supported by two mechanisms: complementarities and transaction efficiency. The Open Access version of an academic book could be considered to be a ‘vertical complementarity’ – an extra services provided by the publisher. Prospective buyers could examine the contents of the book before purchasing it. Given the low acceptance of e-books, it does not seem likely that paper books are replaced in the near future. Like most publishers, AUP enables online purchasing of its titles through several online vendors and its own web shop. The transaction efficiency is therefore greatly enhanced, which also should support more sales. Further research is needed to find the factors that effect sales figures of academic books.

Hypothesis 5 states that the discovery of titles disseminated through both the institutional repository and the Google Book Search program is significantly higher compared to titles disseminated through one of those channels. Discovery was measured as the number of Book visits in the Google Book Search programme a title received during the experimentation period. The hypothesis could not be confirmed. Again, this was not in line with expectations based on the literature review. The information seeking behaviour of researchers in the realm of the humanities and social sciences is described as depending on different channels (Shen, 2007). The combination with research on multichannel management (Neslin & Shankar, 2009) – where the use of multiple channels is associated with higher sales volumes – led to hypothesis 5. A contributing factor may be the very large difference in performance of the channels. In the Google Book Search program, the mean number of monthly Book Views is 90; in the AUP repository, the corresponding mean is 4! Both channels are available in an academic environment. At this point it is not clear whether the large differences between the dissemination channels is caused by users outside the academic environment, or by the searching preferences of researchers.

Hypothesis 6 states that the online consultation (e.g. pages read or number of downloads) of titles disseminated through both the institutional repository and the Google Book Search program is significantly higher, compared to titles disseminated through one of those channels. Online consultation was measured as the number of monthly page views a title received in the Google Book search program combined with the number of monthly page views and the number of monthly downloads a title received in the AUP repository. Here the results present a mixed picture. Contrary to expectations and the results of hypothesis 5, the amount of online consultation through a single channel – the Google Book Search program – is higher than through the combined channels of the repository and the Google Book Search program. The same does not hold true for the monthly page views or the monthly downloads from the repository: no relation to accessibility could be established. More research may confirm or deny whether single channels perform better than multiple channels.

Hypothesis 7 states that the sales figures of titles disseminated through both the institutional repository and the Google Book Search program is significantly higher, compared to titles disseminated through one of those channels. The sales figures used were the monthly number of sales per title. Contrary to expectations, the hypothesis could not be confirmed. As discussed before at hypothesis 5, channel management research associates higher sales volumes with multichannel usage (Neslin & Shankar, 2009). In this experiment, there was no such effect. From the results of hypothesis 1 and hypothesis 2 we may conclude that ‘Open Access channels’ are successful in informing prospective readers. The reasons why this does not lead in significant higher sales cannot be distilled from the collected data for this experiment.

6.4 Conclusions

Research on the effects of free online accessibility of books is scarce, especially the effects on academic books. This paper describes the first experiment of this scale that involves sales figures and online consultation. Furthermore, no other research on the effects of dissemination channels for Open Access publishing has been reported. As Open Access is gaining momentum as dissemination model (NRC, 2009), there is greater need for knowledge of the effects it has on all stakeholders. While this paper is a start, many questions remain unanswered.

The findings of this paper reaffirm the notion that removing barriers to access leads to more discovery and more online consultations of publications. Authors profit directly from Open Access publishing as it enables them to spread their ideas to a maximum number of readers, and helps building their reputation. Academic publishers also profit from Open Access publishing as an efficient means of disseminating scientific knowledge. These effects can be directly measured, which may be a useful tool for marketing purposes.

While online usage is higher for fully accessible titles, it was not translated in higher sales figures. The reasons for that remain unclear. As can be seen from the regression analysis, print run and publications in English have a measurable effect on sales; also the subjects “Dutch Literature – Education” and “Japan; Culture – History” are found to be significant. The print run is based on sales expectations from the publisher; therefore the relation with sales is obvious. The larger market for books in English may account for the measured effect. Furthermore, the books on the subject “Dutch Literature – Education” are aimed at secondary schools, where sales to new students may account for the effects. Lastly, the sales of the books on “Japan; Culture – History” may be the result of remaindering, but this cannot be confirmed. More important is the validity of the regression model. The correlation between sales and the variables can be measured as a percentage of how much the variability in the outcome is accounted for by the variables in the model (R²). Here, the percentage is 26.8%. Therefore, more than 70% of the sales figures cannot be explained from the collected data!

At this moment, the paper publication seems to be the preferred format for extensive use. Due to possibilities created by publishers’ web shops and online bookstores like Amazon, ordering a book can literally be done in seconds. Therefore other restrictions may hamper the sales of academic books. One of those restrictions may be the lack of budget in university libraries. If that is the case, publishing in Open Access is still useful by making unaffordable books available. This also has far-reaching implications for academic publishers in search of a new business model. A sustainable business model cannot be exclusively build on extra sales generated from Open Access publishing. Knowing which factors influence the sales of academic books – paper or online – is very useful information for finding new business models.

Within the experimentation period, the increased online usage did not lead to a higher citation rate. This result is not surprising, as the period is relatively short for scientific disciplines where books – instead of articles – are the norm. Furthermore, as AUP publishes titles on a wide range of scientific fields, the number of titles rooted in one scientific discipline is small. The regression analysis on citations did reveal a significant positive effect of book visits. Although research on Open Access articles points to a higher citation rate, the correlation between citation and the variables in the regression model (R²) is just 12.3%. Therefore, the results of the regression analysis should be interpreted with caution.

Research on Open Access publishing mainly focuses on articles and as far as dissemination is covered, the dissemination channel is an institutional repository. This channel is of course usable for disseminating Open Access books, but publishers can also employ the Google Book Search program as an alternative. The difference in performance between the two channels is large: where the titles in the AUP repository receive 4 monthly page views in average, the titles in the Google Book Search program received 90 monthly Book Views on average. In order words: titles in the Google Book search programs are viewed – on average – 20 times more than titles in the repository. It may come as no surprise that the regression analysis revealed a very strong correlation between book visits and page views combined with accessibility. In contrast to this, the regression analysis of the repository downloads did not produce such clear cut results. A significant effect of the subject “Culture” was found, but again the correlation between downloads and variables in the regression model (R²) is small: 22.7%.

The repository environment is closely linked to the academic community, while the Google search engine is used by almost all internet users. This may explain some of the differences, but at this point not enough information is available on the information searching behaviour of scholars in the humanities and social sciences to understand which channel is used most to discover and consult online books.

6.5 Limitations

For this particular experiment, no models were available; therefore no best practices could be assessed for guidance. The sample (n = 400) is relatively large and in order to remove bias the titles were placed in different sets using publication year, subject, print run and language. The properties publication year, subject and print run fall within a wide range; ensuring that one aspect does not dominate the results. Still, all titles were published by one academic publisher. If certain aspects of a publisher – such as reputation or marketing budget – influence the results, this could not be tested in this experiment. Furthermore, for citation analyses, the experimentation period is relatively short and the number of titles used is low. In citation analysis, the period usually encompasses several years instead of nine months, while the number of titles within one scientific field is usually higher than 100. Also, at this moment no established standard for the scientific impact of academic books exists, such as set by the Institute for Scientific Information.

6.6 Value of this paper

The added value of this paper could be determined through several factors. Firstly, while Open Access publishing is widely debated in the library and information sciences, there is very little empirical research available on academic books. As far as could be established, this is the first paper on Open Access publishing that is based on experimental data. Secondly, a theoretical framework is developed based on the unification of numerous theoretical arguments and perspectives from the both the library and information sciences and the field of e-commerce. Again, this may be the first paper attempting to unify these two fields of knowledge or to use the models from the field of e-commerce on the Open Access phenomenon. Linked to this is the notion that Open Access publishing of books may be considered to be a multichannel endeavour. While multichannel management research on book sales is not new – the case of Amazon.com versus brick-and-mortar stores is quite well known – there is no other research available on multichannel management for Open Access publishing,

As information technology advances, it has enabled new forms of disseminating scientific knowledge. This paper attempts to form a bridge between the disciplines that study different aspects of it.