Bioinformatics textbooks, where to?

The current state of affairs

The field of bioinformatics is rapidly evolving, situation that has led to a dearth of educational materials covering the most recent developments in the field. In sequence analysis, in particular, the most widely used textbooks (Algorithms on Strings,Trees, and Sequences, by Dan Gusfield, and Biological Sequence Analysis by Richard Durbin, Sean Eddy, Anders Krogh, and Graeme Mitchison) have been published in 1997 and 1998, respectively, and until recently no new textbooks had appeared in this field (though recently there has been a flurry of developments with textbooks from Enno Ohlebusch, and from Veli Mäkinen, Djamal Belazzougui, Fabio Cunial, and Alexandru I. Tomescu, for example).

At the same time, the bioinformatics community has been increasingly active in making educational materials available online (e.g. a new textbook on algorithms https://stepic.org/Bioinformatics-Algorithms-2, a hands-on collection of bioinformatics algorithms http://rosalind.info, both from Pavel Pevzner, and a collection of implementations of bioinformatics algorithms from Ben Langmead https://github.com/BenLangmead/comp-genomics-class/). These initial efforts towards creating and disseminating educational materials are, so far, isolated and due to the active involvement of just a few members of the community.

Furthermore, while these efforts provide a valuable alternative to traditional printed textbooks, they still suffer from one of the most significant limitations of the traditional model - the lack of flexibility in adapting to a rapidly evolving field. Updating existing material and adding new material to these resources requires substantial effort from the small number of scientists who have developed and maintain them.

I conjecture that there are two main reasons for this situation. First, putting together the complete set of resources (text, exercises, slides, etc.) for an entire textbook or course is extremely time-consuming. Second, there are little incentives to scientists to devote this level of effort. Why spend 6 months or more writing one textbook (which leads to one line on one's CV), when you could, instead, be doing research, advising students, and contributing to multiple papers?

A textbook-journal?

I have been thinking for a while about an alternative model for the publication of educational materials - running textbooks like journals. Scientists can submit a single "chapter", a lesson plan, a slide deck, a video, or a problem set. The submissions are evaluated by reviewers who focus specifically on the pedagogical quality of the materials, in a collaborative (rather than the usually adversarial) review process aimed at best serving the educational needs of our community. In addition, a rich web-based framework provides opportunities for the involvement of a community of users in identifying and correcting mistakes, clarifying the content, etc. Finally, faculty and students are provided with a mechanism for collating multiple materials into a course-specific collection that can be shared online, in PDF or epub formats, and even printed and sold at low cost to students who prefer hard copies.

Such a model would address a number of the problems I had highlighted above:

    • Focusing on just one "book chapter" would lower the barrier to entry, encouraging, and making it possible for more scientists to become involved.

    • An appropriately designed journal model for educational materials would provide a mechanism for rewarding contributions. The educational resources could individually be reported on CVs, and their impact measured , e.g., by tracking the number of courses or students who use them.

    • The lag time between the introduction of new approaches or concepts in the field and the creation of appropriate educational materials would be greatly reduced, allowing the education to more rapidly catch up to cutting-edge research.

    • A properly designed web framework would created a flexible and dynamic counter-part to physical books, increasing engagement and interaction between students and educators, and providing a valuable base for educational research through the integration of appropriate analytics into the framework.

Some drawbacks

The model I proposed above is not without faults. A concern frequently raised by those with whom I have discussed this idea is that the materials created by different researchers without substantial (and expensive) editorial intervention would lack consistency, detracting from and even undermining the effectiveness of the materials as a teaching tool.

Another concern is that the community of scientists willing to create such resources is simply too small - the model is only sustainable if enough people contribute materials and participate in their review.

Other major concerns are costs and funding mechanisms. It is unlikely that such a system could be effectively set up with just volunteers, and even if the initial development could be funded with federal or foundation funds, a certain level of operating costs are unavoidable: maintenance of the underlying computational infrastructure, paid editorial staff, etc. The experience of open-source journals in recent years indicates that such costs cannot be avoided and are quite substantial. How can the effort be sustained long-term? Asking scientists to create new materials, and pay "publication fees" is unlikely to be a popular strategy. Nor can the entire system be placed behind a pay-wall. Either strategy undermines the main principle behind the ideas presented here. Are there other sustainable funding strategies that preserve openness and accessibility? Should one charge for "premium services"? What would those be?

Other comments

I have approached several publishers with this idea and they simply proposed the traditional edited book model - I would identify a number of authors, each of which generate a chapter, after which the publisher prints the book and sells it while the authors collect some token amount of royalties.

While discussing the ideas above with colleagues I became aware of other similar efforts in the community:

http://collections.plos.org/compbiol-education - a collection of educational articles published in Plos Computational Biology. This collection is so far fairly static (i.e., not too different from an edited book), and doesn't include the broader set of educational materials one may want to have.

CourseSource - this is a resource primarily targeted at biology education, however it embodies most if not all of the ideas I had discussed above. Should I simply contribute to CourseSource? The fact that I hadn't heard about CourseSource until very recently, nor has this resource been mentioned to me by others, raises important questions about the visibility and adoption of educational resources. The success of the "educational journal" model critically relies on its wide adoption within the community. What is necessary for such a model to take off?