Call Classification
Call classification is a difficult topic. My main reference material has been the following:
[1] The Bats of Britain and Ireland, Jon Russ, Alana Books ISBN 0 9536049 0 X
[2] Echolocation by the barbastelle bat, Barbastella barbastellus, A. Denzinger, B. M. Siemers, A. Schaub, H. Schnitzler, J Comp Physiol A (2001)187: 521-528
[3] Acoustic Identification of Twelve Species of Echolocating Bat by Discriminant Function Analysis and Neural Networks, S. Parsons and G. Jones, The Journal of Experimental Biology 203, 2641–2656 (2000)
[4] Classification of Echolocation Calls from 14 Species of Bat by Support Vector Machines and Ensembles of Neural Networks, R.D. Redgwell, J.M. Szewczak, G. Jones, S. Parsons, Algorithms 2009, 2, 907-924.
[5] British Bat Calls, A Guide to Species Identification, Jon Russ, Pelagic Publishing, ISBN 978-1-907807-25-1
The BatExplorer software performs an initial assessment to identify the bat calls, and then you can semi-automatically work through the recording to assign a species. Problems occur due to noise, reflections and multiple species in the same recording which means a level of manual intervention is often required. As mentioned elsewhere, there is no real "gold standard" call that will reliably enable call classification to be 100% certain. The semi-auto classifier in BatExplorer makes a confidence assessment in its classification, which is very useful as it enables you make a quick assessment as to whether additional work is needed.
A further complication arises in the processing concerning the set up of the fast fourier transform (FFT) to generate the sonogram. I've found that a combination of a 1024 and a 512 point FFT with Blackman-Harris window function and an 80-95% overlap gives the most reliable results. Larger numbers of points in the FFT gives better frequency resolution at the expense of poorer time axis resoltuin. Large numbers of points in the FFT makes classification very difficult, so I leave it set at 512 points to start with and then use 1024 points if more frequency detail is needed. See table below, which shows the different results obtained for different numbers of points in the FFT:
256 Point FFT
512 Point FFT
1024 Point FFT
2048 Point FFT
You can readily identify the bat as a Natterers in the first two, but the second two get progressively less recognisable against published data like in reference [1]. So, my advice is to exercise care with your analysis software, and don't change defaults unless you know exactly what you are doing...
The simple sonogram plots in reference [1] have stood the test of time (I think the book is about to re-released in an updated form) and this book is an essential reference for anyone contemplating serious work in this area. There are online resources too, and the Bristol University resource is particularly useful. Jon Russ's sequel to reference [1] has now been published [5]... The sonogram data has been brought up to date and includes data for all the recently discovered species like Alcathoe and Nathusius's Pipistrelle, and includes zero crossing and frequency division data for most cases. This book provides a useful range of information for both the amateur and professional ecologists alike and captures a great deal of information about call analysis that you might otherwise have to learn the hard way...
A further problem is caused by the recording medium itself. In the old days, the tape head limited the frequencies recorded and tended to clip the calls. You need to be aware that saving calls in MP3, or similar compressed format has a similar effect and can clip the the recording peaks and slug the frequency response.
Yet another problem is the doppler shift in the call as the bat comes towards or moves away from you (typically, this is about +/-.5kHz) Add all this up with the variability of the call itself, and reflections from the surroundings, and you wonder how you can identify anything at all!
Finally, a further problem I've discovered since investigating the use of third party classifiers is that feature models built up with one make of detector do not necessarily work that well with a different make of detector. This is particularly prevalent when the signal is relatively high level and close to the dynamic range limit of the detector. My suspicion is that the learning algorithms are also learning the non-linear characteristics of the detector they are made with. More to follow once once I get some more work done on this over the winter months.