Books:
Pattern Classification
Richard O. Duda, Peter E. Hart, David G. Stork Wiley-Interscience; 2 edition (October 2000)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Christopher M. Bishop Springer; 1st ed. 2006. Corr. 2nd printing 2011 edition (October 1, 2007)
http://research.microsoft.com/en-us/um/people/cmbishop/PRML/
The Elements of Statistical Learning: Data Mining, Inference, and Prediction,
Trevor Hastie , Robert Tibshirani, Jerome Friedman Springer; 0002-2009. Corr. 3rd edition (February 9, 2009)
http://www-stat.stanford.edu/~tibs/ElemStatLearn/
Statistics for Terrified Biologists, Helmut van Emden Wiley-Blackwell; 1 edition (April 28, 2008).
Articles:
On the Role of Sparse and Redundant Representations in Image Processing, M. Elad et al., IEEE Proceedings - Special Issue on Applications of Sparse Representation & Compressive Sensing, Vol. 98,Pages 972-982 2010.
Self-Organized Formation of Topologically Correct Feature Maps. T. Kohonen. Biological Cybernetics 43: pag. 59–69. (1982)
A tutorial on spectral clustering. Ulrike von Luxburg, Statistics and Computing 17(4): 395-416 (2007).
A Global Geometric Framework for Nonlinear Dimensionality Reduction, J. B. Tenenbaum, V. de Silva, J. C. Langford, Science 290, (2000), 2319–2323.
Nonlinear Dimensionality Reduction by Locally Linear Embedding, S. T. Roweis and L. K. Saul, Science Vol 290, 22 December 2000, 2323–2326.
An introduction to ROC analysis, T. Fawcett, Pattern Recognition Letters 2006
P -Values are Random Variables, J.Duncan et al. The American Statistician, 2008.
Statistic review: Hypothesis testing and P-values, E.Whitley and J.Ball Critical Care, 2002 (6) 222-225