[Bishop07] Bishop C, Lasserre J.Generative or discriminative? getting the best of both worlds, In: Bayesian Statistics. Oxford University Press; 2007:3-24.
[Sha03] F. Sha and F. Pereira, Shallow parsing with conditional random fields, Proceedings of HLT-NAACL 2003, pp. 134–141, 2003.
[Kwok04] J. T. Y. Kwok and I. W. H. Tsang, The pre-image problem in kernel methods, IEEE transactions on neural networks, vol. 15, no. 6, pp. 1517–1525, Nov. 2004.
[Quadrianto10] N. Quadrianto, A. J. Smola, L. Song, and T. Tuytelaars, Kernelized sorting, IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 10, pp. 1809-21, Oct. 2010.
[Grauman05] K. Grauman and T. Darrell, The pyramid match kernel: discriminative classification with sets of image features, in Proc. of the Tenth IEEE International Conference on Computer Vision (ICCV’05), 2005, pp. 1458-1465.
[Lin02] C.-J. Lin, A comparison of methods for multiclass support vector machines, IEEE transactions on neural networks, vol. 13, no. 2, pp. 415–425, Jan. 2002.
[Basak07] D. Basak, S. Pal, and D. C. Patranabis, Support Vector Regression, Neural Information Processing, vol. 11, no. 10, pp. 203–224, 2007.
[Chalimourda04] A. Chalimourda, B. Schölkopf, and A. J. Smola, Experimentally optimal nu in support vector regression for different noise models and parameter settings, Neural networks, vol. 17, no. 1, pp. 127-41, Jan. 2004.
[Smola04] A.J. Smola and B. Schölkopf, A tutorial on support vector regression, Statistics and Computing, vol. 14, 2004, pp. 199-222.
[Fawcett06] T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, Jun. 2006.
[Demsar06] J. Demsar, Statistical Comparisons of Classifiers over Multiple Data Sets, Journal of Machine Learning Research, vol. 7, pp. 1-30, 2006.
[Ding08] C. Ding, X. He, H. D. Simon, and R. Jin, On the Equivalence of Nonnegative Matrix Factorization and K-means - Spectral Clustering, Lawrence Berkeley National Laboratory, 2008.
[Dhillon04] I. S. Dhillon, Y. Guan, and B. Kulis, Kernel k-means , Spectral Clustering and Normalized Cuts, in Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD' 04, 2004, pp. 551-556.
[Weston05] J. Weston, B. Scholkopf, and O. Bousquet, Joint Kernel Maps, Proceedings of the 8th International Workshop on Artificial Neural Networks, IWANN 2005, Springer-Verlag, 2005, pp. 176-191.
[Joachims09] Joachims T, Hofmann T, Yue Y, Yu C-N, Predicting structured objects with support vector machines, Communications of the ACM. 2009;52(11)
[Quoc12] Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean and Andrew Y. Ng. Building High-Level Features using Large Scale Unsupervised Learning. In Proceedings of the Twenty-Ninth International Conference on Machine Learning, 2012.
[Vevaldi12] A. Vedaldi and A. Zisserman, Efficient additive kernels via explicit feature maps, IEEE transactions on pattern analysis and machine intelligence, vol. 34, no. 3, pp. 480-92, Mar. 2012.
[Lin12] J. Lin and A. Kolcz, Large-scale machine learning at twitter, Proceedings of the 2012 international conference on Management of Data - SIGMOD 12, p. 793, 2012.