Interesting Papers

These are papers/reviews/tutorials that I am reading or have enjoyed reading. The list below is a bit dated. We now keep track of interesting papers and publications via Mendeley.

Showing 97 items
TagTopic/NameURL/LinkTypeDescriptionYear
Sort 
 
Sort 
 
Sort 
 
Sort 
 
Sort 
 
Sort 
 
TagTopic/NameURL/LinkTypeDescriptionYear
Bayesian Networks Great List of papers on Bayesian Learning http://cocosci.berkeley.edu/tom/bayes.html Review Great List of papers on Bayesian Learning January 12, 2010 
Machine Learning Measuring and testing dependence by correlation of distances http://projecteuclid.org/euclid.aos/1201012979 Paper A new distance metric [0,1] that measure dependence between two vectors. It takes into account non-linear dependencies and is only 0 if the two vectors are independent January 25, 2007 
Computational Biology Enhancing scatterplots with smoothed densities http://bioinformatics.oxfordjournals.org/cgi/content/abstract/20/5/623 Paper Plotting high density scatter plots with smoothing and transparency January 16, 2003 
Computational Biology How does multiple testing correction work? http://www.nature.com/nbt/journal/v27/n12/full/nbt1209-1135.html Review Multiple hypothesis testing and correction in biology January 7, 2010 
Computational Biology Simcluster: clustering enumeration gene expression data on the simplex space http://www.biomedcentral.com/1471-2105/8/246 Paper Clustering of gene expression, uses Aitchisonean distance metric which useful for any data that lives in simplex space (example probabilities or data that sums to a constant) December 27, 2009 
Biology Sequencing technologies — the next generation http://www.nature.com/nrg/journal/vaop/ncurrent/abs/nrg2626.html Review Review on next generation sequencing December 17, 2009 
Computational Biology ARTS: Accurate Recognition of Transcription Starts in Human http://www.fml.tuebingen.mpg.de/raetsch/suppl/arts Paper Multiple string kernels with SVMs for TSS prediction November 16, 2006 
Machine Learning Clustering with shallow trees http://arxiv.org/abs/0910.0767# Paper Clustering method that is intermediary between single linkage hierarchical clustering and affinity propagation November 16, 2009 
Biology ChIP–seq: advantages and challenges of a maturing technology http://www.nature.com/nrg/journal/v10/n10/full/nrg2641.html Review Review paper on ChIP-seq and its applications September 26, 2009 
Computational Biology High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites http://nar.oxfordjournals.org/cgi/content/full/37/1/14 Paper Integration of chromatin mark data improves TFBS prediction September 22, 2009 
Machine Learning Deep Belief Networks http://www.iro.umontreal.ca/~lisa/publications/index.php?page=publication&kind=single&ID=209 Review Review by Yoshua Bengio on Deep Belief Networks September 21, 2009 
Computational Dendroscope http://www-ab.informatik.uni-tuebingen.de/software/dendroscope/welcome.html Software Software for visualizing massive networks and trees September 19, 2009 
Machine Learning Lasso, Elastin net and Ridge regression code by Friedman, Tibshirani, Hasti http://www-stat.stanford.edu/~tibs/glmnet-matlab/ Software MATLAB and R code (glmnet package) September 8, 2009 
Computational Biology BedTools: utilities for comparing genomic features in BED format http://people.virginia.edu/~arq5x/bedtools.html Software BedTools: utilities for comparing genomic features in BED format September 1, 2009 
Machine Learning VOWPAL WABBIT: Sparse online learning via truncated gradient http://www.research.rutgers.edu/~lihong/pub/Langford09Sparse-JMLR.pdf Paper Very fast online learning August 24, 2009 
Biology Long noncoding RNAs: functional surprises from the RNA world http://genesdev.cshlp.org/content/23/13/1494.short?rss=1 Review review on long non-coding RNAs July 30, 2009 
Boosting ASSEMBLE: Exploiting Unlabeled Data in Ensemble Methods http://www.rpi.edu/~bennek/kdd-KristinBennett1.pdf Paper Semi supervised boosting July 18, 2002 
Machine Learning Review on semi supervised learning http://pages.cs.wisc.edu/~jerryzhu/research/ssl/semireview.html Review Review on semi supervised learning July 23, 2009 
Boosting Entropy regularized boosting tutorial http://www.cse.ucsc.edu/~manfred/pubs/tut/icml2009/micml.pdf Review Manfred Warmuth's talk on Entropy Regularized Boosting July 14, 2009 
Machine Learning Tutorial on Machine Learning reductions http://hunch.net/~reductions_tutorial/ Review How to convert one type of learning problem into another July 14, 2009 
Machine Learning Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1423975 Review Good review on recommendation systems, collaborative filtering and the linke July 30, 2005 
Machine Learning Network-constrained regularization and variable selection for analysis of genomic data http://bioinformatics.oxfordjournals.org/cgi/content/full/24/9/1175 Paper Network contrained regularized regression July 7, 2009 
Computational Biology From DNA sequence to transcriptional behaviour: a quantitative approach http://www.nature.com/nrg/journal/v10/n7/abs/nrg2591.html Review Transcription, Sequence and nucleosome positioning. Review by Eran Segal June 27, 2009 
Biology Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes http://genesdev.cshlp.org/content/23/12/1379.full Review Next gen sequencing of transcriptomes June 21, 2009 
Machine Learning Measuring classifier performance: a coherent alternative to the area under the ROC curve  http://www.springerlink.com/content/y35743hp7010g354/ Paper An alterative to AUC to measure classifier performance June 21, 2009 
Computational Biology  Analytical methods for inferring functional effects of single base pair substitutions in human cancers  http://www.springerlink.com/content/c86418m69u475231/ Review Inferring functions from mutations in cancer June 16, 2009 
Machine Learning Active learning tutorial http://hunch.net/~active_learning/ Review Active learning tutorial June 15, 2009 
Machine Learning Learning Nonlinear Dynamic Models http://arxiv.org/abs/0905.3369 Paper A different approach for learning HMM/DBN type models June 12, 2009 
Computational GNU Linear programming library http://www.gnu.org/software/glpk/ Software GNU Linear programming library June 9, 2009 
Machine Learning The Entire Regularization Path for the Support Vector Machine http://www.jmlr.csail.mit.edu/papers/volume5/hastie04a/hastie04a.pdf  Paper How to efficiently search the space of regularization parameter C for an SVM June 9, 2009 
Computational Biology Genome-wide association analysis by lasso penalized logistic regression http://bioinformatics.oxfordjournals.org/cgi/content/full/25/6/714 Paper When the number of features is >> number of training examples this is a good methodology to try June 9, 2009 
Boosting Topics in Regularization and Boosting http://www-stat.stanford.edu/~hastie/THESES/saharon_rosset.pdf Review Great thesis on various types of regularization in boosting and SVMs June 9, 2009 
Machine Learning Grafting: fast, incremental feature selection by gradient descent in function space http://portal.acm.org/citation.cfm?id=944976 Paper The regularization term can be used as a way to figure out the stop feature selection/stopping criterion for boosting March 19, 2003 
Biology Deep cap analysis gene expression (CAGE) http://www.biotechniques.com/biotechniques/multimedia/archive/00003/BTN_A_000112802_O_3724a.pdf Review Description of Deep CAGE technology for identification of TSS May 28, 2009 
Biology Fundamental concepts in genetics http://www.nature.com/nrg/series/fundamental/index.html Review Nature Review papers on genetics May 26, 2009 
Biology Genetic Mapping in Human Disease http://www.sciencemag.org/cgi/content/full/322/5903/881 Review Review on genome wide association studies by David Altschuler May 27, 2008 
Computational Biology Aneuploidy prediction and tumor classification with heterogeneous hidden conditional random fields http://bioinformatics.oxfordjournals.org/cgi/content/full/25/10/1307 Paper L1 regularized optimization models for CNV (Rob Schapire) May 24, 2009 
Computational Biology Statistical Inference in mRNA-Seq: Exploratory Data Analysis and Differential Expression http://www.bepress.com/ucbbiostat/paper247/ Paper mRNA-seq data normalization and differential expression May 14, 2009 
Computational Probabilistic inference using MCMC methods http://www.cs.toronto.edu/~radford/ftp/review.pdf Review MCMC, Gibbs sampling and other sampling methods September 27, 1993 
Boosting BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING http://ftp://ftp.stat.math.ethz.ch/Research-Reports/Other-Manuscripts/buhlmann/BuehlmannHothorn_Boosting-rev2.pdf Review A great statistical review of boosting (regression and classification) June 4, 2007 
Machine Learning VFML (Very Fast Machine Learning) toolkit http://www.cs.washington.edu/dm/vfml/ Software VFML (Very Fast Machine Learning) toolkit for very fast online learning with decision trees and bayesian learning April 18, 2009 
Computational On Estimation of a Probability Density Function and Mode http://projecteuclid.org/euclid.aoms/1177704472 Paper Kernel density estimation May 28, 1962 
Machine Learning Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=01556208 Paper Various kernels for sequence/waveform data May 8, 2009 
Machine Learning Pattern Recognition Using Higher-Order Local Autocorrelation coefficients http://www.google.com/search?q=Pattern+Recognition+Using+Higher-Order+Local+Autocorrelation+coefficients Paper Efficient computation of higher order cross-correlation kernels June 24, 2002 
Machine Learning Comparison of Combining Methods of Correlation Kernels in kPCA and kCCA for Texture Classification with Kansei Information http://www.springerlink.com/index/804l53602706185l.pdf Paper Various kernels for sequence waveform data May 28, 2007 
Machine Learning Signal Theory for SVM Kernel Design with applications to parameter estimation and sequence kernels http://eprints.ecs.soton.ac.uk/15121/1/paper.ps Paper Kernels for sequences and waveform signals May 7, 2009 
Machine Learning Computing a nearest symmetric positive definite matrix http://www.maths.manchester.ac.uk/~nareports/narep126.pdf Paper At times a matrix is not symmetric positive definite. This paper explains how to get the nearest psd matrix. Useful for kernel computations. May 17, 1988 
Computational Notes on Functionals and Functional Derivatives http://julian.tau.ac.il/~bqs/functionals/functionals.html Review Useful for understanding functional gradient descent  
Boosting mBoost package documentation http://cran.okada.jp.org/web/packages/mboost/mboost.pdf Software Documentation of the mBoost R package by Peter Buhlmann May 2, 2009 
Teaching 10 simple rules to mix teaching with research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000358 Review  April 27, 2009 
Machine Learning Apache Mahout http://lucene.apache.org/mahout/ Software MapReduce based Machine Learning implementation April 18, 2009 
Machine Learning IBM Parallel Machine Learning Toolbox  http://www.alphaworks.ibm.com/tech/pml Software Kmeans, SVM paralellized, NOT open source April 18, 2009 
Computational Biology Approaches to comparative sequence analysis: towards a functional view of vertebrate genomes http://www.nature.com/nrg/journal/v9/n4/full/nrg2185.html Review Review on comparative sequence analysis April 16, 2008 
Machine Learning A kernel for time series based on global alignments http://arxiv.org/PS_cache/cs/pdf/0610/0610033v1.pdf Paper Kernels for time series data that is not phased (synchronized) October 2, 2006 
Machine Learning LibSVM: A Library for Support Vector Machines http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf Software Great documentation on implementation details of various types of SVMs for classification, regression, density estimation etc. April 16, 2009 
Machine Learning Analysis of Switching Dynamics with Competing Support Vector Machines http://www.csie.ntu.edu.tw/~cjlin/papers/ijcnntime.pdf Paper Weighted SVMs for segmentation of mixed signals  
Machine Learning Cost-Sensitive Learning by Cost-Proportionate Example Weighting http://www.google.com/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fhunch.net%2F~jl%2Fprojects%2Freductions%2Fcosting%2FfinalICDM2003.pdf&ei=zvXmScbcA5ectAO35OXnAQ&usg=AFQjCNFvXBQ2pffOG7g_8x76HrmunUiQ8A&sig2=BQ8LYVNIrz_bsCRYlnkpYw Paper Cost sensitive learning - includes the fabled weighted SVM April 17, 2003 
Machine Learning Map-Reduce for Machine Learning on Multicore http://www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf Paper Parallelization of machine learning algorithms October 10, 2006 
Computational Biology Software package for primary analysis of Illumina next gen sequencing assays http://sgenomics.org/swift/ Software Highly parallelized C++ for primary data analysis of second gen sequencing assays January 24, 2009 
Computational Biology SNP imputation in association studies http://www.nature.com/nbt/journal/v27/n4/abs/nbt0409-349.html Review Eran Halperin's review on the use of SNPs and Haplotypes for association studies PART 2 April 13, 2009 
Computational Biology Maximizing power in association studies http://www.nature.com/nbt/journal/v27/n3/full/nbt0309-255.html Review Eran Halperin's review on genome wide association studies PART 1 April 13, 2009 
Computational Convex Optimization http://www.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf Review Book by Stephen Boyd March 6, 2009 
Computational Biology Efficient and accurate P-value computation for Position Weight Matrices http://www.almob.org/content/2/1/15 Paper Thresholds for PWMs based on a p-value cutoff December 11, 2007 
Computational CloudBurst: Highly Sensitive Short Read Mapping with MapReduce http://apps.sourceforge.net/mediawiki/cloudburst-bio/index.php?title=CloudBurst Software Massive parallelization of tag to genome mapping and k-mer manipulation. Based on google's MapReduce and HADOOP March 18, 2009 
Boosting iBoost: Boosting with item set mining http://www.kyb.mpg.de/bs/people/hiroto/iboost/ Software boosting itemsets January 24, 2009 
Biology E2F in vivo binding specificity: Comparison of consensus versus nonconsensus binding sites http://genome.cshlp.org/content/18/11/1763 Paper Discusses TFs that bind sites that do no have consensus motifs November 13, 2008 
Machine Learning Support Vector Regression http://cs.ecs.baylor.edu/~hamerly/courses/5325_08s/papers/svm/smola2004regression.pdf Review Tutorial on Support vector regression November 28, 2008 
Boosting Gboost: Graph boosting http://www.kyb.mpg.de/bs/people/nowozin/gboost/ Software Code for boosting with graph mining January 24, 2009 
Bayesian Networks Graphical Models, Exponential Families, and Variational Inference http://www.nowpublishers.com/product.aspx?product=MAL&doi=2200000001 Review Extensive review on graphical models February 25, 2009 
Biology Nucleosome positioning and gene regulation: advances through genomics http://www.nature.com/nrg/journal/v10/n3/full/nrg2522.html Review Great review on the effect of nucleosome positioning on gene regulation February 21, 2009 
Computational Complexity of Finite Functions http://www.cs.columbia.edu/~rocco/Teaching/S09/6998/Boppana-Sipser-complexity.ps Review Excellent review paper on computational complexity by Bopanna and Sipser August 15, 1989 
Computational Extremal Combinatorics http://lovelace.thi.informatik.uni-frankfurt.de/~jukna/EC_Book/index.html Review Great book on Advanced topics in computational complexity February 16, 2009 
Computational Biology CoreBoost_HM http://www.ncbi.nlm.nih.gov/pubmed/18997002 Paper Boosting to predict TSS using sequence + chromatin mod data January 6, 2009 
Computational Biology CoreBoost http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1852414 Paper Boosting to predict TSS February 7, 2009 
Computational NoteBooks http://cscs.umich.edu/~crshalizi/notebooks/ Review Great set of links to reading material for over 400 topics February 5, 2009 
Machine Learning Olivier Bousquet, Stéphane Boucheron and Gábor Lugosi, "Introduction to Statistical Learning Theory" http://www.stat.cmu.edu/~larry/=sml2008/BBL.pdf Review Review of Statistical Learning Theory February 5, 2009 
Machine Learning Survey on active learning http://pages.cs.wisc.edu/~bsettles/active-learning Review  January 24, 2009 
Computational MATLAB CVX package for convex optimization http://www.stanford.edu/~boyd/cvx/ Software MATLAB CVX package for convex optimization January 24, 2009 
Computational Biology Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000231 Paper Predicting phenotypic traits from SNP data. Try boosting on it. November 23, 2008 
Computational Biology Activity motifs reveal principles of timing in transcriptional control of the yeast metabolic network http://www.nature.com/nbt/journal/v26/n11/abs/nbt.1499.html Paper Potential project for graph boosting November 23, 2008 
Computational Biology A novel method for comparing topological models of protein structures enhanced with ligand information http://bioinformatics.oxfordjournals.org/cgi/content/short/24/23/2698?rss=1 Paper Protein representation November 21, 2008 
Machine Learning Random Forests http://www.springerlink.com/content/u0p06167n6173512/ Paper Bootstrap based method for creating regression and classification trees November 5, 2004 
Computational Biology Bowtie: Ultra fast short read aligner http://bowtie-bio.sourceforge.net/ Software Fast alignment of tags to genomes using indexing November 5, 2008 
Computational Reducing the Space Requirement of Suffix Trees http://www.zbh.uni-hamburg.de/staff/kurtz/papers/Kur1999.pdf Paper How to implement suffix trees efficiently November 26, 1999 
Computational Biology MUMmer: Utlra fast genome aligner http://mummer.sourceforge.net/ Software Very fast sequence matching and aligning November 5, 2008 
Computational Biology SeqAn: C++ sequence library http://www.seqan.de Software C++ library for sequence manipulation November 5, 2008 
Boosting Gradient Tree Boosting for Training Conditional Random Fields http://jmlr.csail.mit.edu/papers/v9/dietterich08a.html Paper sequence labeling method November 4, 2008 
Boosting The boosting approach to machine learning: An overview http://www.cs.princeton.edu/~schapire/uncompress-papers.cgi/msri.ps Review Introductory review on boosting November 29, 2003 
Boosting An introduction to boosting and leveraging http://www.ee.technion.ac.il/~rmeir/Publications/MeiRae03.pdf Review Detailed review on Boosting and ensemble methods November 29, 2003 
Computational Biology Boolean implication networks derived from large scale, whole genome microarray datasets http://genomebiology.com/2008/9/10/R157 Paper Extracting boolean implications from microarray data, Could be used as a useful pre-processing before learning  
Computational Biology Extracting binary signals from microarray time-course data http://nar.oxfordjournals.org/cgi/content/full/gkm284v1 Paper Simple method for discretization of microarray data (mostly time course data or data that spans a large dynamic range per gene) May 1, 2007 
Machine Learning Lease Angle and L1 Regression: a Review http://arxiv.org/pdf/0802.0964 Review An interesting new method for regression October 27, 2008 
Boosting Sparse Boosting http://jmlr.csail.mit.edu/papers/volume7/buehlmann06a/buehlmann06a.pdf Paper A Boosting technique for regression October 11, 2006 
Boosting Improved Boosting Algorithms Using Confidence-rated Predictions http://www.springerlink.com/content/k8134wq0824k7042/ Paper Excellent paper for efficient implementation of Adaboost and variants (such as abstaining) December 30, 1999 
Computational Conjugate gradient method http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf Review Extremely lucidly explained tutorial on the Conjugate gradient method August 4, 1994 
Machine Learning A tutorial introduction to the minimum description length principle http://arxiv.org/abs/math/0406077 Review Review of the MDL principle June 4, 2004 
Computational Compressive Sensing http://igorcarron.googlepages.com/cs Review A great site on the methods of compressive sensing (a method for compression and transfer of information) October 27, 2008 
Showing 97 items
Comments