RT-Rank

RT-Rank is an open source project consisting of various Machine Learning algorithms which uses Regression Trees.  We implemented this to compete in the Yahoo Learning to Rank Challenge (placed 7th, globally on Set 2, 8th on Set 1 --  1st on the evaluation set of Set 2 ).  The package includes large scale (parallel) implementations of Gradient Boosting, Random Forests, and a novel technique, IGBRT (Initialized Gradient Boosted Regression Trees).  The decision tree (our variation of CART) is written in C++.  The Python scripts implement the decision tree ensembles.

The software allows for easily performing multi-threaded Boosting, Random Forests, and a combination of the two under either Regression and Classification.  The data files must follow the SVM-Light standard as described here.

The theory behind these algorithms are explained in our recently published research paper.

If you use this algorithm in your research, please cite:

@Article{MohanChWe10,
title = {Web-Search Ranking with Initialized Gradient Boosted Regression Trees},
author = {Ananth Mohan and Zheng Chen and Kilian Q. Weinberger},
journal = {Journal of Machine Learning Research, Workshop and Conference Proceedings},
volume = {14},
year = {2011},
pages = {77-89}
}


(We thank Craig MacDonald for providing a helpful bugfix concerning the parsing of input files.)
Comments