Documentation

ReForeSt Overview

ReForeSt is a distributed, scalable implementation of the RF learning algorithm which targets fast and memory efficient processing. ReForeSt main contributions are manifold: (i) it provides a novel approach for the RF implementation in a distributed environment targeting an in-memory efficient processing, (ii) it is faster and more memory efficient with respect to the de facto standard MLlib, (iii) the level of parallelism is self-configuring.

ReForeSt and its documentation have been designed for developers and data scientists which are familiar with the Spark Enviroment and the MLlib library. Consequently please refer first to those documentation before starting with ReForeSt

Downloading

Get ReForeSt from the downloads page of the project website. ReForeSt is built on top of Apache Spark and requires Spark for executing.

Quick Start

Look at the examples to learn a random forest with ReForeSt:

Random Forest learning algorithm in Apache Spark
Random Forest learning algorithm in Apache Spark with sub-tree local computation
Random Rotations learning algorithm

API Docs

ReForeSt (scaladoc)

ReForeSt has been developed at Smartlab - DIBRIS - University of Genoa

Google Sites

Report abuse