Goodreads Datasets

NOTE: Our datasets have been moved!! 

Please see our new webpage about how to download these datasets. This Google site along with the download links in our previous Google Drive will be deprecated soon.

====================================

The datasets were collected in late 2017 from goodreads.com, where we only scraped users' public shelves, i.e. everyone can see it on web without login. User IDs and review IDs are anonymized. 

We collected these datasets for academic use only. Please do not redistribute them or use for commercial purposes. 

If you are using our datasets, please cite the following papers:

If you have any questions or find any bugs regarding these datasets, feel free to contact Mengting Wan (m5wan@ucsd.edu). 

Latest Updates

We've updated several files in May 2019. We really appreciate those who helped us to identify duplicates and bugs in the previous version!

Overview

We collected three groups of datasets: (1) meta-data of the books, (2) user-book interactions (users' public shelves) and (3) users' detailed book reviews. These datasets can be merged together by matching book/user/review ids. 

Basic Statistics of the Complete Book Graph:

Note the complete interaction dataset is very large! We extracted several medium-size subsets by genre, and recommend using these subsets for experimentation first (see "By Genre" for details).

(Meta-Data of Books)

We collected detailed meta-data about 2.36M books. Please see "Books"  page for dataset details and sample records.

Quick links:

(User-Book Interactions)

We collected more than 229M user-book interactions. Please see "Shelves"  page for dataset details and sample records.

Quick links (These files could be very large! Consider using genre-wise datasets if your resources are limited.):

(Book Review Texts)

We further re-scraped more than 15M records with detailed review text. Please see "Reviews" page for details and sample records.

Quick links:

(Operate the Datasets)

We created several jupyter notebooks to illustrate how to download/read these datasets, and provide some basic explorations of the data.

Quick links:

By Genre

Children

Download Links:

Comics & Graphic

Download Links:

Fantasy & Paranormal

History & Biography

Mystery, Thriller & Crime

Poetry

Download Links:

Romance

Download Links:

Young Adult

Download Links: