Sample Blocking Rules for Falcon Data Sets
Sample blocking rule sequences
Products
Blocking rule sequence 1
modelno_title_jaccard < 0.14 AND title_dice < 0.33 AND price_reldiff >= 0.74 AND modelno_title_overlap < 0.5
modelno_title_cosine < 0.23 AND modelno_overlap < 0.5 AND brand_jaccard < 0.13 AND modelno_title_dice < 0.06 AND title_dice < 0.61
Blocking rule sequence 2
title_overlap < 1.5 AND modelno_title_cosine < 0.41 AND title_jaccard < 0.22 AND price_reldiff >= 0.28
title_dice < 0.17 AND modelno_jaccard < 0.8 AND title_jaccard < 0.71 AND pcategory1_dice < 0.14
Songs
Blocking rule sequence 1
artist_name_jaccard < 0.29
title_jaccard < 0.94 AND release_dice < 0.94
title_jaccard < 0.94 AND title_dice < 0.97 AND title_dice3g < 0.8
Blocking rule sequence 2
release_jaccard < 0.41
title_dice < 0.96 AND title_jaccard < 0.94
Citations
Blocking rule sequence 1
journal_jaccard < 0.29
title_jaccard < 0.94 AND authors_dice < 0.94
title_jaccard < 0.94 AND title_dice < 0.97 AND title_dice3g < 0.8
Blocking rule sequence 2
title_dice < 0.61 AND title_jaccard < 0.57
title_jaccard < 0.32
authors_dice < 0.49 AND authors_jaccard < 0.14
authors_dice < 0.13 AND title_overlap < 5.5