Mining rare sequential patterns
This page presents the results and experiments used for the ILP 2017 article.
Download the article here.
Abstract
Abstract
We present an approach of meaningful rare sequential pattern mining based on the declarative programming paradigm of ASP -- Answer Set Programming. The new setting of rare sequential pattern mining is introduced. To cope with the huge amount of meaningless rare patterns, our ASP approach provides an easy manner to encode expert constraints on the expected patterns. We use clingo 5.0 as a solver.
Authors
Authors
- A. Samet
- T. Guyet
- B. Negrevergne
Encodings
Encodings
ASP-Rare Sequence Miner
ASP-Rare Sequence Miner
- Instance example : instance.lp
- Encoding for the rare patterns mining task : sequences_rarepatterns_fg_opt.lp
- Command: clingo 0 -c params.lp instance.lp sequences_rarepatterns_fg_opt.lp
ASP-Minimal Rare Sequence Miner
ASP-Minimal Rare Sequence Miner
En optimized version to mine minimal rare patterns have been designed:
- Encoding: sequences_rarepatterns_mri_fg_opt.lp
- Command: clingo 0 -c params.lp instance.lp sequences_rarepatterns_mri_fg_opt.lp --enum-mod=domRec --heurustic=Domain --dom-mod=3,16
Experiments
Experiments
Data Simulated datasets used to evaluate computing performances
Data Simulated datasets used to evaluate computing performances
- dataset generator: generator.py use -h option to detail about how to use this generator
dataset ZIP contains a set of databases of simulated sequences. The mean length of sequences they contains are from 10 to 20. The file database_100_2_10.lp is a file containing sequences of 100 records and mean length 10. The 2 indicates the dataset id.
Apriori Rare & MRG_EXP
Apriori Rare & MRG_EXP
Our ASP encodings are compared with procedural approaches.