Struct SVM ADMM Code

Code webpage: ADMM for Structural SVMs with augmented L1 regularizers

This page contains information about the software, StructSVM-ADMM, which has the implementation of ADMM for Sparse Structural SVMs with augmented L1 regularizers.

ADMM for Structural SVMs with augmented L1 regularizers

Various mining tasks involve prediction of highly specialized and structured objects like graphs, trees and sequences. Structural SVMs have become a powerful alternative for learning and efficiently predicting such structured objects. Structural SVMs with augmented L1 regularizers help in obtaining sparse models. Sparse models lead to feature identification, extraction and also aid in reducing the complexity of computationally intensive inference procedures.

Given a sample of structured

inputs and outputs {(xi,yi)}i=1,2,..n ∈ (X x Y), Structural SVMs with augmented L1 regularizers solve an optimization problem:

minw,ξ β1||w||1+ β22 (||w||2)2 + C Σi=1,2,…,n ξi

s.t. wT Δfi(y) Li(y) - ξi, ∀ i, ∀ y∈Y.

Δfi(y)=f(xi,yi) - f(xi,y) is the difference feature vector and Li(y)=L(yi,y) is a suitable function capturing the dissimilarity between the two outputs yi and y.

When β1=1 and β2=zero we recover L1 regularized structural SVM. When β1 and β2 are non-zero we recover elastic net regularized structural SVM problem.

The designed ADMM solves a sequence of three problems one of which makes use of sequential dual method developed for training L2 regularized structural SVM while the other two problems have easy-to-compute closed-form solutions.

Structural SVMs are used in various applications like sequence labeling, parse-tree classification, image segmentation etc.,

Download

You can download the current version of StructSVM-ADMM for Sparse Structural SVMs from the link struct_svm_admm.zip

This software is available free only for non-commercial purposes. Please cite the paper mentioned below if you intend to use this program.

How to Cite

Please cite the following paper :

ADMM for Training Sparse Structural SVMs with Augmented L1 regularizers.

P.Balamurugan, Anusha Posinasetty, Shirish Shevade.

In SIAM International Conference on Data Mining , 2016.

The bibtex for the paper is :

@article{StructSVMADMM, title = {ADMM for Training Sparse Structural SVMs with Augmented $\ell_1$ regularizers}, author = {P. Balamurugan and Anusha Posinasetty and Shirish Shevade}, journal = {Proceedings of the Sixteenth SIAM International Conference on Data Mining}, year = {2016} }

How To Use

Running ADMM on example Data

The struct_svm_admm directory contains a sub-folder data in which two sample data files sampledata and OCR200 are given.

You can try running L1 ADMM on sampledata by typing the following command:

./structsvm_sdm_learn -c 1 -w 2 data/sampledata

You should get a primal objective value ≈ 123.10923143.

The performance and sparsity of the model obtained on the sample train data using L1 ADMM is given in the graphs below:

You can try running Elastic Net ADMM on sampledata by typing the following command:

./structsvm_sdm_learn -c 1 -w 3 data/sampledata

You should get a primal objective value ≈ 119.94880210.

The performance and sparsity of the model obtained on the sample train data using Elastic Net ADMM is given in the graphs below:

You can try running similar experiments on the OCR200 data by typing the following commands:

./structsvm_sdm_learn -c 1 -w 2 data/OCR200

./structsvm_sdm_learn -c 1 -w 3 data/OCR200

For L1 ADMM you should get a primal objective value ≈ 430.93739342.

For Elastic Net ADMM you should get a primal objective value ≈ 409.27678099.

The performance and sparsity of the model obtained on OCR200 using L1 ADMM and Elastic Net ADMM are given in the graphs below:

Description of feature vector used in the implementation

For sequence labeling application, the input is made of parts x=(x1,x2,…,xT) and the output is also made of corresponding parts y=(y1,y2,…,yT). The feature vector for the input (x,y) given by f(x,y) is constructed using the first-order features and second-order features. The first-order features are obtained by placing the features xt at yt-th position. The second-order features are obtained by considering the label assignment to the label pair yt-yt-1 at every stage t=1,2,…,T.

Input File Format

A). The input file format should be of SVMStruct format.

example :

4 qid:1 5:-1 345:0.3 2101:0.89 6 qid:1 1:-1.5 5:0.2 39:1.98 3 qid:2 567:-0.24 689:0.8 792:0.21 

In case your data has the CoNLL text chunking format, please go to Step. B.

If the input file format is different from these formats, then the reader module has to be changed.

All other modules which assume this format should also be changed.

B). CoNLL text chunking data has the following form :

But CC O we PRP B-NP certainly RB O like IN O what WP B-NP we PRP B-NP 've VBP O seen VBN O so RB O far RB O . . O " " O 

This data has a column-format (number of columns=3) and a suitable template file is required to extract features from this data.

The template file contains template rules of the following form :

U00:%x[-2,0] U01:%x[-1,0] U02:%x[0,0] U03:%x[1,0] U04:%x[2,0] U05:%x[-1,0]/%x[0,0] U06:%x[0,0]/%x[1,0] U10:%x[-2,1] U11:%x[-1,1]     U12:%x[0,1]  U13:%x[1,1] U14:%x[2,1] U15:%x[-2,1]/%x[-1,1] U16:%x[-1,1]/%x[0,1] U17:%x[0,1]/%x[1,1] U18:%x[1,1]/%x[2,1]     U20:%x[-2,1]/%x[-1,1]/%x[0,1] U21:%x[-1,1]/%x[0,1]/%x[1,1] U22:%x[0,1]/%x[1,1]/%x[2,1] B 

Similar templates are available for other Bio-datasets like BioCreative and BioNLP, the data of which is also in the column format.

For such datasets, please use the utility conll2svmlightconverter by typing the following in the shell prompt:

conll2svmlightconverter template_file train_file test_file validation_file

This will extract the features in SVMStruct format in a new file, the name for which is displayed in the shell prompt.

To know more about this utility, just type conll2svmlightconverter in the shell prompt.

Once you get the features in SVMStruct format, you can start training as described in Step. 8.

The conll2svmlightconverter tool is modelled similar to the feature generation procedure in CRFSGD by Leon Bottou.

Acknowledgment

The implementation uses the SVMStruct API used in Sequence Tagging and available at http://www.cs.cornell.edu/People/tj/svm_light/svm_hmm.html.

From SVMStruct implementation, the following files have been borrowed with minor modifications :

1. svm_common.h

2. svm_common.c

3. svm_learn.h

4. svm_struct_api.c

5. svm_struct_api.h

6. svm_struct_api_types.h

Contact

For any queries, bugs, suggestions please contact: