Claudia Perlich             
 


 
7 East 18th Street, 8th Floor
New York, NY 10003
Email: claudia@ dstillery.com
Cell:  (914) 409 5609



I joined Dstillery (former Media6Degrees) as Chief Scientist in February 2010. In this role, I design, develop, analyze and optimize the machine learning systems to find prospective customers for brands and target them with display ads online.  Prior I worked in Data Analytics Research at IBM’s Watson Research Center, concentrating on data analytics and machine learning for complex real-world domains and applications. I graduated with a PhD in Information Systems from NYU and an MA in Computer Science from the University of Colorado. You can run into me at academic and industry events or at some of my teaching engagements at NYU, MIT, Wharton or Columbia.  I am currently acting as general chair for KDD 2014, the oldest conference in Data Mining drawing 1200 participants from academia, government and industry. This years special theme is "Data Mining for the Social Good".

Current Projects and Responsibilities

My core day-to-day responsibilities as Chief Scientist include:
  • The reliable estimation of our targeting models. We are building on the order of 2000 models per week based on browsing behavior. The dimensionality of the problem is ~100 Million and we are tracking ~100 Million active users.
  • Supervision of our real time scoring engine that applies the models to identify the target segments of browsers.
  • Bid Optimization for RTB systems that incorporate the time of day, inventory and targeting score of a browser to select the appropriate bid price.
  • Performance evaluation of systems/models and campaigns both in vitro and vivo.
  • Support business development and analytics by developing new analytical tool. Most recently we build a matching score for publishers and marketers. 
  • Reporting of analytic developments to the board and CEO/CTO.

Awards and Recognitions

PopTech Fellow: Rockefeller Foundation's Bellagio Residence on "Big Data and City Resilience" 2013

"4 under 40": Emerging Leaders Award of the American Marketing Association 2013

ARF Great Mings Innovation Award: Grand Winner 2013

CRAIN's "40 under 40"
 
Best Industry Paper KDD 2012: "Inventory Modeling and Bid-Optimization in Targeted Online Advertising"
 
People's Choice Award Runner Up at Wharton's EGII: " What Works in the New Age of Advertising & Marketing"

Best Research Paper KDD 2011: "Leakage in Data Mining: Formulation, Detection, and Avoidance"

Winner KDD CUP 2009 Fast Challenge: “Fast Challenge for CRM”

Finalist in the INFORMS Edelman competition 2009: "Operations Research Improves Sales Force Productivity"

Winner KDD CUP 2008 Task 1 and Task 2: “Identifying Breast Cancer”

Winner INFORMS Data Mining Contest 2008: “Identifying Pneumonia Patients"

Winner KDD CUP 2007 Task 2: “Predicting movie popularity for NETFLIX”

Data Mining Practice Prize at KDD 2007: Predictive modeling for marketing”

Winner ILP Challenge 2005: “Genetic classification”

Runner Up KDD CUP 2003 Task 1: “Predicting Citation Rates”

Selected Publications

Journal Papers

 
"Machine learning for targeted display advertising: Transfer learning in action"
C. Perlich, B. Dalessandro, T. Raeder, O. Stitelman, F. Provost. Machine Learning Journal 2014.

"Leakage in Data Mining: Formulation, Detection, and Avoidance"
S. Kaufman, S. Rosset, C. Perlich, O. Stitelman. Forthcoming in Transactions on Knowledge Discovery from Data
 
"On Cross-Validation and Stacking: Building Seemingly Predictive Models on Random Data"
Claudia Perlich, Grzegorz Swirszcz. SIGKDD Explorations 12(2) (2010) 11-15

 
“Social Media Analytics: The Next Generation of Analytics-Based Marketing Seeks Insights from Blogs”

R. Lawrence, P. Melville,C. Perlich, et al. Forthcoming ORMS Today 37(1) (2010)


“On Data-Driven Analysis of User-Generated Content”

C. Perlich, et al. Forthcoming IEEE Intelligent Systems 25(1) (2010) 12-17


“Medical Data Mining: Insights from Winning Two Competitions”

S. Rosset, C. Perlich, G. Swirszcz, P. Melville, Y. Liu. Forthcoming Journal of Data 

Mining and Knowledge Discovery 20 (3) (2010), 439-468


“Winning the KDD Cup Orange Challenge with Ensemble Selection”

A. Niculescu-Mizil, C.Perlich, et al. Journal of Machine Learning Research W&CP 7 (2009) 23-34 


“Operations Research Improves Sales Force Productivity at IBM”

R. Lawrence, C.Perlich, S.Rosset, et al. Interfaces 40(1) (2010) 33-46


“Breast Cancer Identification: KDD Cup Winners Report”

C. Perlich, P. Melville, Y. Liu, G. Swirszcz, S. Rosset and R. Lawrence. SIGKDD Explorations 10(2) (2008) 39-42


“Making the Most of Your Data: KDD Cup 2007 ‘How Many Ratings’ Winner’s Report”

S. Rosset, C. Perlich, Y. Liu, In SIGKDD Explorations 9(2) (2007) 66-69

 

“Analytics-driven solutions for customer targeting and sales force allocation”

J. Arroyo, M. Callahan, M. Collins, A. Ershov, I. Khabibrakhmanov, R. Lawrence, S.Mahatma, M. Niemaszyk, C. Perlich, S. Rosset, S. Weiss, IBM Systems Journal 46 (4) (2007)  

 

“A Market-Based Framework for Bankruptcy Prediction”

Reisz, A.S. and C. Perlich., Journal of Financial Stability 3(2) (2007) 85-131

 

 “Ranking-Based Evaluation of Regression Models”

Rosset, S., C. Perlich, and B. Zadrozny, Knowledge and Information Systems 12 (3) 2006 331-329

 

“ACORA: Distribution-Based Aggregation for Relational Learning from Identifier Attributes”

Perlich, C. and F. Provost. Journal of Machine Learning 62 (2006) 65-105

 

“Temporal Resolution of Uncertainty and Corporate Debt Yields: An Empirical Investigation”

Reisz, A.S. and C. Perlich. Journal of Business 79 (2006) 731-770

 

“Predicting Citation Rates for Physics Papers: Constructing Features for an Ordered Probit Model”

Perlich, C., F. Provost, and S. Macskassy. In SIGKDD Explorations (2004) 154-155

 

“Tree Induction vs. Logistic Regression: A Learning Curve Analysis”

Perlich, C., F. Provost, and J. Simonoff. Journal of Machine Learning Research 4 (2003) 211-255

 

Conference and Workshop Papers

"Scalable Supervised Dimensionality Reduction Using Clustering"

T. Raeder, C. Perlich, B. Dalessandro, O. Stitelman, F. Provost. 19th SIGKDD International Conference on Knowledge Discovery and Data Mining 2013.


"Using Co-visitation Networks For Classifying Non-Intentional Tra.ffic"

O. Stitelman, C. Perlich, B. Dalessandro, R. Hook, T. Raeder, F. Provost. 19th SIGKDD International Conference on Knowledge Discovery and Data Mining 2013.


"Diagnosing Non-Intended Tra.c In Real Time Bidding Advertising Exchanges Using Co-Visitation Network Graphs"

O. Stitelman, C. Perlich, R. Hook, B. Dalessandro, F. Provost. 4th Workshop on Information in Networks (WIN), 2012


"Evaluating and Optimizing Online Advertising: Forget the click, but there are good proxies"
B. Dalessandro, R. Hook, C. Perlich. EMPGENS2 (2012)

"Bid Optimizing and Inventory Scoring in Targeted Online Advertising''
C. Perlich, B. Dalessandro, R. Hook, O.Stitelman, T. Raeder, F. Provost. 18th SIGKDD International Conference on Knowledge Discovery and Data Mining 2012

"Design Principles of Massive, Robust Prediction Systems''
T.Raeder, O.Stitelman,B. Dalessandro,C. Perlich, F.Provost. 18th SIGKDD International Conference on Knowledge Discovery and Data Mining 2012


"Causally Motivates Attribution for Online Advertising"
B. Dalessandro, O. Stitelman, C.Perlich, F.Provost. Under Review at ADKDD workshop at the 18th SIGKDD International Conference on Knowledge Discovery and Data Mining 2012


"Leakage in Data Mining: Formulation, Detection, and Avoidance''
S. Kaufman, S. Rosset, C. Perlich. 17th SIGKDD International Conference on Knowledge Discovery and Data Mining 2011. Best Research Paper Award

"Latent Graphical Models for Quantifying and Predicting Patent Quality''
Y. Liu, Z. Kou, C. Perlich, R. Lawrence. 17th SIGKDD International Conference on Knowledge Discovery and Data Mining 2011


"Estimating The Effect Of Online Display Advertising On Browser Conversion In Observational Data''
O. Stitelman, C. Perlich, B. Dalessandro, F. Provost. The Fifth International Workshop on Data Mining and Audience Intelligence for Online Advertising at SIGKDD 2011

"Biased Modeling Performance in Stacking using Cross-Validation''
C. Perlich,G. Swirszcz. 5th Annual Machine Learning Symposium, New York Academy of Science, 2010

“A Predictive Perspective on Measures of Influence in Social Networks”

P. Melville, C. Perlich, E. Meliksetian, R. Lawrence. 2nd Workshop on Information in Networks (WIN), 2010

 
“Machine Learning for Social Media Analytics”

P. Melville, et al.

 4th Annual Machine Learning Symposium, 

New York Academy of Science, 2009

 
“Predicting Links in Dyadic Domains”

C. Perlich, G. Swirszcz and R. Lawrence. The 1st Workshop on Information in 

Networks, NYU, 2009

 

“Content-based Link Prediction for Patent Marketing”

C. Perlich, G. Swirszcz and R. Lawrence. International Workshop on 

Recommendation-based Industrial Applications at RECSYS 2009


“Spatial-temporal causal modeling for climate change attribution”

A. Lozano, H. Li, A. Niculescu-Mizil, Y. Liu, C. Perlich, J. Hosking, N. Abe. SIGKDD 

International Conference on Knowledge Discovery and Data Mining 2009

 

“Winners Report: KDD Cup Breast Cancer Identification”

C. Perlich, P. Melville, Y. Liu, G. Swirszcz, S. Rosset and R. Lawrence. The KDD 

CUP and Workshop on Mining Medical Data at SIGKDD 2008

 

“Graphical Models for Workforce Classification”

Y. Liu, Z. Kou, C. Perlich, R. Lawrence. 

SIGKDD International Conference on Knowledge Discovery and Data Mining 2008

 

“Mining Political Blog Networks”

W. Gryc, Y. Liu, C. Perlich, R. D. Lawrence. 

Networks in Political Science Conference at Harvard 2008

 

“Making the Most of Your Data: KDD Cup 2007 ‘How Many Ratings’ Winner’s Report”

S. Rosset, C. Perlich, Y. Liu 

KDD Cup and Workshop at SIGKDD 2007

 

“A Data Mining Case Study: Analytics-driven solutions for customer targeting and sales force allocation”

R. Lawrence, C. Perlich, S. Rosset, I. Khabibrakhmanov, S. Mahatma, S. Weiss. 

Second Workshop on Data Mining Case Studies and Practice Prize at SIGKDD 2007

 

“Looking for Great Ideas: Analyzing the Innovation Jam”

Mary Helander, Rick Lawrence, Yan Liu, Claudia Perlich, Chandan Reddy, Saharon Rosset. Workshop on Web Mining and Social Network Analysis at SIGKDD 2007

 

“High Quantile Modeling for Customer Wallet Estimation with Other Applications”

Perlich, C., S. Rosset, R. Lawrence, and B. Zadrozny, 13th SIGKDD International Conference on Knowledge Discovery and Data Mining 2007

 

“Identifying Bundles of Product Options using Mutual Information Clustering”

Perlich, C., SIAM International Conference on Data Mining 2007

 

“Discriminative Embedding for Classification Tasks in Complex Relational and Network Domains”

Perlich, C., Workshop on Novel Applications of Dimensionality Reduction at NIPS 2006

 

“Quantile Modeling for Marketing”

Perlich, C., S. Rosset and B. Zadrozny. Workshop on Data Mining for Business Applications at 12th SIGKDD International Conference on Knowledge Discovery and Data Mining 2006

 

“A New Multi-View Regression Approach with an Application to Customer Wallet Estimation”

Merugu, S. S.Rosset and C. Perlich. 12th SIGKDD International Conference on Knowledge Discovery and Data Mining 2006

 

“Wallet Estimation Models”

Rosset, S., C. Perlich, B. Zadrozny, S. Merugu, S. Weiss and R. Lawrence. International Workshop on Customer Relationship Management: Data Mining Meets Marketing, NYU 2005

 

“Relational Learning for Customer Relationship Management”

Perlich, C., and Z. Huang. International Workshop on Customer Relationship Management: Data Mining Meets Marketing, NYU 2005

 

“Approaching the ILP Challenge 2005: Class-Conditional Bayesian Propositionalization for Genetic Classification”

Perlich, C. Inductive Logic Programming (ILP) 2005

 

“Gene Classification: Issues and Challenges for Relational Learning”

Perlich, C, and S. Merugu. Workshop on Multi-Relational Data Mining (MRDM), at 11th SIGKDD International Conference on Knowledge Discovery and Data Mining 2005

 

“Ranking-Based Evaluation of Regression Models”

Perlich, C., S. Rosset and B. Zadrozny. International Conference on Data Mining (ICDM) 2005

 

“Learning from Identifier Attributes: Distribution-Based Aggregation for Relational Learning”

Perlich, C. and F. Provost. Dagstuhl Seminar 05051, 2005

 

“Aggregation-Based Feature Invention and Relational Concept Classes”

Perlich, C. and F. Provost. Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2003, 167-176

 

“Citation-Based Document Classification”

Perlich, C. Workshop on Information Technology and Systems (WITS) 2003

 

“Aggregation and Concept Complexity in Relational Learning”

Perlich, C. and F. Provost. Workshop on Learning Statistical Models from Relational Data (SRL), at IJCAI 2003

 

“Relational Learning Problems and Simple Models”

Provost, F., C. Perlich and S. Macskassy. Workshop on Learning Statistical Models from Relational Data (SRL), at IJCAI 2003

 

“ACORA: Automated Construction of Relational Attribute”

Perlich, C. Prototype Track at Workshop on Information Technology and Systems (WITS) 2003

 

“Discovering Knowledge from Relational Data Extracted from Business News”

Bernstein, A., S. Clearwater, S. Hill, C. Perlich and F. Provost. Workshop on Multi-Relational Data Mining (MRDM), at Eighth SIGKDD International Conference on Knowledge Discovery and Data Mining 2002

 

“A Modular Approach to Relational Data Mining”

Perlich, C. and F. Provost. American Conference on Information Systems (AMCIS) 2002

 

“Modeling of Scholastic Aptitude Tests”

Weigend, A.S., C. Perlich and M. Brehler. 

International Conference on Neural Information Processing (ICONIP) 1996

 

Invited Book Chapters

“Database Mining for Marketing”

Perlich, C. and M. Saar-Tsechansky. In Encyclopedia of Marketing, 2010

 

“Learning Curves in Machine Learning”

Perlich, C. In Encyclopedia of Machine Learning, C. Sammut and G. Webb Editors, Springer 2009

 

“Quantile Modeling for Wallet Estimation”

Perlich, C. and S. Rosset 

Statistical Methods in eCommerce Research

 

“Aggregation for Predictive Modeling with Relational Data”

Perlich, C. and F. Provost 

In Encyclopedia of Data Warehousing and Mining 2004

 

“Modeling Quantiles”

Perlich, C., S. Rosset and B.Zadrozny.  Encyclopedia of Data Warehousing 

and Mining, Second Edition



“Robust Regression Evaluation”

Perlich, C., S. Rosset and B.Zadrozny. Encyclopedia of Data Warehousing 

and Mining, Second Edition


Tutorials

“Predictive Modeling in the Wild: Success Factors in Data Mining Competitions 

and Real-Life Projects”

At SIGKDD 

International Conference on Knowledge Discovery and Data Mining 2009


Patents

YOR820050714 Ranking-Based Method for Evaluating Customer Wallet Models

YOR820060081 Method for Predicting Customer Wallet

YOR820060057 Method for Customer-Choice Based Bundling of Product Options

YOR920090427 Model for Market Impact Analysis of Part Removal from Complex Products