Part 2. Data Analytics and Machine Learning

Topics

Materials

Session Leaders

1st Week

(8/10, 14:00~15:30)

  • Causal Inference versus Prediction

- Are they substitutes or complements?

- Hofman, J. M., Sharma, A., & Watts, D. J. (2017). Prediction and Explanation in Social Systems. Science, 355(6324), 486-488.

- Athey, S. (2017). Beyond Prediction: Using Big Data for Policy Problems. Science, 355(6324), 483-485.

  • Brogaard, J., Engelberg, J., & Parsons, C. A. (2014). Networks and Productivity: Causal Evidence from Editor Rotations. Journal of Financial Economics, 111(1), 251-270.
  • Bertsimas, D., Brynjolfsson, E., Reichman, S., & Silberholz, J. (2015). OR Forum—Tenure Analytics: Models for Predicting Research Impact. Operations Research, 63(6), 1246-1261. ("Moneyball for Professors," MIT Sloan Management Review. Dec 14, 2016)
  • Optional

- Park, J., Kim, J., Pang, M. S., & Lee, B. (2017). Offender or Guardian? An Empirical Analysis of Ride-Sharing and Sexual Assault. KAIST Working Paper.

- Gerber, M. S. (2014). Predicting Crime Using Twitter and Kernel Density Estimation. Decision Support Systems, 61, 115-125.

Jiyong Park

2nd Week

(8/16, 14:00~15:30)

  • Predictive Analytics
  • Slide Download
  • Dhar, V., Geva, T., Oestreicher-Singer, G., & Sundararajan, A. (2014). Prediction in Economic Networks. Information Systems Research, 25(2), 264-284.
  • Geva, T., Oestreicher-Singer, G., Efron, N., & Shimshoni, Y. (2017). Using Forum and Search Data for Sales Prediction of High-Involvement Projects. MIS Quarterly, 41(1), 65-82.
  • Optional

- Goel, S., & Goldstein, D. G. (2014). Predicting Individual Behavior with Social Networks. Marketing Science, 33(1), 82-93.

- Shmueli, G., & Koppius, O. (2011). Predictive Analytics in Information Systems Research. MIS Quarterly, 35(3), 553-572.

Yoonseock Son

3rd Week

(8/23, 14:00~15:30)

  • Machine Learning and Deep Learning

- How can it be applied to the empirical research?


- How we're teaching computers to understand pictures (Fei-Fei Li, 2015)

  • Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining Satellite Imagery and Machine Learning to Predict Poverty. Science, 353(6301), 790-794.
  • Shin, D., He, S., Lee, G., Whinston, A., Cetintas, S. & Lee, K. (2016). Content Complexity, Similarity, and Consistency in Social Media: A Deep Learning Approach. UT Austin Working Paper. Available at SSRN: https://ssrn.com/abstract=2830377
  • Optional

- Lee, D. and Hosanagar, K. & Nair, H. (2017). Advertising Content and Consumer Engagement on Social Media: Evidence from Facebook. Management Science, forthcoming. Available at SSRN: https://ssrn.com/abstract=2290802

- Park, J., Kim, J., Cho, D., & Lee, B. (2017). Pitching with Style: The Role of Entrepreneur’s Speech in Crowdfunding Success. KAIST Working Paper.

- Zhang, S., Lee, D., Singh, P. V., & Srinivasan, K. (2017), How Much Is an Image Worth? Airbnb Property Demand Estimation Leveraging Large Scale Image Analytics. CMU Working Paper. Available at SSRN: https://ssrn.com/abstract=2976021

- Lin, Y. K., Chen, H., Brown, R. A., & Li, S. H. (2017). Healthcare Predictive Analytics for Risk Profiling in Chronic Care: A Bayesian Multitask Learning Approach. MIS Quarterly, 41(2), 473-495.

Jiyong Park

For a great guide to reconcile the econometrics and machine learning, see two articles of Hal Varian, who is the chief economist at Google and the emeritus professor at UC Berkeley (he is also the co-author, with Shapiro, of the book "Information Rules: A Strategic Guide to the Network Economy").

They could also be a good wrap-up for Parts 1 & 2 in this Summer Session.


  • An elementary introduction to econometrics for computer scientists:

- Varian, H. R. (2016). Causal Inference in Economics and Marketing. Proceedings of the National Academy of Sciences (PNAS), 113(27), 7310-7315.

  • An elementary introduction to machine learning and big data for economists:

- Varian, H. R. (2014). Big Data: New Tricks for Econometrics. Journal of Economic Perspectives, 28(2), 3-27.

Prerequisite for Python

- This is a well-designed course for Python beginners. It is short, easy, and even free.

- For the following technical sessions, highly recommended is taking this basic Python course if you are not familiar with the Python language.

4th Week

(8/19, 15:00~16:30)

  • (Technical Session) Exploiting Unstructured Data

(Step 1) Web crawling

- Kim, J. and Park, J. (2017). Does Facial Expression Matter Even Online? An Empirical Analysis of Creator’s Facial Expression of Emotion and Crowdfunding Success. KAIST Working Paper.

Jongho Kim

(Data Scientist at NICE P&I)

5th Week

(8/26, 15:00~16:30)

  • (Technical Session) Exploiting Unstructured Data

(Step 2) Applying cloud-based APIs

Jongho Kim

(Data Scientist at NICE P&I)