People used to call me “Whopper Jr.” because I would cut up Burger King’s hamburger, Whopper, and devour it over two meals a day. My family was in the lowest-income decile in my country and eventually went bankrupt. I always had a strong will and determination, no resentment or frustration; therefore, I thought deeply about how to survive on my own and started to develop financial strategies. I had to quit university and earn money in various ways, such as working as a day laborer in construction and packing parcels in a factory at night. Even during this time, I organized my math studies and answered others’ questions as a mentor on math-related community websites. To overcome my hardship and study continuously, I started my own business, getting an idea from Whopper Jr., by writing junior-sized calculus books based on compilations of what I had organized and filming lecture videos because I discovered undergraduates’ complaints that calculus textbooks were too heavy and hard to understand. My business was so profitable that it mitigated my huge financial burden and I was able to attend and graduate from Yonsei University.
I have a high level of concentration that does not falter even in a crisis. I consider a problem as a new opportunity for my growth in the process of analyzing and solving it. The ability to maintain concentration until the end result is obtained is critical to ensure reliability and contribute to the field of academia. In this respect, I am a person who knows how to pursue research calmly and persistently.
I personally run a foundation for academic scholarship that helps students who are tired of studying or struggling financially. It is called “Woo's Academic Mentoring Foundation (WAMF)” and has been managed since 2020. I have experienced financial difficulties, and I have maintained a stable student life to some extent recently. I run this foundation to help students who have similar difficulties in studying because of their financial situation like what I have experienced. Therefore, I want to deliver positive energy to them. It may not be a big amount yet, but I plan to continue this activity even when I become a professor in the future. I dream of a future where many students like me are happily immersed in doing research without worrying about their financial difficulties.
(a) Making a patent scaling dataset as a standard in innovation research for academia (2023 - Present)
Based on my expertise in studying patents for more than four years, it is an ongoing project with my best friend, Kyoungwon Kim. We have been mainly using the PATSTAT database with SQL and Python programming (additionally use of PatentsView). We also expect to connect this dataset with the founding year or VC flag data (Michael Ewens and Matt Marx, "Entrepreneurial Patents", Working Paper 2023.). I have in mind to apply NLP (Natural Language Process) techniques to our dataset, thanks to Professor Harry Mamaysky's Big Data in Finance class.
(b) Web-scrapping for automatic filing download from EDGAR (2023)
EDGAR, the Electronic Data Gathering, Analysis, and Retrieval system, is the primary system for companies and others submitting documents. I made a Python web-scrapping code for automatically downloading filings such as SEC Form DEF 14A, also called a definitive proxy statement. It is intended to furnish security holders with adequate information to be able to vote confidently at an upcoming shareholders’ meeting. Form DEF 14A is most commonly used with an annual meeting proxy and filed in advance of a company’s annual meeting. I collected 117,587 DEF 14A file lists with unique 12,705 CIK codes (Central Index Key, CIK, is used on the SEC's computer systems to identify corporations and individual people who have filed disclosure with the SEC).
(c) An integrated database of private companies for academia (2022 - 2023)
I consolidated approximately 260,000 audit reports of domestic private companies scattered into a single database for the first time in Korea. The reason for constructing it was to solve the problem of the lack of information with scattered data on private companies despite countless corporate studies. The database is used to form the basis for my ongoing research on “Overcoming endogeneity issues in IPOs” and “Economic shocks to companies participating in North Korea business.” The former has a contribution to solving the fundamental problem caused by biased sample selection in IPO research. The latter can contribute to unraveling the economic specificity of North Korea. I am collaborating with some professors in these studies with developing the ability to analyze data from an economic perspective as well as a statistical point of view.
(d) Machine learning codes during my teaching assistant activity in “Data Mining Theory and Application” (2021)
I delivered lectures every week in this class based on the Python codes I organized myself. Each specific code was uploaded to my Github website. It contains regression, variable selection methods, principal component analysis (PCA), factor analysis (FA), clustering, k-nearest neighborhood algorithm, association rules, collaborative filtering, social network analysis, logistic regression, decision tree, artificial neural network, self-organizing map (SOM), ensemble (including bagging, boosting, and random forest).