I undertook this project under the guidance of Professor Yesim Orhun (Ross School of Business). I learned to download and collate LinkedIn data, and check information under "activities and societies" in LinkedIn profiles, to detect membership in a list of student organizations. All of the code and a detailed description of the project are available on my GitHub page.
This project involved the following steps:
Collating information on student fraternities and sororities at the University of Michigan (in file orgs_list.csv).
Writing code to parse and collate information from LinkedIn, using two approaches:
Direct download using selenium and BeautifulSoup (scrape_linkedin_public_final.py), and then parsing the page information using BeautifulSoup commands (readin_linkedin_final.py)
Downloading LinkedIn information using Proxycurl profile scraping API (proxycurl_linkedin_final.py).
Writing code (add_fratindicator_final.py) to create a dummy indicator for membership in a fraternity organization and a variable with the name of the fraternity organization, based on the "activities" variable from the LinkedIn dataset. I did this for the data created from the direct download approach, but this can easily be adjusted to work with other files