Cryptocurrency prices from coinmarketcap.com
The codes below are used to scrape the price data of cryptocurrencies from coinmarketcap.com. The default price resolution is 5-minute (the highest), but it can easily be changed to hourly, daily, etc. Technically, we can download price data for all cryptocurrencies available on coinmarketcap.com, but it is time-consuming due to the size and the limit on response rate. A sample of data (in excel file) is uploaded in the Data section on this page, which contain 5-minute prices of 100 largest cryptocurrencies from their inceptions until May 18, 2021.
Cryptocurrencies' OHLCV data from Binance and Bitfinex
The codes below can be used to scrape the OHLCV data of all cryptocurrencies from Binance and Bitfinex, at any frequency provided by the exchanges (all you need to do is to change the frequency specified in the codes). Part of this data is used in Baur and Hoang (2022).
EDGAR 13F
The codes below are used to scrape and parse the data on quarterly holdings of institutional investors reported in form 13F. The current version only works for the report in XML format which is mandate from mid-2013. Hence, the codes provide a complete data from 2014 onwards. A version that can parse the reports in text files (which are commonly used before 2013) is being developed.
One might want to link the obtained data from EDGAR with the commercial institutional holding data provided by Thomson Reuters via WRDS. However, the two databases use different institution identifiers (i.e., Central Index Key - CIK in EDGAR and manager number - MGRNO in Thomson Reuters). I also developed a procedure to achieve this task by linking institutions from the two databases using their names and holdings. In the codes below, I use holding data in 2020Q1 in both datasets to link their CIK and MGRNO. More details about the procedure can be found in Hoang and Yang (2021).
Routine and Opportunistic Insiders classification (following Cohen, Malloy & Pomorski, 2012)
The codes classify insiders into routine versus opportunistic insiders. Only subset of insiders are in the final classification (about 25%) as the the classification only consider insiders with at least three consecutive years of trading history. The process is described in detail in Cohen et al. (2012), but can be briefly summarized as follows (which I implemented in the codes):
Identify insiders (identifier: PERSONID) that have at least three-year of trading history. Only keep those insiders (and the associated company (identifier: CUSIP).
Once a three-year trading pattern is identified, further check whether the insider trades on the same month during those three years. If yes, the insider is classified as routine insider in all the following trades thereafter. If no, the insider is classified as the opportunistic insider thereafter.
After being identified as an opportunistic insider, the insider can be reidentified as a routine inside once the same trading month pattern is detected. Once the insider becomes routine, he/she remains as routine in all trades thereafter.