Web servers record every page request in a log file. Each entry typically includes the time of the request, and the page accessed, along with various additional details. Most servers generate these logs automatically, commonly using the Common Log Format—or a variation of it—standardized by the World Wide Web Consortium (W3C). The Securities Exchange Commission's (SEC) Electronic Data Gathering, Analysis, and Retrieval (EDGAR) System's log records every page request for SEC filings.
The raw files can be found at SEC.gov | EDGAR Log File Data Sets.
Processed files from 2003 - 2017 can be found at
Request Access to EDGAR Log File Data (google.com)
Charlotte Zhou, a CUHK-SZ alumni whom I met at the University of Chicago, and I processed files from 2020 - 2024. We downloaded and processed the EDGAR log data to obtain the download dates and accession ids. We counted the number of downloads per day per accession id per cik. We used the SEC’s index to obtain additional information associated with the accession id such as form type, filing date, company name, etc. Below are some descriptives by Charlotte.