Each web server owner must legally have a log file that records all hits, for each resource (HTML page, image file, javascript etc) for all sites hosted by the server. In this log file can be found all the hits generated by Internet users, but also those of crawlers including the famous Googlebot - Google indexing bot.
Here is an example of a hit recorded in a log file:
2016-05-31 00:03:55 GET /robots.txt 66.249.66.19 Mozilla / 5.0 + (compatible; + Googlebot / 2.1; ++ http: //www.google.com/bot.html) - 200In order, we can read:
Each site has different log formats depending on the technology used on the server side (IIS, APACHE, NGINX etc.), the server version and more generally its configuration. The example given may not be valid for your site.
Benefit of doing a log analysis: The objective of such an analysis will be to follow the activity of Googlebot and measure its impact on your SEO. This allows you to know with great accuracy the URLs ignored by Google and those its crawlers love most, by knowing that Googlebot revisits them dozens of times a day. The log analysis makes it possible to locate the resources which can cause a loss of link equity. If, in your logs (filtered on Googlebot), 30% of hits fall on 404 or 500 errors, i means that you're wasting 30% of your crawl budget on useless pages, at the expense of your strategic pages!
If you want to go ahead with a log analysis, it is necessary to be able to provide at least 30 rolling days of filtered logs on GoogleBot with at least this information:
Date
Complete URL with the the request parameter
Referer
User-agent
HTTP Status Code
Domain associated to the URL
Protocol
These fields are all contained in the default formats for Apache, nginx, Varnish and IIS.
The logs may contain comments beginning with a #.
We draw your attention to the importance of the 'Referer' field, which is essential for the detection of visits and for the quality of the data of the reports.
If you go through a CDN, it is possible that the Referer field is by default absent from your log files. We encourage you to check this point and ask your provider to activate this field before sending us your first logs.
How to send us these .log files?