HTTP Logs Analysis using Microsoft Log Parser

While there are several tools freely available on the web to analyze your website traffic and they are doing great at this (Google AnalyticsGoogle Webmaster ToolBing Webmaster tool …). These tools provide great and free value to track your traffic and troubleshoot potential issues on your website. As any tool available they have some limitations and the need to find alternative/complementary solutions becomes necessary.

In this post I will discuss the use of Microsoft Log Parser to analyze “hits” on your web server  Any website of different size or complexity comes to have these different types of problems with time:

1)    Change of URL
2)    Removing old pages
3)    Error pages

To some extend the tools mention above will show you these errors, but they might not be exactly what you seek in a real data analysis perspective. Let’s take for example Error pages, some of your pages crashes sending HTTP 500 Status Code, you might not be able to recover data using the normal Google Analytics Javascript depending of how you are treating these crashes.

One way to get access to these data is to analyze you web server logs (if they are active of course). So as not to get too detailed in the explanation find below some utility code that will help you troubleshoot issues in your application. (After installing Log Parser you will be able to run the below syntax from command line)

HTTP 200 OK from Google Bots
LogParser.exe “SELECT date, count(*) as hit INTO HTTP200.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status = ‘200’ GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 200 Hits”

HTTP 301 Permantly Moved Google Bots
LogParser.exe “SELECT date, count(*) as hit INTO HTTP301.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status = ‘301’ GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 301 Hits”

HTTP 4xx Not Found / Gone Google Bots
LogParser.exe “SELECT date, count(*) as hit INTO HTTP4xx.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status >= 400 AND sc-status < 500 GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 4xx Hits”

These queries will produce nice graphs of how much HTTP 200,301,4xx hits you receive per day while the Google bot is crawling you site.

You can also easily find out the same thing for your users by changing the cs(User-Agent) LIKE ‘%%google%%’ to cs(User-Agent) NOT LIKE ‘%%bot%%’.

Of course these are approximated to a certain level, because not all bots add the keyword “bot” to use user-agent.

Hoping this can come in handy. If you have more queries to share, drop by and put a comment.
Further readings :

IIS: Redirection from non-www to www domain

The problem today, is that we have a great website but search engines are indexing the instead of, this is commonly seen on the web and there are several ways to archive a good result for making the non-www to www domain. This redirection should be a 301 Permanently Moved, otherwise you will might lose your search engine indexed page or become duplicate content for your non-www and www domain. Here are easy steps how to archive a quick and clean Permanent Redirection using IIS.

Consider the case where we already have a website in IIS called:

  • Go to IIS Manager
  • Create a new website that point to the same directory as your existing one
  • Select the newly created website, open the properties box
  • In the option button “When connecting to this resource the content should come from” should be change to “A redirection to a URL
  • Specify the URL
  • Select the check box that says “A permanent redirection for this resource.”

IIS Log Archiving

You need to archive your IIS Log often so as not to get your log folder full with HTTP Logs.

I have been searching for some quick implemented solutions for performing this IIS Log archiving task and found some quiet nice discussions and article about it. Here are the links to the different post and forums that talk about a solution to solve this issue: (Tool to automate maintenance of IIS Log)

On my side i need something with a bit more functionality so, i modified some of the scripts that i could find on the different article related above and came up with a solution that can.

  • Compress each log file found in your websites folder
  • FTP the compressed files on a foreign server ( Keeping historic of your IIS log ) Uses Chillkat Free FTP ActiveX
  • Delete them from your disk afterward

You can launch this process everyday and there will be no log that is older than a specified number of days on your server.

Requirement for this solution to work:

You can download the script here.

See the entire script in the full post.

Continue reading “IIS Log Archiving”