IIS Web Log Hit Counter

Mehedi Shams

5.00/5 (4 votes)

Sep 9, 2016

CPOL

3 min read

23896

724

Counting hits of the sites that run through IIS, using IIS logs

Download source code - 66.4 KB

Introduction

IIS keeps history of detailed site hits in its log files. Sometimes, there might be a requirement to count the number of hits for sites running through IIS.

Background

A practical scenario where this might be needed is when there is a need to decommission a server, but we need to see which sites are still being used by the users, so as to host them in a new server and continue the business.

IIS keeps history of site hits in the log files. They usually reside in "\inetpub\logs\LogFiles". A typical folder snap is:

Here, each of the folders corresponds to a particular node in IIS. E.g., the above snap corresponds to the following IIS structure.

To determine which log folder belongs to which site, you can find the Site ID in IIS site properties as follows. Here the site ID for 'AdvancedSearch' site is 3, hence the folder 'W3SVC3' contains the log files for this site (and any subsites).

And a typical log file structure is as follows:

See, the page hit sequences are default.aspx->Display.aspx>MoreInfo.aspx.

Then a new session started later and the hit sequence is default.aspx, then a reload (or postback) occurred (second line from the bottom). But in between, there are other lines which were not really hits, but loading of relevant stylesheets and scripts. So we have to exclude them from counting.

Using the Code

The code is simple. It is mainly couple of string operations with some specific checks.

The program interface is as follows. It has a folder browse dialog what is used to locate the log folder. As an added feature, there is a date check; if date is provided, then all the hits prior to that date (<=) is taken into account.

It keeps the counts in a dictionary object:

Dictionary<string, int> SiteAndCount = new Dictionary<string, int>();

The sites (and subsites) that need to be counted are added in the configuration file. The config file also has other keys which are understandable by the name. So basically, after hit count for the sites (Books, PearlSBuck, SidneySheldon, SatyajitRoy, HumayunAhmed), the output will be provided in the text file named "Hit_Stats.txt" which will be located in the same directory as the executable file. From the metadata of a logfile, it can be seen that the site hits will be under the header "cs-uri-stem" (please see the snap of the log file structure above).

<add key ="APP_TITLE" value="Web Log Counter"/>
<add key ="STAT_FILE_NAME" value="Hit_Stats.txt"/>
<add key ="SITE_NAMES" value="Books, PearlSBuck, SidneySheldon, SatyajitRoy"/>
<add key ="URL_HEADER" value="cs-uri-stem"/>

A separate log file is created for each day, so we need to browse and parse all log files for the sites. This is accomplished in the following code:

foreach (string LogFile in Files)
{
    StreamReader SReader = new StreamReader(LogFile);
    StatusLabel.Text = "Parsing file: " + LogFile;
    FileCountLabel.Text = "Processing file: " + count++.ToString() + 
                          " of " + Files.Count().ToString();
    Application.DoEvents(); // Refresh the labels.
    ParseFile(SReader, ConfigurationManager.AppSettings["URL_HEADER"]);
    SReader.Close();
}

All lines are read processed in a WHILE loop. Then lines are discarded until the last meta-data (containing "#fields") is reached (please see the file structure snap above).

while (Line != null)
{
    do
        Line = SReader.ReadLine();
    while (Line != null && Line.Substring(0, 7) != "#Fields");  // Read through 
                                              // the #-ed lines, these are meta info.

This line is used to determine the index of the hit count. First it splits the line, then checks if the index was already determined. If not, then it loops until it finds the index, then sets the index and quits to proceed to the next step.

Strings = Line.Split(' ');
if (UrlIndex == -1)     // UrlIndex = -1 means, the index was not obtained. 
                        // Generally after the first finding it will be something else.
    for (int i = 0; i < Strings.GetUpperBound(0); i++)
    {
        if (Strings[i].Equals(UrlHeader))
        {
            UrlIndex = i - 1;   // This line might be like '#Fields: 
                                // date time s-ip cs-method cs-uri-stem cs-uri-query s-port'
            break;              // Subsequent lines will not have the '#Fields' attribute. 
                                // Hence reduce the index by 1.
        }
    }

The next bit of code starts to process the lines until the next meta-data is reached (please see the file structure snap above). First, it checks if a date check was intended. If yes, then checks for site hits prior (<=) to this date. Otherwise, it starts to check hits irrespective of dates.

Line = SReader.ReadLine();  // Read the line next to the #Fields line, 
                            // these subsequent lines actually contain the site hits.
while (Line != null && !Line.Substring(0, 1).Equals("#"))   // Parse all the lines until 
                            // the next meta-data starts (#), or end of file is reached (NULL).
{
    bool SiteHitFound = false;

    Strings = Line.Split(' ');
    if (!CheckFindAllHits.Checked)                          // If date check was intended.
    {
        var Regex = new Regex(@"\d{4}-\d{2}-\d{2}", RegexOptions.Compiled);
        IsSuccess = Regex.Match(Strings[0]).Success;        // Check for a valid date format.
        if (IsSuccess)
            HitDate = Convert.ToDateTime(Strings[0]);

        if (HitDate <= Convert.ToDateTime(HitsBeforeThisDate.Text))  // If the log date 
                      // is over the check date, then no need to proceed with this line,
        CheckSiteHit(ref CurrentSite, Strings, UrlIndex, ref LastSite, ref SiteHitFound);
     }
     else
        CheckSiteHit(ref CurrentSite, Strings, UrlIndex, ref LastSite, ref SiteHitFound);
     Line = SReader.ReadLine();  // Proceed with the next line in the log file.
}

The "CheckSiteHit" method actually does the count. It browses through the listed sites for checks and sees if a hit happened. It also keeps track of the last site encountered. This is to ignore same consecutive sites, because literally that was a single hit while the subsequent hits were loading CSS, JS, images, etc., or are a postback (POST). The reason for this is already explained above (please see the file structure snap above). Then if it finds a hit, then enters it in the dictionary (or increments the counter if it is already there).

private void CheckSiteHit(ref string CurrentSite, 
string[] Strings, int UrlIndex, ref string LastSite, ref bool SiteHitFound)
{
    foreach (string Site in Sites)
    {   // Check if any of the sites to be counted matches the URI string.
        if (Strings[UrlIndex].IndexOf(Site, StringComparison.CurrentCultureIgnoreCase) > -1)
        {
            if (!LastSite.Equals(Site))        // There might be consecutive site listings 
                                               // whereas the later ones usually contain 
                                               // CSS, JS, image etc.
            {
                SiteHitFound = true;
                CurrentSite = LastSite = Site; // Hence, only if it is a new site, 
                                               // then add it to counter.
                break;                         // Proceed with the next line.
            }
        }
    }
    
    if (SiteHitFound)
    {
        int value;
        if (!SiteAndCount.TryGetValue(CurrentSite.ToUpper(), out value))
            SiteAndCount.Add(CurrentSite.ToUpper(), 1);      // If the site is not found 
                                                             // in the dictionary then 
                                                             // add it and start the counter.
        else
            SiteAndCount[CurrentSite.ToUpper()] = value + 1; // Else increase the count of 
                                                             // the site.
    }
}

Finally, it outputs the dictionary in the status file.

StreamWriter SWriter = new StreamWriter(ConfigurationManager.AppSettings["STAT_FILE_NAME"]);
foreach (KeyValuePair<string, int> entry in SiteAndCount)
    SWriter.WriteLine(entry.Key.ToString() + ":\t" + entry.Value.ToString());

StatusLabel.Text = "Finished parsing.";
FileCountLabel.Text = "Processed file: " + (--count).ToString() + 
                      " of " + Files.Count().ToString();
SWriter.Close();
MessageBox.Show("Finished parsing. Please see the stats in file: " + 
                 ConfigurationManager.AppSettings["STAT_FILE_NAME"]);

That's it! :)

History

9^th September, 2016: Initial version