Click here to Skip to main content
15,879,239 members
Articles / Web Development / HTML

Introducing Server Side Analytics for ASP.NET Core

Rate me:
Please Sign up or sign in to vote.
5.00/5 (12 votes)
20 Aug 2018CPOL5 min read 18.4K   18   6
Simple middleware to add server side analytics functions to ASP.NET Core

Introduction

I wanted to keep trace of visitors and know the usual stuff of web analytics: visitors, source, nationality, behaviour and so on.
And client side analytics are not so reliable:

  • Ad Blockers interfere with them
  • Using a third party service requires to annoy the user with those huge cookie consent banners
  • They drastically increase the loading time of the web application
  • They don't register API calls and any other not-HTML calls like web API

So I developed by myself a very simple server side analytics system for .NET Core, which is running on my website.

The Middleware

The idea is to implement a middleware that will be invoked on every request, no matter if a route was specified or not.

This middleware will be put into the task pipeline and set up using only fluid methods.
The middleware will write incoming request into a generic store after the processing of the request is completed.

The middleware will be inserted into the task pipeline by using UserServerSideAnalytics extension method in app startup.

This method requires an IAnalyticStore interface that is going to be the place where our received request will be stored.

C#
public void Configure(IApplicationBuilder app)
{
   app.UseServerSideAnalytics(new MongoAnalyticStore("mongodb://192.168.0.11/matteo"));
}

Inside the extension, I will create a FluidAnalyticBuilder and bind it to the task pipeline via the method Use.

C#
public static FluidAnalyticBuilder UseServerSideAnalytics
               (this IApplicationBuilder app,IAnalyticStore repository)
{
    var builder = new FluidAnalyticBuilder(repository);
    app.Use(builder.Run);
    return builder;
}

The FluidAnalyticBuilder is a fluid class that will handle the configuration of the analytics that we want to collect (like filtering unwanted URL, IP address and so on) and practically implement the core of the system via the method Run.

In this method, ServerSideAnalytics will use two methods of the store:

  • ResolveCountryCodeAsync: Retrieve (if existing) the country code of remote IP address.
    If not existing, CountryCode.World is expected.
  • StoreWebRequestAsync: Store the received request into the database
C#
internal async Task Run(HttpContext context, Func<Task> next)
{
     //Pass the command to the next task in the pipeline
     await next.Invoke();

     //This request should be filtered out ?
     if (_exclude?.Any(x => x(context)) ?? false)
     {
         return;
     }

     //Let's build our structure with collected data
     var req = new WebRequest
     {
           //When
           Timestamp = DateTime.Now,

           //Who
           Identity = context.UserIdentity(),
           RemoteIpAddress = context.Connection.RemoteIpAddress,

           //What
           Method = context.Request.Method,
           UserAgent = context.Request.Headers["User-Agent"],
           Path = context.Request.Path.Value,
           IsWebSocket = context.WebSockets.IsWebSocketRequest,

           //From where
           //Ask the store to resolve the geo code of given ip address
           CountryCode = await _store.ResolveCountryCodeAsync(context.Connection.RemoteIpAddress)
      };

     //Store the request into the store
     await _store.StoreWebRequestAsync(req);
}

(Maybe, I should add other fields to collected requests? Let me know. 😊)

Via the List<Func<HttpContext, bool>> _exclude, it also provides easy methods to filter out requests that we don't care about.

C#
//Startup.cs
// This method gets called by the runtime. Use this method to configure the HTTP request pipeline.

public void Configure(IApplicationBuilder app, IHostingEnvironment env)
{
     app.UseDeveloperExceptionPage();
     app.UseBrowserLink();
     app.UseDatabaseErrorPage();

     app.UseAuthentication();

     //Let's create our middleware using Mongo DB to store data
     app.UseServerSideAnalytics(new MongoAnalyticStore("mongodb://localhost/matteo"))

     // Request into those url spaces will be not recorded
     .ExcludePath("/js", "/lib", "/css")

     // Request ending with this extension will be not recorded
     .ExcludeExtension(".jpg", ".ico", "robots.txt", "sitemap.xml")
                
     // I don't want to track my own activity on the website
     .Exclude(x => x.UserIdentity() == "matteo")

     // And also request coming from my home wifi
     .ExcludeIp(IPAddress.Parse("192.168.0.1"))

     // Request coming from local host will be not recorded
     .ExcludeLoopBack();

      app.UseStaticFiles();
}

And that is all the middleware. 😀

The Store

Have you seen above that the middleware writes collected data into a generic store expressed by the interface IAnalyticStore, the component that will handle all the dirty work of this job.

I wrote three stores:

In the attached code, you will find a sample site using SQLite, so no external process is needed to run the example.

The store has to implement an interface with two methods invoked by Server Side Analytics and some method to query stored requests.

This is because database types isolation is so cool but also means that you cannot cast an Expression<Func<MyType,bool>> to Expression<Func<WebRequest,bool>>, no matter how similar MyType and WebRequest would be.

We will see the use of those methods in the last part of the article regarding the exposition of our data inside the web application.

C#
public interface IAnalyticStore
{
    /// <summary>
    /// Store received request. Internally invoked by ServerSideAnalytics
    /// </summary>
    /// <param name="request">Request collected by ServerSideAnalytics</param>
    /// <returns></returns>
    Task StoreWebRequestAsync(WebRequest request);

    /// <summary>
    /// Return unique identities that made at least a request on that day
    /// </summary>
    /// <param name="day"></param>
    /// <returns></returns>
    Task<long> CountUniqueIndentitiesAsync(DateTime day);

    /// <summary>
    /// Return unique identities that made at least a request inside the given time interval
    /// </summary>
    /// <param name="from"></param>
    /// <param name="to"></param>
    /// <returns></returns>
    Task<long> CountUniqueIndentitiesAsync(DateTime from, DateTime to);

    /// <summary>
    /// Return the raw number of request served in the time interval
    /// </summary>
    /// <param name="from"></param>
    /// <param name="to"></param>
    /// <returns></returns>
    Task<long> CountAsync(DateTime from, DateTime to);

    /// <summary>
    /// Return distinct Ip Address served during that day
    /// </summary>
    /// <param name="day"></param>
    /// <returns></returns>
    Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime day);

    /// <summary>
    /// Return distinct IP addresses served during given time interval
    /// </summary>
    /// <param name="from"></param>
    /// <param name="to"></param>
    /// <returns></returns>
    Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime from, DateTime to);

    /// <summary>
    /// Return any request that was served during this time range
    /// </summary>
    /// <param name="from"></param>
    /// <param name="to"></param>
    /// <returns></returns>
    Task<IEnumerable<WebRequest>> InTimeRange(DateTime from, DateTime to);

    /// <summary>
    /// Return all the request made by this identity
    /// </summary>
    /// <param name="identity"></param>
    /// <returns></returns>
    Task<IEnumerable<WebRequest>> RequestByIdentityAsync(string identity);

    /// <summary>
    /// Add a geocoding ip range.
    /// </summary>
    /// <param name="from"></param>
    /// <param name="to"></param>
    /// <param name="countryCode"></param>
    /// <returns></returns>
    Task StoreGeoIpRangeAsync(IPAddress from, IPAddress to, CountryCode countryCode);

    /// <summary>
    /// Makes the geeo ip resolution of incoming request. Internally invoked by ServerSideAQnalytics
    /// </summary>
    /// <param name="address"></param>
    /// <returns></returns>
    Task<CountryCode> ResolveCountryCodeAsync(IPAddress address);

    /// <summary>
    /// Remove all item in request collection
    /// </summary>
    /// <returns></returns>
    Task PurgeRequestAsync();

    /// <summary>
    /// Remove all items in geo ip resolution collection
    /// </summary>
    /// <returns></returns>
    Task PurgeGeoIpAsync();
}

Identities

Have you maybe noticed every WebRequest has got a field name Identity. This is because the most important data is to know Who made What.
But how is it evaluated?

  • If it is from a registered user, we are going to use username
  • If not, we are going to use the default AspNetCore cookie
  • If not available, we use the connection id of the current context
  • Then we are going to try to save the result in our own cookie, so we don't have to do it again

In code:

C#
public static string UserIdentity(this HttpContext context)
{ 
     var user = context.User?.Identity?.Name;
     const string identityString = "identity";

     string identity;

     if (!context.Request.Cookies.ContainsKey(identityString))
     {
          if (string.IsNullOrWhiteSpace(user))
          {
               identity = context.Request.Cookies.ContainsKey("ai_user")
                        ? context.Request.Cookies["ai_user"]
                        : context.Connection.Id;
          }
          else
          {
               identity = user;
          }
          context.Response.Cookies.Append("identity", identity);
      }
      else
      {
          identity = context.Request.Cookies[identityString];
      }
      return identity;
}

IP Geocoding

One of the most interesting data of every analytic system is to know where your user comes from.
So the IAnalyticStore of SSA implement methods to make the IP address geo coding of incoming requests.

Sadly, in 2018, there is a well established protocol although Int128 is not a well established data type, especially in database.

So we need to implement a cool workaround to have an efficient query to our database.
Or at least this is the strategy that I used in my three stores, if you have a better idea you can implement your analytic store or even better contribute to the project.

We are going to save every IP address range as a couple of strings.

Algorithm:

  • If the IP address is a IPV4, it should be mapped to IPV6 so they can be stored together
  • Then we are going to take the bytes of our new IP address
  • We are going to revert them, so "10.0.0.0" will keep being "10.0.0.0" instead of "10"
  • Now we have a string of bytes that represent a very big number
  • Let's print this number using every digit so they can correctly compared by the database
    (from 000000000000000000000000000000000000000 to 340282366920938463463374607431768211455)

Or in code:

C#
private const string StrFormat = "000000000000000000000000000000000000000";

public static string ToFullDecimalString(this IPAddress ip)
{
    return (new BigInteger(ip.MapToIPv6().GetAddressBytes().Reverse().ToArray())).ToString(StrFormat);
}

I implemented this function in ServerSideAnalytics.ServerSideExtensions.ToFullDecimalString so if you want to reuse it, you don't have to become mad like me.

Now that we have our IP address normalized into a well defined string format, finding the relative country saved in our database is really simple.

C#
public async Task<CountryCode> ResolveCountryCodeAsync(IPAddress address)
{
     var addressString = address.ToFullDecimalString();

     using (var db = GetContext())
     {
         var found = await db.GeoIpRange.FirstOrDefaultAsync
                                   (x => x.From.CompareTo(addressString) <= 0 &&
                                    x.To.CompareTo(addressString) >= 0);

         return found?.CountryCode ?? CountryCode.World;
    }
}

But to query the database, we need to fill it first.
Find a reliable and cheap database of countries and their relative IP address ranges can be quite difficult.

For this reason, I wrote other three analytic stores that act as wrappers around an existing one to provide fallback geo-IP resolution.

If the first repository doesn't contain a valid IP range for the client, it will ask the second one and so on.
If at the end chain a valid geo-IP has been found this, I saved into the main store.
I wrote three of them, if you want to add more, please contribute on GitHub.
You can find those analytic store in ServerSideAnalytics.Extensions.

  • IpApiAnalyticStore: Add ip-geocoding using Ip Api (ip-api.com)
  • IpInfoAnalyticStore: Add ip-geocoding using Ip Stack (ipinfo.io)
  • IpStackAnalyticStore: Add ip-geocoding using Ip Stack (ipstack.com)

Personally, I'm using a pre-loaded IP range database with all three failovers enabled:

C#
public IAnalyticStore GetAnalyticStore()
{
    var store = (new MongoAnalyticStore("mongodb://localhost/"))
                        .UseIpStackFailOver("IpStackAPIKey")
                        .UseIpApiFailOver()
                        .UseIpInfoFailOver();
    return store;
}

Let's see how it works inside one of those as example:

C#
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace ServerSideAnalytics.Extensions
{
    class IpApiAnalyticStore : IAnalyticStore
    {
        readonly IAnalyticStore _store;

        public IpApiAnalyticStore(IAnalyticStore store)
        {
            _store = store;
        }

        public Task<long> CountAsync(DateTime from, DateTime to) => _store.CountAsync(from, to);

        public Task<long> CountUniqueIndentitiesAsync(DateTime day) => 
                                    _store.CountUniqueIndentitiesAsync(day);

        public Task<long> CountUniqueIndentitiesAsync(DateTime from, DateTime to) => 
                                   _store.CountUniqueIndentitiesAsync(from, to);

        public Task<IEnumerable<WebRequest>> InTimeRange(DateTime from, DateTime to) => 
                                  _store.InTimeRange(from, to);

        public Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime day) => 
                                  _store.IpAddressesAsync(day);

        public Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime from, DateTime to) => 
                                 _store.IpAddressesAsync(from,to);

        public Task PurgeGeoIpAsync() => _store.PurgeGeoIpAsync();

        public Task PurgeRequestAsync() => _store.PurgeRequestAsync();

        public Task<IEnumerable<WebRequest>> RequestByIdentityAsync(string identity) => 
                                 _store.RequestByIdentityAsync(identity);

        public async Task<CountryCode> ResolveCountryCodeAsync(IPAddress address)
        {
            try
            {
                var resolved = await _store.ResolveCountryCodeAsync(address);

                if(resolved == CountryCode.World)
                {
                    var ipstr = address.ToString();
                    var response = await (new HttpClient()).GetStringAsync
                                         ($"http://ip-api.com/json/{ipstr}");

                    var obj = JsonConvert.DeserializeObject(response) as JObject;
                    resolved = (CountryCode)Enum.Parse(typeof(CountryCode), 
                                obj["country_code"].ToString());

                    await _store.StoreGeoIpRangeAsync(address, address, resolved);

                    return resolved;
                }

                return resolved;
            }
            catch (Exception)
            {
                return CountryCode.World;
            }
        }

        public Task StoreGeoIpRangeAsync(IPAddress from, IPAddress to, CountryCode countryCode)
        {
            return _store.StoreGeoIpRangeAsync(from, to, countryCode);
        }

        public Task StoreWebRequestAsync(WebRequest request)
        {
            return _store.StoreWebRequestAsync(request);
        }
    }
}

And that's all, folks! :)

This article was originally posted at https://matteofabbri.org/read/server-side-analytics

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Chief Technology Officer Genny Mobility
Italy Italy
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionNice! Pin
Member 1330714425-Apr-20 11:19
Member 1330714425-Apr-20 11:19 
QuestionNicely done. Pin
Member 1416312427-Aug-19 0:34
Member 1416312427-Aug-19 0:34 
GeneralMy vote of 5 Pin
J. De Mulder22-Aug-18 0:10
professionalJ. De Mulder22-Aug-18 0:10 
GeneralRe: My vote of 5 Pin
TheQult22-Aug-18 0:59
professionalTheQult22-Aug-18 0:59 
QuestionVery interesting, but... Pin
Dewey21-Aug-18 13:57
Dewey21-Aug-18 13:57 
AnswerRe: Very interesting, but... Pin
TheQult22-Aug-18 1:02
professionalTheQult22-Aug-18 1:02 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.