Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Interfaces and Abstract Classes

4.95/5 (92 votes)
18 Sep 2014CPOL28 min read 127K   839  
A deep explanation on when to use interfaces and abstract classes.

Introduction

I often see people asking what's the difference between an interface and an abstract class and most answers only focus on the different traits, not on how to use them.

That is, most answers will tell things like:

  • An abstract class can have an implementation while an interface can't;
  • In .NET we don't have multiple inheritance, so we can't use multiple abstract base classes, but we can implement multiple interfaces;
  • An interface is a contract, an abstract class is more than that (this one actually says nothing useful to me, but it is a common answer).

Well, there are many other answers that go in the same direction. Independently how correct those answers are, they don't answer the real question (even if sometimes it isn't asked explicitly):

When should we use an interface and when should we use an abstract class?

Short answer: If we want to create a method that receives an object to call one or more of its methods, expecting those methods to have different implementations according to the received object, we should ask for an interface. Now, if we need to provide an object for such method, we should prefer implementing an abstract class instead of implementing the interface directly.

I know, that short answer is not very clear and doesn't explain why do that, so this article is all about answering it.

Input arguments - Always use interfaces

Imagine that you are developing a system and, at certain moments, such system must generate logs. I am not going to discuss if logging should be part of the main logic or not, what I am going to discuss is the fact that you may write the log to a text file, to the database or to something else.

At this moment, you simply know that you want to be able to do things like these:

C#
logger.Log("An error happened in module X.");
logger.ConcatLog("An error happened in module ", moduleName, ".");
logger.FormatLog("An error happened in module {0}.", moduleName);

If it is not clear enough, the difference between the three methods is that the first receives only a single string, the second receives a params array of objects that are concatenated together and the third receives a string with placeholders (the {0}) and then an array of objects that will fill the placeholders.

It is easy to write an interface for these methods:

C#
interface ILogger
{
  void Log(string message);
  void ConcatLog(params object[] messageParts);
  void FormatLog(string format, params object[] parameters);
}

Note that at this moment you simply don't need to care at all how those will be implemented. You only need an interface to be able to do the calls. If you think about other methods that you might need, it is OK to add them to the interface right now. What you are saying is "I need to call these methods and I don't care how they are implemented".

Abstract class - Provide a base implementation for those who may want it

In my opinion, an abstract class is useful if you want to provide a basic implementation for an interface when you believe some (or most) of its methods will be always implemented with the same code or if you believe that new methods with default implementations may be added in the future.

For example, independently if we are going to log to a database, to a text file or even send messages through TCP/IP, the methods ConcatLog() and FormatLog() may be implemented like this:

C#
public void ConcatLog(params object[] messageParts)
{
  string message = string.Concat(messageParts);
  Log(message);
}
public void FormatLog(string format, params object[] parameters)
{
  string message = string.Format(format, parameters);
  Log(message);
}

So, an abstract class can actually implement those two methods and keep the Log() method abstract. Then, when developing the FileLogger, the DatabaseLogger and the TcpIpLogger any developper may use the abstract class that implements two of the three methods to avoid repetitive code.

Why not start with the abstract class?

Considering my example, a common question is: Why not start with the abstract class and avoid the creation of an "useless" interface?

Well, who guarantees that the interface is useless?

For example, my actual implementation of those methods is not completely right. It doesn't validate the input parameters, so if the ConcatLog() is called with null an ArgumentNullException will be thrown saying that the args is null. But the ConcatLog() receives a messageParts, not an args. A similar problem happens with the FormatLog(). Of course that we can solve that problem by correcting the abstract class if we are its authors, but what happens if the abstract class is comming in a compiled library and we are only the user of that library?

If the developers of the compiled library always receive things by an interface, independently on the existence of the abstract class, we are free to completely reimplement the methods, giving the right error messages if needed.

As another example, what about a NullLogger? That is, you give a logger as a parameter to a method that expects a logger, yet your logger doesn't do anything.

You may implement a NullLogger with the abstract class, but two of the three methods will waste time formatting/concatenating the message that will not be logged. A NullLogger that implements all the interface methods to do nothing will be faster. This kind of "no action" object helps to avoid checking for null and is so common that there's even a design pattern for it: Null Object Pattern.

Finally, we never know the kinds of statistics our users need. Imagine that someone wants to create a logger that not only logs the message, it also logs how many times each one of the log methods was called. If we start with the abstract class that always redirect to the Log() message, we will only be able to count how many times the Log() method was called, independently if users were actually calling the other methods. By fully implementing an interface, we will avoid this problem, as we can generate the statistics for each method individually.

So, putting ourselves in the position of developers that give components to others, we must try to give the components completely right and also allow users to reimplement anything they see fit, be it because we made a mistake or simply because they have a special need.

An abstract class that comes with an implementation yet is fully virtual

A possible solution to the previous problem is to create an abstract class that implements the ConcatLog() and the FormatLog() and still lets them be overriden (so, makes them virtual), completely avoiding the existence of an interface.

This works pretty fine as users may reimplement any methods of the class. So, looking at this problem in isolation, an abstract class that comes with an implementation but keeps all the methods virtual is a better solution than an interface. But continue reading, as I will explore the differences even further.

Future changes

Here is a big difference from an interface to an abstract class.

If a new method is added to an interface, all classes that implement it must change to include the new method, even if all implementations are identical.

If a new method is added to an abstract class, as long as it comes with a default implementation, nothing else needs to change. Users will simply see that there's a new method there that they can invoke or override.

Of course, a new abstract method on an abstract class is as problematic as a new method in an interface. But with default implementations, we can see that the abstract class has an advantage.

Yet, there's another question that we should ask: Why are we going to add another method to an interface or abstract class?

It is guaranteed that already existing compiled code that's working with the old version of the interface or abstract class is not going to call the new method, so why create such a breaking change?

Well, I will try to list some reasons to do this:

  1. The code is still under development and we simply saw that we need more methods (in this case, we can say there's no "existing code" that will break);
  2. The code is in a library given to many different users, and the users are asking for new methods they consider to be missing (so, the users will use the new method as soon as it is available);
  3. We are changing the code that calls the actual interface or class and we want new variants of the existing methods to make things faster;
  4. Similar to the previous case, we are changing the code that calls the interfaces or abstract classes, but now we see that we need new methods, which are simply impossible to be simulated with the existing methods. It is not simply a question of performance;
  5. We have many complaints that some methods have too many parameters and are hard to use, so the users want easier to use overloads.

Well, I am pretty sure there are many other reasons, but I will stop here.

For the first situation, we can change things freely, as we didn't "close" the code yet. The only things that will matter will be our purpose and our own rules regarding the possible evolution of the application.

For the second situation I will use the .NET Stream class as example. In the first version it didn't support timeouts, yet that was a very common need and it was added, having a default implementation that says it doesn't support time-outs (CanTimeout returns false) and throws an InvalidOperationException if we try to use the read or write timeout properties. It would be wrong to change an already existing interface to add those properties, as this would break all the existing implementations, yet it would be possible to create another interface with the extra functionality. In any case, in .NET the Stream is an abstract class, not an interface, and the new properties were added with a default implementation.

For the third situation, we can add those extra methods with a default implementation in an abstract class. So, those who want to use the faster implementation can. Those who don't want will keep their code working without changes. Unfortunately, changing an interface will cause breaking changes. So we may ignore the change, keeping the slower speed, we may cause the breaking change (and let some users furious) or we may add an extra interface, which our code will try to use with a checked cast or else will use the "default" implementation. Unfortunately, again, adding an extra interface and the interface lookup affects performance so, in some cases, we will lose the performance boost that we were looking for.

For the fourth situation, we will have a breaking change using either an interface or an abstract class. So there's no difference here.

For the fifth situation, we can solve the problem by adding extension methods that work as the simpler to use overloads (this works for the interfaces and the classes, but users must add the right using clause to see these extensions) and we can add them to the abstract class without problems, even if they aren't virtual. Adding them to the interface will be a breaking change, but compared to the extension methods has the advantage that if you see the interface, you can see all the overloads.

Interface + Abstract class

Except for the fourth problem, which is always a breaking change, we can solve all the problems by always having an interface + an abstract class, allowing us to add new methods that are virtual and always visible (which is better than the extension methods).

It is not a problem to add a new method to an interface if all the implementations are based on an abstract class that you are providing, as the abstract class can give the default implementation. This can actually create a "design pattern" where for each interface there's an abstract class, even if it doesn't give any default implementation in its first version.

So, those who prefer to revisit their code if the interface changes can implement it directly (and if everything is well documented, they shouldn't blame you for breaking changes if/when the interface changes). Users who prefer a "default behavior" instead of reviewing the interface changes simply inherit from the abstract class. If this is done, we return to the short answer I gave in the beginning of the article, in which method parameters should always use the interface, so they can support any implementation (based on the abstract class or not), yet the implementations should try to use the abstract class to avoid breaking changes.

Big or small interfaces? - Advanced

I finished with the direct differences between interfaces and abstract classes and I believe from this point on the article becomes advanced. So, if you don't want to see the advanced text, you can jump to the topic Summing it up.

Now I have a question, for which I know two opposite answers: Should we use big or small interfaces?

I just talked about method overloads. Do you think interfaces should present all the possible overloads or they should have only the method with most parameters and that the overloads should be written in separate helper classes?

Many people will say that we should use the Interface Segregation Principle (also known as ISP) and that we should only give the methods with most parameters, that any overload should be elsewhere. Well, actually ISP doesn't talk about method overloads, it talks about interfaces that do more than what they should do. For example, most of my IValueConverters used in WPF don't need the ConvertBack() method because I use one-way bindings, yet the interface forces me to implement such method throwing a NotSupportedException. It is not an overload of an existing method, it is a method with a different purpose that's not used.

So, the overloads aren't necessarily an ISP violation, yet it became implicitly understood by those who follow the principle that we should not put the overloads in the interface. This returns to the problem already presented that we may not be able to monitor how many times each method was called, as we will only be able to override the method with most parameters. Aside from that, how do we make these extra methods accessible?

The normal solution now is to use extension methods. With extension methods we can "add" methods with an implementation even to interfaces. Yet, they aren't real methods of the interface and they will not appear to the user if the unit isn't using the right namespace. This may cause lots of confusions to developers that know that an interface "has" a certain method but the calls are simply not working.

Also, those methods are seen as "implementation" on top of the interface. That is, we are not depending on interfaces, we are still depending on implementations (even if they are small) which then depend on the interface.

So, we have two options:

  • Small interfaces + extension methods: Any overload that simply fills some defaults is put as an extension method and the interface stays intact. We will not break any code that depends on or implements the interface, as the interface doesn't change. The new overloads will not be real interface methods and users may not see them if they miss a using clause. Overriding those overloads is not possible (but there's a work-around that I will present later);
  • Big interfaces: Any overload is put as part of the interfaces. So, adding a new overload is a breaking change for the interfaces, yet we can solve the trouble for those who want to implement them by also giving abstract classes on top of the interfaces, which we will need to update when the interfaces change. As the methods are part of the interfaces, they can always receive specific implementation (useful to know exactly which methods are called and in remoting scenarios where it is preferable to send less data instead of sending the defaults) and they are always visible when we have access to the interface.

Contrary to what most developers say, I usually prefer the big interfaces, as long as I keep the abstract classes to avoid the breaking changes. But I know, this is a personal preference.

Interface Segregation Principle (ISP) and "overridable" extension methods

Extension methods are static methods written in static classes that can be invoked as if they were member methods of their targeted types.

As they are static, they can't be overridden. Yet there's a work-around for this. The extension method should first try to cast the received object to an interface that implements the same functionality it does. If the cast is valid, it should use that version. If not, it continues with its own implementation.

We can see this in some LINQ methods. For example, the ElementAt() method first tries to cast the enumerable to an IList, which already has an indexer (very fast for lists and arrays) but when the cast fails it enumerates the enumerable until it reaches the requested index.

To achieve this kind of functionality we usually keep a base interface with the minimum required methods (in our example, an ILogger with only the Log() method) then we have sub-interfaces, like IFormatLogger and IConcatLogger, each one with a single specialization (usually a single method, but it is possible to have some related methods together).

With this approach we can extend base interfaces and still allow the "overridable" behavior at run-time and, better yet, the extra methods could be added in different assemblies.

The bad parts of this approach are:

  • When implementing the base interface, users may not be aware of all the extra methods that could be overriden, which is even more problematic as users must know the extension method and the interface used by such method;
  • There's an extra interface lookup. Such lookup is usually pretty fast, but this means that creating extra methods to add a little speed boost to the base interface may end-up not giving any benefit, as we gain in the method itself but we lose with that extra lookup.

Interface Segregation Principle and Decoration

The Interface Segregation Principle splits large interfaces into many small interfaces. Independently on the other discussions about abstract classes, interfaces, method overriding and performance this technique has another problem: It interacts terribly with decoration.

Decoration is when we implement an interface (or even abstract class) to add some behavior and then we redirect to another implementation. For example, we may use a decorator to count how many times each method is invoked, used to generate statistics at the end of the application, yet we keep calling the methods of another implementation to do the real job.

When we have a single class or interface with many methods we know how many methods we must implement. But when we have many segregated interfaces, this is not so easy. We may need to have decorators that implement exactly the same interfaces as the target objects, which is not viable in many situations.

To let things clear, I will use the streams as example. In .NET it is a single Stream abstract class. In Java we have many interfaces to represent the same thing. I will not focus on the Java version, but the .NET Stream class could be divided into:

  • Read part: As there are read-only streams;
  • Write part: As there are write-only streams;
  • Time out part: Both the read and the write parts may or may not support timing-out;
  • Seeking: When dealing with files (and some other streams), we may "reposition" what we see on that stream. In Video files, for example, we may go to the minute 45 without waiting for 45 minutes and we may go back to the beginning at any moment. Yet, most of streams are forward only.

As everything is inside a single class in .NET, if we want to create a decorator we know that we must simply inherit a single class and override all the methods that we can.

If we could have many different interface combinations, how would we create a decorator?

  • If the decorator implements only the Read interface without implementing the Timeout one, it is not important if the base object supports time-out, users of the decorator will not have access to the timeout anymore;
  • If the decorator implements the timeout interface, then users can simply assume that such instance supports timeouts, but this will not be true if the decorated object doesn't have such a support. Throwing exceptions when the timeout is configured seems wrong, as the decorator implements the interface. Ignoring the timeout configuration when the inner stream doesn't support it is wrong too, after all the code may be ignoring time-out alternatives because the decorator apparently supports timeouts. Writing a default implementation for timeouts is too much and will possibly cause bugs if there's no method to cancel a pending action (ow... I forgot to put an ICancellableStream in the list of possibilities). Putting a CanTimeout inside a timeout interface is counter-intuitive, as users expect to see that interface only when the object supports timing-out, after all many developers forget to check if an object is not null, now imagine checking that an object is not null, that it implements an interface and that it also has a CanTimeout of true.

So, should we create a decorator class per possible combination? That is, a decorator for the read interface only, one for the read and timeout interface, one for write, one for read and write... and I think you got the idea, this will require many, many implementations to give all the possible combinations.

Should we only create decorators at run-time, analyzing the existing objects instead of hard-coding them? This will make generating decorators pretty hard and it may not be possible in all situations, as some constrained environments don't allow generating code at run-time.

Should all the interfaces have a CanSomething so decorators can implement all the interfaces and still tell which actions they don't support? This is how the Stream class tells if it supports something or not, but it is a single class with many possibilities. An interface with a single kind of action saying that it doesn't support that action is awkward. And, if the decorators must have all the interfaces, why not have a base class that lists all the possible actions with those CanSomething properties to start with?

Well, that's what the .NET Stream class do. By not having independent interfaces, it is guaranteed that all the Stream implementations will have all the same methods, even if some of them throw NotSupportedExceptions.

So, for particular scenarios, abstract classes may still be a better option than segregated interfaces, be it for the easiness to create adapters or by their performance.

Interface Segregation Principle + Easier decoration

I already presented how it is possible to use "overridable" extension methods. So, why not extend the concept even further to allow easier decoration?

Actually the biggest problem with decoration is the fact that the casts are done directly over the received objects. So, either the object implements the interface or not. It doesn't have the chance to say: "Hey, even if I implement the interface, in this particular situation ignore it".

That's what the CanSomething properties were trying to tell. But in a well done architecture the interfaces will never need to have a property to tell if they work or not. Yet, the framework will not simply cast the received instance to another interface directly. It will "ask" for that other interface.

This can be done in a very configurable manner using events and static events, which I am not going to discuss in this article as it is already getting long, or it can be very simple, like having a TryGetService<T>() in the main interface.

So, this is how a Stream could look if built of entirely of segregated interfaces:

C#
public interface IStream:
  IDisposable
{
  T TryGetService<T>();
}

public interface IReadStream:
  IStream
{
  // The partial methods may read less than the amount requested.
  int PartialRead(byte[] buffer);
  int PartialRead(byte[] buffer, int offset);
  int PartialRead(byte[] buffer, int offset, int length);

  // These methods will either read all the requested amount
  // or throw an exception if the stream ends before that amount.
  void Read(byte[] buffer);
  void Read(byte[] buffer, int offset);
  void Read(byte[] buffer, int offset, int length);
}

public interface IWriteStream:
  IStream
{
  // The partial methods may write less than the amount requested.
  int PartialWrite(byte[] buffer);
  int PartialWrite(byte[] buffer, int offset);
  int PartialWrite(byte[] buffer, int offset, int length);

  // These methods will either write all the requested amount
  // or throw an exception if the stream is full before all
  // the data was written.
  void Write(byte[] buffer);
  void Write(byte[] buffer, int offset);
  void Write(byte[] buffer, int offset, int length);
}

// Maybe the size should be in another interface, but I consider
// that we should not be able to reposition to the end if we don't 
// know the size of the stream, so I put both together.
public interface IReadPositionableStream:
  IReadStream
{
  long Position { get; }
  long Size { get; }
}

public interface IWritePositionableStream:
  IWriteStream
{
  void SetPosition(long value);
  void SetSize(long value);
}

public interface IReadTimeoutStream:
  IReadStream
{
  TimeSpan? ReadTimeout { get; set; }
}

public interface IWriteTimeoutStream:
  IWriteStream
{
  TimeSpan? WriteTimeout { get; set; }
}

With these interfaces, code that requires a read stream that supports timeouts may have an input parameter of type IReadTimeoutStream.

A code that wants to write to a stream that may be repositioned may request for an IWritePositionableStream directly, yet it may try to configure a timeout by requesting the IWriteTimeoutStream.

During such request, a decorator may look if its inner stream supports that interface. If the inner stream doesn't support it, it may return null to tell that it doesn't support the service, yet in its code it may actually implement all those interfaces, redirecting to the inner stream when possible. It can actually delegate the decoration to another object, so it doesn't need to implement all the interfaces, it only needs to know how to find/create the other implementation.

So, an important thing to remember is:

If you want to use segregated interfaces, don't do direct casts. Ask if the object can do the extra actions by calling a method that may return an object of the right interface or null. This will greatly simplify decorations.

Multiple inheritance

When seeing comparisons between interfaces and abstract classes it is common to see arguments that a class can implement many interfaces but can only inherit from one abstract class (talking about .NET... C++ supports multiple inheritance). So it is common to say that interfaces are the way to go if we want multiple inheritance in .NET.

Well, that's actually a terrible argument. Using an abstract class we have the benefits of having a default implementation. For example, I use base classes that implement the INotifyPropertyChanged interface and call the event accordingly. Considering there's an INotifyPropertyChanging interface too, I would love to have the possibility to implement both interfaces with a default implementation by inheriting from two base classes, one for each interface, and this is the kind of inheritance that I would not like to expose (private inheritance, also not available in .NET). I would like that users of my class only saw my class directly inheriting from object and implementing the interfaces, yet I would not need to write that repetitive code for the interfaces.

So, considering that we may want abstract classes because of their default implementation and considering that interfaces can't have a default implementation, having the possibility to implement many interfaces doesn't help achieve multiple inheritance at all, we will still need to provide an implementation to all of them.

Finally, if we return to the previous topic, if frameworks "asked" for a service instead of doing direct casts, we could very well live with classes that are limited to implement either one base class or one interface, as any extra service could be discovered at run-time by doing the right call and by receiving different instances that implement the other interfaces if needed.

In the end, the fact that our classes may implement many interfaces is a nice to have feature, not a required one, and it may be a needed feature to be able to use already existing frameworks that use casts to interfaces to search for features. Looking to the problem differently, those frameworks were created that way only because there's the support for multiple interfaces. Maybe all the frameworks would work better if multiple interface implementations were never supported.

 

Summing it up

Right things

  • If you want to call an abstraction, call an interface, not an abstract class. That is, input parameters should use the type of an interface;
  • If you want to give a default implementations to an interface, create an abstract class that implements it, independently on how many methods are kept as abstract;
  • If you want to implement either an interface or an abstract class to be able to pass a parameter to a method that expects for the interface, use the interface if you prefer to implement all the methods and to have errors that force you to revalidate your implementation if in the future new methods are added to the interface. Use the abstract class if you prefer to use default implementations, independently if they exist today or if they may be added in the future.

Maybe some developers will consider the next item debatable, especially because I never saw a well-known pattern talking about it but, to me, it is the right thing to do:

  • If you are writing a framework that may ask for different services of an object, don't do cast, prefer using a method or, even better, an event to ask for that. This allows for easier decoration (as an object that implements a specific interface can still say that it doesn't support it because the decorated object doesn't) and users of the framework would be able to give implementations of the interface even to third-party created objects.

Debatable things

  1. Use of small interfaces is better;
  2. The Interface Segregation Principle makes the code more maintainable;
  3. Use of big interfaces is better;
  4. You should not create an interface when you can create an abstract class and keep all the methods as virtual or abstract;
  5. You should write interfaces in one assembly and their default implementations (if any) in another one.

The first two points are based on the idea that we should not force users to implement big interfaces if we are only interested in a small part of it, but such segregation may make it hard to know what are all the possible methods that can be used or implemented and may cause some performance loss when we do actually want to use the "extra" functionalities;

The third point goes in the opposite direction, making it clear that all the possibilities are available at a single place, so users know all the methods they can decorate and call, without having to learn anything about "helper" classes;

The fourth point tries to avoid duplications. If an abstract class can have all its methods reimplemented, there's no need for an extra interface to tell the same (and this may actually make things a little faster, as apparently the virtual dispatch of class methods is faster than the virtual dispatch of interface methods). But aside from possible performance and multiple inheritance issues (as you may need to use a base class for something else) this may create the wrong pattern, both because you may forget and put fields or non-virtual methods on the class and because of the next point;

The fifth point is related to how the interfaces are used. Code that needs to use an interface doesn't care if the implementations used an abstract class as the base or not. Such code doesn't need to reference any assembly aside the one with the interfaces. This allows to use completely different implementations without having to load any code that will not be used. Unfortunately, when you want to use the default implementation, you will need to reference two assemblies (the interface one + the default implementation).

Wrong thing

  • You should always use interfaces. Abstract classes are bad because class inheritance is bad.

This point, well, I think that there are enough differences, each one has its strong points and they can complement each other, so there's no need to choose one over the other in all situations.

The Sample

The sample application isn't useful as an application, being useful only as a "proof of concept".

Its purpose is to show how we can combine Interface Segregation Principle, Interfaces + Abstract classes and decoration by "requesting services" instead of doing direct casts.

In fact, it has the IStream and its sub-interfaces presented in this article + abstract classes that implement them and an adapter to redirect all the calls to a normal .NET Stream. If we were seriously trying to recreate the Streams we should avoid redirecting to the .NET Stream, but this is only a sample.

I believe this sample shows both the potential of this solution (as it can be expanded "without limits" while keeping decoration possible) yet it shows the problem of segregated interfaces, as something that could be done with a single abstract class + a single implementation is done through many classes and interfaces, making it hard to figure out the direct utilisation or implementation. So, it works both as a proof of the potential as well as a proof of the problems of these patterns.

Diamond Problem

Actually having a TryGetService<T> in the basic stream creates a kind of Diamond Problem, considered a big problem of multiple inheritance. As the TryGetService<T> can return another object, which also has a TryGetService<T>, such another object may have a different implementation for it. This is simply an error, as every object returned should have the same implementation for such method.

To solve this problem, in the sample I created the StreamPart abstract class. Each "part" of the stream actually redirects to the main stream when the TryGetService<T> is invoked, so we can say that they don't have their own implementation (or that they use the same implementation). If they had their own implementation it would be very strange, as we could ask the main stream for a service and get it, yet from that service itself such request could fail. All in all, this is a problem with the architecture in which the objects may answer if they support a service or not by possibly giving new objects but, as you can see, it can be solved by always redirecting to the "first" object.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)