Introduction to Terracotta

VinayakIyer

5.00/5 (3 votes)

Nov 24, 2013

CPOL

4 min read

10530

A brief introduction to Terracotta

Introduction

The way of computing has changed from time to time. It started with single processor low end processing to multiple processor large computing. The computation power is increasing day by day with the speed of the internet reach. Large scale systems have come into place to perform huge transactions and computation. As the system becomes big, the complexity of it also increases. In a large scale architecture, there are large number of application servers dealing the data of the application at different stages. In any application, it is very important to maintain its state information. For e.g. in any web application, it is important to maintain the authentication information of the user, so that he doesn’t need to authenticate again and again.

Clustering

When the load on an application increases, the system tends to become vulnerable and tends to fail. As a result, we use clustering of machines, with the same application running on different machines which helps in the situation of failure of any one server and also maintains the performance of the application. As a result, the user experience remains intact even with the failure of machines as one server fails, the other servers in the cluster take the incoming requests and serve it. Clustering solves two main problems of any applications which is scalability and fail over. Scalability is the measure of performance of a system under increasing load. With the help of clustering, we have different set of physical servers put in place, so if any server fails, we can replace it with the new one. As a result, we can achieve linear scalability in theory.

Fail over is the situation where an application should receive its state from where it failed in real time. In the event of successful fail over, system should be able to maintain a state of user where it failed and returns to the same state when it recovers from failure.

One should be able to maintain the state of the user in case of fail over situations. This can be achieved by using Java serialization. Serialization is the process of converting a Java object into a binary one which is sent across the server. Using Java serialization, we can serialize the object which maintains the state of user. This serialized object is passed to all the servers in the cluster so that if any one fails other servers have the same state of the user and whenever he requests again, he will get the desired content in the event of fail over conditions.

There are drawbacks too of this approach as it serializes the whole object again even if there is minor change of information in the state. Also, it passes to all the other servers the whole object all the time which consumes the bandwidth more. As a result, this process of serialization is quite inefficient. As a result, we need to see some more alternative techniques which are more efficient and give good performance.

Terracotta

It is an open source enterprise JVM-level clustering solution. JVM level clustering enables applications to be deployed on different JVM yet these interact with each other like a single JVM. It uses Byte Code Instrumentation (BCI) to overcome the limitations of Java serialization. Using BCI, it identifies the exact properties of any object and changes only those properties which are required to change. Usually, BCI is a process through which an application's behavior can be modified at runtime.

How It Works

In normal clustering techniques, we replicate the data on all the servers and if a change is made to an object in one server, we replicate in all the other servers. In contrast Terracotta, using a client/server architecture, in which there is a dedicated Terracotta server which maintains the state of the objects. If a change to any object is made, those changes are pushed to the Terracotta server. Terracotta server intercepts the BCI and using it pushes only those changes required to the servers dynamically on runtime.

As a result of it, all the servers have the same data with the only difference as the changes are injected on demand rather than replicating it all the time. The overhead of replicating all the servers is the same as the overhead of replicating the one Terracotta server. Also we can cluster replicate the Terracotta server itself so that even if that too fails, we can recover from that situation.

Conclusion

In large scale deployment where data is changed quite a lot of time, it is useful to use Terracotta in the setup to change the data as and when required based on the requirement. This helps in improving the performance and efficiency of the system.